• Earthman_Jim@lemmy.zip
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      3 days ago

      Yeah, I wonder how long it will take them to clue in that no one wants to trade gaming for an AI fucking girlfriend ffs…

      • setsubyou@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        I mean if they came with a cool android body we could talk about it. It should at least be able to do cleaning and cooking. Otherwise my wife won’t like it.

  • RegularJoe@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    3 days ago

    Nvidia’s Vera Rubin platform is the company’s next-generation architecture for AI data centers that includes an 88-core Vera CPU, Rubin GPU with 288 GB HBM4 memory, Rubin CPX GPU with 128 GB of GDDR7, NVLink 6.0 switch ASIC for scale-up rack-scale connectivity, BlueField-4 DPU with integrated SSD to store key-value cache, Spectrum-6 Photonics Ethernet, and Quantum-CX9 1.6 Tb/s Photonics InfiniBand NICs, as well as Spectrum-X Photonics Ethernet and Quantum-CX9 Photonics InfiniBand switching silicon for scale-out connectivity.

    • TropicalDingdong@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 days ago

      288 GB HBM4 memory

      jfc…

      Looking at the spec’s… fucking hell these things probably cost over 100k.

      I wonder if we’ll see a generational performance leap with LLM’s scaling to this much memory.

      • AliasAKA@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 days ago

        Current models are speculated at 700 billion parameters plus. At 32 bit precision (half float), that’s 2.8TB of RAM per model, or about 10 of these units. There are ways to lower it, but if you’re trying to run full precision (say for training) you’d use over 2x this, something like maybe 4x depending on how you store gradients and updates, and then running full precision I’d reckon at 32bit probably. Possible I suppose they train at 32bit but I’d be kind of surprised.

        Edit: Also, they don’t release it anymore but some folks think newer models are like 1.5 trillion parameters. So figure around 2-3x that number above for newer models. The only real strategy for these guys is bigger. I think it’s dumb, and the returns are diminishing rapidly, but you got to sell the investors. If reciting nearly whole works verbatim is easy now, it’s going to be exact if they keep going. They’ll approach parameter spaces that can just straight up save things into their parameter spaces.

  • fubarx@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 days ago

    Question is, how long before it makes it to the next DGX Spark? Some people don’t have $10B to burn.