2024-11-15

NVIDIA Working On An RTX 3080 Ti With 9984 CUDA Cores And 34 TFLOPs

A leaker that predicted the specifications of the RTX 3000 series months in advance has just revealed that the company is working on an RTX 3080 Ti. The Twitter user in question has a stellar record and we have no reason to doubt this information. That said, this card does appear to be in very early stages, and considering there are already rumors of NVIDIA dropping the 20 GB variant of the RTX 3080 it might be wise to take this with a grain of salt in case Jensen changes his mind again.

NVIDIA RTX 3080 Ti with 9984 cores and 34 TFLOPs baking in Jensen’s oven right now

According to Kopite, the RTX 3080 Ti will be based on the GA102 and will have specifications between an RTX 3080 and 3090. The exact chip nomenclature is GA102-250-A1 and will feature a 384 bits bus with GDDR6X memory. The bus-size means NVIDIA will either be using a 12 GB buffer or a 24 GB one. Considering Microsoft Flight Simulator 2020 is already bottlenecked by the 11 GB ram on the RTX 2080 Ti, it would be disappointing to see NVIDIA ship another powerful GPU with a small memory buffer. With RTX IO and asset streaming (Unreal Engine demo for next-generation consoles) becoming a thing in the next year or so, every bit of buffer will help in this paradigm shift.

It is also unclear at this point how the revelation of this new GPU fits into the rumors about NVIDIA planning a move back to TSMC. It does however lend credence to the belief that the RTX 3000 series, at least for now, is staying on the Samsung 8nm process. The company initially faced less than ideal yields and supply constraints at launch but those are expected to significantly improve as we enter into the new year.

This is going to be an insanely powerful card with 34 TFLOPs of power. That said, the current API, driver, and application infrastructure cannot fully take advantage of all this raw power. This is the actual reason why NVIDIA’s insanely powerful cards don’t scale linearly with TLFOPs. The company essentially made cards that are ahead of their time when compared to the surrounding software ecosystem.

The hard evidence for this lies in the fact that the RTX 3000 series has been experiencing non-linear positive scaling when going down the stack. An RTX 3070 which has slightly more cores than the RTX 2080 Ti beats the former flagship – putting to rest any rumors or allegations about the CUDA cores in the Ampere series not being as strong as the Turing series or misleading blames at architecture design with lacking INT performance.

AMD’s big Navi series drops later today and it remains to be seen how the GPU market shapes up as we enter into the holiday season. NVIDIA’s pricing is great but it needs to work with developers to fix the API and driver stacks to properly take advantage of the raw performance offered by the Ampere series (fine wine of the highest order) and in the meantime, AMD is going to churn out budget-friendly performance cards with what appears to be ample supply. They have also taken a lot of steps to make sure that the bot-scalping that happened with NVIDIA does not happen at their Radeon RX 6000 series launch.

NVIDIA GeForce RTX 30 Series ‘Ampere’ Graphics Card Specifications:

Graphics Card Name NVIDIA GeForce RTX 3050 NVIDIA GeForce RTX 3050 Ti NVIDIA GeForce RTX 3060 NVIDIA GeForce RTX 3060 Ti NVIDIA GeForce RTX 3070 NVIDIA GeForce RTX 3070 Ti? NVIDIA GeForce RTX 3080 NVIDIA GeForce RTX 3080 Ti? NVIDIA GeForce RTX 3090
GPU Name Ampere GA107 Ampere GA106? Ampere GA106? Ampere GA104-200 Ampere GA104-300 Ampere GA102-150 Ampere GA102-200 Ampere GA102-250 Ampere GA102-300
Process Node Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm Samsung 8nm
Die Size TBA TBA TBA 395.2mm2 395.2mm2 628.4mm2 628.4mm2 628.4mm2 628.4mm2
Transistors TBA TBA TBA 17.4 Billion 17.4 Billion 28 Billion 28 Billion 28 Billion 28 Billion
CUDA Cores 2304 3584 3840 4864 5888 7424 8704 10496 10496
TMUs / ROPs TBA TBA TBA 152 / 80 184 / 96 232 / 80 272 / 96 328 / 112 328 / 112
Tensor / RT Cores TBA TBA TBA 152 / 38 184 / 46 232 / 58 272 / 68 328 / 82 328 / 82
Base Clock TBA TBA TBA 1410 MHz 1500 MHz TBA 1440 MHz TBA 1400 MHz
Boost Clock TBA TBA TBA 1665 MHz 1730 MHz TBA 1710 MHz TBA 1700 MHz
FP32 Compute TBA TBA TBA 16.2 TFLOPs 20 TFLOPs TBA 30 TFLOPs TBA 36 TFLOPs
RT TFLOPs TBA TBA TBA 32.4 TFLOPs 40 TFLOPs TBA 58 TFLOPs TBA 69 TFLOPs
Tensor-TOPs TBA TBA TBA TBA 163 TOPs TBA 238 TOPs TBA 285 TOPs
Memory Capacity 4 GB GDDR6? 6 GB GDDR6? 6 GB GDDR6? 8 GB GDDR6 8 GB GDDR6 10 GB GDDR6X? 10 GB GDDR6X 20 GB GDDR6X 24 GB GDDR6X
Memory Bus 128-bit 192-bit? 192-bit? 256-bit 256-bit 320-bit 320-bit 320-bit 384-bit
Memory Speed TBA TBA TBA 14 Gbps 14 Gbps TBA 19 Gbps 19 Gbps 19.5 Gbps
Bandwidth TBA TBA TBA 448 Gbps 448 Gbps TBA 760 Gbps 760 Gbps 936 Gbps
TGP 90W? TBA TBA 180W? 220W 320W? 320W 320W 350W
Price (MSRP / FE) $149? $199? $299? $399 US? $499 US $599 US? $699 US $899 US? $1499 US
Launch (Availability) 2021? 2021? 2021? November 2020? 29th October Q4 2020? 17th September January 2021? 24th September