NVIDIA Ampere GA100 GPU Powered Tesla A100: Worlds Largest 7nm GPU, 54 Billion Transistors, 1 Petaflops Compute & Up To 96 GB HBM2 Memory

NVIDIA has unveiled the GA100 GPU, their first and also the world's largest 7nm chip based on its next-gen Ampere GPU architecture. Featuring 20 times the performance of its predecessor, the Volta GPU, Ampere ushers in a new era of high-performance computing, being the first GPU in the world to deliver a peak compute power of greater than 1 Peta-Ops per second for AI/DNN.

NVIDIA Unveils The Worlds Largest 7nm GPU, The Ampere GA100 GPU - Powering The Tesla A100 With 54 Billion Transistors and Up To 96 GB Undisputed & Fastest HBM2 Memory

Powered by the next-generation Ampere GPU architecture, the Tesla A100 is an impressive board for the HPC market. The first thing that we have to talk about any HPC GPU is its specs & Ampere is a monster of a chip. NVIDIA went all out with 7nm process node, making GA100 the largest 7nm chip in production but that's not all, it's also the most advanced and feature-pack chip in the industry as of right now.

The Ampere GA100 GPU is once again based on a bleeding-edge 7nm process node and has a gargantuan count of 54 Billion transistors packed within it. The chip is expected to pack 128 SM units, equalling a total of 8192 CUDA cores. That alone is a 50% increase in the total number of cores. For memory, we are looking at six HBM stacks that point out a 6144-bit bus interface. The memory dies are definitely from Samsung who has been NVIDIA's strategic memory partner for HPC-centric GPUs.

NVIDIA's Ampere GA100 GPU is a massive chip featuring 54 billion transistors. (Image Credits: EETimes via Videocardz)

Samsung has recently announced its HBM2E DRAM which features 16 Gb dies. Depending on the height of the stacks, NVIDIA could offer anywhere from 48 GB (4-hi) to all the way up to 96 GB (8-Hi) which is just insane amounts of VRAM compared to the existing Tesla V100 which maxes out at 32 GB. The HBM2E stacks also deliver increased speeds of up to 3.2 Gbps, allowing for up to 410 GB/s bandwidth or 2.5 TB/s bandwidth or even faster if NVIDIA decides to go for the 4.2 Gbps dies that will result in 3.2 TB/s bandwidth for the entire chip which is an amazing technical feat.

In terms of performance, the Ampere GA100 GPU delivers 1 Peta-OPs which is a 20x increase over the Volta GV100 GPU. The double-precision performance is rated at 2.5x higher over NVIDIA's Volta GV100 GPU which should end up somewhere around 20 TFLOPs FP64 since Volta features around 8 TFLOPs FP64 compute power. This would mean that the single-precision performance is rated at over 40 TFLOPs (FP32) which would be mind-blowing for the HPC segment.

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

NVIDIA Tesla Graphics Card	Tesla K40 (PCI-Express)	Tesla M40 (PCI-Express)	Tesla P100 (PCI-Express)	Tesla P100 (SXM2)	Tesla V100 (SXM2)	Tesla V100S (PCIe)	Tesla A100 (SXM3)
GPU	GK110 (Kepler)	GM200 (Maxwell)	GP100 (Pascal)	GP100 (Pascal)	GV100 (Volta)	GV100 (Volta)	GA100 (Ampere)
Process Node	28nm	28nm	16nm	16nm	12nm	12nm	7nm
Transistors	7.1 Billion	8 Billion	15.3 Billion	15.3 Billion	21.1 Billion	21.1 Billion	54 Billion
GPU Die Size	551 mm2	601 mm2	610 mm2	610 mm2	815mm2	815mm2	~800-850mm2
SMs	15	24	56	56	80	80	128?
TPCs	15	24	28	28	40	40	TBD
CUDA Cores Per SM	192	128	64	64	64	64	TBD
CUDA Cores (Total)	2880	3072	3584	3584	5120	5120	8192?
Texture Units	240	192	224	224	320	320	TBD
FP64 CUDA Cores / SM	64	4	32	32	32	32	TBD
FP64 CUDA Cores / GPU	960	96	1792	1792	2560	2560	TBD
Base Clock	745 MHz	948 MHz	1190 MHz	1328 MHz	1297 MHz	TBD	TBD
Boost Clock	875 MHz	1114 MHz	1329MHz	1480 MHz	1530 MHz	1601 MHz	TBD
FP16 Compute	N/A	N/A	18.7 TFLOPs	21.2 TFLOPs	30.4 TFLOPs	32.8 TFLOPs	~80 TFLOPs
FP32 Compute	5.04 TFLOPs	6.8 TFLOPs	10.0 TFLOPs	10.6 TFLOPs	15.7 TFLOPs	16.4 TFLOPs	~40 TFLOPs
FP64 Compute	1.68 TFLOPs	0.2 TFLOPs	4.7 TFLOPs	5.30 TFLOPs	7.80 TFLOPs	8.2 TFLOPs	~20 TFLOPs
TOPs (DNN/AI)	N/A	N/A	N/A	N/A	125 TOPs	130 TOPs	>1000 TOPs
Memory Interface	384-bit GDDR5	384-bit GDDR5	4096-bit HBM2	4096-bit HBM2	4096-bit HBM2	4096-bit HBM2	6144-bit HBM2e
Memory Size	12 GB GDDR5 @ 288 GB/s	24 GB GDDR5 @ 288 GB/s	16 GB HBM2 @ 732 GB/s 12 GB HBM2 @ 549 GB/s	16 GB HBM2 @ 732 GB/s	16 GB HBM2 @ 900 GB/s	16 GB HBM2 @ 1134 GB/s	Up To 96 GB HBM2 @ 2.5-3.2 TB/s
L2 Cache Size	1536 KB	3072 KB	4096 KB	4096 KB	6144 KB	6144 KB	TBD
TDP	235W	250W	250W	300W	300W	250W	250W?

NVIDIA's Ampere GA100 also features a new Tensor operation compute indicator known as Tensor Float 32 or TF32 which is based on the 3rd Generation Tensor Cores, offering higher AI/DNN core output. The Tensor cores also natively support double-precision compute which allows the GA100 GPU to hit a 2.5x performance increase over its predecessor. As of right now, nothing from the competition that has been announced comes close to this beast.

The DGX-A100 - The First HPC System With 140 Peta-OPs Compute Shipping Now For $199,000

Finally, NVIDIA will be announcing its next-generation DGX-A100 system which Jensen Huang teased a few days ago. The DGX-A100 will deliver 5 Petaflops of peak performance with its six Ampere based Tesla A100 GPUs. The system itself is 20x faster than the previous DGX based on NVIDIA's Volta GPU architecture. The reference cluster design features 140 DGX-A100 GPUs with a 200 Gbps Mellanox Infiniband interconnect. The whole system is going to start at $199,000 and is shipping as of today.

The post NVIDIA Ampere GA100 GPU Powered Tesla A100: Worlds Largest 7nm GPU, 54 Billion Transistors, 1 Petaflops Compute & Up To 96 GB HBM2 Memory by Hassan Mujtaba appeared first on Wccftech.

Refference- https://wccftech.com

Teche Cast

NVIDIA Ampere GA100 GPU Powered Tesla A100: Worlds Largest 7nm GPU, 54 Billion Transistors, 1 Petaflops Compute & Up To 96 GB HBM2 Memory

NVIDIA Unveils The Worlds Largest 7nm GPU, The Ampere GA100 GPU - Powering The Tesla A100 With 54 Billion Transistors and Up To 96 GB Undisputed & Fastest HBM2 Memory

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

The DGX-A100 - The First HPC System With 140 Peta-OPs Compute Shipping Now For $199,000

Post a Comment

0 Comments

Popular Posts

Calendar Widget by Home Agenda Lite is a customizable calendar widget for your events

Razer BlackWidow v3 PRO Wireless Mechanical Keyboard Review – A No Compromise Wireless Gaming Keyboard?

Nikola (NKLA) Soars Nearly 20 Percent on the Imminent Opening of Reservations for Its Badger Electric Truck

Technology

Random Posts

Recent in Technology

Popular Posts

Calendar Widget by Home Agenda Lite is a customizable calendar widget for your events

Razer BlackWidow v3 PRO Wireless Mechanical Keyboard Review – A No Compromise Wireless Gaming Keyboard?

Nikola (NKLA) Soars Nearly 20 Percent on the Imminent Opening of Reservations for Its Badger Electric Truck

Menu Footer Widget

Teche Cast

NVIDIA Ampere GA100 GPU Powered Tesla A100: Worlds Largest 7nm GPU, 54 Billion Transistors, 1 Petaflops Compute & Up To 96 GB HBM2 Memory

NVIDIA Unveils The Worlds Largest 7nm GPU, The Ampere GA100 GPU - Powering The Tesla A100 With 54 Billion Transistors and Up To 96 GB Undisputed & Fastest HBM2 Memory

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

The DGX-A100 - The First HPC System With 140 Peta-OPs Compute Shipping Now For $199,000

You may like these posts

Post a Comment

0 Comments

Popular Posts

Calendar Widget by Home Agenda Lite is a customizable calendar widget for your events

Razer BlackWidow v3 PRO Wireless Mechanical Keyboard Review – A No Compromise Wireless Gaming Keyboard?

Nikola (NKLA) Soars Nearly 20 Percent on the Imminent Opening of Reservations for Its Badger Electric Truck

Technology

Random Posts

Recent in Technology

Popular Posts

Calendar Widget by Home Agenda Lite is a customizable calendar widget for your events

Razer BlackWidow v3 PRO Wireless Mechanical Keyboard Review – A No Compromise Wireless Gaming Keyboard?

Nikola (NKLA) Soars Nearly 20 Percent on the Imminent Opening of Reservations for Its Badger Electric Truck

Menu Footer Widget