Think about a system that explores a number of approaches to advanced issues and makes use of the huge quantity of knowledge understanding, from scientific datasets to supply code to enterprise paperwork, and infers by way of real-time prospects. This lightning reasoning doesn’t look ahead to the horizon. At this time, that is occurring in our clients’ AI manufacturing atmosphere. The dimensions of the AI programs clients are constructing immediately (Throughout Drig Discovery, Enterprise Search, software program growth, and so forth.) is actually exceptional. And it is farther forward.
We’re happy to announce the final availability of the P6E-GB200 Extremely Cellber accelerated by Nvidia Grace Blackwell Superchips to speed up innovation throughout new generative AI developments comparable to inference fashions and agent AI programs. The P6E-GB200 Extremely Sorber is designed to coach and deploy the biggest and most subtle AI fashions. Earlier this yr we launched the P6-B200 occasion accelerated by Nvidia Blackwell GPUs for quite a lot of AI and high-performance computing workloads.
On this publish, we share how these highly effective computing options can construct on every little thing they’ve discovered about delivering safe and dependable GPU infrastructure at scale, in order that clients can confidently push the boundaries of AI.
Meet the rising computational demand for AI workloads
The P6E-GB200 Extremely Sorber represents essentially the most highly effective GPU product ever, with as much as 72 Nvidia Blackwell GPUs interconnected utilizing the fifth technology nvidia nvlink. Every Ultraserver presents an enormous 360 petaflop of dense FP8 calculations and 13.4 TB whole bandwidth GPU reminiscence (HBM3E) that’s greater than 20 instances the computing and 11 instances the reminiscence of a single NVLink area in comparison with P5EN cases. The P6E-GB200 Extremely Sorber helps as much as 28.8 TBPS whole bandwidth for 4th technology elastic cloth adapter (EFAV4) networks. Every occasion presents 1.4 TB of high-bandwidth GPU reminiscence, as much as 3.2 Tbps EFAV4 networking, and eight NVIDIA BLACKWELL GPUs interconnected utilizing NVLink utilizing NVLink with fifth technology Intel Xeon scalable processors. The P6-B200 occasion presents as much as 2.25 instances the GPU TFLOPS, 1.27 instances the GPU reminiscence measurement and 1.6 instances the GPU reminiscence bandwidth in comparison with the P5EN occasion.
How do you select between the P6E-GB200 and the P6-B200? This alternative extends to the necessities and architectural wants of a selected workload.
- The P6E-GB200 Extremely Sorber is good for essentially the most computational and memory-intensive AI workloads, together with coaching and frontier mannequin deployments. Their NVIDIA GB200 NVL72 structure actually shines on this scale. Think about that every one 72 GPUs with unified reminiscence area and coordinated workload distribution function one. This structure permits for extra environment friendly distributed coaching by decreasing communication overhead between GPU nodes. For inference workloads, the flexibility to utterly comprise trillion parameter fashions inside a single nvlink area means sooner and extra constant response instances at scale. When mixed with optimization strategies comparable to decomposition with Nvidia Dynamo, the massive area measurement of the GB200 NVL72 structure unlocks vital inference efficiencies for numerous mannequin architectures, comparable to mixing knowledgeable fashions. The GB200 NVL72 is very highly effective when it’s essential to deal with particular context Home windows or run excessive concurrency purposes in actual time.
- The P6-B200 occasion helps a variety of AI workloads and is a perfect possibility for medium to giant coaching and inference workloads. When porting an present GPU workload, the P6-B200 occasion presents the acquainted 8-GPU configuration that minimizes code adjustments and simplifies migration from present technology cases. Moreover, NVIDIA’s AI software program stack is optimized for each ARM and X86, but when your workload is particularly constructed for an X86 atmosphere, a P6-B200 occasion with an Intel Xeon processor is the perfect alternative.
Innovation based mostly on AWS Core Strengths
Bringing Nvidia Blackwell to AWS is just not a few single breakthrough. This considerations steady innovation throughout a number of layers of infrastructure. By constructing years of studying and innovation throughout computing, networking, operations and managed companies, we now have delivered the total capabilities of Nvidia Blackwell with the reliability and efficiency we count on from AWS.
Strong occasion safety and stability
One vital level emerges constantly when clients inform you why they select to run GPU workloads on AWS. It focuses on cloud occasion safety and stability. The specialised {hardware}, software program, and firmware of AWS Nitro programs are designed to implement restrictions to make sure that anybody on AWS has entry to delicate AI workloads and knowledge. Past safety, NITRO programs basically change the way in which infrastructure is maintained and optimized. Nitro programs that deal with networking, storage and different I/O capabilities enable for firmware updates, bug fixes, and optimization deployments. This characteristic to replace with out system downtime, we name Dwell Replacewhich is vital in immediately’s AI panorama, and disruptions have a serious affect on manufacturing timelines. The P6E-GB200 and P6-B200 each characteristic sixth technology Nitro programs, however these safety and stability advantages are nothing new. The progressive Nitro structure has been defending and optimizing Amazon Elastic Compute Cloud (Amazon EC2) workloads since 2017.
Massive-scale dependable efficiency
With AI infrastructure, this problem not solely reaches giant scale, but additionally gives constant efficiency and reliability at that scale. We’ve deployed the P6E-GB200 Extremely Sorber to the third technology EC2 Extremely Cluster. This creates a single cloth containing the biggest knowledge heart. The third technology extremely cluster not solely reduces energy consumption by as much as 40%, reduces cable necessities by greater than 80%, but additionally improves effectivity, in addition to considerably reduces potential failure factors.
To supply constant efficiency at this massive scale, we use elastic cloth adapters (EFAs) with scalable, dependable datagram protocols. It intelligently routes visitors between a number of community paths to take care of clean operation even throughout congestion and failures. Constantly improved EFA efficiency over 4 generations. P6E-GB200 and P6-B200 cases utilizing EFAV4 present as much as 18% sooner combination communication in distributed coaching in comparison with P5EN cases utilizing EFAV3.
Infrastructure effectivity
The P6-B200 occasion makes use of a confirmed air-cooled infrastructure, whereas the P6E-GB200 Ultrasorber makes use of liquid cooling. This permits for larger computational density on giant NVLink area architectures, leading to larger system efficiency. The P6E-GB200 is liquid cooled with a brand new mechanical cooling resolution, offering configurable liquid-to-chip cooling in each new and present knowledge facilities, permitting help for each liquid-liquid accelerators and air-cooled networks and storage infrastructures in the identical facility. This versatile cooling design means that you can present most efficiency and effectivity on the lowest price.
Get began with AWS’ Nvidia Blackwell
You may simply begin your P6E-GB200 Ultrasorber and P6-B200 cases by way of a number of deployment paths, permitting you to shortly get began utilizing Blackwell GPUs whereas sustaining one of the best working mannequin to your group.
Amazon Sagemaker HyperPod
If you wish to speed up AI growth and scale back the time spent managing infrastructure and cluster operations, that is precisely the place Amazon Sagemaker HyperPod is nice. It gives a managed, resilient infrastructure that mechanically handles provisioning and administration of huge GPU clusters. Preserve your Sagemaker HyperPod stronger and add improvements like versatile coaching plans to get a predictable coaching timeline and enable you run your coaching workloads inside funds necessities.
Sagemaker HyperPod helps each P6E-GB200 Extremely Cellbers and P6-B200 cases, with optimizations to maximise efficiency by sustaining workloads throughout the similar NVLink area. It’s also constructed right into a complete multi-tier restoration system. SageMakerHyperPod mechanically replaces failed cases with pre-configured spares in the identical NVLink area. The built-in dashboard means that you can visualize every little thing from GPU utilization and reminiscence utilization to workload metrics and ultra-cellbar well being.
Amazon eks
For giant AI workloads, if you wish to use Kubernetes to handle your infrastructure, Amazon Elastic Kubernetes Service (Amazon EKS) is the management airplane of your alternative. We’ll proceed to drive innovation in Amazon EK with options like Amazon EKS hybrid nodes. This lets you handle each on-premises and EC2 GPUs in a single cluster.
Amazon EKS helps each P6E-GB200 Extremely Cellbers and P6-B200 cases with automated provisioning and lifecycle administration by way of managed node teams. For the P6E-GB200 Ultrasorber, we mechanically label ultra-cellver IDs and community topology data to construct topology consciousness that understands the GB200 NVL72 structure and permit for optimum workload deployment. It may be spanned into node teams spanning a number of ultrasorbers or devoted to particular person ultrasorbers, offering flexibility in organizing your coaching infrastructure. Amazon EKS screens GPU and accelerator errors and relays them to the Kubernetes management airplane for elective repairs.
AWS’s nvidia dgx cloud
The P6E-GB200 UltraServers can be accessible from NVIDIA DGX Cloud. DGX Cloud is a unified AI platform optimized throughout all layers with multi-node AI coaching and inference capabilities, in addition to NVIDIA’s full AI software program stack. You may profit from Nvidia’s newest optimizations, benchmark recipes, and technical experience to enhance effectivity and efficiency. We offer versatile terminology size together with complete NVIDIA knowledgeable help and companies to assist speed up your AI initiative.
The announcement of this launch is a vital milestone and it’s only the start. As AI capabilities quickly evolve, we’d like infrastructure constructed for not solely immediately’s calls for, however all the chances forward. With innovation throughout computing, networking, operations and managed companies, the P6E-GB200 Ultraservers and P6-B200 cases are able to allow these prospects. I am unable to wait to see what you construct with them.
useful resource
Concerning the creator
David Brown I’m Vice President of AWS Compute and Machine Studying (ML) Providers. On this function, he’s liable for constructing all AWS computing and ML companies, together with Amazon EC2, Amazon Container Providers, AWS Lambda, Amazon Bedrock, Amazon Sagemaker, and extra. These companies are utilized by all AWS clients, but additionally help most of AWS’ inside Amazon purposes. It additionally leads new options comparable to AWS Outposts, bringing AWS companies to clients’ non-public knowledge facilities.
David joined AWS in 2007 as a software program growth engineer based mostly in Cape City, South Africa, engaged on the preliminary growth of Amazon EC2. In 2012 he moved to Seattle and continued to work for a wider Amazon EC2 group. Over the previous 11 years he has taken on a better management function as extra AWS computing and ML merchandise have grow to be a part of his group.
Earlier than becoming a member of Amazon, David labored as a software program developer at a monetary startup. He holds a level in Laptop Science and Economics from Nelson Mandela College in Port Elizabeth, South Africa.

