Oracle has announced upcoming availability for new Oracle Cloud Infrastructure (OCI) Compute instances powered by NVIDIA H100 Tensor Core GPUs, NVIDIA L40S GPUs, and Ampere AmpereOne CPUs. The new OCI Compute instances are designed to make running a variety of workloads—from training, fine-tuning, and AI model inferencing to running cloud-native applications and video transcoding applications—in the cloud more accessible to organizations while providing improved price-performance.
The upcoming OCI Compute instances based on next-generation NVIDIA GPUs will include:
OCI Compute Bare Metal Instances Powered by NVIDIA H100 GPUs: Can help customers reduce the time it takes to train large AI models, such as those that power natural language processing and recommendation systems. Organizations using NVIDIA H100 Tensor Core GPUs have seen as much as a 30x improvement in performance for AI inference use cases and 4x better performance training AI models compared to using the previous generation of NVIDIA A100 Tensor Core GPUs. For customers running intense computing workloads such as AI model training, OCI Supercluster enables them to connect tens of thousands of NVIDIA H100 GPUs over a high-performance, ultra-low latency cluster network. These instances are planned to be generally available in the Oracle Cloud London Region and Oracle Cloud Chicago Region later this year, with others expected to follow.
OCI Compute Bare Metal Instances Powered by NVIDIA L40S GPUs: Will provide customers an alternative option for workloads such as AI inferencing or training small to medium AI models. These instances have been tested to deliver up to a 20% improvement in performance for generative AI workloads and up to a 70% improvement in fine-tuning models over the previous generation of NVIDIA A100 GPUs. The instances are planned to be available within the next year.
The upcoming OCI Compute Instances based on Ampere Computing CPUs will include:
OCI Compute A2 Instances Powered by Ampere AmpereOne™ CPUs: Are expected to deliver leading price-performance and the highest available processor core count in the industry—320 cores in the bare metal shape and up to 156 cores in the flexible VM shape—to power a variety of general-purpose cloud workloads including running web servers, transcoding video, and servicing CPU-based AI inference requests. The high core count available in these instances can support increased levels of performance, virtual machine density, and scaling to help customers more efficiently manage their computing workloads while reducing data center footprint and power consumption. These instances can also run flexible shapes for virtual machines to provide customers with granular options for the amount of processing power and memory to help maximize resource utilization and minimize costs while providing a simple and predictable pricing model. These instances are planned to be coming next year.
“OCI was one of the first cloud providers to offer bare metal instances natively which is a key part of our ability to make high-performance computing more accessible to organizations everywhere. By providing access to processors from NVIDIA and Ampere Computing in OCI, we are giving our customers the performance, efficiency, and flexibility they need in their cloud infrastructure to power anything from general purpose workloads all the way up to high-performance AI projects,” said Donald Lu, senior vice president, software development, Oracle Cloud Infrastructure. “Oracle is early to the market with cloud compute offerings designed specifically to support the development and use of AI. We are well-positioned to lead the cloud computing industry as the market grows by supporting the increasing number of AI providers and users.”
“The collaboration between NVIDIA and Oracle is helping democratize access to cutting-edge GPUs on Oracle Cloud Infrastructure,” said Ian Buck, vice president of Hyperscale and High Performance Computing, NVIDIA. “NVIDIA H100 and L40S GPUs on OCI will enable AI innovation with unprecedented performance, scalability, and security for customers across all verticals.”
“Oracle was the first cloud services provider to globally deploy compute instances based on Ampere processors,” said Jeff Wittich, chief product officer, Ampere Computing. “This new generation of Ampere A2-based instances from Oracle Cloud Infrastructure will provide up to an industry-leading 320 cores per instance for even better performance, workload density, and scale.”
“The upcoming OCI Compute instances, powered by NVIDIA GPUs, will give us the power we need to train and serve the next generation of industry-leading Cohere enterprise AI models,” said Martin Kon, president and COO, Cohere. “Oracle’s cloud provides reliable and powerful computing resources to build high-performance models that can be embedded into any application and used in a wide range of industries.”
“Training large language models on the MosaicML Platform requires thousands of NVIDIA GPUs running on OCI’s bare metal compute instances, which leverage high-performance storage and ultrafast cluster networking,” said Naveen Rao, vice president of Generative AI, Databricks. “We chose OCI for its superior price-performance for AI training and inferencing at scale and look forward to using the OCI Compute instances with NVIDIA H100 and L40S GPUs.”
“Uber is revolutionizing the way people and things move around cities. As part of a multicloud architecture, we leverage Oracle Cloud Infrastructure for critical workloads because of its superior security, performance, and flexibility,” said Kamran Zargahi, senior director of tech strategy, Uber. “We use Standard and Dense I/O instances based on AMD processors, and plan to use OCI Compute with NVIDIA GPUs in the future.”