Oracle Cloud Infrastructure: Compute and High-Performance Computing Roadmap Update

Today, I wanted to share some important news on Oracle’s compute infrastructure offerings. Listening to our customers about their requirements and experiences, we found that the promise of high-performance computing (HPC) and performance in the public cloud still comes with too many trade-offs, caveats, and gaps.

So, we have updates both to our current capabilities and news related to our Compute services roadmap. These announcements are designed to meet the high expectations our infrastructure customers, including for HPC solutions to deliver the best of both worlds: the absolute performance they expect from on-premises systems and the pay-per-use, flexibility, and scalability of cloud, all at a leading price point.

Doubling Down on Future High-Performance Computing Investments with Intel

Our existing portfolio of HPC capabilities is best-in-class and more comparable to on-premises purpose-built clusters. Services from other cloud providers have underserved this market. Over the last year, we’ve invested in this arena to provide customers with the benefits of both on-premises and cloud with none of the downsides, such as overbuilding to peak capacity.

And customers have responded. Our recent announcement with Nissan Motors in Japan is a great example, where Nissan moved to Oracle Cloud Infrastructure for its Computational Fluid Dynamics (CFD), crash simulations, and 3D visualization workloads. Altair Engineering announced today that they’re betting on Oracle Cloud Infrastructure as their preferred provider for internal use and for customer-facing SaaS products.

“Our goal is to help customers solve complex problems faster, easier, and smarter than ever before,” said Sam Mahalingam, CTO of Altair. “We’ve standardized on Oracle Cloud Infrastructure, because it delivers the best price-performance, enabling our customers to easily design innovative sustainable products.”

As part of our HPC platform roadmap, we’re collaborating closely with Intel, and we’re excited to announce that early next year, we’re offering our next generation HPC Compute instances based on Intel’s Ice Lake processors. Workloads such as crash simulations, CFD, and Electronic Design Automation (EDA) workloads are expected to deliver more than 30 percent more performance compared to our existing X7 HPC generation of instances. Customers can also deploy these instances as bare metal, get NVMe storage for local check-pointing, a balanced core-memory ratio, and ability to build clusters of these instances on our remote direct memory access (RDMA)-enabled cluster network. Finally, to feed all this computing, you can also build large distributed storage clusters running high-performance file systems, and we’ve built a highly scalable file system to serve this purpose. The recent IO500 shows that Oracle has the fastest, cloud-based BeeGFS cluster in the world.

A graphic showing a quote from and photo of Bob Swan, CEO of Intel.

Put simply, Oracle Cloud Infrastructure has and continues to have the best cloud HPC platform.

NVIDIA A100 General Availability

Continuing our investments in HPC, our existing set of NVIDIA GPU offerings is best-in-class within the public cloud space. In fact, Oracle Cloud Infrastructure’s current NVIDIA GPU infrastructure is more comparable to on-premises systems, such as NVIDIA’s DGX, as a great alternative for customers that need the absolute best performance for their workloads, such as deep learning training or hardware accelerated visualizations.

We announced earlier this year that Oracle is working with NVIDIA on the next generation of GPU instances in Oracle Cloud Infrastructure. Today we’re announcing general availability of those instances starting September 30 in our US, EMEA, and JAPAC regions, at an industry low on-demand price of $3.05 per GPU per hour. This bare metal instance joins our cluster network architecture, allowing customers to scale up to 512 GPUs in a single cluster for large-scale artificial intelligence training or HPC workloads.

A graphic showing a quote from and a picture of Jensen Huang, Founder and CEO of NVIDIA.

These instances provide up to 1.6 Tb per second of bandwidth per bare-metal node that houses eight A100 GPUs, all fully interconnected with NVLINK. These instances also provide you with over 25 Tb of local NVMe storage and 2 Tb of RAM for large-scale graph workloads or accelerated databases. Finally, we also allow for cutting edge capabilities such as GPUDirect through RDMA, not yet available on any other cloud provider. Furthermore, you can use all the existing toolsets, preconfigured Data Science virtual machines, and Marketplace images, along with support for NVIDIA GPU Cloud (NGC). We’ve been working with several customers in preview over the last few months, including DeepZen, IdenTV, and Altair.

“Replicating the human voice with AI is highly dependent on processing power, and Oracle Cloud Infrastructure delivers that with the new NVIDIA A100 GPU, which provided an immediate performance increase of 37% enabling us to scale our business.” Kerem Sozugecer, Co-Founder and CTO and Deep Zen Limited.

“The amount of streaming video data being created is growing exponentially. To deliver real-time analytics and insights demands the highest level of graphics processing units. Oracle Cloud Infrastructure delivers that with the new NVIDIA A100 GPU where we expect an immediate performance gain of 35%.” Amro Shihadah, Cofounder and COO of IDenTV.

One thing is clear. This is the cloud-based GPU instance on-premises you’ve been waiting for. Find out more in NVIDIA and Oracle Cloud Infrastructure NVIDIA GPU Cloud Platform.

Partnership with Rescale

We’re also announcing a partnership with Rescale. They’re one of the leading HPC providers today. Availability of Rescale on Oracle Cloud Infrastructure makes it even easier for you to onboard and get your jobs running in under a day. Rescale has more than 450 applications already preinstalled on Oracle’s high-performance computing instances, making it easy and giving you the ability to bring your own licenses as well.

“Rescale takes the complexity out of high-performance computing in the cloud by providing an HPC platform that can be deployed in minutes,’ said Terry Danzer, COO at Rescale. “Oracle Cloud Infrastructure, with its industry first bare metal HPC shapes and low latency RDMA networking, offers the ideal platform for high-performance computing.”

Partnering with Ampere for Oracle’s First Arm Offering

Arm-based compute instances provide a great alternative for developers to diversify and take advantage of hardware innovation. To achieve this goal, we partnered with Ampere to bring an Arm offering to Oracle Cloud. These Arm-based instances provide developers with the best price-performance, compared to any other x86 Compute instance on a per core basis with an order of magnitude of cost savings.]

A graphic showing a quote from and picture of Renee Jamesm Chairman and CEO of Ampere.

Early next year, we’re going to offer customers the ability to launch bare metal or virtual machine instances with up to 160 cores with 3.3 Ghz turbo frequency on wide variety of Linux distros, including Oracle Linux and Ubuntu. These instances are also part of our flexible Compute family, giving you the ability to pick and choose varying levels of cores or memory based on your workload characteristics and requirements.

Apart from pure Compute instances, you can also use these instances as part of our Always Free Tier, allowing you develop and test. Look out for more information on this offering later in the year.

Next Generation of Flexible, Elastic Computing

We recently announced our E3 instances, which were the first generation of our flexible infrastructure. Using the infrastructure, customers can design their instances with a custom number of CPU cores and memory to suit their workload and application needs, as opposed to getting set into fixed sizes with wasted resources and costs. Built on AMD’s Rome generation of CPUs, this service allows you to design custom shapes. For example, if you wanted two cores but 12-GB RAM, it’s now possible to design that custom, exact shape to match to the workload. As the least expensive per core Compute offering that we had in our portfolio of instances, it’s a great choice for general-purpose workloads.

To get even further toward our goal of truly responsive, performant, and cost-effective Compute services, early next year we’re going to offer customers E4 instances that build upon this platform with AMD’s next generation of CPUs instances. They allow for greater performance at the same economics, making this service ideal for general-purpose workloads or custom applications.

A graphic showing a quote from and picture of Lisa Su, President and CEO of AMD

Get Started Today!

If you haven’t already, I recommend watching Clay’s keynote, along with special appearance from Intel CEO Bob Swan, NVIDIA Founder and CEO Jensen Huang, CEO of Ampere Renee James and AMD CEO Dr. Lisa Su.

You also get to hear from customers and partners about how Oracle Cloud Infrastructure helped them transform, partner, and provide the best infrastructure available today.

You can get started today with some of our latest offerings by signing up for our Oracle Cloud Free Tier and get access to preview of future offerings by getting in touch with us directly.

Resources:

What is high-performance computing (HPC)?

High Performance Computing (HPC) on Oracle Cloud Infrastructure

LEAVE A REPLY

Please enter your comment!
Please enter your name here