I hadn’t realized just how important graphic-processing units (GPU), graphic chips, have become over the last few years in the cloud. Today, almost all cloud services offer GPUs for demanding users.
When I think about services in the cloud, I think about storage and file-sharing, infrastructure as a service; cloud-based applications such as Office 365 and Gmail, software as a service; or building applications on the cloud, platform as a service. But GPUs? On the cloud? Supercomputers? Sure, many of the fastest supercomputers now rely on GPUs, but clouds?
Well, after looking into it, it turns out there are many useful, advanced applications that run better on clouds enriched with GPUs. Most of the most important public clouds, including Google Compute, Microsoft Azure, IBM Cloud, and Amazon Web Services (AWS) offer GPU-enabled instances. Almost all clouds are using NVIDIA Tesla GPUs.
NVIDIA, while the market leader, doesn’t totally dominate the market. AMD Radeon Sky Series graphics cards are used to accelerate graphics in cloud-based games and AMD FirePro S7150x2 Server GPUs for graphic applications.
It turns out, although I didn’t know it, NVIDIA had moved its sights from video-gaming and supercomputing to the cloud back in 2012. Then, its goals were to use its GPUs’ fast streaming and virtualization capabilities to “accelerate cloud computing.”
Fast forward to 2017 and NVIDIA has announced its own cloud: the NVIDIA GPU Cloud. This offering focuses on NVIDIA-optimized deep learning software and High-Performance Computing applications. This cloud relies not on NVIDIA’s hardware—except for specialized access to the NVIDIA Saturn V supercomputer—but on AWS and Azure-based hardware. This means NVIDIA’s GPU Cloud is more of a channel and demand aggregator for its partners’ cloud services.
Along the way, cloud GPUs has created its own niches. The first important use was for remote desktops such as MyGDaaS, a remote desktop for high-end graphic users, and the Citrix XenDesktop for more general purpose end-user computing.
That proved to be just the tip of the iceberg. Now GPU-enabled clouds are used for a multitude of other purposes. These include machine learning training and inference, geophysical data processing, simulation, seismic analysis, molecular modeling, financial analysis and other high-performance compute (HPC) use cases.
For example, Aon Benfield Securities, a provider of risk management, insurance and reinsurance brokerage, uses GPUs to speed up its risk analysis scenarios using its in-house PathWise financial modeling program. The company found they could do their work faster and more affordably using GPUs on an AWS cloud. Specifically, by using AWS’s pay-as-you-go pricing, the business could spin up multiple GPUs quickly and inexpensively, so it moved its infrastructure to AWS and deprecated its co-located data center. “We realized that by using AWS, we could have a whole turnkey environment up and running in no time,” says Peter Phillips, ABS Managing Director.
Google Compute, on the other hand, uses GPU-enabled clouds to power its Google Cloud Machine Learning Engine (GCMLE). This uses the same TensorFlow framework that powers many Google products, from Google Photos to Google Cloud Speech. TensorFlow is a widely used open-source machine intelligence software library. GCMLE can take any TensorFlow model and perform large scale training on a managed cluster. Additionally, it can also manage the trained models for large scale online and batch predictions. Once useful, the trained model can pull data from Google Cloud Dataflow for pre-processing. This enables you to access data from Google Cloud Storage, Google BigQuery, and others data sources. These models can be deployed anywhere on the Google Cloud.
TensorFlow can, in turn, be sped up by using it with NIVIDA’s TensorRT. This is a high-performance, deep-learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. It can be used to rapidly optimize, validate, and deploy trained neural networks for inference to hyperscale data centers, embedded, or automotive product platforms.
In short, anywhere you need HPC, you can turn to your cloud of choice now, and look for the right combination of processing speed and capabilities.
Fair warning: It wont’ be cheap. You’ll need to optimize your programs to take advantage of GPU parallel processing and GPU cloud time is expensive. For example, adding a single NVIDIA Tesla P40 GPU to six vCPUs and 128MBs of RAM will likely cost you over $1,500 a month to start.
On the other hand, if you can make proper use of GPU processing, it’s well worth the money.