For the first time in a couple of years, we’ll be hopping a plane to hit AWS re:Invent right after we’ve digested our Thanksgiving turkey. There are plenty of third-party services that promise to babysit your cloud footprints to keep your monthly bills in check. But each year, when we hit the expo floor in Vegas, we’ve wondered when somebody would come up with a solution for training a machine learning model on the job to perform the job more systematically. There’s one firm preannouncing before all the ruckus to announce just that.
CAST AI is a two-year old startup making the types of bold claims that service providers typically offer; in this case, it claims that it can cut your cloud compute bills in half. In a previous life, the cofounders headed Zenedge, a cloud-based cybersecurity firm eventually acquired by Oracle. Like any born-in-the-cloud company, it was seeking a better way to contain its monthly cloud computing bills. And so, in the cofounders’ next act, this was the problem they trained their sights on.
In the data world, we’ve seen AI being aimed at optimizing queries, tuning database performance, and, in the case of Oracle’s autonomous database, running the whole darn thing. There is plenty of machine learning being employed to predict or prevent outages.
So why not apply machine learning to shaping the cloud compute footprint? It’s a natural problem for machine learning to solve because there is no shortage of log data, and the problem is pretty linear and sharply defined. The key variants are the nature and characteristics of the workload alongside the underlying compute infrastructure. It’s a problem that outscales human learning because, in the case of AWS (and other cloud providers), there are easily hundreds of compute instance types and related storage permutations.
CAST AI introduced its first service about six months ago, providing real-time analysis of workload snapshots to identify the best instance configuration. It restricts itself to cloud-native, containerized workloads that run under Kubernetes (K8s). For instance, a compute-intensive workload using eight C5a.large instance types might run more cheaply using three C5a.2xlarge types instead.
By keeping its focus on cloud-native containerized workloads orchestrated by K8s, it takes advantage of the declarative container APIs that describe the characteristics of the workload. And by working only in the K8s environment, it clears the way for the “instant rebalancing” optimization service being announced this week. It allows clusters to right-size the cluster configuration on the fly, taking advantage of the automation (through K8s orchestration) to perform the autoscaling. This feature takes the place of manual load rebalancing steps that are performed periodically.
Cost optimization of the cloud is an obvious target for applying machine learning; there is no shortage of cloud customers seeking to get their bills under control. This has traditionally required managers to monitor CloudWatch or implement rules-based controls that abruptly throttle down workloads. When we reach the expo floor of re:Invent, we expect that CAST AI will have a lot more company.