This week, Oracle announced a major extension of its cloud-based Autonomous Data Warehouse service that transforms it into an end-to-end offering with a heavy dose of self-service for business users. The new version expands from being just a standalone data warehouse database service to a broader one supporting self-service capability for data ingest, loading, transformation, cataloging, and modeling.
Oracle’s move is very much in line with some of its rivals, such as Microsoft with Azure Synapse Analytics, and SAP with Data Warehouse Cloud, that have also broadened their cloud data warehouses to end-to-end services. But it is starkly differentiated from others like AWS, Snowflake, and Google, that are taking more portfolio-oriented or best-of-breed partner strategies.
As the name states, Oracle’s Autonomous Data Warehouse is built on the Autonomous Database. And until now, the main claim to fame was the fact that the Autonomous Database was, and still is, the only database that combines automation and machine learning to make the system fully self-driving. With three years of service under its belt, Oracle has disclosed results, such as that 85% of all bugs are found automatically and often nipped in the bud before customers are even aware of them, and that a self-driving database is ultimately cheaper and more efficient to operate. With the new release, Oracle is shifting the spotlight from bottom line TCO to top line business benefit.
LOOKING UNDER THE HOOD
Oracle has often used self-driving cars as the metaphor for explaining what its autonomous database is. With the new release of Autonomous Data Warehouse, Oracle is changing the subject. They are targeting a broader audience beyond the usual IT constituency: business and data analysts, citizen data scientists, and line of business developers. Unlike IT, they are not as concerned about how cheap the database is to run; they care about how to get their jobs done as analysts working on business problems.
The broadened service includes drag and drop data loading, data cleansing, data transformation, data modeling tools, and AutoML tools for business analysts and citizen data scientists. It also includes a rudimentary data catalog for data discovery. The catalog for now is confined to displaying table metadata and tracking lineage, and will gradually evolve with future releases.
The service does not extend to self-service visualization – that is the domain of Oracle Analytics Cloud or third parties like Tableau, Qlik, or Spotfire – but it includes integrations and also provides some basic visualizations plus some machine learning guided discovery capabilities.
A good example of the focus on ease-of-use is the data loading capability. In many competing cloud data warehousing services, it would require lines of SQL code to create tables and then load data into them. In the Autonomous Data Warehouse, a web-based visual data loading tool (based on Oracle Data Integrator) lets you accomplish that through drag and drop functions that load from external files or database sources; they can come from the Oracle Cloud Infrastructure (OCI), on-premises systems, or other public clouds. Once the file or database table has been selected, the system will introspect the data; recommend column names; allow the user to preview that data; and once approved, creates the database table automatically.
There is a similar capability for creating what Oracle terms “business models” or materialized views of data, providing a guided experience to creating facts, dimensions, and measures. In turn, a variation of data profiling spots underlying patterns, anomalies, and outliers in data that can identify opportunities for drilling down for discovering hidden insights. And all results can be visualized in any SQL-based BI visualization tool or service, such as Oracle Analytics Cloud or other third-party offerings.
There are also several paths for machine learning. Data scientists can work through built-in Zeppelin notebooks writing SQL or Python code, but for citizen data scientists, there are AutoML capabilities that will guide them to selecting, configuring, and deploying ML models. The business user creates an experiment, chooses the data source(s), and then the AutoML tool will provide a guided experience, identifying features, comparing several algorithms side-by-side, display a leaderboard of results, and recommend which algorithm is the best fit.
With RESTful APIs exposed, developers can in turn create apps, using tools such as Oracle’s low-code/no-code APEX tool, which is bundled as part of the Autonomous Data Warehouse.
COMPARING NOTES
As we noted in our Snowflake assessment last week, the cloud data warehousing space is aligning into a couple poles. On one side are services that have largely remained database-centric such as Amazon Redshift, Google Cloud BigQuery, and Snowflake. Characterizing these as standalone offerings doesn’t tell the full story, as in AWS’s and Google’s cases, there are integrations with other services across their portfolios such as streaming, ML, visualization, and other databases. Snowflake, as we noted last week, is taking a more partner-centric strategy, but at the same time is also positioning itself as a data marketplace.
ADW fits along the other pole – cloud data warehousing services which can be used standalone, but also emphasize end-to-end experiences. The prime examples are Azure Synapse Analytics, which extends the former Azure SQL Data Warehouse to embed the data pipelining and data transformation capabilities of Azure Data Factory, and dual SQL and Spark execution engines for running against database tables and Azure Data Lake Storage (ADLS) gen 2. While not formally part of it, Power BI for visualization, and Azure Machine Learning, can be directly called from Synapse. Conversely, SAP Data Warehouse cloud extends the experience toward self-service visualization, integrating capabilities of SAP Analytics Cloud. Functionally, Oracle’s offering is a closer apples-to-apples match with Azure Synapse although at this point, while ADW provides SQL access to the data lake, it is not supporting Spark operations.
The back story to Oracle’s offering is the Autonomous Database, a self-driving platform that remains unique to Oracle.
Built on Exadata, the Autonomous Database evolved from the automation that Oracle has been building into its database for years. Introduced with Oracle Database 18c, the autonomous database was built on a foundation of automation – we have counted nearly 20 discrete automation features that Oracle has built over the years. They date back as far as the 9c generation well over a decade ago, handling functions such as storage management, memory management, tablespace undos, storage indexing, and so on. Atop this, the self-driving part, where machine learning is applied to cluster provisioning, patching, testing, change management, error handling, and other functions. While some of the automation, such as patching, updating, and provisioning, is common to most cloud managed database services, the automation of the “knobs” that DBAs figuratively turn to optimize resources are handled by the database.
Until now, Oracle has promoted the Autonomous Database for its low TCO – the guiding notion being that a machine learning approach is ultimately more effective than humans constantly running the controls. It has made guarantees to underprice Redshift. Oracle has run numerous benchmarks – especially against Amazon Redshift – claiming superior performance and cost.
But with the new release, Oracle is turning its attention to the business by building better rounded experiences. Business users may not directly care about the advantages of the autonomous database; they simply want a database that will give them decent SLAs for their queries and models, at the right price. They want more self-service. With APEX (which we have already covered in these pages), Oracle is treading into new territory — ease of use and self-service, for which it traditionally was not known. The latest release of Autonomous Data Warehouse marks another step in that direction
Disclosure: AWS, Microsoft, Oracle, and SAP are dbInsight clients.