Alteryx, the public company best known in the self-service data preparation and pipeline realm, has always had interesting and significant AI/machine learning (ML) capabilities as part of its Designer platform. But today, at its Virtual Global Inspire event, the company is announcing some significant new AI/ML capabilities that should resonate with business users and power users alike.
Also read: Alteryx says let’s get visual
CDAO 411
ZDNet was briefed on the new products by Alteryx’s Chief Data and Analytics Officer (CDAO), Alan Jacobson, who joined the company two years ago from his post as director of global analytics at Ford Motor Company. Jacobson personally demoed the Intelligence Suite features and the Alteryx Machine Learning product, covering the latter in great detail.
Alteryx’s Intelligence Suite brings Machine Learning and Text Mining tabs into Designer, adding Natural Language Processing (NLP) and text mining; computer vision capabilities for image-based data and optical character recognition (OCR); as well as topic modeling and sentiment analysis. Jacobson described this set of features as the “Pythonic” equivalent of Alteryx’s longstanding predictive capabilities based on the R programming language. Intelligence Suite also adds some light automated machine learning (AutoML) capabilities.
Automated modeling, standalone version
But if AutoML is what you’re after, you’ll want to take a look at the new Alteryx Machine Learning product, a standalone, cloud- (and browser-) based offering. The product offers full API-driven operation and, perhaps more important for many of Alteryx’s customers, it offers an excellent UI that helps non-data scientists go from dataset to optimized ML model through a series of steps.
The first step, “Prep Data,” offers data ingest, basic data profiling, and “data health” (data quality) support. The second step, “Auto Insight” features Automated Insight Generation to support what Alteryx calls its insights-first model development process. Jacobson explained that Alteryx takes this approach because drawing insights from the model itself, rather than deploying it for scoring and predictive analytics, is often the goal in data science. In that spirit, users can see visualizations for correlations (including the interactive chord diagram shown in the figure at the top of this post), outliers and the prediction distribution for user-selected target variables.
Knobs and dials
Advanced settings in the Auto Insight step let users get ready for generating ML models by setting the ranking metric, a time limit for the AutoML test runs, maximum iterations within them, cross validation settings and whether model “ensembling” should be enabled. Users can also perform automated feature engineering using Alteryx’s “Deep Feature Synthesis” and built-in feature store derived from its 2019 acquisition of MIT spinoff Feature Labs.
Also read: Alteryx buys Feature Labs to automate ML feature engineering
Next comes the “Auto Model” step, which kicks off the AutoML probing, with progress visualized in a leaderboard that ranks models’ performance, relative to a baseline model. At the end of the process, the model at the top of the leaderboard becomes the recommended model. In the final “Review Model” step, users can see the general stats for, and processing pipeline used to generate, the recommend model and can view visualizations of its performance. The Insights tab within the Auto Model step is quite valuable, as it summarizes feature importance, partial dependence and Shapley-based prediction explainability, something uncommon in AutoML platforms. Alteryx Machine Learning also provides a handy feature that exports its visualizations to a PowerPoint slide deck.
Beyond the core product
Back in Alteryx Designer, users can create pipelines that score new data against models created in Alteryx Machine Learning. In addition, the model can be uploaded to Alteryx Promote, for managed deployment to development, test and production environments. A subset of Alteryx Machine Learning’s functionality is available in the open source libraries Woodwork, Compose, Featuretools and EvalML, all of which are available on GitHub. Alteryx Designer continues to offer connectors to other AutoML platforms too, including DataRobot, H20 Driverless AI, and Microsoft’s AutoML.
Also read: Alteryx Promote delivers AI/machine learning model deployment, management and integration
Alteryx has other announcements today too, including a new 2021.2 version of Designer, as well as a new cloud-based version called, logically enough, Alteryx Designer Cloud. There’s also a Unified Platform API and software development kit (SDK) and a revamped user experience and features within the Alteryx Community portal. The company is also announcing Alteryx Ventures, a new $50M fund that will invest in companies that complement Alteryx’s analytics and data science products, as well as its analytics process automation (APA) platform.
Ducks in a row?
Alteryx’s acquisitions of Yhat, ClearStory Data (whose former CEO, Sharmila Mulligan, is Alteryx’s chief strategy and marketing officer) and Feature Labs have seemingly coalesced to power the various products and capabilities the company is announcing today. With the company’s stock down almost 57 percent from its July, 2020 high, it’s needed to up its game. Today’s announcements certainly point it in the right direction.
Also read: Alteryx expands product set, makes data science acquisition