Although the hype around machine learning and AI is giving way to a more realistic view, enterprises and institutions still have work to do to help the data science discipline achieve maturity and deliver sustained business value. These findings and others are covered in a new report released recently by Anaconda, Inc., provider of the world’s most popular data science platform.
Titled “The State of Data Science 2020: Moving from Hype Toward Maturity,” the report explores industry trends across a variety of topics ranging from preferred tools and languages to open-source security practices, to concerns around bias in machine learning (ML). 2,360 people from more than 100 countries participated in the online survey between February 12 and April 20, 2020.
Demand for data science talent has been strong in recent years, and even with the recent economic downturn, organizations that have seen positive ROI from data science are likely to maintain their investments in this area. Anaconda’s report provides insight into the practices, concerns, and trends that are shaping this key industry, from education to enterprise deployment. Respondents spanned age groups, industries, and job functions, allowing for granular analysis of responses across multiple variables.
Among the report’s findings:
- While enterprises have embraced open-source tools in a wide variety of functions, their security practices don’t always keep pace. A concerning 30% of respondents who have knowledge of their company’s security practices stated that their organization does not have any mechanism in place to secure open-source tools used for data science and machine learning.
- Getting data science outputs into production, where they can impact a business, isn’t always straightforward. Respondents reported that on average 45% of their time is spent getting data ready (loading and cleansing) before they can use it to develop models and visualizations. Once their models are ready for production, they contend with numerous environments, dependencies, and even skills gaps before models see the light of day.
- Perhaps as a result of these production struggles, fewer than half (48%) of respondents feel they can demonstrate the impact of data science on business outcomes.
- Concerns about bias and privacy are on the minds of data professionals, with nearly half of respondents citing one of these two themes as the “biggest problem to tackle in the AI/ML arena today.” Yet concerningly, only 15% of respondents said their team is currently actively addressing the issue of bias, and only 15% of universities include courses in ethics.
- Although COVID-19 has brought uncertainty to the employment landscape, enterprises should still pay attention to job satisfaction among data scientists. One-third of data professionals reported they plan to find a new role within the next 12 months, and another 24% expect to make a move within the next three years.
“Data science has the ability to be transformational for businesses, but our 2020 survey shows that both organizations and professionals in the space are still in the process of maturing,” said Peter Wang, CEO and co-founder of Anaconda. “From broadening the data science educational curriculum to being more intentional with open-source security, there are clear learnings here for the industry at large to implement in order to improve. We’ve seen positive progress in many of these areas, but there is still work to be done.”