Programming language Python is a big hit for machine learning. But now it needs to change

Open-source programming language Python has become one of the few languages that won’t disappear anytime soon. It’s the top or one of the top two languages in most notable language popularity indexes, and even looks set to beat Java these days.

But 35-year-old Python does have its weaknesses. Not necessarily for the data-science and machine-learning communities built around Python extensions like NumPy and skippy, but as a general programming language.

Python is the top language according to IEEE Spectrum’s electrical engineering audience, yet you can’t run Python in a browser and you can’t easily run it on a smartphone. Plus no one builds games in Python these days.

To build browser applications, developers tend to go for JavaScript, Microsoft’s type-safety take on it, TypeScript, Google-made Go, or even old but trusty PHP. On mobile, why would application developers use Python when there’s Java, Java-compatible Kotlin, Apple’s Swift, or Google’s Dart?

Python doesn’t even support compilation to the WebAssembly runtime, a web application standard supported by Mozilla, Microsoft, Google, Apple, Intel, Fastly, RedHat and others.

These are just some of the limitations raised by Armin Ronacher, a developer with a long history in Python who 10 years ago created the popular Flask Python microframework to solve problems he had when writing web applications in Python.

Austria-based Ronacher is the director of engineering at US startup Sentry – an open-source project and tech company used by engineering and product teams at GitHub, Atlassian, Reddit and others to monitor user app crashes due to glitches on the frontend, backend or in the mobile app itself.

Much of Sentry is written in Python, putting it in the same class of Python-heavy tech firms like Instagram, Netflix, and Dropbox, from which Python’s creator Guido van Rossum announced his retirement a year ago. He stepped down as Python’s ‘Benevolent Dictator For Life’ in 2018.

While Ronacher contributes little to Flask today – because new Python features for data science don’t interest him – it’s become popular for deploying machine-learning models thanks to an abundance of tutorials and university courses that teach it. Flask was the most popular web framework, ahead of Django, in IDE-maker JetBrains’s 2018 Python developer survey.

Python at Sentry is mostly legacy code but it’s still used on the backend for integrating Sentry with other systems, such as when a crash report in Sentry needs to be passed to Atlassian’s Jira issue-tracking system.

“The sync between Sentry and Jira is written in Python,” Ronacher tells ZDNet. “A lot of the backend business logic that’s connected to Sentry’s frontend is written in Python. The event processing and all that complexity is slowly moving into Rust for performance reasons.”

Ronacher is a major fan of Rust, a programming language created at Mozilla Research five years ago. Microsoft is eyeing it as a memory-safe replacement for parts of its C and C++ codebases in Windows and Office.

Despite Python’s success as a language, Ronacher reckons it’s at risk of losing its appeal as a general-purpose programming language and being relegated to a specific domain, such as Wolfram’s Mathematica, which has also found a niche in data science and machine learning.

“Your expectation is not going to be that I’ll develop a desktop application in Mathematica,” said Ronacher.

“At the moment, it feels like the total fields for Python are super applicable and expanding, but we can already see that there will always be smartphones – or something that replaces smartphones – and there will be browser applications. Python cannot serve these two things right now and comes with a lot of restrictions,” he says.

Peter Wang, co-founder and CEO of Anaconda, maker of the popular Anaconda Python distribution for data science, cringes at Python’s limitations for building desktop and mobile applications.

“It’s an embarrassing admission, but it’s incredibly awkward to use Python to build and distribute any applications that have actual graphical user interfaces,” he tells ZDNet.

“On desktops, Python is never the first-class language of the operating system, and it must resort to third-party frameworks like Qt or wxPython.”

Packaging and redistribution of Python desktop applications are also really difficult, he says.

“On the web, the frontend is always JavaScript or a derivative. And on mobile, Python is barely used at all. It’s kind of a miracle that Python is even on the radar, much less ranked in the top three languages. In an ironic way, it’s somewhat a testament to the power and popularity of Python for backend and data-science workloads.”

The Python community realizes that app distribution is its Achilles heel, but Ronacher doesn’t see a way forward without fracturing the Python community.

The last time Python developers tried to introduce major changes was Python 3, released in 2008. Yet by 2014, van Rossum was almost begging developers to “move on to Python 3”. At that stage, companies like Instagram with huge Python codebases still hadn’t made the transition to Python 3.

“Python boxed itself into a corner where it’s very hard to innovate without breaking people’s code. Last time it was attempted to make some bigger changes to Python, which was Python 3, a lot of people’s code broke. It took 10 years for the ecosystem to heal up again,” says Ronacher.

Why can’t Python run in the browser?

According to Ronacher, Python’s limitations come down to design, the risk of breaking Python code used in production systems, and alienating sizable parts of its large user base.

Python’s interpreter and its C-language application binary interface (ABI) and application protocol interface (API) has hindered innovation in the browser, says Ronacher.

While JavaScript developers can embed their code in a browser and have every tab running its own JavaScript engine, Python can’t because the current ABI, which is exposed to Python extensions like NumPy. This prevents it from having two versions of its interpreter in the same process space in what’s known as the ‘global interpreter lock‘.

“If you have two interpreters, they would share the same object,” explains Ronacher. “So, if ‘tab one’ would modify an object, ‘tab two’ would also observe that modification.”

Though he admits running Python in a browser is unpopular anyway, that limitation has other ramifications for its future and use as a general programming language.

Enterprise Software

“It would have been nice if we could run multiple processes and have one pinned to one CPU core and do message passing to get the global interpreter lock. But basically the exposure of this C API prevents a whole range of these things.

“To implement this, you would have to take away or change the C API, which would break Python’s biggest ecosystem: NumPy, skippy, and the entire machine-learning environment.”

SEE: 10 ways to prevent developer burnout (free PDF) (TechRepublic)

Ronacher says Python has been “stuck like this for many years”. Time and again, efforts to “kill the global interpreter lock” fail because it would cause troubles for extensions like NumPy.

It’s less a core technical challenge than one of making enough people care enough about an innovation for one group that could be painful for another group in the ecosystem.

“If what you’re doing is going to break NumPy, and that thing is not going to benefit the NumPy community, it’s unclear if you will get the backing to execute on that,” he points out.

Instead, the Python community has moved towards machine learning and data science, which is less concerned with Python’s performance problems because they can be overcome by moving code to a GPU or a cluster where many processes are running independently.

“But it’s meant that Python is not used in computer games anymore, not being compiled to the browser, and also is less commonly used in web applications,” he says.

‘New Python’ is possible but might split the community

Yang and Ronacher agree that a successor to Python could emerge – one that isn’t tied to existing design choices that limit its use in fields outside data science and backend systems.

“If Python is to be long-term competitive with other application development languages outside data analysis, it must have a coherent vision of what it stands for and that vision must be differentiated,” says Anaconda’s Yang.

“Programmers are always looking for new shiny objects, so languages rarely get rewarded for doing many things well. Of course, Python’s value as a lingua franca for backend system automation and scripting is a tough thing to displace; but it shouldn’t be content to just be relegated to that,” he adds.

“Many of Python’s core concepts and objects are intrinsically tied to a ‘single-node PC’ computation model. Truly next-gen languages must move beyond this paradigm.”

Ronacher is optimistic that Python can be reinvented but also points to the fate of Perl, a hit for programming on the web in the early 2000s but now well outside of Tiobe’s top 10. Perl is now known as Raku due to fractures within the community over its future and legacy.

Using Perl as a yardstick shows why bending Python’s future will take a brave and committed character.

“I’m not saying you can’t fix Python. I think you can make a new version of Python that fixes a lot of those things, but someone would have to come in and say, ‘It’s OK for us to plot a path towards a future version of the language that’s incompatible’,” says Ronacher.

“Some efforts where people tried to do that ended up separating communities. Perl 6 is no longer even called Perl because the community never fully subscribed to it and the cost of going to the new version was writing code from scratch.”

The other risk of big breaking changes is that former devotees of a language simply switch to a modern alternative.

“The moment you say, ‘Could we do this really cool version of Python that fixes all those problems?’, is that it doesn’t look like Python anymore and there’s no clear migration path,” says Ronacher.

“There’s always a risk that people go to another language entirely. I don’t know what that path will be but if Python wants to have a future as a language that’s generally applicable, then it probably needs to start exploring some radical improvements.”

What of new languages like Rust and Julia?

Ronacher says he felt the same “positive emotions” using Rust as when he started using Python. However, Rust has youth on its side, making it extremely well liked yet not widely used.

“You can contribute, it’s exciting to work on, and every six weeks you get an update of the compiler and you read the changelog. It makes you happy to upgrade and it has a really good ecosystem around it.”

Ronacher reckons Rust’s mission statement is critical to its popularity, despite low adoption, because it helps define what parts of the language apply to some developers and not others.

“I think the community self-segregates into people who could and couldn’t use Rust. Over the years, Python never had that,” he says.

Ronacher points to Python support for Windows. For example, until Windows 10 1903, the May 2019 Update, Windows was the only mainstream operating system that didn’t come with a Python interpreter.

“These were always independent community efforts and it never felt like it was part of the language,” he says.

It’s a different story with Rust due to its lack of baggage and its mission statement, according to Ronacher.

“When people started porting Rust to work on Windows, that really became a core effort in the language,” he says.

For example, there’s a team of Rust contributors who are responsible for making sure it works on Windows, which includes support from Rust sponsor, Microsoft.

But in Python, Ronacher says projects like porting it to Windows “was always an afterthought”.

“It was like, if someone is willing to do it, we’ll do it, but if not, it was also not a concern. And you feel it. In Rust, they decided that good packaging is necessary for the language, so even though there was a separate team doing it, it became everybody’s responsibility.”

Python also has emerging rivals in data science, such as Julia, a language born out of MIT’s CSAIL artificial-intelligence lab. But it’s yet to gain mainstream popularity.

Wang reckons Julia is appealing and its makers pitch it as being as easy-to-use as Python, but with the best qualities of R for statistics and Matlab for algebra. However, he sees Python and R’s popularity as an obstacle for developers who know those languages and ecosystems already.

“Julia as a language has lots of nice features. But languages are a viral technology, in that there is a network effect,” says Wang.

“Until a certain percentage of the user base is actively using it and pushing it onto their friends and teams, then adoption is very slow. One challenge Julia has is that for a lot of teams, Python and R already exist and while they are far from perfect, they may be ‘good enough’.”

More on Python and programming languages

By ZDNet Source Link

LEAVE A REPLY

Please enter your comment!
Please enter your name here