On Friday, Google debuted a new product developed with OpenMined that allows any Python developer to process data with differential privacy.
The two have been working on the project for a year and Google said the freely available privacy infrastructure will help millions in “the global developer community – researchers, governments, nonprofits, businesses and more – build and launch new applications for differential privacy, which can provide useful insights and services without revealing any information about individuals.”
Google began its differential privacy efforts in 2019 and got significant interest in it, prompting them to launch the new open source differential privacy product in Python. Their work with OpenMined included efforts to train third party experts to educate anyone who wants to learn how to leverage differential privacy tech.
Google privacy and data protection office product manager Miguel Guevara told ZDNet that they reached out to OpenMined last year to surface the idea of building this Python product, with the goal of making it the most usable end-to-end differential privacy solution freely available. They immediately jumped onboard, Guevara added.
“It’s been a truly amazing experience to work collectively with OpenMined towards building a more private Internet. The energy that their developers had through this journey over the past year demonstrated the appetite there is for expanding access to these privacy-enhancing technologies that we believe will play a critical role in the future of the web for every user,” Guevara said.
“Beyond the joint work our engineers did for the design and implementation of the library, we’re also thrilled that OpenMined now offers trained experts to provide guidance and resources for any developer looking to implement differential privacy in their projects.”
Google initially launched an open-sourced version of their foundational differential privacy library in C++, Java and Go in 2019. Developers immediately took to the project, wanting to use the library for their own applications.
Google noted that startups like Arkhn have used it to help hospitals share data and Australian researchers use it for a variety of scientific studies.
“With this new Python library, we’ve already had organizations begin experimenting with new use cases, such as showing a site’s most visited webpages on a per country basis in an aggregate and anonymized way. The library is unique as it can be used with Spark and Beam frameworks, two of the leading engines for large data processing, yielding more flexibility in its usage and implementation,” Guevara explained.
“We are also releasing a new differential privacy tool that allows practitioners to visualize and better tune the parameters used to produce differentially private information. Finally, we are also publishing a paper sharing the techniques that we use to efficiently scale differential privacy to datasets of a petabyte or more.”
Guevara urged researchers and developers to use the tool and provide feedback, noting that Google would continue “investing in democratizing access to critical privacy enhancing technologies.”