Pinterest details the AI that powers its content moderation

March 5, 2021

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.

Pinterest this morning peeled back the curtains on the AI and machine learning technologies it’s using to combat harmful content on its platform. Leveraging algorithms to automatically detect adult content, hateful activities, medical misinformation, drugs, graphic violence, and more before it’s reported, the company says that policy-violating reports per impression have declined by 52% since fall 2019, when the technologies were first introduced. And reports for self-harm content have decreased by 80% since April 2019.

One of the challenges in building multi-category machine learning models for content safety is the scarcity of labeled data, forcing engineers to use simpler models that can’t be extended to multi-model inputs. Pinterest solves this problem with a system trained on millions of human-reviewed Pins, consisting of both user reports and proactive model-based sampling from its Trust and Safety operations team, which assigns categories and takes action on violating content. The company also employs a Pin model trained using a mathematical, model-friendly representation of Pins based on their keywords and images, aggregated with another model to generate scores that indicate which Pinterest boards might be in violation.

“We’ve made improvements to the information derived by optical character recognition on images and have deployed an online, near-real-time, version of our system. Also new is the scoring of boards and not just Pins,” Vishwakarma Singh, head of Pinterest’s trust and safety machine learning team, told VentureBeat via email. “An impactful multi-category [model] using multi-modal inputs — embeddings and text — for content safety is a valuable insight for decision makers … We use a combination of offline and online models to get both performance and speed, providing a system design that’s a nice learning for others and generally applicable.”

In production, Pinterest employs a family of models to proactively detect policy-violating Pins. When enforcing policies across Pins, the platform groups together Pins with similar images and identifies them by a unique hash called “image-signature.” Models generate scores for each image-signature, and based on these scores, the same content moderation decision is applied to all Pins with the same image-signature.

For example, one of Pinterest’s models identifies Pins that it believes violates the platform’s policy on health misinformation. Trained using labels from Pinterest, the model internally finds keywords or text associated with misinformation and blocks pins with that language while at the same time identifying visual representations associated with medical misinformation. It accounts for factors like image and URL and blocks any images online across Pinterest search, the home feed, and related pins, according to Singh.

Since users usually save thematically related Pins together as a collection on boards around topics like recipes, Pinterest deployed a machine learning model to produce scores for boards and enforce board-level moderation. A Pin model trained using only embeddings — i.e., representations — generates content safety scores for each Pinterest board. An embedding for the boards is constructed by aggregating the embeddings of the most recent Pins saved to them. When fed into the Pin model, these embeddings produce a content safety score for each board, allowing Pinterest to identify policy-violating boards without training a model for boards.

“These technologies, along with an algorithm that rewards positive content, and policy and product updates such as blocking anti-vaccination content, prohibiting culturally insensitive ads, prohibiting political ads, and launching compassionate search for mental wellness, are the foundation for making Pinterest an inspiring place online,” Singh said. “Our work has demonstrated the impact graph convolutional methods can have in a production recommender systems, as well as other graph representation learning problems at large scale, including knowledge graph reasoning and graph clustering.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform
networking features, and more

Become a member

By VentureBeat Source Link

Pinterest details the AI that powers its content moderation

VentureBeat

LEAVE A REPLY Cancel reply

CYBER SECURITY NEWS

Understanding Online Financial Frauds and How to Stay Protected

3.6 times surge in mobile banking malware and 83% crypto phishing spike: New financial cyberthreats report by Kaspersky

Online Safety Tips and free Cyber Safety and Crimes books

The National Cyber Crime Reporting Portal

Protect your online accounts from hackers and enable 2SV

Gartner Identifies Top Commercial Threats Facing Sales Leaders in 2025

TECH NEWS

AI powers record 2024 revenue, but automotive and industrial struggles linger says Omdia

High-performance computing, with much less code

Generative and agentic AI set to transform customer service into a strategic value driver for businesses

Generative AI and Machine Learning Set for Continued Investment

Gartner Identifies Top Supply Chain Technology Trends for 2025

Tech CEOs Must Take Several Mitigating Actions to Address Pitfalls

TOP NEWS

CEOs Are Relying on Employee Productivity to Fuel Organizational Growth in 2025 and Beyond

The National Cyber Crime Reporting Portal

Over 140,000 Tonnes of CO₂ Emissions Prevented by Uplink Community in 2023-2024

The Art and Science of Cryptography: Securing the Digital World

Automotive dealers need to adapt to technological advancements to remain competitive, says GlobalData

Cryptocurrency Scams: Understanding the Risks and How to Stay Safe

TECH NEWS & UPDATES

Understanding Online Financial Frauds and How to Stay Protected

Network security market grows 5.1% year-over-year in Q4 2024, Omdia reports

RCS Interactions in India Skyrocket by 850% in 2024: Infobip Report

The Power List: Indian Entrepreneurs Driving Innovation

46% of UK businesses are embarrassed by their website despite spending an average of...

Pinterest details the AI that powers its content moderation

VentureBeat

RELATED ARTICLES

LEAVE A REPLY Cancel reply

CYBER SECURITY NEWS

TECH NEWS

TOP NEWS

TECH NEWS & UPDATES