Blog

Company Updates & Technology Articles

November 18, 2025

Investing in the People Behind Reliable AI

Scale AI is strengthening its commitment to the contributors behind Outlier, investing in improvements that make work more consistent, transparent, and rewarding.

November 14, 2025

Research

Breaking Out of the Lab: Testing AI in Professional Domains

AI excels on academic tests, but it fails at real professional jobs. That's the stark finding from PRBench, our new benchmark series designed to move AI testing out of the lab and into the real world. We're launching the series with two of the most complex domains: Law and Finance. Using 1,100 high-stakes tasks sourced from 182 professionals, we tested how today's frontier models handle the nuanced, high-stakes reasoning that defines these fields. While models are great at following instructions, they fail at the expert judgment, auditable reasoning, and deep diligence required for tasks with real economic consequences.

November 13, 2025

Product

Introducing Agentex: Open-Source Infrastructure for Enterprise AI Agents

We are open-sourcing the agentic infrastructure layer in Scale GenAI Platform: Agentex. Our Enterprise team sits down to demo Agentex and share how it’s used across our enterprise customers today. We also dive into our decision to open-source and our hopes for collaborating with the community.

November 7, 2025

Research

Beyond "Out-of-the-Box": Why Enterprises Need Specialized RL Agents

While general-purpose AI models are powerful, they often fail to deliver on complex, specialized enterprise workflows that use private data. We share results from our real world work in the insurance and legal industries, highlighting how our RL-tuned agents outperformed leading LLMs and dive into how we achieved these performance gains.

November 5, 2025

Company

Expanding Our Presence with New Offices Around the World

Scale AI is expanding offices in New York, London, Washington D.C., and St. Louis to support growth, innovation, and reliable AI development worldwide.

October 29, 2025

People

Why I Joined Scale: Building the Applications for Saudi Arabia's AI Future

Talal AlBakr joins Scale AI to build production-ready AI applications that power Saudi Arabia’s Vision 2030.

October 29, 2025

Research

The Remote Labor Index: Measuring the Automation of Work

Can AI actually automate complex, professional jobs? The new Remote Labor Index (RLI) from Scale and the Center for AI Safety (CAIS) provides the first data-driven answer. By testing AI agents against 240 real-world, paid freelance projects, the RLI found that the best-performing agents could only successfully automate 2.5% of them. This new benchmark reveals a critical gap between AI's generative skill and the end-to-end reliability required for professional work, showing the immediate impact is augmentation, not mass automation.

October 28, 2025

Company

Scale AI Partners with Korea’s AI Safety Institute to Advance Global AI Evaluation and Governance

October 27, 2025

Engineering

Beyond Code Exploits: Red Teaming the New AI Attack Surface

Your cybersecurity playbook is obsolete. In the age of AI, the greatest risks aren't traditional code exploits but unpredictable model behaviors—from prompt injections and data leakage to emergent misuse. Drawing on insights from live red teaming exercises with members of Congress, NATO, and the UK Parliament, AI security expert David Campbell explains why we must treat the model itself as the new attack surface. This post unveils an enterprise playbook for proactive AI red teaming, moving beyond static checks to continuously test systems like an adversary. Learn how to map, score, and measure AI risks to get ahead of the threat before an incident occurs.

October 22, 2025

General

Partnering with Lindbergh Schools to Prepare the Next Generation for the Age of AI

As part of our Pledge to America’s Youth, Scale AI is helping bring AI literacy into classrooms across America, starting in St. Louis.