Blog

Company Updates & Technology Articles

March 17, 2025

Scale AI’s Proposal for the U.S. AI Action Plan

The next four years will be critical to the future of AI leadership around the world and the case for bold action has never been clearer. More than one year ago the United States was leading the world in the development of AI systems, but today that is no longer the case. Chinese AI advancements, most notably with the launch of Deepsek, have shown that China has closed the gap and now the race is nearly tied. It is not enough for the United States to match China’s intensity on AI, we must exceed it or simply put, we lose. President Trump rightly called Deepseek’s release “a wake up call” and now, the US needs to heed the call to action and determine how to best respond in order to win.

March 5, 2025

Government

Introducing Thunderforge: AI for American Defense

Scale is proud to have been awarded a prime contract by the Defense Innovation Unit (DIU) for Thunderforge - the DoD’s flagship program leveraging AI for military planning and wargaming. Thunderforge represents our commitment to advancing U.S. military capabilities. Following its initial deployment, Thunderforge will expand throughout combatant commands, leveraging Scale AI's agentic applications and GenAI evaluation expertise.

March 3, 2025

General

Scale AI and Inception Announce Strategic Partnership to Drive AI Innovation

Scale AI, a leader in building frontier AI solutions, and Inception, a G42 company developing AI-native products for enterprises, have announced a strategic partnership aimed at accelerating global AI adoption across the public and private sector. The partnership agreement was signed between Ashish Koshy, COO of Inception and Trevor Thompson, Global Managing Director at Scale AI.

February 27, 2025

Government

Scale AI & Center for Strategic and International Studies (CSIS) Introduce Foreign Policy Decision Benchmark

Scale AI, in collaboration with the Center for Strategic and International Studies (CSIS), is proud to introduce the Critical Foreign Policy Decision (CFPD) Benchmark—a pioneering effort to evaluate large language models (LLMs) on national security and foreign policy decision-making tendencies.

February 23, 2025

General

MCIT & Scale AI: Paving the Way for Qatar’s Digital Future

The Ministry of Communications and Information Technology (MCIT) and Scale AI, the leader in frontier AI solutions, are announcing a strategic, long-term partnership to drive Qatar’s digital transformation.

February 19, 2025

Product

Machine Perception for Human Protection: Creating Vision Algorithms to Augment Perimeter Security

Contested basing environments require scalable solutions for perimeter security. Scale AI, the Defense Innovation Unit, and the U.S. Air Force are demonstrating the value of computer vision to force protection challenges globally.

February 11, 2025

Research

Jailbreaking to Jailbreak: A Novel Approach to Safety Testing

Scale researchers have discovered a groundbreaking method for AI safety testing called J2 (Jailbreaking to Jailbreak), where language models are taught to systematically test their own and other models' safety measures. This hybrid approach combines human-like strategic reasoning with automated scalability, achieving success rates of over 90% in vulnerability testing, nearly matching professional human red-teaming effectiveness. While highlighting significant advances in automated security testing, these findings also reveal important challenges for the future of AI safety.

February 11, 2025

Research

Advancing Safe and Reliable AI: Scale's Research in Post-Training, Reasoning, and Evaluation

Scale AI leads groundbreaking research to build safer, more capable AI systems through innovative approaches in post-training optimization, agent development, and evaluation frameworks. Their comprehensive work spans from improving model performance and reliability to developing robust safety measures, all while maintaining a commitment to open collaboration and industry-wide advancement. Through the Safety, Evaluations, and Alignment Lab (SEAL) and various research initiatives, Scale AI is shaping the future of responsible AI development.

February 10, 2025

Company

Scale AI Partnering with the U.S. AI Safety Institute to Evaluate AI Models

Scale’s AISI-approved AI model evaluations are setting a new standard for pre-deployment testing. By offering voluntary, efficient, and third-party validated assessments, we are empowering AI developers to create more reliable models—without the complexities that typically slow down the process.

January 24, 2025

Research

When RLHF Meets Text2SQL

Text2SQL systems promise to democratize access to enterprise data but often fail to handle the complexity of real-world database queries, even if they perform well on test datasets. We found that Reinforcement Learning from Human Feedback (RLHF) is a viable approach for active learning from incorrect production queries to improve Text2SQL accuracy.