CalypsoAI Launches Security Index; Provides First Comprehensive Safety Ranking of Major GenAI models

By AP News - Feb 26, 2025, 11:11 AM ET

Last Updated - Feb 26, 2025, 11:11 AM EST

CalypsoAI Launches Security Index; Provides First Comprehensive Safety Ranking of Major GenAI models

AI Security Leader's Inference Red-Team Product Compromises World's Largest Foundation Models

NEW YORK and DUBLIN, Feb. 26, 2025 /PRNewswire/ -- CalypsoAI, the leader in securing GenAI for enterprises, launched the CalypsoAI Security Leaderboard, the world's first index of all the major AI models based on their security performance.

The CalypsoAI Security Leaderboard ranks all the major models on their ability to withstand advanced security attacks and presents a risk-to-performance (RTP) ratio as well as a valuable cost of security (CoS) metric. CalypsoAI compiled the Leaderboard after stress-testing AI models with its new Inference Red-Team solution, which combines Agentic Warfare with automated attacks.

"Our Inference Red-Team product has successfully broken all the world-class GenAI models that exist today," said Donnchadh Casey, CEO of CalypsoAI. "Many organizations are adopting AI without understanding the risks to their business and clients; moving forward, the CalypsoAI Security Leaderboard provides a benchmark for business and technology leaders to integrate AI safely and at scale."

CalypsoAI Inference Red-Team delivers automated, scalable assessments that run real-world attacks to proactively identify vulnerabilities and create a CalypsoAI Security Index-scored (CASI) AI inventory. By combining Agentic Warfare with a continuously updated library of extensive signature attacks and operational testing, Inference Red-Team empowers organizations to enhance governance, ensure compliance, and maintain proactive, secure and resilient AI systems.

"GenAI presents unparalleled opportunities for business transformation, yet security, governance and compliance risks remain significant barriers to adoption," said Amit Levinstein, VP Security Architecture & CISO at CYE. "CalypsoAI's breakthrough red teaming solution is a quantum leap in AI security and provides the hard evidence executives need, and confidence they desire, to deploy AI applications safely."

With over 70 years of experience in security and AI among its engineering team, CalypsoAI recognized this need for clear, actionable reports identifying vulnerabilities to enable security teams to strengthen their AI systems and stay ahead of the latest threats.

"CalypsoAI Inference Red-Team introduces Agentic Warfare as the latest method of finding security gaps in GenAI models," said James White, President and CTO of CalypsoAI. "It signals the end of inefficient and inconsistent manual red teaming, which is still used by even the largest AI model developers, and with the CalypsoAI Security Leaderboard, closes a significant gap in publicly-available information on model security."

The AI threat landscape is constantly evolving and most companies aren't equipped to test their AI systems in the way attackers will; with its Agentic Warfare capability, CalypsoAI Inference Red-Team leverages AI-powered adversaries to engage dynamically, exposing hidden weaknesses that static tests miss. The agentic nature of this solution ensures teams can confidently select the safest models before deploying applications and as use cases evolve.

"CalypsoAI Red Team is a game-changer for businesses leading or even experimenting in AI," said Jay Choi, CEO Typeform. "In many initiatives, we saw great opportunities to innovate by embedding Generative AI; but we constantly struggled with risk of security. It was the unknown unknowns that were the most challenging. Calypso Inference Red-Teaming really addresses that risk so companies integrating AI-driven technology no longer need to choose between security and innovation."

The CalypsoAI Security Index (CASI)

CASI is a metric developed to answer the complex question of how secure any given model is. A higher CASI score indicates a more secure model or application. While many studies rely on Attack Success Rate (ASR), this traditional metric often oversimplifies the reality and treats all attacks as equal. For example, an attack that bypasses a bicycle lock is equated to one that compromises nuclear launch codes. Similarly, in AI, a small, unsecured model might be easily compromised with a simple request for sensitive information, while a larger model might require sophisticated techniques like Agentic Warfare to break its alignment.

CalypsoAI will continue to develop new vulnerabilities and work with model providers to responsibly disclose and resolve these issues. As a result, model CASI scores are updated on a quarterly basis.

Model Provider	Model Name	CASI	Avg. Performance	RTP	CoS
Anthropic	Claude 3.5 Sonnet	96.25	84.50 %	0.93	18.70
Microsoft	Phi4-14B	94.25	75.90 %	0.68	0.66
Anthropic	Claude 3.5 Hiku	93.45	68.28 %	0.57	5.14
OpenAI	GPT-4o	75.06	80.50 %	0.52	16.65
Meta	Llama 3.3 70b	74.79	74.50 %	0.69	1.85
DeepSeek	DeepSeek-R1-Distill- Llama-70B	74.42	72.67 %	0.74	1.24
DeepSeek	DeepSeek-R1	74.26	86.53 %	0.58	4.24
OpenAI	GPT-4o-mini	73.08	71.78 %	0.73	1.03
Google	Gemini 1.5 Flash	73.06	66.70 %	0.92	0.51
Google	Gemini 1.5 Pro	72.85	74.10 %	0.63	5.58
OpenAI	GPT-3.5 Turbo	72.76	59.20 %	0.82	2.75
Alibaba Cloud	Qwen QwQ-32B-preview	67.77	68.87 %	0.65	2.14

About CalypsoAI 
The first AI solution to secure the model inference layer, CalypsoAI protects AI systems against data leakage, misuse and misbehavior, empowering enterprises to confidently and safely deploy GenAI at scale. CalypsoAI provides the only full-lifecycle platform to secure AI models and applications at the inference layer, deploying agentic warfare to protect businesses from evolving adversaries.

Trusted by global enterprises like Palantir and SGK, CalypsoAI's leading team of experts are doing the hard miles to ensure a mature security approach keeps pace with today's incredible AI innovation trajectory. CalypsoAI was founded in 2018 and has secured over $40 million in venture funding from investors including Paladin Capital Group, Lockheed Martin Ventures and Hakluyt Capital. Learn more at calypsoai.com and LinkedIn.

View original content to download multimedia: https://www.prnewswire.com/news-releases/calypsoai-launches-security-index-provides-first-comprehensive-safety-ranking-of-major-genai-models-302386190.html

SOURCE CalypsoAI

CalypsoAI Launches Security Index; Provides First Comprehensive Safety Ranking of Major GenAI models

Latest News by Industry

Technology

Show more

Amazon's new AI-powered Alexa promises to be your 'best friend in a digital world' for a monthly fee

Emergency fundraisers offer a lifeline to groups who've lost foreign aid

Wall Street is anxiously watching Nvidia earnings again. What to know, by the numbers

Oil & Gas

Show more

Range Resources: Q4 Earnings Snapshot

Par Petroleum: Q4 Earnings Snapshot

Tetra Technologies: Q4 Earnings Snapshot

Healthcare

Show more

Option Care: Q4 Earnings Snapshot

Addus HomeCare: Q4 Earnings Snapshot

Veracyte: Q4 Earnings Snapshot

Manufacturing

Show more

Stock market today: Wall Street climbs and heads for its first gain in 5 days

Wall Street is anxiously watching Nvidia earnings again. What to know, by the numbers

Photronics: Fiscal Q1 Earnings Snapshot

Finance Insurance & Real Estate

Show more

EHealth: Q4 Earnings Snapshot

Bank of Montreal: Fiscal Q1 Earnings Snapshot

Bank of Nova Scotia: Fiscal Q1 Earnings Snapshot

Retail

Show more

Apple shareholders reject proposal to scrap company's diversity programs

Stock market today: A slide for Walmart pulls Wall Street from its record, and Dow drops 450

Walmart rolled through 2024, but uncertainty about consumers and tariffs seep into year ahead

Transportation

Show more

Booking Holdings: Q4 Earnings Snapshot

Expeditors International: Q4 Earnings Snapshot

Stock market today: Wall Street slumps as worries worsen about inflation and tariffs

Contact Us

Quick Links

Categories

Our Offices

10kInfo, Inc.

10kInfo Data Solutions, Pvt Ltd.

CalypsoAI Launches Security Index; Provides First Comprehensive Safety Ranking of Major GenAI models

Oscar performers include Cynthia Erivo, Ariana Grande and Lisa. Here's what to know about the show

Recent aviation disasters and close calls stoke fears about the safety of flying

The US and Ukraine are closing in on a mineral deal. What changed, and what comes next?

English cricket gets 'seminal moment' as $650M raised by investors

Measles cases continue to rise in rural parts of West Texas, with 124 confirmed

What's going on with the Kennedy Center under Trump?

Hunter Schafer on why she spoke out about being issued a male passport

'Conclave' triumphs at SAG Awards and Timothée Chalamet wins best actor, upending Oscar predictions

Stock market today: Wall Street falls as US consumers get more pessimistic about inflation, tariffs

Apple shareholders reject proposal to scrap company's diversity programs

A swarm of small drones may help artificial reefs attract sea life

Par Petroleum: Q4 Earnings Snapshot

Small business owners feel more uncertain about the future

Workiva: Q4 Earnings Snapshot

Tetra Technologies: Q4 Earnings Snapshot

Microsoft workers protest sale of AI and cloud services to Israeli military

Synovus wins 15 Best Bank awards from Coalition Greenwich

/C O R R E C T I O N -- BrowserStack/

/C O R R E C T I O N -- BrowserStack/

LIS Technologies Inc. Appoints Preeminent Researcher Neil Campbell, Ph.D., as its Chairman of ...