OpenAI Models Arrive on Cloudflare Workers AI

Open-source AI just got a significant boost. OpenAI has released its first-ever open-weight models, and Cloudflare has been making them available to developers since day one. By integrating these cutting-edge models into Workers AI, Cloudflare is reinforcing its commitment to openness, transparency, and enterprise-grade performance in the evolving AI landscape. This collaboration enables developers to build faster, more innovative, and more customizable applications with improved data control and lower latency.

Cloudflare Joins OpenAI at Launch

Cloudflare announced that it is a Day 0 launch partner for OpenAI’s new open-weight models. The integration immediately brings OpenAI’s GPT-OSS-120B and GPT-OSS-20B models to Cloudflare’s Workers AI platform, allowing developers and enterprises to run these models directly within Cloudflare’s global network infrastructure.

Company officials stated that Workers AI has always championed open AI models. This new partnership with OpenAI advances that vision by offering robust, accessible AI tools that meet modern demands for transparency, control, and deployment flexibility. For organizations requiring complete data privacy and compliance, the ability to run open models securely within Cloudflare’s environment presents a strong alternative to closed solutions.

Two Model Sizes for Versatile Use

OpenAI released two sizes of its new GPT-OSS models: one with 120 billion parameters and another with 20 billion. Both use a Mixture-of-Experts (MoE) architecture, a modern approach that enhances efficiency by only activating relevant portions of the model per query.

This MoE structure allows each model to provide high-quality responses with reduced compute overhead. Notably, both models run natively on FP4 quantization, which drastically reduces the GPU memory footprint. Compared to traditional FP16 models, these versions deliver faster execution and lower latency.

The more minor GPT-OSS-20B offers performance suitable for lightweight applications, while the larger GPT-OSS-120B provides greater depth for more complex reasoning tasks. This range supports use cases from basic chat applications to advanced enterprise logic workflows.

Designed for Real-World Capabilities

The GPT-OSS models are text-only but come equipped with a set of advanced reasoning capabilities. They support tool calling and are compatible with enhanced features such as Code Interpreter and Web Search. According to Cloudflare, Web Search support will be available soon, while Code Interpreter is already implemented within the Workers AI environment.

OpenAI trained these models to perform robust logical reasoning and code execution tasks—areas where many LLMs struggle. Rather than handling complex computation through internal prediction alone, the models can invoke tools to process tasks programmatically. This hybrid capability enables more accurate and context-aware results, especially in use cases involving programming, data analysis, and mathematics.

Cloudflare’s Unique Support for Code Interpreter

To maximize the utility of Code Interpreter, Cloudflare leveraged its broader Developer Platform ecosystem. Workers AI now integrates directly with Cloudflare Sandboxes, providing a secure and persistent environment for executing AI-generated code.When a user submits a code-related query, Workers AI creates a dedicated Sandbox container that remains active for 20 minutes. This allows for stateful sessions, where the model can retain context across multiple interactions. Users can modify, re-run, or build upon previous executions without loss of session data.

The containerised environment is constructed using the newest Cloudflare Containers that are able to provide secure and elastic execution of code on demand. According to officials, Sandboxes of Code Interpreter are warmed up in advance so that there is minimum wait time when starting, resulting in fast, uninterrupted developer experience.This stateful execution capability is particularly valuable for enterprise and advanced developer workflows. It transforms the typical stateless interaction model into a more flexible, iterative environment—enabling powerful applications in education, analytics, and automation.

A Platform Built for Open AI

Cloudflare focused on the fact that Workers AI is not merely a hosting service of models. It is component of a bigger developer-focused platform that facilitates scalable compute, safe storage, and advanced AI functionality. While the release of OpenAI models further support the company vision of developing Internet-scale apps without sacrificing privacy, performance, or transparency to application developers.

The company pointed out that open-weight models offer key advantages for many users. Developers can inspect, fine-tune, and deploy models with full control. Enterprises gain confidence in data governance, knowing their data stays within their controlled infrastructure. And startups benefit from reduced costs and increased customization compared to closed-model alternatives.

With OpenAI’s models now available through Workers AI, users can combine industry-leading AI performance with Cloudflare’s global edge network and security offerings.

Collaboration Across the Ecosystem

The launch was made possible through support from several contributors in the open AI ecosystem. Cloudflare thanked teams from vLLM and Hugging Face for their assistance in enabling efficient model serving on day one. These partnerships reflect the collaborative nature of open-source AI development and underscore the importance of community-driven innovation.

The prerequisites of Cloudflare considering OpenAI models should, however, cover numerous efforts to integrate the models between its internal teams. These engineering work behind the Sandboxes, Containers and new Responses API support demonstrate how the platform is ready to retain next generation AI workloads.

Looking Ahead: What’s Next for Developers

Now that the GPT-OSS models exist, Cloudflare will publish documentation and example projects on how to use them. These materials will consist of best practice documentation on how to call the models, use Code Interpreter in Sandboxes, and incorporate the tools in larger applications.

Programmers that want to experiment with the potentials of the models can start creating today through the Workers AI platform. Fine tuning the workflows and other customization might be aded to the advanced level users in future releases with the release of more models and multi value CAD.The immediate availability of both model sizes ensures that developers across all experience levels can explore, prototype, and deploy AI-powered features at scale.

FAQs

What are the OpenAI GPT-OSS models available on Cloudflare Workers AI?

Cloudflare now supports two of OpenAI's open-weight models: GPT-OSS-120B and GPT-OSS-20B. These models are based on a Mixture-of-Experts (MoE) architecture and are optimized for efficient, high-performance AI tasks.

How can developers access OpenAI’s models on Cloudflare Workers AI?

Developers can access the models via Workers Bindings or REST APIs. Supported endpoints include /run, /responses, and an upcoming /chat/completions endpoint for OpenAI-compatible requests. The preferred format is the new Responses API.

What is the Code Interpreter feature and how does Cloudflare support it?

The Code Interpreter enables the models to execute Python code to solve logic and math problems. Cloudflare runs it using Sandbox containers, allowing for stateful code execution that persists across multiple queries in a secure environment

Why are the models more efficient than traditional AI models?

These models use FP4 quantization and a Mixture-of-Experts architecture, which reduces memory usage and speeds up execution. Only the relevant model experts are activated per query, improving performance without sacrificing quality.

Who benefits from using open-weight models on Workers AI?

Both developers and enterprises benefit. Developers gain transparency and flexibility for building custom apps, while enterprises enjoy data security and deployment control by running models within Cloudflare’s infrastructure.

Useful Links

Follow Us

Partnering with OpenAI to bring their new open models onto Cloudflare Workers AI

Cloudflare Joins OpenAI at Launch

Two Model Sizes for Versatile Use

Designed for Real-World Capabilities

Cloudflare’s Unique Support for Code Interpreter

A Platform Built for Open AI

Collaboration Across the Ecosystem

Looking Ahead: What’s Next for Developers

FAQs

What are the OpenAI GPT-OSS models available on Cloudflare Workers AI?

How can developers access OpenAI’s models on Cloudflare Workers AI?

What is the Code Interpreter feature and how does Cloudflare support it?

Why are the models more efficient than traditional AI models?

Who benefits from using open-weight models on Workers AI?

Franklin

OpenAI launches GPT-5, expands ChatGPT power to 700m users

Leave a Reply Cancel reply

Recommended.

Microsoft Boosts Computing Power for Its In-House AI Models

AI Deployment in NHS Hospitals Faces Significant Obstacles

Subscribe.

Trending.

Nano Banana — Gemini’s Prompt-Driven AI Image Editor That Blends Photos, Keeps Faces Stable, and Adds SynthID Transparency

How the AI Boom Mirrors the Industrial Revolution in America

AI Systems Help a Couple Conceive After 18 Years of Infertility

Real-Life ChatGPT Tips From OpenAI Employees

Why Google Tensor G5 Could Redefine Pixel Performance: AI Speed, Gaming Power, and Camera Upgrades You Can’t Ignore

Why Choose us?

Newsletter