Open-source AI just got a significant boost. OpenAI has released its first-ever open-weight models, and Cloudflare has been making them available to developers since day one. By integrating these cutting-edge models into Workers AI, Cloudflare is reinforcing its commitment to openness, transparency, and enterprise-grade performance in the evolving AI landscape. This collaboration enables developers to build faster, more innovative, and more customizable applications with improved data control and lower latency.
Cloudflare Joins OpenAI at Launch
Cloudflare announced that it is a Day 0 launch partner for OpenAI’s new open-weight models. The integration immediately brings OpenAI’s GPT-OSS-120B and GPT-OSS-20B models to Cloudflare’s Workers AI platform, allowing developers and enterprises to run these models directly within Cloudflare’s global network infrastructure.
Company officials stated that Workers AI has always championed open AI models. This new partnership with OpenAI advances that vision by offering robust, accessible AI tools that meet modern demands for transparency, control, and deployment flexibility. For organizations requiring complete data privacy and compliance, the ability to run open models securely within Cloudflare’s environment presents a strong alternative to closed solutions.
Two Model Sizes for Versatile Use
OpenAI released two sizes of its new GPT-OSS models: one with 120 billion parameters and another with 20 billion. Both use a Mixture-of-Experts (MoE) architecture, a modern approach that enhances efficiency by only activating relevant portions of the model per query.
This MoE structure allows each model to provide high-quality responses with reduced compute overhead. Notably, both models run natively on FP4 quantization, which drastically reduces the GPU memory footprint. Compared to traditional FP16 models, these versions deliver faster execution and lower latency.
The more minor GPT-OSS-20B offers performance suitable for lightweight applications, while the larger GPT-OSS-120B provides greater depth for more complex reasoning tasks. This range supports use cases from basic chat applications to advanced enterprise logic workflows.
Designed for Real-World Capabilities
The GPT-OSS models are text-only but come equipped with a set of advanced reasoning capabilities. They support tool calling and are compatible with enhanced features such as Code Interpreter and Web Search. According to Cloudflare, Web Search support will be available soon, while Code Interpreter is already implemented within the Workers AI environment.
OpenAI trained these models to perform robust logical reasoning and code execution tasks—areas where many LLMs struggle. Rather than handling complex computation through internal prediction alone, the models can invoke tools to process tasks programmatically. This hybrid capability enables more accurate and context-aware results, especially in use cases involving programming, data analysis, and mathematics.
Cloudflare’s Unique Support for Code Interpreter
To maximize the utility of Code Interpreter, Cloudflare leveraged its broader Developer Platform ecosystem. Workers AI now integrates directly with Cloudflare Sandboxes, providing a secure and persistent environment for executing AI-generated code.When a user submits a code-related query, Workers AI creates a dedicated Sandbox container that remains active for 20 minutes. This allows for stateful sessions, where the model can retain context across multiple interactions. Users can modify, re-run, or build upon previous executions without loss of session data.
The containerised environment is constructed using the newest Cloudflare Containers that are able to provide secure and elastic execution of code on demand. According to officials, Sandboxes of Code Interpreter are warmed up in advance so that there is minimum wait time when starting, resulting in fast, uninterrupted developer experience.This stateful execution capability is particularly valuable for enterprise and advanced developer workflows. It transforms the typical stateless interaction model into a more flexible, iterative environment—enabling powerful applications in education, analytics, and automation.
A Platform Built for Open AI
Cloudflare focused on the fact that Workers AI is not merely a hosting service of models. It is component of a bigger developer-focused platform that facilitates scalable compute, safe storage, and advanced AI functionality. While the release of OpenAI models further support the company vision of developing Internet-scale apps without sacrificing privacy, performance, or transparency to application developers.
The company pointed out that open-weight models offer key advantages for many users. Developers can inspect, fine-tune, and deploy models with full control. Enterprises gain confidence in data governance, knowing their data stays within their controlled infrastructure. And startups benefit from reduced costs and increased customization compared to closed-model alternatives.
With OpenAI’s models now available through Workers AI, users can combine industry-leading AI performance with Cloudflare’s global edge network and security offerings.
Collaboration Across the Ecosystem
The launch was made possible through support from several contributors in the open AI ecosystem. Cloudflare thanked teams from vLLM and Hugging Face for their assistance in enabling efficient model serving on day one. These partnerships reflect the collaborative nature of open-source AI development and underscore the importance of community-driven innovation.
The prerequisites of Cloudflare considering OpenAI models should, however, cover numerous efforts to integrate the models between its internal teams. These engineering work behind the Sandboxes, Containers and new Responses API support demonstrate how the platform is ready to retain next generation AI workloads.
Looking Ahead: What’s Next for Developers
Now that the GPT-OSS models exist, Cloudflare will publish documentation and example projects on how to use them. These materials will consist of best practice documentation on how to call the models, use Code Interpreter in Sandboxes, and incorporate the tools in larger applications.
Programmers that want to experiment with the potentials of the models can start creating today through the Workers AI platform. Fine tuning the workflows and other customization might be aded to the advanced level users in future releases with the release of more models and multi value CAD.The immediate availability of both model sizes ensures that developers across all experience levels can explore, prototype, and deploy AI-powered features at scale.