Two big ones players, from the open transformation of artificial intelligence, we have just announced the extension of collaboration to boost AI. The advancement of the partnership between Red Hat and Google Cloud is focused on expanding the list of business applications for AI, combining the open source technologies of the open source leader with the specialized infrastructure of Google and its family of models, Gemma.
Together, companies will advance use cases for AI scaling through:
- Launch of the open source llm-d project, with Google as founding collaborator;
- Support vLLM on Google Cloud TPUs and GPU-powered virtual machines (VMs) to enhance AI inference;
- Zero-day support for vLLM with Gemma 3 model distributions;
- Availability of Red Hat AI Inference Server on Google Cloud;
- Development of AI agentic with Red Hat as a collaborator in the Agent2Agent (A2A) protocol of Google.
Reinforcing AI inference with vLLM
Demonstrating its commitment to readiness from day zero, the Red Hat is now one of the first testers in Google's open model family, Gemma, starting with Gemma 3, with immediate support for the vllm. vLLM is an open source inference server that accelerates the execution of generative AI applications. As a leading commercial contributor to vLLM, Red Hat is making this platform more efficient and responsive for gen AI applications.
Besides, uh google Cloud TPUs, high-performance AI accelerators that integrate Google's AI portfolio, are now fully compatible with vLLM. This integration allows developers to maximize resources while getting the performance and efficiency essential for fast and accurate inference.
Recognizing the transition from AI research to real-world deployment, many organizations face the complexities of a diverse AI ecosystem and the need to adopt more distributed computing strategies open source project llm-d, with Google as a founding contributor. Building on the momentum of the vLLM community, this initiative aims to usher in a new era for gen AI inference.The goal is to enable greater scalability across heterogeneous resources, optimize costs, and increase workload efficiency - all while driving continuous innovation.
Boosting enterprise AI with community-based innovation
Bringing the latest advances from the open source community to the enterprise environment, the Red Hat AI Inference Server now it is available on Google Cloudsimilar to the enterprise distribution of vLLM by Red Hat, AI Inference Server helps enterprises optimize model inference across their hybrid cloud environment.By leveraging the trusted infrastructure of Google Cloud, organizations can deploy production-ready, generative AI models that are both highly responsive and cost-effective at scale.
Highlighting the joint commitment to open AI, the Red Hat also went on to contribute to the protocol Agent2Agent (A2A) google's 2an application-level protocol that facilitates communication between agents and end users across diverse platforms and clouds.By actively participating in the A2A ecosystem, Red Hat seeks to accelerate innovation and ensure that AI workflows remain dynamic and effective with the power of AI Agentic.
Red Hat Summit
Watch Red Hat Summit keynotes to hear the latest from Red Hat executives, customers, and partners:
- Modern infrastructure aligned with enterprise AI 20 May, 8h - 10h EDT (YouTube)
- Hybrid cloud evolves to drive business innovation 21 May, 8h-9h30 EDT (YouTube)

