Red Hat and Google Cloud Expand Alliance to Boost Open AI and AI Agents

Two major players in open transformation of artificial intelligence have just announced the extension of their collaboration to drive AI forward. The advancement of the partnership between Red Hat and Google Cloud aims to expand the range of enterprise applications for AI, combining the open-source technologies from the leader in open source with Google’s specialized infrastructure and its model family, Gemma.

Together, the companies will advance use cases for scaling AI through:

  • Launch of the open-source project llm-d, with Google as a founding contributor;
  • Support for vLLM on Google Cloud TPUs and GPU virtual machines (VMs) to enhance AI inference;
  • Day-zero support for vLLM with Gemma 3 model distributions;
  • Availability of Red Hat AI Inference Server on Google Cloud;
  • Development of agent-to-agent AI with Red Hat as a collaborator on Google’s Agent2Agent (A2A) protocol.

Reinforcing AI inference with vLLM

Demonstrating its commitment to readiness from day zero, Red Hat is now one of the first testers of Google’s open model family, Gemma, starting with Gemma 3, with immediate support for vLLM. vLLM is an open-source inference server that accelerates the execution of generative AI applications. As the leading commercial contributor to vLLM, Red Hat is making this platform more efficient and responsive for gen AI applications.

Moreover, Google Cloud’s TPUs, high-performance AI accelerators that are part of Google’s AI portfolio, are now fully compatible with vLLM. This integration enables developers to maximize resources while achieving the essential performance and efficiency for quick and accurate inference.

Recognizing the transition from AI research to real-world deployment, many organizations face the complexities of a diverse AI ecosystem and the need to adopt more distributed computing strategies. To meet this demand, Red Hat has launched the LLM-D open source project, with Google as a founding collaborator. Building on the momentum of the vLLM community, this initiative aims to usher in a new era for AI gen inference. The goal is to enable greater scalability across heterogeneous resources, optimize costs, and increase workload efficiency — all while driving continuous innovation.

Empowering enterprise AI with community-driven innovation

Bringing the latest advancements from the open-source community to the enterprise environment, the Red Hat AI Inference Server is now available on Google Cloud. Similar to Red Hat’s enterprise distribution of vLLM, the AI Inference Server helps companies optimize model inference across their hybrid cloud environment. Leveraging the reliable infrastructure of Google Cloud, organizations can deploy production-ready generative AI models that are both highly responsive and cost-effective at scale.

Highlighting the shared commitment to open AI, Red Hat has also started contributing to the Agent2Agent (A2A) by Google — an application-level protocol that facilitates communication between agents and end-users on various platforms and clouds. By actively participating in the A2A ecosystem, Red Hat aims to accelerate innovation and ensure that AI workflows remain dynamic and effective with the power of Agentic AI.

Red Hat Summit

Watch the keynotes from the Red Hat Summit to hear the latest news from Red Hat executives, customers, and partners: