New updates across Red Hat's AI portfolio drive major transformations in the business sector. Through Red Hat AI, the company aims to further enhance the capabilities needed to accelerate the adoption of the technology, offering more freedom and confidence to customers in implementing generative AI (gen AI) in hybrid cloud environments. With the launch of Red Hat AI Inference Server, third-party validated models on Red Hat AI, and integration with Llama Stack APIs and Model Context Protocol (MCP), the company is repositioning itself in the market for various artificial intelligence modalities.
According to Forrester, open source software will be the engine to accelerate enterprise AI efforts. As the AI landscape becomes more complex and dynamic, theRed Hat AI Inference Serverand third-party validated models offer efficient inference and a tested collection of AI models optimized for performance on the Red Hat AI platform. With the integration of new APIs for the development of gen AI agents, including Llama Stack and MCP, Red Hat is working to simplify deployment complexity, empowering IT leaders, data scientists, and developers to advance their AI initiatives with greater control and efficiency.
Efficient inference in hybrid cloud with Red Hat AI Inference Server
The Red Hat AI portfolio includes the new featureRed Hat AI Inference Server, providing faster, more consistent, and cost-effective inference at scale in hybrid cloud environments. This addition is integrated into the latest versions of Red Hat OpenShift AI and Red Hat Enterprise Linux AI, and is also available as a standalone solution, enabling organizations to deploy intelligent applications with greater efficiency, flexibility, and performance.
Tested and optimized models with Red Hat AI and third-party validation
Third-party validated Red Hat AI models, available atHugging Face, they facilitate the company's choice when finding the right models for their needs. Red Hat AI offers a collection of validated models, as well as deployment guidelines that increase customer confidence in model performance and result reproducibility. Selected models are also optimized by Red Hat, using model compression techniques that reduce their size and increase inference speed, helping to minimize resource consumption and operational costs. Furthermore, the continuous process of model validation helps Red Hat AI customers stay at the forefront of innovation in generative AI.
Standardized APIs for application development and AI agents with Llama Stack and MCP
Red Hat AI is integrating theLlama Stack, initially developed by Meta, together with theMCPfrom Anthropic, to provide standardized APIs for building and deploying AI applications and agents. Currently available in a developer preview version on Red Hat AI, Llama Stack offers a unified API for inference access with vLLM, retrieval-augmented generation (RAG), model evaluation,guardrailsand agents, in any model of gen AI. The MCP allows models to integrate with external tools, providing a standardized interface for connecting with APIs, plugins, and data sources in agent workflows.
The latest version of theRed Hat OpenShift AI (v2.20)) offers additional improvements to build, train, deploy, and monitor generative and predictive AI models at scale. The highlights include:
- Optimized model catalog (technical preview):Facilitated access to validated Red Hat and third-party models, with deployment via web console and complete lifecycle management with integrated OpenShift registration.
- Distributed training with KubeFlow Training OperatorModel tuning execution with InstructLab and distributed PyTorch workloads across multiple nodes and GPUs on Red Hat OpenShift, with distributed RDMA networking for acceleration and better GPU utilization, in order to reduce costs.
- Feature store (technical preview):Based on the upstream Kubeflow Feast project, it offers a centralized repository for managing and supplying data for training and inference, optimizing the data flow and improving the accuracy and reusability of models.
THERed Hat Enterprise Linux AI 1.5brings new updates to Red Hat's base model platform, focused on the development, testing, and deployment of large-scale language models (LLMs). The main features of RHEL AI version 1.5 include:
- Availability on Google Cloud MarketplaceExpanding customer options to run Red Hat Enterprise Linux AI on public clouds (beyond AWS and Azure), facilitating the deployment and management of AI workloads on Google Cloud.
- Enhanced capabilities in multiple languagesfor Spanish, German, French, and Italian via InstructLab, allowing model customization with native scripts and expanding the possibilities of multilingual AI applications. Users can also use their own "teacher" and "student" models for greater control in customization and testing, with future support planned for Japanese, Hindi, and Korean.
THE Red Hat AI InstructLab on IBM Cloud Now it is generally available. This new cloud service further simplifies the model customization process, enhancing scalability and user experience. Companies can use their data more efficiently and with greater control.
Red Hat's vision: any model, any accelerator, any cloud
The future of AI should be defined by unlimited opportunities and not restricted by infrastructure silos. Red Hat envisions a horizon where organizations can deploy any model, on any accelerator, on any cloud, delivering an exceptional and more consistent user experience without exorbitant costs. To unlock the true potential of investments in gen AI, companies need a universal inference platform — a new standard for continuous and high-performance AI innovations, both now and in the coming years.
Red Hat Summit
Participate in the Red Hat Summit keynotes to hear the latest news from Red Hat executives, customers, and partners:
- Modern infrastructure aligned with enterprise AI—Tuesday, May 20, 8 a.m. - 10 a.m. EDTYouTube)
- The hybrid cloud evolves to drive business innovation- Wednesday, May 21, 8:00 AM - 9:30 AM EDTYouTube)