New updates across Red Hat’s AI portfolio drive significant transformations in the business sector. Through Red Hat AI, the company aims to further expand the capabilities needed to accelerate technology adoption, offering customers more freedom and confidence in implementing generative AI in hybrid cloud environments. With the launch of the Red Hat AI Inference Server, third-party validated models on Red Hat AI, and integration with the Llama Stack and Model Context Protocol (MCP) APIs, the company repositions itself in the market for various forms of artificial intelligence.
According to Forrester, open-source software will be the engine to accelerate enterprise AI efforts. As the AI landscape becomes more complex and dynamic, the Red Hat AI Inference Server and third-party validated models offer efficient inference and a proven collection of AI models optimized for performance on the Red Hat AI platform. With the integration of new APIs for gen AI agent development, including Llama Stack and MCP, Red Hat works to simplify implementation complexity, empowering IT leaders, data scientists, and developers to advance their AI initiatives with more control and efficiency.
Efficient inference in hybrid cloud with Red Hat AI Inference Server
The Red Hat AI portfolio features the new Red Hat AI Inference Server, offering faster, more consistent, and cost-effective inference at scale in hybrid cloud environments. This addition is integrated into the latest versions of Red Hat OpenShift AI and Red Hat Enterprise Linux AI and is also available as a standalone solution, enabling organizations to deploy intelligent applications with greater efficiency, flexibility, and performance.
Tested and optimized models with Red Hat AI and third-party validation
The third-party validated models from Red Hat AI, available on Hugging Face, make it easier for businesses to choose the right models for their needs. Red Hat AI offers a collection of validated models, along with deployment guidelines that increase customer confidence in model performance and result reproducibility. Selected models are also optimized by Red Hat, using model compression techniques to reduce size and increase inference speed, helping to minimize resource consumption and operational costs. Furthermore, the ongoing model validation process helps Red Hat AI customers stay at the forefront of innovation in gen AI.
Standardized APIs for developing AI applications and agents with Llama Stack and MCP
The Red Hat AI is integrating the Llama Stack, initially developed by Meta, together with the Anthropic’s MCP, to provide standardized APIs for building and deploying AI applications and agents. Currently available in preview version for developers on Red Hat AI, the Llama Stack offers a unified API for access to inference with vLLM, enhanced generation by retrieval (RAG), model evaluation, guardrails, and agents, across any gen AI model. The MCP enables models to integrate with external tools, providing a standardized interface for connecting to APIs, plugins, and data sources in agent workflows.
The latest version of Red Hat OpenShift AI (v2.20) offers additional enhancements for building, training, deploying, and monitoring generative and predictive AI models at scale. Highlights include:
- Optimized Model Catalog (technical preview): easy access to validated models from Red Hat and third parties, with deployment via web console and full lifecycle management with integrated OpenShift registry.
- Distributed Training with KubeFlow Training Operator: running model tuning with InstructLab and distributed PyTorch workloads across multiple Red Hat OpenShift nodes and GPUs, with distributed RDMA network for acceleration and better GPU utilization to reduce costs.
- Feature store (technical preview): based on the upstream Kubeflow Feast project, provides a centralized repository for managing and serving data for training and inference, optimizing data flow and improving model accuracy and reusability.
The Red Hat Enterprise Linux AI 1.5 brings new updates to the Red Hat’s base model platform aimed at developing, testing, and running large-scale language models (LLMs). Key features of RHEL AI version 1.5 include:
- Availability on Google Cloud Marketplace, expanding customer choices to run Red Hat Enterprise Linux AI on public clouds (in addition to AWS and Azure), easing deployment and management of AI workloads on Google Cloud.
- Enhanced capabilities in multiple languages for Spanish, German, French, and Italian via InstructLab, allowing model customization with native scripts and expanding possibilities for multilingual AI applications. Users can also use their own ‘teacher’ and ‘student’ models for more control in customization and testing, with future support planned for Japanese, Hindi, and Korean.
The Red Hat AI InstructLab on IBM Cloud now has general availability. This new cloud service further simplifies the model customization process, improving scalability and user experience. Companies can utilize their data more efficiently and with increased control.
Red Hat’s Vision: any model, any accelerator, any cloud
The future of AI should be defined by limitless opportunities and not restricted by infrastructure silos. Red Hat envisions a horizon where organizations can deploy any model, on any accelerator, in any cloud, delivering an exceptional and more consistent user experience, without exorbitant costs. To unlock the true potential of investments in gen AI, companies need a universal inference platform — a new standard for continuous and high-performance AI innovations, both now and in the years to come.
Red Hat Summit
Join the Red Hat Summit keynotes to hear the latest updates from Red Hat executives, customers, and partners:
- Modern infrastructure aligned with enterprise AI — Tuesday, May 20, 8 am – 10 am EDT (YouTube)
- The hybrid cloud evolves to drive enterprise innovation — Wednesday, May 21, 8-9:30 AM EDT (YouTube)