IBM Releases Granite 4.0, Hybrid AI Models for Enterprise Efficiency

IBM has launched its latest open-source model, Granite 4.0, which introduces a hybrid Mamba-2/Transformer architecture to significantly boost enterprise efficiency and reduce computing costs.

The core innovation of the Granite 4.0 model lies in combining Mamba-2 state space model layers with traditional Transformer blocks. This hybrid design is claimed to drastically reduce memory requirements compared to traditional LLMs while maintaining strong performance on workloads, reducing memory needs by over 70% in tasks involving long inputs and concurrent sessions.

The new series offers various specifications, including instruction-tuned and base variants. The main models include:

  • Granite-4.0-H-Small: A Mixture-of-Experts (MoE) model with 32 billion total parameters, but only 9 billion active parameters during inference.
  • Granite-4.0-H-Tiny: A hybrid MoE model with 7 billion total parameters and only 1 billion active parameters.
  • Granite-4.0-H-Micro: A dense hybrid model with 3 billion parameters.

These models are designed for the specific needs of enterprises, such as Retrieval-Augmented Generation (RAG), multi-agent workflows, long document summarization, and deployment on local devices or edge hardware.

Leave a Reply

Your email address will not be published. Required fields are marked *