UHG
Search
Close this search box.

Apple’s GenAI Takes a ‘Lazy’ Leap with LazyLLM

They’ve introduced LazyLLM, which selectively computes the key-value (KV) for important tokens during the pre-filling and decoding stages.

Share

Apple recently introduced LazyLLM, a a novel technique designed to enhance the efficiency of LLM inference. Detailed in a recent research paper, it aims to accelerate the generation of responses in transformer-based language models without compromising accuracy.

This paper is proposed as a technique for efficient LLM inference, particularly in long context scenarios. LazyLLM selectively computes the KV for tokens important for the next token prediction and ‘lazily’ defers the computation of the remaining tokens to later steps, when they become relevant. 

Developed by Qichen Fu, Thomas Merth, Sachin Mehta, and Mahyar Najibi of Apple, alongside Mohammad Rastegari, who now works at Meta, the ‘LazyLLM’ offers flexibility by enabling the model to reconsider tokens that were previously pruned, making the process more adaptive and efficient.

By reducing the heavy computing work of the pre-filling stage, this paves the way for more responsive and agile AI systems, possibly completely changing applications that rely on large language models.

Apple Generative AI Innovations 

Apple recently released a new open-source LLM, DCLM-Baseline 7B, featuring 7 billion parameters. This model, which includes weights, training code, and dataset, is trained on 2.5 trillion tokens from open datasets. It primarily uses English data and has a 2048-token context window. 

https://twitter.com/_philschmid/status/1814274909775995087

The new model is licensed under Apple Sample Code License, accessible on Hugging Face and Transformers. Trained with PyTorch and OpenLM, it matches closed-dataset models like Mistral in performance.

This comes after Apple at WWDC 2024 introduced Apple Intelligence to enhance Siri’s capabilities with generative AI. 

📣 Want to advertise in AIM? Book here

Picture of Tarunya S

Tarunya S

As a passionate enthusiast of caffeine and journalism, I transform tech into words. I enjoy mountain hikes as much as binge-watching new Netflix series.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.