UHG
Search
Close this search box.

DeepLearning Offers New Course on Large Multimodal Model Prompting with Gemini

The new course focuses on integrating video, image along with text prompts for GenAI output.

Share

Listen to this story

Andrew Ng’s DeepLearning.AI in collaboration with Google Cloud launched a new course named ‘Large Multimodal Model Prompting with Gemini’ aiming to provide learners with essential skills in using text embeddings for various applications in AI. Unlike Large Language Models (LLMs) that accept text prompts as input, this course aims at teaching how Large Multimodal Models (LMMs) like Gemini can integrate text, images and video as input prompts to deliver more comprehensive and accurate outputs. 

Andrew Ng with the offering of this course aims to teach effective techniques for multimodal promotion, the differences and use cases for Gemini Nano, Pro, Flash, and Ultra models. Taught by Erwin Huizenga, developer advocate for Gen AI on Google Cloud, it focuses on how to integrate Gemini with external APIs using function calling and best practices for creating multimodal applications.

Check out the course here

Innovation First

The ability to have a model that can reason across text and image is quite news. Before LMMs such as Gemini, one effective way to work with image and text simultaneously, would be to use an image captioning model to describe the image feeding that caption to a LLM. But with LMMs these images can be interpreted directly as inputs along with text by the AI.

DeepLearning.AI is offering a flurry of new courses, this month alone the company has unveiled a course on federated learning, allowing secure training on private data, alongside a partnership with Flower Labs. A significant collaboration with Upstage is helping students to efficiently pretrain large language models, including cost-saving techniques like depth up-scaling.

With this diverse range of offerings, DeepLearning.AI is providing a valuable resource for those seeking to advance their skills in the ever-evolving field of AI. Rounding out the recent launches is a course on optimising retrieval augmented generation (RAG) in partnership with MongoDB. This course equips learners with the skills for building efficient and scalable RAG applications, covering techniques like vector search and prompt compression.

Also Read: Andrew Ng’s DeepLearning.AI Unveils New Course on Building AI Applications with Haystack

📣 Want to advertise in AIM? Book here

Picture of Tanisha Bhattacharjee

Tanisha Bhattacharjee

Journalist with a passion for art, technological development and travel. Discovering the dynamic world of AI, one article at a time.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.