Listen to this story
|
Andrew Ng’s DeepLearning.AI in collaboration with Google Cloud launched a new course named ‘Large Multimodal Model Prompting with Gemini’ aiming to provide learners with essential skills in using text embeddings for various applications in AI. Unlike Large Language Models (LLMs) that accept text prompts as input, this course aims at teaching how Large Multimodal Models (LMMs) like Gemini can integrate text, images and video as input prompts to deliver more comprehensive and accurate outputs.
Andrew Ng with the offering of this course aims to teach effective techniques for multimodal promotion, the differences and use cases for Gemini Nano, Pro, Flash, and Ultra models. Taught by Erwin Huizenga, developer advocate for Gen AI on Google Cloud, it focuses on how to integrate Gemini with external APIs using function calling and best practices for creating multimodal applications.
Check out the course here
Innovation First
The ability to have a model that can reason across text and image is quite news. Before LMMs such as Gemini, one effective way to work with image and text simultaneously, would be to use an image captioning model to describe the image feeding that caption to a LLM. But with LMMs these images can be interpreted directly as inputs along with text by the AI.
DeepLearning.AI is offering a flurry of new courses, this month alone the company has unveiled a course on federated learning, allowing secure training on private data, alongside a partnership with Flower Labs. A significant collaboration with Upstage is helping students to efficiently pretrain large language models, including cost-saving techniques like depth up-scaling.
With this diverse range of offerings, DeepLearning.AI is providing a valuable resource for those seeking to advance their skills in the ever-evolving field of AI. Rounding out the recent launches is a course on optimising retrieval augmented generation (RAG) in partnership with MongoDB. This course equips learners with the skills for building efficient and scalable RAG applications, covering techniques like vector search and prompt compression.
Also Read: Andrew Ng’s DeepLearning.AI Unveils New Course on Building AI Applications with Haystack