At Microsoft Build 2024, Microsoft announced its partnership with Khan Academy to provide time-saving and lesson-enhancing AI tools to millions of teachers. Last year, Khan Academy introduced an AI-powered teaching assistant called ‘Khanmigo for Teachers’.
By donating access to Azure AI-optimised infrastructure, Microsoft will enable Khan Academy to offer all K-12 educators in the U.S. free access to the pilot program of Khanmigo for Teachers, now powered by Azure OpenAI Service.
Moreover, Khan Academy will use Phi-3 to enhance math tutoring. The Edtech giant will provide Microsoft with access to educational content such as math problem questions and step-by-step solutions. This will help develop AI-powered math tutoring using Phi-3.
Khan Academy will also offer continuous feedback and benchmarking data to assess performance. Notably, none of Khan Academy’s user data will be used to train the model.
“These state-of-the-art teacher tools we are going to be able to give for free to every teacher in the United States, so they can get productivity improvements,” said Salman Khan, Khan Academy chief, adding that teaching will be the first mainstream profession to really benefit from generative AI.
Apart from Khan Academy other developers are also leveraging Phi-3 for innovative applications. For instance, ITC is using it to built a copilot for Indian farmers.
Epic is employing Phi-3 to summarise complex patient histories, address clinician burnout and staffing shortages. Digital Green’s AI assistant, Farmer.Chat, integrates video to aid over 6 million farmers.
Phi-3 Family of Models
Microsoft announced the expansion of its Phi-3 family of small, open models with the introduction of Phi-3-vision, a multimodal model combining language and vision capabilities.
Phi-3-vision is a 4.2B parameter model capable of reasoning over real-world images and text, optimized for chart and diagram understanding.
The model enhances text and image reasoning, setting new benchmarks in visual reasoning tasks, OCR, and chart understanding. It outperforms larger models like Claude-3 Haiku and Gemini 1.0 Pro V, maintaining Phi-3’s reputation for delivering high performance in a compact size
Earlier models, Phi-3-small and Phi-3-medium, are now available on Microsoft Azure, catering to generative AI applications with strong reasoning, limited compute, and latency constraints. These models join Phi-3-mini and Phi-3-medium on Azure AI’s models as a service, providing developers with quick and easy access.
Phi-3 models, known for their capability and cost-effectiveness, outperform similarly sized models across language, reasoning, coding, and math benchmarks. They are developed with high-quality training data, adhering to Microsoft’s responsible AI standards.
The Phi-3 family includes:
- Phi-3-vision: A 4.2B parameter multimodal model.
- Phi-3-mini: A 3.8B parameter language model.
- Phi-3-small: A 7B parameter language model.
- Phi-3-medium: A 14B parameter language model.
These models are optimised for various hardware, supporting mobile and web deployments through ONNX Runtime and DirectML. They are also available as NVIDIA NIM inference microservices, optimized for NVIDIA GPUs and Intel accelerators.
Takes on Google Gemma 2
Earlier, at Google I/O 2024, Google introduced PaliGemma, an open vision-language model designed for leading performance across a wide range of vision-language tasks. These tasks include image and short video captioning, visual question answering, text recognition in images, object detection, and object segmentation.
Google also introduced Gemma 2, the next generation of its Gemma models. The new 27 billion parameter model offers performance comparable to Llama 3’s 70B model at less than half the size, setting a new standard in the open model landscape.