UHG
Search
Close this search box.

Cosine Unveils Genie, the AI Software Engineer that Beats Cognition’s Devin

Cosine has secured $2.5 million in funding from SOMA and Uphonest, with additional investment from Lakestar and Focal and is part of the YC-W23 batch.

Share

Cosine Unveils Genie, the AI Software Engineer that Beats Cognition’s Devin

The race for building a team of AI software engineers doesn’t stop. After Cognition’s Devin, Cosine, the human reasoning lab, has introduced Genie, hailed as the most capable AI software engineering model globally, achieving 30.08% on SWE-Bench evaluations. 

Genie is designed to emulate the cognitive processes of human engineers, enabling it to solve complex problems with remarkable accuracy and efficiency. “We believe that if you want a model to behave like a software engineer, it has to be shown how a human software engineer works,” said Alistair Pullen, the founder of Cosine.

Moreover, this UK-based AI startup Cosine has secured $2.5 million in funding from SOMA and Uphonest, with additional investment from Lakestar and Focal and is part of the YC-W23 batch. 

As the first AI Software Engineering colleague, Genie is trained on data that mirrors the logic, workflow, and cognitive processes of human engineers. 

This allows it to overcome the limitations of existing AI tools, which are often extensions of foundational models with added features like web browsers or code interpreters. Unlike these, Genie can tackle unseen problems, iteratively test solutions, and proceed logically, akin to a human engineer.

Genie has set a new standard on SWE-Bench, achieving a score of 30.08%, a 57% improvement over the previous best scores held by Amazon’s Q and Code Factory. 

This milestone not only represents the highest score ever recorded but also the largest single increase in the benchmark’s history. Genie’s enhanced reasoning and planning capabilities extend beyond software engineering, positioning it as a versatile tool for various domains.

In its development, Genie was evaluated using SWE-Bench and HumanEval, with a strong focus on its ability to solve software engineering problems and retrieve the correct code for tasks. 

Genie scored 64.27% in retrieving necessary code lines, identifying 91,475 out of 142,338 required lines. This marks significant progress, though Cosine acknowledges room for improvement in this area.

Genie’s development involved overcoming challenges related to training models with limited context windows. Early efforts using smaller models highlighted the need for a larger context model, leading to Genie’s training on billions of tokens. The training mix was carefully selected to ensure proficiency in the programming languages most relevant to users.

Cosine’s innovative approach to Genie’s development included the use of self-improvement techniques, where the model was exposed to imperfect scenarios and learned to correct its mistakes. This iterative process significantly strengthened Genie’s problem-solving abilities.

Looking ahead, Cosine plans to continue refining Genie, expanding its capabilities across more programming languages and frameworks. The company aims to develop smaller models for simpler tasks and larger ones for complex challenges, leveraging their unique dataset. Exciting future developments include fine-tuning Genie on specific codebases, enabling it to understand large, legacy systems even in less common languages.

📣 Want to advertise in AIM? Book here

Picture of Mohit Pandey

Mohit Pandey

Mohit dives deep into the AI world to bring out information in simple, explainable, and sometimes funny words.
Related Posts
Association of Data Scientists
Tailored Generative AI Training for Your Team
Upcoming Large format Conference
Sep 25-27, 2024 | 📍 Bangalore, India
Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

Flagship Events

Rising 2024 | DE&I in Tech Summit
April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore
Data Engineering Summit 2024
May 30 and 31, 2024 | 📍 Bangalore, India
MachineCon USA 2024
26 July 2024 | 583 Park Avenue, New York
MachineCon GCC Summit 2024
June 28 2024 | 📍Bangalore, India
Cypher USA 2024
Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA
Cypher India 2024
September 25-27, 2024 | 📍Bangalore, India
discord icon
AI Forum for India
Our Discord Community for AI Ecosystem, In collaboration with NVIDIA.