Listen to this story
|
Devin, the world’s first AI software engineer, is democratising coding like never before, bringing ease and efficiency to developers, similar to how Canva simplified design for non-designers.
“Devin feels UI/UX first, not GenAI first,” said Andrew Kean Gao, an avid developer who got early access to play around with the world’s first AI software engineer. He said that AI is a core component, but the surrounding infra they built is the star of the show.
Within just a couple of days of its release, Devin AI has captured developers’ minds and social media feeds worldwide, thanks to an impressive demo released by founder and CEO Scott Wu.
Built on advanced reasoning and planning capabilities, Devin aims to handle entire projects independently, offering greater value to users by streamlining the development process from start to finish. The tool can successfully plan and execute complex projects that require ‘thousands of decisions.’ “Devin can recall relevant context at every step, learn over time, and fix mistakes.”
Devin gets the job done.
The progression from GitHub Copilot to AutoGPT and now Devin shows developers are keen to replace the tedious parts of their jobs with AI.
What sets the AI software engineer apart is that it doesn’t derail like the autonomous systems so far. Silas Alberti, a computer scientist and co-founder of another stealth AI startup who tried out Devin, confirmed this and said, “Using a tool like Devin effectively seems like a skill of its own, and it might be a very valuable skill to practise.”
Devin vs GitHub Copilot
This software assistant is a huge leap from AutoGPT and other AI agents and is largely different from the code completion tools of GitHub Copilot. AutoGPT was an experimental open-source project that creates AI agents that break down high-level tasks like Devin. However, its performance was inconsistent, especially with complex tasks.
After the hype of AutoGPT and similar autonomous agents, Devin’s performance elicited excitement and scepticism from developers. The current SWE-bench benchmark correctly resolves 13.86% of issues unassisted, far exceeding the previous state-of-the-art of 1.96%.
Brian Atwood, CEO of Sindarin, a conversational speech AI, noted that Devin’s performance is being compared to LLMs. He explained that the agentic assistant with access to a command-line interface and internet browsing will outperform one-shot GPT-4 inference. This assumption of Devin using GPTs is not confirmed by the team, but to get a proper baseline, Atwood then asked, “Someone hack together a gpt-4 dev loop agent.”
GitHub Copilot, on the other hand, is a popular AI-powered code completion tool that provides suggestions and code snippets based on the context of the user’s code. It assists developers in writing code faster and more efficiently but does not autonomously complete entire projects.
However, the question remains: will Devin live up to its claims and deliver a truly autonomous software development experience? If it does, it could significantly impact the user base of tools like GitHub Copilot. A user on HackerNews who goes by the name observationist said, “AI should be able to make a good employee able to do 100x the output at the same level of quality, and AI only gets more efficient and capable from here on out.” Developers who currently rely on Copilot for code completion and suggestions may find themselves turning to Devin AI for end-to-end project management and execution.
But the transition may not be straightforward. Copilot’s seamless integration with popular IDEs and ability to learn from a developer’s coding style has made it a favourite among many. “You will need human intervention at some point and at that point you’re building a full-fledged web IDE,” commented breadsniffer, a user on HackerNews. Devin will need to offer a similarly smooth user experience and prove its reliability in real-world scenarios to win over developers.
Moreover, the cost of using Devin could be a determining factor in its adoption. If the pricing model is not competitive with existing tools like Copilot, developers may hesitate to make the switch. “Devin uses GPT-4, which can get expensive very quickly. Yes, it can get even more expensive than a human,” Bindu Reddy, CEO of Abacus.AI, speculated on X.
As the AI-assisted software development landscape continues to evolve, it will be interesting to see how tools like GitHub Copilot and Devin AI coexist and complement each other. Dawid van Straaten, the technical director at ConfigureTerminal said, “GitHub Copilot would still be my preferred way to generate small boilerplate pieces of code,” While Devin may take on the heavy lifting of project management and execution, Copilot could still play a crucial role in helping developers write cleaner, more efficient code.
Andrej Karpathy, the former senior director at Tesla who is bullish on software engineering automation, responded to Devin’s release and said, “Software engineering is on track to change substantially. And it will look a lot more like supervising the automation while pitching in high-level commands, ideas, or progression strategies in English.”
Ultimately, Devin’s success will depend on its ability to deliver on its promises and provide tangible benefits to developers. Karpathy stressed that it could become a game-changer in the industry if it can significantly reduce the time and effort required to complete complex software projects while maintaining code quality and performance.