AI for code: the next frontier in software development?

Published on

May 2, 2022

Recently, DeepMind announced a breakthrough in AI for code generation. AlphaCode, which uses large-scale transformer models to generate code, was pitted against humans in a series of programming contests run by Codeforces. Contestants were asked to design novel solutions to abstract problems, which required a blend of natural language understanding, formal logic, creative algorithm design, and coding. In an unprecedented outcome for AI systems, AlphaCode’s performance was in line with the median human competitor.

This is the latest in a flurry of developments in the field – and has captured headlines just as OpenAI’s Codex launch did last summer.

What are the implications? Will we start to see real-world computer programs being synthesised automatically – and what could this mean for the global software economy? And more presently, how do we evaluate the startups that are at the forefront of building AI-for-code products?

‍The opportunity: from workflow efficiency to full program synthesis

‍In recent years, the ML community has made great progress in understanding and generating natural language. Programming is one of the more promising applications for large language models and deep learning techniques. As demand for talent continues to outstrip supply, there is an opportunity to build tools that make developers more effective, or give non-developers tools to help them create software – hence the rise of no-code / low-code tooling in recent years.

As natural language now interacts with code, it is easy to imagine highly valuable applications emerging. A code-to-documentation tool could make it easier for collaborators to understand what a piece of code does. A highly effective general-purpose transpiler could help us shift entire codebases out of obsolete or deprecated languages – perhaps alleviating the bottleneck of COBOL engineers to maintain legacy infrastructure, for example. And if an IDE can autocomplete snippets of code from natural language prompts, shaving seconds from a developer’s workflow at scale, this might translate into highly valuable productivity improvements. One might perhaps build a virtual coding tutor, answering students’ natural language queries with working code. Each of these could lead to a multi-billion-dollar revenue opportunity if adopted at global scale.

OpenAI’s Codex has been a notable pioneer in improving the developer workflow. Codex is a descendant of GPT-3, trained on natural language as well as publicly available code sources like GitHub. Launched last summer, it helps users translate natural language to functioning code in over a dozen languages. Applications like GitHub’s Copilot, as well as a crop of early-stage startups, are using Codex to offer ‘virtual pair programmers’, query generators, docstring writers, and other tools to save developer time and minimise errors. These have been met with acclaim, albeit with some reservationsabout code quality, security, bias, and IP issues.

Program synthesis is a much more substantial challenge, but with potentially further-reaching implications. DeepMind is now demonstrating that AI systems can generate entirely novel programs based on problems described in natural language – even when they cannot duplicate solutions seen before, and cannot try out every potentially viable algorithm. Yet we are very much in the early phases of program synthesis, and there will long remain a need for humans to apply context and creativity, as they break down abstract business problems into machine-readable ones. We remain some distance from code generation systems that truly pass the Turing Test.

‍Commercialising AI for code: key questions for startups

‍There is a rising groundswell of startups operating at the interface between natural language and code. As we evaluate them, our three biggest questions are:

How do you sell AI for code? What’s the insertion point, and how can you expand from there? Is there a specific burning pain point – perhaps in a vertical or a language where developer capacity is severely constrained? Our hypothesis is that the first successful products will be adopted bottom-up by individual developers for the productivity savings they bring. This might then lead to organic intra-organisational spread. There is a higher burden of technical proof required for top-down, C-level adoption of a larger project-based application (e.g. transpiling a large legacy codebase).
How do you build a differentiated platform? If large pre-trained models are openly accessible, how do you build a moat? One approach is to acquire a proprietary domain-specific dataset to fine-tune a model – most obviously in the sectors with tightest IP controls, including life sciences and financial services. We’re also looking out for regulatory barriers to entry that might allow us to build a moat on top of open-access models. Product packaging and UX is another angle from which to differentiate: the winners will be the companies who embed assistive features most elegantly into users’ existing workflow, and eventually offer a managed offering with enterprise-grade support and security.
How do you fit into the user workflow? The most obvious wedge for a new entrant is to assist the developer within an IDE, as GitHub Copilot does in VSCode and others. Yet few IDEs are meaningful businesses in their own right – and being in thrall to a third-party platform is rarely an easy platform on which to build a large standalone business. In the long run, our hypothesis is that the winning platforms will be an extension of today’s low-code trend: i.e. consumerised products, with purely natural language interfaces, enabling non-technical business users to self-serve and generate the insights or applications which are beyond their reach today.

In conclusion – we are in the very early stages of the adoption curve, and there are many open questions for early-stage founders to navigate today. An enthusiastic futurist might foresee a scenario in which AI replaces human programmers altogether. We are far from this endgame – but in the short term, it’s clear that there are new ways to give ‘superpowers’ to developers and alleviate some of their capacity constraints. The founders who enable this could become a vital part of the way we build software over the coming decades.

If you’re one of them, we’d love to chat: please get in touch.

‍With thanks to the founders and experts who joined our ‘AI for Code’ roundtable session.

AI for code: the next frontier in software development?

AI metrics

GenAI’s adoption puzzle

Our investment in Manas AI

AI Agents Don’t Buy Seats—Why Your Pricing Should Follow Suit

Transforming customer service with AI agents: Parloa raises $120M Series C at $1B valuation

What kind of disruption?

Apple innovation and execution

Are better models better?

The Deep Research problem

Introducing Mosaic's new Partner, Chandar Lal

AI eats the world

Competing in search

The AI summer

The VR winter continues

Why we invested in Coram.ai

Apple intelligence and AI maximalism

Building AI products

Ways to think about AGI

AI and problems of scale

Looking for AI use cases

The challenges of investing in AI

Why we invested in Podcastle

Remaking the App Store

Why we invested in Parloa

AI and everything else

Unbundling AI

Scaling personalised support: LLMs and human empowerment

The impact of LLMs on marketplaces

LLM agents: the next platform shift in B2B software

Generative AI and intellectual property

When tech says "no"

LLM applications: an investing framework

AI and the automation of work

Vision Pro

Personalised learning: Edtech’s long-standing aspiration

Netflix, Shein and MrBeast

The New Gatekeepers

Evaluating SaaS metrics at Series A

ChatGPT and the Imagenet moment

Why We Invested in Vektor AI - a Platform Unlocking Mentoring for Tech Talent

Ways to think about a metaverse

Powering Personalisation: Why We Invested in Ninetailed

Meet Johannes Barth - Mosaic's New Head of Analytics

The creator economy: a power law

Rocket ships and tractors

Within and tech M&A

Back to the trend line?

There’s no such thing as data

Now what? A Letter to Founders on How to Survive a Bear Market

What do Europe’s leading founders have in common?

AI for code: the next frontier in software development?

TV, merchant media and the unbundling of advertising

‘Google meets Which’: cost of living data platform Nous raises $9m

Privacy on the internet: what comes next?

Tech questions for 2022

Three Steps To The Future

Nexar is building a ‘digital twin’ of cities using crowdsourced dash cam data

Notes on newsletters

B2B marketplaces: what comes next?

When big tech buys small tech

Metabrand

Blockchain says it posted $1.5 billion in revenue this year

Reimagining the future of buy-to-let. Why we invested in GetGround.

Privacy on the internet: who cares?

Metaverse! Metaverse? Metaverse!!

Stepping out of the firehose

A decade of the Tim Cook machine

Why We Invested in Lightyear

Mainframes, ML and digital transformation

Ads, privacy and confusion

Do App Store Rules Matter?

Why we invested in Zerion

Unleashing the potential of the extended workforce. Why we invested in Utmost.

Integrative SaaS: A new OS for the workplace?

Antitrust posturing

The Potential of Real Time Trade Finance. Our investment in Hokodo

Boxes, trucks and bikes

Apple, Fedex and the cookie apocalypse

Can Apple change ads?