Here's my perspective on LLMs and the future of programming.
I don't believe that the introduction of LLMs that can write code is going to obviate programming. And I don't believe that it is now pointless to develop new programming languages. Instead, I think LLMs are going to make programming and FoC research
better, by automating one of the least interesting parts of programming: fiddling with the minutiae of syntax, language constructs, and libraries.
I think programmers will still have plenty of work to do. The profession is not doomed. But to justify this, we have to take a step back and consider all of the activities involved in programming.
Firstly, what is a "program"? A program is nothing more than:
•
A formal specification of the behaviour of an interactive system
• ...that computer hardware can execute (after translating it into machine code).
To emphasise this, I will use the term "formal spec" in place of "program" for the remainder of this discussion.
GPT-4 can understand formal specs, and also everyday English. Thus, if we can describe the functionality of a system in everyday English, GPT-4 can (attempt to) translate it into a formal spec. But writing the formal spec is just
one activity of programming.
Altogether, programming (or perhaps "software development") involves several activities:
1. Determining what functionality the system being developed "should" have. This is done either by talking with relevant stakeholders (e.g. the future users), or by directly observing deficiencies with their current practices.
2. Expressing that functionality as a formal specification, i.e. "coding".
3. Verifying that the specification correctly implements all of the functionality of step 1. This includes practices such as reading and reviewing the specification, as well as testing the software.
4. Validating that the implemented functionality addresses the stakeholder's problems.
5. Repeating the first 4 steps until the stakeholders are satisfied with what has been developed.
Here's my hypothesis:
In the next 10 years, LLMs might radically reduce the amount of work required for step 2, but only step 2.
Steps 1 and 4 are very human-centered, and thus can't be automated away — at least until we are at the point where we have an omnipresent AGI that observes all human practices and automatically develops solutions to improve them.
Similarly, step 3 will not be automated any time soon, because:
• The plain English descriptions that we give to LLMs will often be ambiguous, underspecified, and maybe even inconsistent. Thus the LLMs will have to make educated
guesses at what we mean. (Even if they are able to ask clarifying questions, there will always be
some choices that are automatically made for us.)
• LLMs will occasionally get confused or misinterpret what we say, even if we are clear and careful. We will not have
infallible AIs any time soon.
So let's assume that LLMs can automate most of step 2. What does this mean for those of us developing tools and technologies to improve programming? Is our work obsolete now? Will the AI researchers and AI startups be taking the reigns?
I don't think so! There is still a huge opportunity to develop tools that address step 3, at the very least. (Steps 1 and 4 are harder to address with technology.)
In particular,
step 3 involves the task of reading source code. When an LLM spits out 1000 lines of JavaScript, how do you know that the code implements the functionality that you wanted? You have to
verify that it does, and for large programs, that will be an enormous amount of work!
As we all know, no amount of testing can prove that a program is correct. Thus, we cannot verify AI-generated programs just by
using them. Maybe the program has a subtle bug, such as a buffer overflow, that might only be triggered 5 years after the program is deployed. Or less insidiously: maybe the program just doesn't handle certain edge-cases in the way you would like it to. Either way, a human should probably read through the entire program with a keen eye, to check that all of the logic
makes sense.
There's clearly an opportunity for FoC researchers here: we can make languages and tools that make
reading and
verifying the behaviour of programs easier! Some examples:
• We can design programming languages that are vastly easier to
read than traditional languages. How might we do that? Well, "higher-level" languages are likely easier to read, since they are likely to be more concise and focus on the end-user functionality. So work on higher-level programming models will continue to be valuable. To complement this, we can (and IMO, we should) invent new syntaxes that are closer to plain English, such that the specifications that LLMs produce are accessible to a wider audience.
• We can design programming languages where it is harder to write erroneous programs. For example, we can design programming languages that cannot crash or hang (i.e. Turing-incomplete languages), but which are still general-purpose. This reduces the kinds of errors that a human needs to consider as they verify a program.
• We can design better tools for reading and interrogating source code. (For example, better IDE support for navigating and understanding the structure of large codebases.)
• We can design better tools for exploring the space of behaviours of a running program. (Perhaps similar to the tools discussed in Bret Victor's
"Ladder of Abstraction" essay.)
Overall, I think the future is bright! I'm going to continue my own PL research project (a very high-level language) with as much vigor as ever.