Did anyone see the OpenAI code generation demo dur...
# thinking-together
v
Did anyone see the OpenAI code generation demo during Microsoft Build this week? Write a function name and a comment and OpenAI generated Python code that does that thing, and keeps context across functions?
šŸ‘Ž 2
🤯 2
šŸ 1
a
I missed it! Do you have a link or name I can search for?
v
That's the one!
s
Same problems different day. Look, generating Python code is not exciting. The problem is not ā€œjustā€ writing the code. We often forget there is learning git, infra, package management, and like a million other things that this OpenAI does not help with for the other 99% of the world that cannot code and regardless of an AI helping them type strange looking Python — it will not change the fact that most people cannot code. It’s impressive, sure, but think about who benefits by writing Python faster…? Is speed really the problem?
Python, and all other languages, remain to low for most people and that will not change. We need something higher-level than classes, functions, variables and importing libraries. šŸ¤”
āž• 1
I will come to no surprise that I think this presentation by Microsoft was not inspiring nor fully utilizing the OpenAI framework to empower more people. It may have only made it easier for people who already know how to code… 🤷
a
The first time I saw tech like this was in Scratch to help beginners learn to code. They could write what they wanted the code to do (ā€œmake the ball bounceā€) and get a handful of snippets that they could steal from. And it worked the other way, so if they were unsure what some code did, they could press a button to get potential descriptions based on how people had described similar code. Very handy for learners.
s
Right, but Scratch is a structured editor and essentially all-in-one. Python is far from that.
Now, if this was built into Darklang… now we are talking.
a
A key part of the demo is that it puts you in conversation with the computer about processes grounded in the language of your code base. It happens that Python is part of the shared language they used in this demo, but it’s really impressive that it is even conversant in verbs the user themselves defined. I can’t say I’d get any more or less out of it if they’d done it in Dark? The value of the demo isn’t in your ability to download it today, is it?
šŸ‘ 1
Honestly if they swapped out Python for system hotkeys then you’d be well on your way to being able to type ā€œcomputer, do my taxesā€ so I’m having trouble figuring out your disappointment.
šŸ‘ 1
s
I’m sorry, explain to me how system hotkeys can do my taxes?
I think they picked Python for a reason. That reason being that their demos are search results in StackOverflow. I’m not inspired by writing the code they demoed because all that is in libraries. They did the simple stuff…. how is that impressive? What about the hard stuff? I don’t think it will anything to say about it. Python and essentially every single language has constructs and design principles that require abstraction and refactoring when the domain changes or knowledge is exposed.
this function reverses a list
is nothing. `this function fetches our customers from stripe and filters by company, then cross checks the results against our salesforce`…. now that is something real
a
The same way you do if you use tax software by pressing keys. If you wrote it out in Plain English, it’d look like ā€œOpen browser. Navigate to taxcut.com. Type username. Type password. Paste my 1099.ā€ with more words because I’m on my phone šŸ˜› The point is, every process, transcribed, is code, so interpreting natural language into code is a big deal, if not exactly what you were calling for in the other thread.
s
It’s the unique stuff that counts. All these businesses have different constraints, services, environments, processes… OMG it’s endless. The OpenAI demoed is nothing but a search result in SO
interpreting natural language into code is a big deal
This is where I, respectfully, disagree. At least in the context of code being a programming language currently on the market today.
a
I personally look forward to instructing a computer to help me generate any code required as input to every language you ever create. šŸ˜„
s
Oh, I certainly am too. It’s what I’m building at Storyscript… but it’s NOT translating to Python
create a form with an email and after it's submitted put the results in my excel table
That is WAY better NLP. And… it works in Storyscript.
`upload zoom recordings to youtube and slack me the link`…. Try doing this in OpenAI to Python. Sure you can… but you still have to deploy it, manage env, etc. etc. etc. The code is meaningless without being able to execute it effectively.
You see the difference here? The above is serverless, gitless, codeless, envless, deployless. It’s voice powered, context free, click-to login and runs out of the box.
PS it’s not a programming language. It’s not compiled, not interpreted, not transpiled.
PSS I did not type that code, it was entirely generated.
a
It’s hard to tell from a screenshot, but the interaction seems structurally identical?
s
If the differences are not obvious (
serverless, gitless, codeless, envless, deployless
) than I’m not getting my point across.
And it’s not structurally identical. Python does not have built-in WYSYIGS and you still have all the traditional concerns as listed above.
a
It seems like you also really like to be able to generate code from natural language, and since that’s the only part of the demo I think is significant, we’re good šŸ‘šŸ¼
s
🤣 ā¤ļø you Tom! — We can agree on that for sure. But I implore you to think about the implications of using Python (or other languages) because it’s not easy, it’s seriously a pain in the ass and why only 0.3% of the world can do it.
@ibdknox This conversation above sheds more light on our discussion. Thank you!
a
Absolutely! We’re in such agreement there, and I’m honestly thrilled at seeing you take this direction in Storyscript.
šŸ™‡ā€ā™‚ļø 1
s
Ignoring Storyscript for a moment, as a Python developer myself, I don’t see the demo as being ā€œthatā€ useful in how I write code. Most my code is unique business-logic requiring constraints and very intentional abstractions that I’m not optimistic OpenAI will account for. I would however use this demo’s talents on generating the most basic functionality in my application so that I don’t need to search SO šŸ˜‰
a
I think if it reaches GPT-level understanding of language (a high bar and a low bar), it should be able to converse using your abstractions. I think that was the point of the part of the demo where they said ā€œprint a receiptā€ and it did so using the helper method that retrieved all the line items.
s
print(ā€œData {}ā€œ)
is NOT ā€œprint a receiptā€ā€¦ print a receipt is format data and send data to the receipt printer to be printed on paper šŸ˜‰ 🤣 Great an AI can call a ā€œprintā€ function but it’s light-years away from figuring out how to connect to a EPOS TM88…. (which I did professionally for years, it’s a fucking nightmare to manage that complexity)
The AI would help me write 1 line of code while I, manually, have to figure out how to write thousands to abstract the stupid escape character sequence to cut the paper. Yes, it left scars HAHA
j
One subtlety here that will probably limit the practical impact of this in the near future -- the process of verifying whether the generated code is correct or not is essentially the same as debugging somebody else's code without them around. For functions on the simpler side, that may be a simple enough task, but as complexity increases that difficulty scales superlinearly.
šŸ‘ 2
a
Steve, I’m really curious where the rubber meets the road in your ideal system. If someone said to you, ā€œWeekly, look at the forecast and adjust the number of drivers required to deliver food by comparing average orders and weather,ā€ wouldn’t you start by looking at your tools an compiling your available actions into a procedure to achieve your goal? You might defer some of that work, and you might keep the original request in mind to evaluate your effectiveness and correct course over time, but so long as the request is made using language and not a timed series of motor commands injected as electrical impulses directly into your brain stem, it seems compilation is an essential part of the process of doing someone else’s work… yet you seem to hate compilation in every form people have discussed today, be it template matching of pseudo-English all the way down to machine code, or ML-based machine translation to Python—but somehow compilation of short English phrases to Storyscript gets it right? Like… what if PEP bottomed out at Storyscript, or the OpenAI system emitted Storyscript, allowing someone to write three Storyscript statements using only one command? Neither of those systems were written to preclude this, but you act like their existence is unjustified because they don't.
i
So they’re just throwing ML at the problem, much surprised, very wow. I agree with steve here - an ML trained on a bunch of snippets from github/SO is 99% sure not to have or understand what I’m doing when I’m juggling the domain abstractions. Yeah it could help with writing down the boilerplate, but wtf do we even write boilerplate then? Just make a template instead of making an ā€œAIā€. I mean, it’s a good idea, but so far away from reality that doing this is just picking the low hanging fruit and saying ā€œoooh futureā€.
šŸ‘ 2
s
@Ian Rumac @josh šŸ™ Well said. I'm happy you both see that the challenges of generating and validating code that's is beyond simple functions.
@alltom I believe a 1:1 chat would be more productive to answer your question. It's not easy to articulate the subtleties and UX without going into the small details.
a
I dunno, maybe we're off in weeds that are only interesting to me, but could you point to where the boilerplate is in this example? The bodies of these functions were written by the ML model and required making several design decisions without human intervention, about how values should be computed, what nearby methods could be reused, and what output should look like. I think in any other context, we'd call the parts that the human wrote "boilerplate" and the parts that the computer wrote "programming".
s
@alltom the above cannot be generated by OpenAI, there are many assumptions that it made and decisions that it made which throw many red flags for me. However again, I think we are on different pages here because what I see in this screenshot is the ā€œsimpleā€ parts of coding, the ā€œhardā€ parts concern the abstractions, security, data integrity, integration and scope concerns that an AI, in the general purpose sense, cannot manage well (and I think never will unless you narrow the scope). @josh was spot on when hinting at verifying generated code. The system may have come up with multiple potential candidates, How do you pick them? How do you provide the extra context or params when required. @alltom I promise you that this demo was hacked… I have little confidence that the results you see are fully using OpenAI. I would love to try it out and give it just a little ambiguity and see it fail constantly. There is so many issues with the above… I don’t even know where to start. I would bet šŸ’° it was mostly faked.
Think about all the ambiguity. Understanding the variable scope, the class structure, the functions available, the user intent… it’s far to much and far to complex to ā€œjust generateā€ it… This is at best snake oil..
j
FWIW -- I think it's very plausible that this particular example was generated honestly by the team. This is a pattern that I'd expect to be well-represented in the corpus of training material available on github, and so I'd expect the AI to usually get close. Particularly when using logical constraints as part of the generation process (think techniques like SRL). Occasionally, close will end up being completely correct. The issue with trying to use this for writing production code is that even an 80% success rate on individual expressions produces a lot of bugs over time without skilled programmers constantly debugging newly generated additions. I think it's probably a wash for most programmers to try to use this for development (assuming it's in that state), but reflecting more on it, this seems like potentially a huge accessibility boost for programmers who are blind or have lost dexterity. Definitely an avenue worth exploring more on the road to AGI. šŸ™‚
šŸ”„ 1
āž• 1
šŸ’Ŗ 1
i
Having built systems that can take arbitrary english and write sophisticated queries with high accuracy, these examples seem entirely doable given the technologies we have. Also knowing several of the people at OpenAI, including Sam, it's extremely unlikely the demos were "mostly faked." Are they general? Dunno. They made no claims that it was and we'd need to see more. But either way it's probably wrong to be so convicted that it's all just lies. In general, you'll gain much more from evaluating things from the prospective of how they could work, rather than assuming they're impossible and the people behind them are just using smoke and mirrors. It's also better for the community if we assume people are showing things in good faith, instead of trying to trick you.
c
@Steve For what it's worth, when I made this comment - https://futureofcoding.slack.com/archives/CCL5VVBAN/p1587142633117200?thread_ts=1587131460.117000&cid=CCL5VVBAN - my immediate reaction to your demo had been "this must be faked"!
s
@Chris Knott It was faked, of course! That’s what marketing is all about.
I never said it was real and it should be obvious that it was faked. The fact remains that we are building this and have made significant progress. Still remains, the scope of Python and challenges that come in the GPPL world are vast and the OpenAI won’t be able to generate code beyond simple functions that got seeded with nearly perfect concepts. In their demo, if you remove just one word it would fail to produce the code because that one less word created ambiguity that no AI can resolve because the answer is in your head.
@ibdknox I’m not assuming they were intentionally tricking anyone, or being malicious. I do think however, based on experience and research in this category, that the words which were seeded to OpenAI were methodical and predefined. Like as if ā€œIf you say this exact phrase, we can generate this codeā€. I say this with confidence because if you change even small parts of the sentence than you lose scope/context and add ambiguity which the AI cannot understand.