I’ve been thinking about how languages taught at t...
# thinking-together
k
I’ve been thinking about how languages taught at the intro level are usually taught with lots of white lies about how they work, out of necessity. For example, Python’s actual underlying memory model (how everything’s a reference and isn’t copied around) is confusing (arguably), so we avoid teaching that part and oftentimes make all these distinctions between references and primitives (even though everything’s an object). Teaching something more accurate (everything’s an object with references, and oh sometimes values are even referenced via referenced counting) would actually lead to confusion when trying to translate some of this in other languages like C, where everything’s actually a value (pointers are just ints purposed for addressing things). Java is also confusing with all the GC stuff and opinions it has about memory and memory behavior. I would like a first-learner’s language that has the arguably-more-consistent semantics of something like C (“everything’s a value”), but is closer to Python. So I can feel like I’m not “lying” so much. I think the closest thing is C itself, unfortunately, or C++ with a custom allocator thing. Or maybe a hot-take: objective C / ARC are the closest. But really I’d just like a Python with less obfuscation and inconsistency, so it looks more like C with safety on-top and better file imports. Essentially, Python with proper value types for everything, even if a little more challenging. The most explainable language closest to hardware, but easier to use. Just thinking out-loud. I think a Python 3 subset preprocessor that outputs raw C would look close to the language I have in-mind. (I’d like
end
for blocks too).
k
Could you elaborate on where beginners might get tripped up by memory model gotchas in these languages? Basically I'm wondering how to build up to a worldview that this is a major issue. For example, if you had a list of gotcha programs in Python that would be very interesting to go through. I can think of one gotcha in Python (appending to a default arg that is initialized to an empty list, or something like that). Is it a jungle out there, or is it just a couple of things like that in each language that people hit and learn to avoid?
k
e.g. “Why you can modify a list passed to a function and not a non-container type” ; “Why does setting a variable equal to a list / object type make for a reference to the list / object type, whereas ints/floats/strs don’t have this behavior? In general you explain this in intro by describing copying of values and treating container/object types as references, but in Python that’s not actually how it works. These are not SUPER major issues, but any attempt to explain correctly leads to more complexity than an intro student can normally handle. Python’s a bit messy unless you just suspend disbelief and pretend things work a certain way without explaining why. I’d like a language that doesn’t hide things like that quite as much, even if arguably it’s not a huge issue for intro. I do think a simplified C is slightly better because the copying semantics is pretty well-defined vs Python’s reference counting stuff. Good talk and page on how Python variables work
m
C++ is value oriented and low level. Passing a container into a function creates a copy of that container (unless it's passed as a pointer). So the "reference vs value" gotcha can be mostly avoided while teaching. There is a sea of other gotchas in the language though. IMO the value vs reference may not be such a big issue if explained directly through exercises. A couple tasks where the student has to directly experience how values and references behave should be enough to form a correct mental model. May be easier than chosing a different language altogether.
j
does https://pyret.org/ check any of your boxes?
k
When I started advocating and teaching Python to scientists, around 1996, I tried to avoid the hard parts indeed. But not for long. After a few years, I covered the memory model in the very first session, to get that out of the way. It worked well, and I think it was helpful in particular for the many participants who grew up with Fortran. That said, I never covered the more subtle gotchas of Python, such as the messy scoping rules. I am ambivalent about teaching languages that avoid thorny issues but don't aim to be usable in real life. Whatever language is used for teaching should also permit learners to solve real problems that matter to them (though not necessarily to get a job in the software industry).
t
What isn't a gotcha in Python? lol. Cyclic imports, WTF is a
__init__.py
, imports are executing. setup.py is executed. package resolution. Nothing can be trusted due to monkey patching (which I like BTW).
from {foo} import
vs.
import foo
import-by-reference vs import-by-value subtleties. module unloading. white space. Forgetting an f in a multi-line f string (I caused a prod outage coz of that, it is also unlintable). Can't put functions in a string interpolation. Single line lambdas only (?!). the GIL. async. multi-processing. the list is endless with that language. The default value being persistent in args is a steady cause of bugs which we can at least lint around. The memory model is not not my main source of problems though.
c
You can write C++ in a value-oriented way, but you have to internalize a lot of things about it to know how to do that consistently and effectively, so it’s a poor fit for the pedagogical context here I think… but folks with strong positive experiences using C++ in that value-oriented way as well as a lot of time using Swift (which obviously draws a lot on the “C + ARC” model) are behind Hylo, which might be interesting to poke at here. _Caveat emptor_: I haven’t used it, just read the docs, and it’s still quite early!
k
@Tom Larkworthy Your list contains mostly features that easily become problems in practice when developing large programs. That's not the same as features that require a precise mental model. It's easy to see why the f matters in f-strings, but that doesn't protect you from forgetting it. Good language design should of course avoid both types of issues.
k
I’m thinking also in terms of curriculum. Going from python in intro straight to java then C seems like the wrong order because in a way, Python’s more complicated and unique than Java than C in that everything’s a reference, then most things are references / some are values, then everything’s a value in C (and pointers are values that can refer to other values). It’s not a giant issue if you do handwaving and don’t explain how things work, but if you try to explain Python’s model for names/labels, I think it can cause friction if then trying to learn general-purpose things or other languages. The fact that python is all references/no copying is just a little weird. I like python for a lot of things, but not for this aspect (and not for its lack of graphics and web support). For me it’s major because stylistically, I like not to handwave.
j
Racket is a great teaching language with loads of pedagogical material available. If you hate s-expressions, then Pyret (mentioned above) and Rombus are very good choices. These languages have all been designed by people with decades of teaching programming and researching what works for language learners. https://rhombus-lang.org
k
btw I don’t have a way to change the language since I don’t design the university curriculum. It’s more me thinking out-loud. I saw racket before, it’s cool.
x
I would still recommend C but without the stdlib ... perhaps an alternate stdlib library.
k
That’s what CS50 does
I’d do that and maybe provide wrappers around regular stdlib so people could check the innards if they wanted.
x
Perhaps add a defer as well. I like D .... its just that the problem with D is C library people write horrible APIs which abuse the C preprocessor. I hate C preprocessor .... people do some scary stuff with it, which is always intimidating for beginners.
k
Have you seen Odin?
x
Yeah all of them have the .... need C to do anything useful problem.
except go
k
It’s basically what we’re talking about. Better allocator support, better imports, built-in allocator context, defer. C’s worst offsense to me is the includes and headers
Actually I think Odin’s pretty good.
x
Zig / Odin change the syntax way too much. At best I would be happy with just changes in the declaration syntax
main(argc :int, argv :char[]) :: int
but they change a whole lot. D is definitely more consistent.
objective C is really not a bad deal only issue with it is ... lack of win / linux support
k
I think C’s syntax is okay except the square brackets in declaring an array and the includes / headers are BLA. But yeah. A C with a custom course stdlib would be enough. Even if the allocation / deallocation straegy is not super efficient, having something automatic in-place could help. Just a big arena. Or a Python with less magic.
x
C + Lua is a good combo as the world of warcarft players might pay more a attention in class I guess 😁
k
Realistically just C with classes (barebones C++ with methods and basic templates)
k
Boosting @Jack Rusher’s recommendation for Racket. It has not only extensive teaching experience behind it, but it is also very usable in real life. Not for all domains, of course. Nothing is. But it's not just a toy language that you need to move on from rapidly.
s
Why not Smalltalk?
k
We’re in university territory here. They only do python, java, and c-likes. So I’m sort of compromising. I imagine a version of python with more C-like behavior, or just using a variant of C. I think why not because it behaves too differently and at a too high level from others, but that doesn’t mean smalltalk isn’t cool. It is cool.
s
How about ES6 JavaScript then?
k
The problem feels overconstrained as stated. You want a good language but also need it to run the gauntlet of a large bureaucracy like a university. Any attempt to improve on a mainstream language will be definition not be mainstream, at least at the start. Someone has to do the work to get it approved.
k
True. This all started with me being a bit frustrated at Python for hiding too many things. Among the existing languages, to me, some kind of constrained C makes sense. If there were a new language (insert “let’s make a new standard” meme), then yes. Maybe something with a simple simple runtime and smalltalk-esque syntax.
g
Odin (very modern) or assembler for one of the early CPU architectures (PDP-11, Vax, MC6809, NS32016, etc.). I've never used Turing (nor Turing+), but it is a language designed for teaching during the era when garbage collection was considered to be a dirty subject. In particular, I like CSRI's approach towards simplicity, esp. Holt's and Cordy's. Note that Kernighan was an undergrad at UofT and Rob Pike was at least seen at CSRI (I remember being convinced that Pike was an overgrad there, but, can't confirm) (CSRI was in the same small building as Marshall McLuhan's office). FWIW, further on my feelings about Remics - the study of the fundamentals of reprogrammable machines
k
Yep, I’ve been following Odin. Pretty nice.
k
@Steve Dekorte Why not Smalltalk? From my perspective, Smalltalk would probably make a good system for starting with computing, but there isn't much teaching material for it, nor much teaching experience with it. The same argument holds against Odin (just mentioned) and many other recent languages, including languages like Hedy which are designed for teaching. If you adopt any of them, you should be doing research in computing education, not large-scale teaching. Which is fine, I think that needs to be done, but it's another objective. If you pick a language for teaching based on your gut feelings about that language being good for teaching, you are doing vibe teaching. Which is perhaps the next fashion, but it's not something I'd recommend based on my own (limited, of course) teaching experience.