Would folks here agree that what most developers s...
# thinking-together
s
Would folks here agree that what most developers spend most of their time on is (roughly speaking) rewriting code that’s been written before instead of implementing logic truly unique to their application?
👍 8
n
That adds up with my experience
e
I would agree -- with the added caveat that I don't think that is unique to programming, though.
💯 1
s
If so, then are the biggest productivity gains (by a very wide margin) to be made in finding ways to increase reuse? Shouldn’t increasing reuse the be top priority of any FoC project?
w
yup.. though the problem with that 90% of code that is kinda similar to what someone else has done before is that it wasn't quite in the right form for the current application to re-use
s
@Wouter Right. Would finding good abstractions (and design constraints to adopt) solve this problem? For example, hypercard and spreadsheets are abstractions for a class of problems. They don’t do everything, but for their domains, they eliminate a vast amount of what would otherwise be perpetually reimplemented code.
What I’m wondering about is whether FoC projects should be focused on more easily creating/debugging/verifying code, or more easily reusing code, as these may be very different (and perhaps often conflicting) goals.
w
I don't think so.. the idea of perfectly re-use-able code is a pipedream: the more generic you make it to cover all possible re-use cases the less suitable it becomes for any one use, almost by definition. The design space for even simple algorithms is vast in terms of how it would work best for a particular use.
Remember, OO was one day meant to be the savior of re-use and it didn't work well.. nowadays the best kind of re-use we have is called "github repo" 😉
people will only re-use if the benefit of doing so is really vast (large functionality gained to surface area ratio), so by definition only relatively large things make sense to reuse
👍 1
n
Yep two independent pieces of libraries won't magically work together. You have to write glue code between them. For simple things it's just easier to do the thing instead of learning two arbitrary collection of thoughts. And for some reason ~all of my nightmare-level technical challenges I have ever encountered have been always related to 3rd party code.
s
@Wouter “OO was one day meant to be the savior of re-use and it didn’t work well..” Do you feel you’d be more productive reimplementing OO UI frameworks than reusing them?
@Wouter “people will only re-use if the benefit of doing so is really vast…” Even though most of what we do is rewrite existing code, the amount of code we write is still tiny compared to the amount of code we reuse. How many projects these days start by writing it’s own compiler, editor, OS, network stack, windowing system, UI frameworks, etc? Our apps are standing on top of a tower of abstraction giants.
k
Sounds like reuse is already happening, then, by your definition? So if we're still dissatisfied, maybe the problem lies elsewhere? Is there any reason to suppose that "doing more reuse" or "doing it better" is the solution? (I tend to be sour on reuse: https://futureofcoding.slack.com/archives/C5T9GPWFL/p1566593111421000?thread_ts=1566430717.339900&cid=C5T9GPWFL. But I think you may mean something different by the term than I did in that thread.)
n
Let's just say that I am much more productive with something like F# Suave which is somewhat lightweight FP library: web application is just a function which takes request and returns response. Sadly I usually have to work with C# / ASP.NET and I cannot honestly remember what that mammoth actually do which is net positive for productivity.
Definition of "framework" is that you have black box which give random output without ever calling user code. Sorry but had today one of those days at work again...
t
We can start from collecting the problems: "the more generic you make it to cover all possible re-use cases the less suitable it becomes for any one use" "Yep two independent pieces of libraries won't magically work together." "so by definition only relatively large things make sense to reuse" I will add my observation here: 1. to support reuse, we tend to add more runtime customization here and there. This would make reasoning the system runtime behavior even harder. 2. static #ifdef marcro apporach has severe readability issue. Also there is no type system there to regulate its power.
s
@Kartik Agaram “Is there any reason to suppose that “doing more reuse” or “doing it better” is the solution?” Take any app you’ve written and consider how much time it would take if you had to write the entire sw stack it sits on. How does the leverage you got from that reuse compare to the leverage you get from using your favorite language, debugger, editor, etc? If it makes the later look like a rounding error, why do we tend to focus on the later?
The primary value of programming languages themselves could be viewed as enabling one to easily reuse common coding patterns. Maybe we forget this because it’s invisible, or we never experienced having to implement those patterns (e.g. stacks, calling conventions, memory management disciple, etc) manually, so we don’t realize how much code we are effectively reusing instead of rewriting.
👍 1
k
It's not clear where the disconnect is, so 95% of this will seem obvious. I just don't know which 95%. • Yes, I can do my work faster when more of it is done for me by someone else. Often it's not done for me. Or it's not quite done with me. Or it comes bundled with something I don't want. Garbage collection. Or ad tracking. Or just implications. It's not easy to separate out the leverage from the burdensome aspects. • Past performance is no guarantee of future success. The current stack does a lot. But perhaps it's a local optimum. Perhaps doing better requires back-tracking: http://worrydream.com/#!/Links2013 (search for 'Reading Tip #3') • It's not obvious that 'ease' or 'speed' is the right metric. The entire edifice of software can be seen as a monument to the cult of performance/efficiency. If that goal is incomplete, reuse becomes less valuable. In particular, some things may require more up-front investment but have less of a ceiling in capabilities. • I'm not building a programming language. That seems more like r/ProgrammingLanguages. Most people here don't seem to be trying to plug into such a well-defined part of the jigsaw puzzle. • Your distinction between "entire sw stack" and "favorite language, debugger, editor, etc." seems invalid, because the entire software stack is made up of languages, editors, debuggers, etc. Every new thing anybody tries to create will seem like rounding error compared to everything that exists. But if everyone gave up because of that, nothing would exist.
s
@Kartik Agaram “Every new thing anybody tries to create will seem like rounding error compared to everything that exists. But if everyone gave up because of that, nothing would exist.” My point was that we should be encouraged, not discouraged, by the power of reuse. It’s the most effective tool we have, but we don’t focus on it. I know I’ve spent a lot of time on cool programming language features, and while I think there’s some minimum of language features we need to build on, I’ve come to see frameworks focused on easy reuse to be orders of magnitude more important wrt increasing productivity.
d
Consider the hundreds of JSON libraries that exist, written in just about every active programming language. Why does every language need its own JSON library? Why does C++ need 22 JSON libraries? http://json.org/
👍 2
k
@Steve Dekorte What do you think are the highest-leverage places to work on?
@Doug Moen I actually think having lots of JSON libraries is ok. Good grief, that's one of the few places where the world is working well. The last thing we need is some ridiculous JSON library monopoly situation.
👍 1
d
There's a debate on the OpenSCAD mailing list right now about whether or not reusable libraries are a good thing. On the anti-reuse side: 'OpenSCAD is a simple, powerful language, and I know the entire language and its idioms. It's easier for me to just write the code I need, than to learn how to use somebody else's libraries.'
s
@Doug Moen @Kartik Agaram I agree with both of you on JSON libraries. Ideally, we could run components as actors in their own VMs with a messaging API so they would be more platform and language agnostic. But whatever the number of JSON libraries you can find on github, it’s worth considering how many more there would be without reuse.
@Kartik Agaram “What do you think are the highest-leverage places to work on?” On whatever we find people are spending the most time on while trying to get machines to do what they want, as seen from a high level perspective.
d
I don't actually use a JSON library in my Curv project. I guess this was one of those cases where it was pretty easy to just write the code I needed in pure C++, and where reviewing the 22 existing JSON libraries, selecting one, and writing the necessary glue code, was probably going to take more time and result in more code bloat than the approach I actually took.
☝️ 1
s
Here’s an example of what I mean by high level: If you look at a typical programmer building a website, and watch what they do on the micro level, you might say we can increase productivity by making their editor have more shortcuts, better searching, debugging, etc. But if you look at what they spent their time on from a macro level, you might find they mostly spent it rewriting things that have been done by every other web site developer - like writing a user database. So why focus on making micro tasks faster, when it may be possible to avoid almost all of them by finding ways to stop rewriting the macro components?
👍 1
What if the number of components we need to build most applications really isn’t that large?
k
Say you come up with a minimal set of (ugh) components. And somebody then puts some of them together into a function or something. Is that a failure of your set? Something you should have anticipated? Sometimes it's faster to make it yourself rather than do a thorough scholarly survey of past work. Sometimes we make something and insufficiently market it, so others end up making the same thing. Both these outcomes seem.. fine?.. to me. It just doesn't seem very important.
The problem is that today we seem screwed whether we build it or buy it. That seems like the important thing to me.
s
Consider how things were made before industrialization. Every artifact was hand crafted, custom, ornate, and usually incompatible with any similar artifact. Not unlike the way we build various levels of sw today.
At some point, it became clear that reliability and cost were more valued than customization and ornamentation.
Software has also been following this path. Projects used to make their own hardware, then hw standards emerged but people wrote their own OSes. Then OS standards emerged but people rolled their own UIs. Then Windowing systems and UI frameworks emerged. See the pattern?
k
"Reliability and cost more valued than customization and ornamentation." How's that going? Do you feel like you live in a world where reliability is valued more than ornamentation? The Slack app on my phone ate the first iteration of this response because programmers somewhere decided they needed rounder corners than a browser textarea could provide. Yes, I do see the pattern of what mainstream tech is doing. Like Ian Malcolm in Jurassic Park, I question whether it should. As I've said elsewhere: "The programs we write are already boundless, factory-like desire amplifiers at run-time. Life can be pretty good even if the factories themselves need to be created in a bespoke, one-off manner." (https://futureofcoding.slack.com/archives/C5T9GPWFL/p1566611906422000?thread_ts=1566430717.339900&cid=C5T9GPWFL) The analogies you make with history are obviously persuasive to many people, and you're all better programmers than me and will likely have a bigger influence on the future. But I think all of you are cargo-culting the industrial revolution in a new setting where it doesn't really make sense in the long term. Premature automation, the only worse sin than premature optimization.
w
Whhat aree you talking aboout? Thhis keeybooard works greeat!
😂 2
e
Whenever the conversation moves into interchangeable parts, my ears perk up. The first big win in software reuse/parts was the FORTRAN scientific libraries fo rmainframes in the 60-70's. There were about 1000 functions you could use, open source, that ran on a broad set of compilers. These projects were the output typically of someone's master or PhD thesis, and contributed to FORTRAN's dominance in the scientific world. The next big re-use ecosystem was VB6. As Bill Gates' pet project inside Microsoft (BASIC was the founding product of MS), it was given significant resources and bright people, and VB6 until MS dismantled it with their Java-envy .NET project, had a thriving marketplace. With plug-ins for all sorts of user interface components, many of which were sold under license, and there were stores you could go to buy and download components, that worked perfectly in that very uniform environment where all the components were controlled by simple tables of parameters. I have met many a programmer who build incredibly rich user interfaces and deep products in VB6. @Steve Dekorte makes some great points, that programs do last for decades, and that maintenance costs far outweigh the cost of construction. That was also true for the golden gate bridge by the way, the total cumulative cost of running the bridge is probably over 20 times the construction cost. I would say that software has a similar ratio over the up to 25 year lifespan of programs used in business. If you make software super easy to build, then you don't care about re-use as much. There are many ways to improve programming, many of which don't conflict with each other: 1) simplifying the debugging process with reversible computing, 2) creating a better standard for data interchange than the idiotic JSON, which is not a binary format and therefore crippled and inefficient, 3) making it so client products can send a "blackbox" recording to the developers so that bugs are 100% repeatable instead of the current mess where most bug reports in big companies are effectively ignored until they become statistically significant. 4) make it easier for people to understand other people's code, and 5) make the interchangeable pieces smaller so there is less duplication of work. At present, i hardly ever use Github as a resource. Most of the code is of unknown quality, poorly documented, and oftentimes the dependency chain is a dealbreaker. Eliminating dependencies would help a lot with interchangeability frankly.
f
Interesting discussion, thanks! When talking about reuse, it's important to distinguish between diversity and duplication. For example, there may be two JSON implementations where one is optimized for memory footprint and the other for execution time. It may be impossible to combine both use cases in a single implementation. Requiring a single standard would limit diversity in this case. On the other hand, there may also be implementations that only differ in non-essential details like function names. Combining these would reduce duplication without hurting diversity. IMO, we should strive for as much diversity as possible (and document this!) while reducing duplication where possible. In the JSON example this would probably reduce the number of implementations from 20 to 5 which would make choosing an impl easier and therefore shift the "reuse vs. build yourself" decision towards reuse. 3 of these impls could probably even use the same API, making switching between them trivial. Each of these impls should document alternatives and when to use them. OSS has been great at allowing diversity (e.g. by forking), but it doesn't do well on reducing duplication. We probably need tools and an attitude towards "joining" as the inverse to forking.
s
This all sounds so reasonable, diversifying by different goals like speed or memory usage, and yet to me it looks like the main driver for different implementations is “I’d implement/architect this differently.” Reuse is a useful concept in theory, but I think that we’re not getting anywhere near it has more to do with that we can’t agree on anything on what makes a “good” implementation, let alone what an API should look like.
f
@Stefan If it's not clear which implementation / architecture is better, it's valuable to have both. Time will probably tell which is superior. But even then, these implementations could still share other aspects such as spec, tests, API, ... Situations like these might also indicate that the abstractions used by these implementations could be improved (a.k.a. Python Zen's "There should be one-- and preferably only one --obvious way to do it.").
w
Friends, I feel optimistic tonight, like I have 119 reasons to be happy about code reuse. That's what I get when I type
bundle list | wc -l
for my current project — minus one since the first line is "Gems included by the bundle:", but whatever. The problem with anything technical not sucking is that as soon as it does, we can forget about it and to focus on the remaining steaming pile.
s
If it's not clear which implementation / architecture is better, it's valuable to have both.
This is a very wise approach and I totally agree and take variety over monopoly. However, sometimes I myself suffer from this “Nah, it's clear to me that this piece of code is sh*t and I know what would be better, so I’d rather rewrite it myself because clearly the developer has no idea what they're doing” and it seems to me that many other developers have the same problem. And often in those moments, I realize shortly after, “oh, I wrote that myself.” ;-)
☝️ 2
k
Shameless plug for an article I wrote about reusable and its limits: [Reusable vs re-editable code](https://hal.archives-ouvertes.fr/hal-01966146) (based on a quote from Donald Knuth) One important conclusion in that article: reuse makes sense only for code that is significantly bigger or more complex than its specification, except for extremely widely used code such as the JSON reader mentioned here.
☝️ 4
c
(wish I could subthread another lvl)
👍 1
Heavy usage has certainly not prevented the existence of a severe security bug in OpenSSL for two years — Hinsen
k
Thanks @Konrad Hinsen! Now I have a citation the next time this comes up. When we have two stores make identical pizza we don't call it duplication. We call it competition. When a child runs a lemonade stand we don't call it duplication. We call it education. Reuse and duplication is only meaningful within a zone of ownership. If you and I decide to use two distinct but functionally identical libraries for our separate projects, the world can handle it. It may even be a good thing. There's no problem here. Unless you have OCD.
f
@Konrad Hinsen But once you know the reusable component, the argument is reversed, right? I've used pd.read_csv in a couple of places already and I'd say I know how it works. So for me, it's easier to understand usages of pd.read_csv compared to some hacked-together csv reader. The pd.read_csv function applies too much magic IMO, but that's not the point. Reusable components also mean that there's less to learn and keep in mind for developers, which is a good thing. Having five different csv parsers, I'd need to check all of them for details like handling of trailing commas, emty lines, ... A single reusable component that covers 80% of the common use cases only needs to be learned once. Apart from that, pd.read_csv might also contain optimizations that you'd never implement in an ad-hoc solution (at least not error-free).
k
@Felix Kohlgrüber it sounds like you have a concrete example. Can you describe it? What are the competing libraries? What prevents you from standardizing on one for yourself? A lot of the argument I hear against duplication is that it's confusing for newcomers to have to wade through all the alternatives. But it sounds like you already have a preferred candidate. I'm not sure what the problem is in that case.
f
@Kartik Agaram I don't know if these examples are helpful. Who needs two identical restaurants (taking capacity aside, which doesn't apply to code reuse anyways)? Reuse is essential in so many areas (standardized screws, network protocols, ...) and actually enables collaboration between different "zones of ownership".
It may even be a good thing.
Can you come up with an advantage?
k
Antifragility! Anytime you try to make something more efficient you're also reducing the slack for responding to catastrophes. If one of the JSON libraries ends up publicizing a security hole (and you know they exist, right now, everywhere) it impacts a fraction of the world. If you had just one the blast radius goes way up.
Part of the problem with a word like 'reuse' is that it leads to us conflating many different things. Standardizing on a common protocol is not the same as asking the whole world to use the exact same bits of code.
f
Antifragility...
But taking this thought a step further would mean that everyone should build everything themselves... Sure, that limits the effect of errors, but it's also massively inefficient. Where would you draw the line?
k
That's a strawman. Nobody's advocating for zero code sharing. Where would I draw the line? The de facto line the world is drawing through a million decisions everyday is working fine from a narrow reuse perspective.
No, that's not true. If you run an app and depend on two libraries, and they pull in two JSON libraries, that's now a problem for you. You have duplication in your zone of ownership. There's no easy fix for this today, so we end up asking others to "reuse" so that we don't have this problem. I think the problem is misattributed. The real problem is that it's hard (technically and culturally) to make changes to our dependencies to get their implementation properties in line with our local policies.
Ideas like dependency injection and thorough integration tests help here. If a library can parameterize the JSON library we can plug in the one we want. If the one we want turns out to disagree with the library in some subtle way, an integration test can catch that.
k
@Felix Kohlgrüber I’d say the magic in read_csv is the key point. It’s what makes its specification so complex that few people will ever understand it in depth, to the point of knowing the right incantations to make it read a particularly weird CSV file.
g
1 for the reusable/re-editable distinction here: https://futureofcoding.slack.com/archives/C5T9GPWFL/p1568813363251400?thread_ts=1568748321.207400&channel=C5T9GPWFL&message_ts=1568813363.251400 . It seems like the argument here is between abstractions that get general enough that they’re hard to specialize vs specialized tools that can’t get any reuse at all. If we’re spending a lot of time reimplementing old stuff and/or modifying general code to do what we actually want, then the main problem with general code is that it’s hard to edit (because of architecture, indirection, and (i’d argue) the impedance mismatch between text and structure editing. It seems like a code editing tool that made it extremely easy to do the kind of refactoring we might need, like change something from json to xml, or a string to a list of strings might enable us to reuse specialized code by generating a new version of it specialized to a particular use case. that’s why i find stuff like the expressions of change project exciting: https://www.expressionsofchange.org/
n
Indeed @Garth Goldwater. Concrete example: you use one of those json libraries. You receive new requirement from customer that you must to keep relations between data and it's position in json text. Lets say that it's for some error messages. I think most json libraries simply lost this information during parse. So how reuse must be done that I can still use most of json library?
w
@Niko Autio it comes back to the classic problem that what appears to be a small change in requirement is actually a large change.
d
Two approaches to API design: (1) Weak and bloated. Lots of APIs, one for each use case foreseen by the developer. If your use case is missing, you are out of luck. (2) Powerful and compositional. A small, orthogonal set of operators that can be composed in many ways to solve problems. Use cases are often addressed using idioms involving the composition of two or three operators. Solves more problems using a smaller interface, leading to more code reuse.
👍 1
s
@Doug Moen Well said. I’d add that making systems compositional tends to involve the trade-off of making some patterns easier to express in exchange for making others more difficult. I suspect the expectation that this isn’t the case is a major barrier to the development of more composable systems.
d
@Steve Dekorte Powerful and compositional APIs are more work, and require a higher level of skill, to design. They can have a higher learning curve, if there are a lot of new idioms to learn. Once you master the more general idioms, there is a lot of knowledge transfer between different domains. Once you obtain a certain level of mastery of compositional programming, that knowledge can shorten the learning curve when you encounter another compositional API for a new domain, when compared to learning another weak-and-bloated API.
s
What other barriers do folks here see to more reusable/ compositional systems?
e
The main problem in re-use is data interchange. In the historical first success of interchangeable parts/ re-use, there was the FORTRAN function libraries. In that world, one had simple scalar values, or perhaps an array of 32 bit REAL numbers. Later on 64 bit versions appeared. So you had only 2 needed variants of some function, and you could call the Bessel function, which would take you a year to get right if you wrote it yourself (assuming you knew enough math to derive which, which not one in 1000 programmers can do). But once you get past simple arrays and into data that has a less mathematically regular shape, which is almost every real-world application involving humans, the data regularity goes away. And now you have health records with incompatible coding, or inconsistent encoding. This is where reusable falls apart today. Its all about wildly varying data storage methods which prevent some identical algorithm from being reused. Hence massive code bloat and duplication of effort.
👍 1
s
Who has the incentive to write widely reusable code? What company would approve the costs to make code that would mostly be reused elsewhere? What academic community rewards code reuse as it does paper references? Which developers are willing to pay enough for libraries to to cover their development costs?
e
For decades the academic community put each Masters and PhD candidate on a reusable library, and gradually built up a lot of great stuff. My Bachelor's thesis was on braille translation by computer, and i write a FORTRAN program that could be used as a subroutine to take raw text and convert it into proper braille. that code would still run today. A whole sub-category of the gaming industry can be traced back to the PhD thesis that produced the inverse kinematics library called Box2D which allowed Angry Birds to be written by normal mortals. Inverse kinematics is something not even 1 in a 100,000 programmers could build. It requires clever programming to do it efficiently. And every single modern OS relies on font rendering, and I would wager not 100 people working today know the truetype language on which it is based. Everyone thinks fonts are pictures of letters, but they are not, each letter is a little program, and once in my labeling software product i encountered a crash bug in some MS fonts. But today the academic community is not making interchangeable parts for the most part, and this is indeed a problem, because companies keep a lot of their code private, except when they are trying to de-monetize and area so as to suppress competitors who could gain a foothold with some economic viability. Most of the open source work that the big boys do is not born of altruism but a recognition that by such practices they can keep young sprouts from growing to any size that would threaten then. JetBrains is among a handful of tool companies that has crossed the 500 employee mark, which to me is the sign of a strong, established firm with a solid customer base that can support that kind of continuity.
t
black box reuse the only way to make money, however, parametric black box might be the wrong solution to the problem.