A story currently much debated in biology: <Scient...
# thinking-together
k
A story currently much debated in biology: Scientists rename human genes to stop Microsoft Excel from misreading them as dates The problem of Excel converting some gene names to dates (example: MARCH1) has been discussed for a while already. Biologists had adopted Excel believing it to be a general-purpose computational tool, but then discovered that Microsoft intended it to be a domain-specific tool for business data, without insisting much on that detail in their communication with the public. Several published genome studies already contained mistakes. So... what to do? The feature cannot be turned off, Microsoft didn't really care because biologists are not their main customer base, and re-training biologists around the world to use another tool wasn't doable. So they decided to change the names of the genes! This isn't an easy decision, and it will be costly as well, adding complexity to future bioinformatics software and confusion among biologists.
đŸ€ŠđŸ» 7
👍 3
s
Are we firmly in the "then the tools shape us" phase of computing then? I saw this story and am.. amazed. Not necessarily saddened, but.. just very fascinated. It reminded me of the the cuneiform scripts - which look that way (wedge shaped straight lines) because that is how you could write on clay tablets with a stylus. Also many South Indian scripts which look that way (very curvy) because they were written on dried leaves and straight lines would easily produce tears.
❀ 4
k
Staying with McLuhan: If Excel is the medium, what is then the message?
👍 1
😂 2
f
"Yes a large of non-programmer end-users CAN in fact write simple (or sometimes hopelessly complex) programs. But be careful what you wish for."
k
From my point of view as a computational scientist (though not in genomics, and not using Excel), I'd go for another message: don't use software for research if it's authors don't explicitly support research and in particular your research domain (unless of course you are competent enough to judge the suitability of some software yourself).
t
This is a fascinating story. It goes to show how in a world of software there is no such thing as a non-technical person. There is only people that understand that world and can navigate it, and those that cannot. It follows that it is imperative to make the world navigate-able by many more people.
👍 1
☝ 3
z
Excel is established enough that they can afford to ignore a huge base of users like this. I'm sure their product team is a lil embarrassed right now though. But in an alternate universe where Excel was designed as a technical tool for technical users (think of it as an IDE), this wouldn't have happened.
@Tudor Girba agreed, all users are technical to some extent. I think for Excel and many FoC tools, a big challenge is designing so all types of user can get just as technical as they need to in order to solve the problem at hand. But to keep the next depth of technicality/complexity just around the corner so casual folks aren't overwhelmed.
s
@Konrad Hinsen In this case if there is no product that explicitly supports research, and also has the desired features of Excel , what would scientists do?
k
Have a suitable tool designed and implemented for them by specialists. Using more generic bricks (GUI etc.), why not. The main point is that the core functionality, i.e. what happens to the data, is agreed on between developers and users and written down in some form of contract. Cost is really not an issue here. Genomics requires a lot of resources for data acquisition (mainly sequencing), so it's worth spending a fraction of that money to ensure that the data is exploited to its potential.
👍 1
t
Yes!
c
We've been in the "tools shape us" stage for a while. For example, the band Chvrches spells it like that so that it is googleable. It's the same phenomenon as "AAA Plumbing Ltd" in Yellow Pages. I don't think the issue with the gene is as ridiculous as it first seems. If you are going to do this kind of helpful guessing you will sometimes get it wrong. I work with a lot of Excel files of phone numbers which are always interpreted as integers rather than text and converted to scientific notation (e.g. 447123000111 gets turned into 4.47123E+12). Drives me nuts, but from MS's point of view this is definitely the correct way to guess.
👍 4
p
I think geneticists renaming some genes is a lot less concerning than everyone doing research in systems that don't model their problems well because they're familiar, or to avoid licensing costs...
k
@Peter van Hardenberg I certainly agree - there is worse happening than this renaming issue. @Chris Knott Adapting to changes in the world around us is fine. Being shaped by tools as well - it's inevitable anyway. The issue here is (1) This is a partial bug fix, not an adaptation. Old and new gene names will coexist for many years and cause a lot of trouble. (2) Similar issues can happen any time again, if Microsoft decides to introduce more magic into Excel. It's OK to adapt to a changing world, but you want to have a say in how these changes happen when your core business is concerned.
i
@Chris Knott great example with plumbing and chvrches, didn’t know that about them! I feel it’s like an 80/20 rule - this type of parsing satisfies 80% of the people so they made it default - I feel like when pasting/importing, there should be a intermediate mapping/what is this data step. I don’t like it when tools assume without confirming - always confirm before assuming. Like autocomplete doesn’t immediately fill your code out but requires you to press tab. It might be annoying if you do it constantly, but a “remember for this sheet” checkbox would solve that.