As a programmer, you write for two very different kinds of readers. One is the rigid computer platform, the other is the human maintainers of your code. For the former we have quite conclusive guidelines on what works, but the latter is only consistent as a source of disagreement and uncertainty. Typically the guideline is “Always code as if the person who ends up maintaining your code is a violent psychopath who knows where you live.“, while most people end up writing as if the person maintaining the code is themselves.

Neither approach is particularly good -I can’t imagine code written for violent psychopaths  would really be that great to maintain. And on the other end, a lot of coders optimize for their own readability, and will argue the finer points of formatting from that point of view, not that of the actual readers – missing the point that readability is ultimately decided by the reader, not the writer.

I certainly catch myself doing this quite often.

In the Perl world, this is probably an even bigger issue than in other languages.  There-Is-More-Than-One-Way-To-Do-It is still one of Perls big strenghts, but certainly not all the ways are equally good – which is why Perl Best Practices is now a staple of any Perl-wielding office. The contrasting Python, with it’s “there should be one — and preferably only one — obvious way to do it”, seems to be able to get away with 19 simple statements to define the pythonic best practice.

Now, anyone can have an opinion, particularly since research on code readability it still quite lacking. We just don’t know for certain yet what makes people get code. However, cognitive psychologists have been interested for several decades in how people generally organize large data structures in their brains, and some of this has given some neat practical applications for coding, such as Furnas’ paper on ‘Fish-eye’ views (1986). (This is basically about IDEs with collapsing code branches, but now you know why that’s nice, how to do it right and who thought of it first.)

Research on general language comprehension, however, is a massive field. In the ACM Journal of Computer Documentation‘s August 2000 issue, George R. Klare provides some clear-minded and research-based adviced for communicators that might be enlightening and is a good starting point for thinking about readability of normal text – that might apply to programming.

He specifies four purposes of readability:

  • Reading speed and efficiency.
  • Reader judgment.
  • Readership.
  • Comprehension, Learning and Retention.

Of these, only the last is of any big interest for programmers; you don’t typically care about the speed the maintainer needs to read your code. You might care about the Reader  judgment if you code to impress your fellow programmers, but that is another chapter. Readership applies to how the size of the readership may be a function of  the simplicity of the text – a consideration on the skill level of your maintainers, perhaps, but few people code with the intention of their code to be read by a large audience.

However, the biggest issue with text and code is comprehension. Even more so in programming code, as it is essential that the maintainer is able to create a precise representation of the code in his own mind. Also he must be able to understand both what it actually does and what the intention is, since these don’t always add up.  For debugging purposes, it is actually completely essential that the reader can separate the soft human intentions from the hard computer operations, so any code must be understood on two levels simultaneously.

Now, how does readability affect comprehension? First, keep in mind that readability in natural language usually refers to choice of words and sentence length, and is typically measured by level of education necessary to read the text. This might not be appropriate for code readability – level of programming skill might not map to level of code understanding in the same way, and readability in code is often just a matter of indentation and syntactics.

But it probably maps somewhat close. This is where research is lacking again, so it is hard to tell.

Klare’s big advice, however, is that higher readability does not always convert into higher comprehension, but is modulated by situation and traits in the readers. And that is the important part. He describes four conditions when it does not work as you would think:

  1. A reader can understand at higher levels than expected if his motivation is high. Also, skill level is a fuzzy concept.
  2. If time is not limited, increase in readability might not make a difference on comprehension. The more time is limited, the larger the effect of readability
  3. The greater the readers background knowledge on the topic, the less effect does readability have. On the other hand, even an expert on one field may prefer higher levels of readability in texts outside his field.
  4. Type and level of motivation might affect comprehension.

or to quote his summary:

[..] more readable, written material is likely to produce greater comprehension, learning and retention than less readable only when one or more of the following factors are present: the less readable is much harder than the more readable, and clearly beyond the reader’s usual level; reading time is limited; the reader does not have a large amount of background or experience with the topic being covered; and, the reader has a relatively strong set to learn.

Now does this particulary increase understanding of readability of programming code?

Perhaps not.

However, if we think of readability not as just the reading of code as natural language, but rather of understanding of the semantic concepts, there is an interesting observation to be made.  If we consider using more basic programming to make the program easier to understand, while avoiding more advanced concepts, Klare’s summary of readability shows something that also adds up with other research, namely that this approach to readability doesn’t always increase comprehension.

In a very recent piece of research using Scala, Gilles Dubouchet found that using more compact and advanced functional programming methods rather than basic, typical loop-constructs increased comprehension.  Although one single piece of research such as Dubouchet’s is too limited to base decisions on, it becomes more interesting when it actually adds up with prior research on language comprehension.

Together it indicates that using more advanced methodology can increase comprehension for both original programmers and maintainers, unless they are pressed on time or motivation.

Unlike many other languages, Perl allows you to increase the complexity of your programming across methodology quite freely. You can start with the simple baby-Perl, go through procedural programming, add objects or start playing with functional approaches. You can use Aspect Oriented Programming or add on your own crazy, homespun programming methodology, if you so please.  For comprehension and readability purposes, the above research indicates that if you consider your audience and their situation well, going for a higher and more advanced level might not be a disadvantage.

But do keep in mind that the research is still a bit patchy, and this is mostly an argument without empirical data. But I’ll make sure I report what I find, as this is just the first in many articles about readability…

4 thoughts on “What is readability, or simple != readable

  1. Shawn says:

    Interesting but you assume that reading and comprehension in natural languages are similar to computer lanugages. You can read a piece of text out loud without understanding it. Now try reading some piece of code out loud. It is very difficult.

    Computer languages express ideas differently and so comprehension would require different efforts. Using readibility tools for natural languages on computer languages seems dubious, at best.

  2. admin says:

    Shawn, thanks for your feedback. I tried to be careful with making too strong assertions on the similarity. That said, I don’t think the difference in reading code out loud means it’s not read in the same way as text – and I think there are quite a few points showing we do actually read code as text: We regard descriptive, natural language variable names as better, function-names are usually English rather than non-descriptive, we use alphanumeric symbols from natural language and so on.

    But you raise a good research question! One can build a good argument for either case – and even if it to me seems obvious that we read code in a similar way as natural language text, I can’t know that for sure. Well-designed experiments and research could help sort that out, until then it’s only going to be speculation. My discussion above is more a minor exploration or a starting point to think more about it, reading more into it is going further than the data can support.

  3. Shawn says:

    I think natural languages and computer languages are very different. For example, you could read a medical paper out loud and though you don’t understand it, a doctor listening to you would. On the other hand, if you read a program out load, not only would you have difficulty doing so, even an experience programmer would have trouble following you. Computer languages are more visually oriented and natural ones more aural.

  4. admin says:

    You’ve got a good point – but have you actually tried having a program read to you? Although it sounds pointless, I have to say I’ve never actually tried…

    Cognitive psychology will suggest that both reading and listening ultimately sets off the same understanding in your brain anyways. The question is then why listening to programming code is so different to listening to natural languages. If you look at the nature of listening and seeing, the former is immediate (you have to hear the sound exactly when it is generated, with no memory short of echoes) while the latter is not equally time-constrained and allows revisiting – it can be more of an external memory.

    That suggests that we actually DON’T build internal models of the code that are quite incomplete, and only work on parts while the rest is stored and accessed externally.

    And yes, quite different from natural language comprehension! It actually implies easy, repeated access to the code is essential, and that readability is even more important for code than natural language. Do you agree with my reasoning?

Leave a Reply

Your email address will not be published. Required fields are marked *