Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles It Ain't Necessarily So: Exploring Type Systems for Verifying Musical Correctness

It Ain't Necessarily So: Exploring Type Systems for Verifying Musical Correctness

Leia em Português

Key Takeaways

  • As an increasing proportion of our culture becomes codified, we need to consider how to authentically express theory and insights from diverse fields in our software.
  • Music is an excellent domain to explore using programming, because it has a long history of notation and is governed by pattern-based rules.
  • Music theorists have been fascinated by what makes music correct for centuries. Now that we can write music using programming languages, we have the opportunity to cross-pollinate lessons in correctness between music and programming.
  • In this article Chris explores how music can be type checked, using the Mezzo Haskell library and Idris’ dependent types as examples.


I’ve given a lot of talks on representing music using programming languages. It’s a popular subject, because it tickles the engineer’s mechanical curiosity as well as plugging-in to music’s universal relatability.

What does it look like? There are musical coding environments for many different programming languages. Two popular ones are TidalCycles for Haskell and Sonic Pi for Ruby on the Raspberry Pi. Here’s an example using my Leipzig composition library. As it’s written in Clojure, there is no type checking.

(def row-row-row-your-boat
  (phrase [3/3 3/3 2/3 1/3 3/3]
          [  0   0   0   1   2]))

->> row-row-row-your-boat
  (canon (simple 4))
  (where :pitch (comp C major))
  (where :time (bpm 90))

Durations and pitches are represented as integer and ratio literals, which can be a little cumbersome. Programming really shines when it comes to musical transformations, for example above where the original melody fragment is given an accompaniment, put in the key of C major and given a tempo of 90 beats per minute.

Once programmers see musical code, one of the most common reactions is to ask whether it’s possible to use type systems to prevent bad music. This is a natural thought. If music can be seen as code, then might musical mistakes be somehow like programming errors? And if that’s the case, then maybe we can apply the techniques we use in writing programs to improve how we write music.

In particular, there is an obvious analogy between the kind of mistakes that type systems prevent and common examples of musical incorrectness. If my programming language can check that I don’t pass a String to a function that expects an Integer, then it should be able to check that I don’t play an F# in a piece written in C major, a key which includes no sharps or flats.

In this article, I will grapple with what makes music correct and how we might encode it in a type system.


Music theorists have been fascinated by what makes music correct for centuries. For most of that time, they have operated through a frame of musical prescriptivism, in other words evaluating music for conformance to a set of rules.

The historically dominant methodology for arriving at these rules can be summarised as follows:

  • Examine the canon
  • Extract regularities
  • Formulate them as sets of rules

We then can make the following claim:

Music that deviates from these rules is incorrect.

For example, a compositional rule is that a melody jumping a major seventh from a C all the way up to a B is not permitted. This approach has produced a lot of valuable insight. There are genuinely interesting regularities in classical music, and generalising our observations as rulesets helps us to discuss musical phenomena. 

What’s more, adherence to the rules is not a bad idea for students learning their craft. I’m quite glad that children learning the violin practice the C major scale and whenever they play an F# their teacher tells them they made a mistake. There are certainly contexts where prescriptive notions of correct and incorrect music are useful.

However, there are two major shortcomings of this way of defining musical correctness.

First, the rules yielded by this process are heavily dependent on what music the theorist regards as canon. A biased dataset leads to biased inferences, and Western classical music is not representative of humanity’s musical experience. Inferring rules based on Bach’s and Mozart’s music might be fascinating, but it doesn’t tell you much about music at large.

This gets especially tricky when you realise that musical genres are often heavily associated with particular cultures and ethnicities, and that by prioritising a certain kind of music you are also inadvertently prioritising a certain kind of person. In particular, academic music theory owes a lot to the study of the Euroclassical tradition and has paid relatively little attention to music deriving from the African diaspora like blues, jazz, rock ‘n’ roll and hip hop. Relying on traditional music theory as a source of correctness can lead to a cultural version of the same problem that has led to facial recognition algorithms working better for white faces than black ones.

The second major shortcoming of building rules by extracting and generalising from existing regularities is that it’s backwards-looking. Music is at its most thrilling when it’s breaking rules. Before Jimmy Page used distortion to artistic effect, engineers considered clipping of the signal during amplification to be a defect. So, even if observed regularities were a useful way to understand the music of today, they’re a terrible way of judging the music of tomorrow.

A good type system for music has to avoid the two traps of prescriptivism. It mustn’t be built on inappropriate or biased notions of what music is correct. But more subtly, in an attempt to make invalid states unrepresentable, it must not make innovative states impossible.


The alternative to prescriptivism is descriptivism. Where prescriptivism sees musical regularities as laws to be followed, descriptivism sees them as emergent patterns from actual practice. A descriptivist way to discover musical rules might look something like this:

  • Examine a body of music
  • Extract tendencies
  • Formulate them as a model

We then can make the following claim:

Music that deviates from the model is unusual.

To do that, we need a good model for musical structures that can represent common/unusual and not just right/wrong. One useful approach is that taken by David Huron in his book Sweet Anticipation. Huron lays out a descriptivist theory that for the purposes of this article can be summarised as three primary claims:

  1. Music is appreciated through statistical learning.
  2. Correct prediction delights you.
  3. Novelty keeps you from getting bored.

In our earlier prescriptivist example, jumping from a C up to a B was considered illegal, but in Huron’s theory it would instead be considered highly unusual. People who have listened to a lot of Western music would learn that when they hear a C, the next note is most likely to be another C, one note lower or one note higher. The effect of jumping up to the higher B would surprise listeners, which they might interpret negatively.

One of the fascinating consequences of Huron’s theory is that composing music is not an exercise in maximising nor in minimising invention. Rather, music dances on the border between order and chaos, and the audience’s appreciation depends on the musician’s ability to balance the shock of the new with the satisfaction of the expected. If you truly want to check a composition’s correctness, you have to make sure that it doesn’t stray too far from convention but also that it does not stick too close.

The following table is a statistical analysis of melodic tendencies. Darker squares represent outcomes that are more likely. The way to read it is to choose a note on the vertical axis, then read across to find the likelihood that your chosen note will be followed by each note on the horizontal axis. For example, 33.53% of the time in this dataset a "Re" is followed by a "Do".

[Click on the image to enlarge it]

The data is adapted from figures from Huron’s book, and is based on an analysis of more than 250,000 tone pairs from German folks songs in a major key. A different corpus, for example of hip hop or rock ‘n roll, would yield different probabilities.

These probabilities can be equivalently represented as bits of entropy, where a 50% chance corresponds to one bit of entropy and a 25% chance corresponds to two bits, etc. In order to avoid division by zero, I’ve modelled transitions that never occur in the dataset as having a 1/100,000 chance. This time, darker squares represent transitions that are more expensive in terms of entropy, or in other words that are expected to occur less commonly. For example, a "Re" followed by a "La" corresponds to approximately six bits of entropy, or a 1/64 chance.

[Click on the image to enlarge it]

The entropy representation makes it more natural to add together the surprise from a series of notes, where each transition from one pitch to the next is given an entropy count and the surprise of the melody as a whole sums each transition together.

Do -> Re -> Mi -> Fa = 2.20 + 2.71 + 5.94 = 10.85 bits
Do -> Ti -> Mi -> Fa = 2.48 + 7.43 + 5.94 = 15.85 bits

This is a very similar approach to a hidden Markov model, but rather than generating strings of notes we use the weighted probabilities to measure how likely a given melody is to occur. 
You might notice that the transition from "Do" to "Ti" is only 2.48 bits, which seems to contradict my earlier claim that a jump from C up to B is highly unusual. Unusual transitions would normally have about 10 bits of entropy. The reason for this inconsistency is that Huron’s data doesn’t distinguish between jumping up six notes from "Do" to "Ti" (very unusual) and stepping down one note from "Do" to a lower "Ti" (quite common). If our data allowed us to separate out the two cases then our model would show "Do", "Ti", "Mi", "Fa" as even more unconventional compared to the rather pedestrian "Do", "Re", "Mi", "Fa".

Typed Notes

At their best, type systems support programmers by guiding them towards valid coding decisions and eliminating the possibility of mistakes. A good example of a musical analogue is solfège. Even if you’ve never had a formal musical education, you’re probably familiar with it from the song Do-Re-Mi from the musical The Sound of Music. Each note of the octave is given its own name and any note outside the key doesn’t have a name at all:

  1. Do (the lowest note of the set)
  2. Re
  3. Mi
  4. Fa
  5. So
  6. La
  7. Ti
  8. Do (this time one octave higher)

This system guides musicians towards the correct notes, by defining a mini-language that makes it impossible to express notes outside the chosen key. There are infinitely-many frequencies between "Do" and "Re", but in solfège they are inexpressible.

This can be easily expressed in code as an algebraic data type:

data Solfege = Do | Re | Mi | Fa | So | La | Ti

This allows us to define musical functions that do not permit the possibility of off-key notes, for example this one which repeats the same note n times:

repeat : Nat -> Solfege -> List Solfege
repeat 0 s = [s]
repeat n s = s :: repeat (n - 1) s

With this definition, repeat 3 Re is an expression that can be easily understood within the musical domain while repeat 3 F# is a type checking error, because F# is not a value of type Solfege.

This gets us a certain degree of type safety. Our type system can help us avoid invalid notes, but it still cannot protect us from invalid combinations of notes. If we’re playing a C major scale we will never play an F#, but we might still play a C then jump up a major seventh to a B, which is forbidden by (conventional) composition rules. This kind of contextual type checking is possible, but it requires a more sophisticated approach.

Typed Transitions

Mezzo, a Haskell library by Dima Samoz, uses dependent types to check music for correctness. Mezzo’s README describes it as a "very strict spell-checker for music," and it’s able to check a variety of compositional rules beyond simple key-safety. In Mezzo, this compiles because the melody C, D, E, F is composed out of legal intervals:

comp = defScore $ start $ melody :| c :| d :| e :| f :>> g

However, when we violate the rule against jumping up from C to B, Mezzo steps in. The following code does not compile:

comp = defScore $ start $ melody :| c :| b :| e :| f :>> g

In a remarkable achievement, Mezzo is even able to pinpoint the problem:

Major sevenths are not permitted in melody: C and B
In the first argument of ‘(:|)’, namely ‘melody :| c :| b’
In the first argument of ‘(:|)’, namely ‘melody :| c :| b :| e’
In the first argument of ‘(:>>)’, namely
    ‘melody :| c :| b :| e :| f_’

This may appear to be a fulfillment of the dream of making incorrect music a type error, but there are still occasionally problems. When Samoz encodes Chopin’s Prelude into Mezzo, he discovered that on a few occasions the composer hadn’t followed the conventional rules:

"The piece could be transcribed in almost its entirety—however, I occasionally had to leave out a few notes as they would create forbidden intervals which Mezzo pointed out."

One might ask who is doing the forbidding here. If Chopin is not allowed to say what is allowed and what is not in Western classical music, who is?

The problem here is that even though Mezzo is able to take into account context, a type judgement is still a binary choice. The notes are either all correct and the piece compiles, or one note is out of place and the piece does not compile. So while Mezzo gives you control over which compositional rules are active, the decision whether or not to use a rule is global. If you switch a rule off because you have a particular artistic effect that breaks ordinary convention, then you abandon the check for other parts of your composition.

Typed Entropy

One way to model the breaking or bending of musical rules in types is to follow Huron and model changes between notes in terms of likelihood. Rather than a certain transition being either permitted or not, the combination can be given an entropy score that represents how surprising that pair of notes would be to a listener.

The following code is written in the dependently-typed functional programming language Idris. It checks that conventional is a melody that starts at "Do," and concludes at "So", and that the path it takes between these notes generates between 8 and 16 bits of entropy. First, we need to define a Melody type that captures our notion of a melody with a certain likelihood or entropy cost. This type has three constructors:

  • Pure, which represents creating a melody of a single note with an upper and lower entropy bound of zero.
  • (>>=), which represents composing two melodies and adding their entropy bounds along with the cost of the transition from the note at end of the first melody to the note at start of the second.
  • Relax, which represents taking a melody and relaxing its entropy bounds.
data Melody : (Solfege, Solfege) -> (Nat, Nat) -> Type where

  Pure  : x -> Melody (x, x) (0, 0)

  (>>=) : Melody (w, x) (low, high) ->
          (() -> Melody (y, z) (low2, high2)) ->
          Melody (w, z) (cost x y + low + low2, cost x y + high + high2)

  Relax : Melody (x, y) (low + dl, high) ->
          Melody (x, y) (low, high + dh)

This type definition is a little complicated, but once we have it we can use Idris’ do notation to sequence a melody and the Idris compiler will track the entropy bounds for us. Here is a conventional melody, where all the transitions have low cost:

conventional : Melody (Do, So) (8, 16)
conventional = Relax $ do
  Pure Do
  Pure Re
  Pure Mi
  Pure Fa
  Pure So

The next melody is unconventional because it jumps up a major seventh, which in Mezzo would have to either be forbidden or the rule turned off entirely. Here, we just expand the complexity bounds of the melody to have between 8 and 24 bits of entropy:

unconventional : Melody (Do, So) (8, 24)
unconventional = Relax $ do
  Pure Do
  Pure Ti
  Pure Mi
  Pure Fa
  Pure So

If instead of Melody (Do, So) (8, 24) we had assigned the unconventional melody the type Melody (Do, So) (8, 16) it would not have compiled!

The novelty of this approach is that it also catches music that is too boring. If the entropy of the melody is insufficient to meet the lower bound, a type error is generated. So if the conventional melody had been given the type Melody (Do, So) (16, 24) it too would not have compiled, as the entropy of that melody falls below the lower bound. This allows us to be faithful to Huron’s insight that listeners have a hard time listening to music that is too surprising or too predictable.

In the degenerate case of using this approach, transitions to other notes in the scale are given an entropy value of zero and transitions to notes outside of the scale are given an arbitrarily large entropy value. In that case we effectively recover the binary type judgement approach, where a note is either legal or illegal with no grey area in between.

Important but Hard to Check

Type checking music is difficult because defining the correctness of music is difficult. If we cannot model our notion of musical correctness precisely and mathematically, we cannot encode it into a type system. Even if we are able to formalise our understanding in an authentic way, it takes some effort combined with powerful type systems to get to the point where the compiler is an adequate judge of our composition. Nonetheless, with some determination and a few simplifying assumptions, it is possible.

Is this just a curiosity? Perhaps. It’s unlikely that the music industry will employ type systems any time soon as a means to prevent compositional errors. But if exploring the computational aspect of music can give us a new way to appreciate something that has so much significance in almost everyone’s life, then that itself justifies the enterprise.

But perhaps just as importantly, music is just one domain where correctness is both important and hard to check. The more that programming is used to automate social and cultural domains, the more frequently situations will occur where an answer is difficult to evaluate, yet has crucial importance to a human being. Consider using a computer program to decide how long to hold a person in prison. The correctness of a sentencing decision involves a complex trade-off of costs and probabilities, and if we cannot follow the system’s reasoning, how do we know that the outcome was just?

Any time that we automate opaque reasoning, we need to ask ourselves, how do we know that this system is correct?

About the Author

Chris Ford began to make music with code to compensate for his poor piano technique. Only later did he realise that programming offers deep insight into musical structures. Over the past few years, Chris has given many talks presenting music theory to programming audiences, covering topics including European classical music, complexity theory, jazz, central African polyrhythms and tuning systems. He is Head of Technology for ThoughtWorks Spain and lives and works in Barcelona.

Rate this Article