What you get from stacking fifths
Useful stuff, but not, strictly speaking, a circle
After a handful of my first cryptic posts in which I remember how to write and how awful the state of my current shortcomings is, let’s do different. I promised music theory, right?
So let’s lay some groundwork first. Take a piece of paper to write or draw on, a bit later, because you’ll surely need it to make additional examples or check something.
Psychoacoustics basics
In simple cases, sounds are usually classifiable as either tonal or noise. Tonal sounds have pitch, pitches being linearly ordered (some being “higher” or “lower” than others). In the simplest case of sine waves, pitch can be put in correspondence with the frequency of that wave, and this correspondence even works for some other cases (see the link above for details).
Pitch can be quantified by using just-noticeable difference (JND): a distance between two pitches which is noticeable, say, 50% of the time by some group of observers in some circumstances. So, JNDs measured in different experiments somewhat vary, but still agree to an extent. It ends up that, in the middle part of audible frequency range, JND is pretty constant relative to frequencies which correspond to pitches. Now treat any two pitches separated by JND as being at distance 1—and you get a distance function on pitches. It depends on experimental data to an extent, but the general picture is clear: in the midrange, difference in pitch is a logarithm of the ratio of corresponding frequencies, and outside that range, it’s still quite close.
It is one of two main reasons to use logarithms when working with intervals—ratios of two frequencies. Importance of intervals arises in melody—as sequential steps up or down in pitch when singing or playing (or imagining/listening)—and in harmony—as distances between notes held together, which correlate with characteristics of the sound it makes.
As it stands, there’s plenty of tonal sounds used in music which share a common timbral property: the spectrum of their sound is mainly multiples of some frequency, called a fundamental (with the overal pitch of the sound in this case almost always corresponding to the pitch of a sine wave with that same frequency). Examples of good harmonic timbres include voice and bowed strings, less so blown pipes and freely vibrating strings. (Blocks of material, plates, tubes and bells are usually quite inharmonic.)
Now, sounds with harmonic timbre do sound nice together when the ratio of their fundamentals is sufficiently close to a “simple” rational: m/n with small m and n. This marks relevance of intervals in harmony.
The above is, though, not the second reason for representing intervals by logarithms of ratios—quite contrary, rational numbers (if they are so good) are more comprehensible if used as is—it’s that in music, we often deal with a scale of pitches1 which can be used in a part of music, and we thus fix the attainable intervals, which can be stacked on each other upwards and downwards, frequency ratios being multiplied and divided. Using logarithms, one may use a simpler picture of adding and subtracting intervals. And while specifying ratios still stays important, usually one would still use additive terminology.
Octave and scales
Of simple ratios other than the trivial unison 1/1 (or 0, additively), the octave 2/1 is the most important (at least in a statistically significant portion of world music). The name, for non-musicians, may come as a misnomer, and frankly it is. We’ll get to why of it later if I won’t forget.
The octave is so simple that usually pitches an octave apart sound uncommonly close or even “the same”2. This is called octave equivalence. When one understands this as a mathematical equivalence relation f₁ ~ f₂ :⟺ ∃n ∈ ℤ. f₁ = 2ⁿ f₂, factorizing pitches by this relation gives rise to pitch classes—sets of pitches {…, (1/4) f, (1/2) f, f, 2 f, 4 f, …} related by octaves or, in other words, pitches indistinguishable with respect to transposition by octaves (going a whole number of octaves up or down).
Pitch classes are no longer linearly ordered, but they still have cyclic order: one can still say: pitch classes A, B, C3 go upwards one after another (there exist concrete pitches a < b < c that satisfy that and are no more than octave apart), and pitch classes A, D, C do not (any such a < d < c would be at least an octave apart).
(Now, sometimes we don’t treat octave as an equivalence interval, so in xenharmonic community, a term equave is used. For example, in music utilizing Bohlen—Pierce scales, the equave is 3/1. But in this post, be sure that 2/1 is our only equave4.)
A scale is just a sequence of pitches. Mathematically, it can be infinite in both directions, but physically-applicable scales will be of course finite due to our meager audible range. Here we’ll concern ourselves only with periodic scales: those that repeat in a sense that transposing all pitches of an (infinite) scale by a period interval, we’ll end up with the same scale5. Now, period is not usually an octave, but octave would be its multiple if we’re concerned with pitch equivalence (which we are).
Like with octave equivalence, we can spice generalize things by considering a scale that is invariant under transposition: say, our 0-th pitch had a frequency 440 Hz. Now if we consider all transpositions of this scale together, their 0-th pitches would have every positive frequency in the world—but the intervals between m-th and n-th pitches would always be the same if we fix m and n. This kind of scale which forgets frequencies but still remembers intervals, is a relative scale, the former retconned to be called absolute. We’ll be concerned mostly with relative scales.
Fifth and scales
Also having a funny name with historic roots, there’s fifth 3/2. It is unambiguously the second simplest interval no wider6 than octave, after octave itself. That’s why ancient figures like Pythagoras considered having a plethora of fifths in the scales used.
Let’s look at some scales now. One of the simplest scales are equal divisions. We take octave and divide it into N equal parts. A step, an interval between neighboring pitches, would be equal 2^(1/N) as a ratio—notice it will never ever be rational except for the boring case of N = 1. Such a scale is called Nedo7 (variants: N-edo, N-EDO, Ned2) or, a bit more ambiguously, N-TET (N-tone equal tuning/temperament).
The scale you’re most probably familiar with, as a reader of English, is 12edo: the octave is divided into 12 semitones. It’s an appropriate place to add that for technical and theoretic applications, each semitone is equally divided into 100 cents, denoted, as usually, ¢89. Pitch JND is about 2…20 ¢.
Now, 12edo has fifths, and very good ones: pure (also called just) fifth of 3/2 is ≈702 ¢,10 whereas 12edo contains an interval of 700 ¢ = 7 semitones, above and below of every pitch; with ≈2 ¢ of difference. We’ll call the 12edo fifth 7\12 (with a backslash to differentiate it from ratio 7/12, which sounds nowhere close)11—7 steps of 12edo.
Also let’s look at e. g. 19edo. This scale also has a pretty good fifth 11\19 ≈ 695 ¢. That’s a larger difference of ≈7 ¢. Or 17edo, which has 10\17 ≈ 706 ¢, ≈4 ¢ apart from 3/2 but now up instead of down (which has different psychoacoustic and cultural implications). But otherwise, taking larger edos we can get the error as small as we want, for a price of working with tons of small divisions: for example, a fifth of 311edo has error of ≈0.3 ¢ (the step size being ≈3.9 ¢, so that’s not just the effect of smaller available steps).
Pitch classes and scales
An octave-repeating scale can be visualized as its pitch classes arranged on a circle, with arcs between them having lengths proportional to intervals between corresponding pitches. Transposing by an interval here is rotation by a corresponding angle. Rotating the circle one way (usually clockwise) will make all pitch [class]es higher, rotating in the opposite direction will make them lower.
A diagram of this sort for 12edo is just a clock face: 12 points equidistant from each other. Transposing by a semitone 1\12 is rotating by 30° = π/6 = τ/12, a twelfth of a turn.
Diatonic scale
Fifths (any approximate fifths, not just 3/2) got so popular in western music as to get a special kind of scale (and its slight variations) to become very important: the diatonic scale. Its construction is also quite simple: take a pitch class, transpose it a fifth up, add to the collection, and repeat until there are seven pitch classes in total, which will constitute the whole octave-repeating scale.
Neighboring pitch classes of this scale would invariably, due to this construction, have steps of just two different sizes between them; let’s denote the larger of those intervals L and the smaller s. The chosen fifth would be equal to 3 L + s. Then the pattern of steps will be (any rotation of) LLLsLLs12. This allows us to name the pitch classes somewhat uniquely by their place in the pattern of steps, which goes as follows:
So, starting at A, the pattern goes LsLLsLL. Or, starting at C, it goes LLsLLLs. What we get when we single out one of these pitch classes as a thing of special interest and a kind of home is called a mode of the scale. Intervalic structure of a mode can be uniquely characterized by the pattern of steps like above, starting from this home pitch—a tonic—and going upwards.
Here are the seven diatonic modes and their names:
LLLsLLs — FGABCDEF — Lydian
LLsLLLs — CDEFGABC — Ionian (major)
LLsLLsL — GABCDEFG — Mixolydian
LsLLLsL — DEFGABCD — Dorian
LsLLsLL — ABCDEFGA — Aeolian (minor)
sLLLsLL — EFGABCDE — Phrygian
sLLsLLL — BCDEFGAB — Locrian
In this list, modes go from brightest to darkest, and what that means is this: Lydian has the widest intervals upwards from the tonic, Ionian has one of them narrowed: LLL to LLs, then Mixolydian has 5 L + s narrowed to 4 L + 2 s, and so on. If one tries to darken Locrian, a natural step, while remaining in diatonic world, would be to lower the tonic, and they’d find themselves with Lydian again, with all intervals being widened again.
We’ll return to modes after a while, but now do note how our construction of a diatonic scale ensures that there is a fifth up from every pitch except B (and a fifth down from all except F): there is an unbroken chain of fifths F—C—G—D—A—E—B. This chain of fifths is what we’ll grow shortly into the eponymous circle of fifths. Or will try to.
Naming issues
Now it’s finally the time to explain why octave and fifth are named as they are. When talking about modes and notes of pieces of music written in different modes, it ended up convenient to refer to scale degrees13—pitch classes which you get when you go a fixed number of scale steps up (or down) from the tonic.
0th degree is 0 steps of the scale from the tonic, so it is itself;
1st degree (D for C, E for D, etc.) is a step up: s or L up for different modes;
2nd degree (E for C, F for D, etc.) is a 2-step up—either L + s or 2 L up;
and so on, up until the 7th degree which is octave-equivalent to the tonic, and it all repeats.
In general, a k-step is an interval between any m-th and (m + k)-th pitches of the scale—it’s any k contiguous steps of the scale taken together. So, we established that 0-step is the unison 0; that 1-step is one of {s, L}, and 2-step is {L + s, 2 L}. Do check all the possibilities for the rest of k-steps for k from 3 to 6, for good measure.
Now, base-0 indexing is unfortunately a recent find in humanity’s history, or at least in the history of western music. At some point in time, some of European musicians counted their scale degrees thus: they started at the tonic and said prima (the first [note]), then they went a step up and said secunda (the second), then tertia, quarta, quinta, sexta, septima and finally octava (the eighth)14. It was this terminology that got either borrowed or translated into many languages, hence fifth is the 4th degree from above and octave is 7th degree from above. And yes, these degree names are also shared with k-step interval names: an interval going up from the tonic to the fifth (degree, which is the 4th if zero-indexed) is, well, the fifth.
Now, this is not the only thing of concern with names here, and the other thing stems from practicality. Having just seven notes to a scale has obvious shortcomings: if we want to play in different modes having the same tonic (or worse: different modes with different tonics15), we need to notate things and complicate sight reading for musicians. Instead, we can just add more pitches to our scale to support more modes.
In diatonic modes, each k-step interval is either a multiple of octave, or comes in two sizes, like 1-steps s and L do. Smaller variants are termed minor and larger, major. …Well, except for fifths: as they have a special role in constructing the scale, they also are “major”, of our given size, in all modes but the darkest (Locrian), so they are called perfect instead, and the defective Locrian fifth is called diminished. Likewise, a fourth is a fifth’s mirror image (a fifth up is a fourth down, modulo adding or subtracting octaves), so fourths are small in all modes but Lydian, termed perfect too—but the Lydian fourth is augmented.
Notation: it is common to use abbreviated interval names like P1 for perfect unison, m2 for minor second, M3 for major third, A4 for augmented fourth and d5 for diminished fifth.
So, to play every diatonic mode (with the same tonic) using a unified scale, one needs to have notes these intervals up from the tonic: P1, m2, M2, m3, M3, P4, A4, d5, P5, m6, M6, m7, M7 (repeated by adding octaves, of course).
Let’s digress for a bit and ensure we don’t mix up various layers of abstraction. There is a generic interval (a k-step) like third ≡ 2-step, which is a set of more specific intervals {L + s, 2 L}, minor and major thirds, and also there are the most concrete thirds when we take L and s themselves to be concrete: like, if s = 1\12 and L = 2\12—this will give us a diatonic scale embedded in 12edo,—then we have a concrete “12edo minor (or major) third”, though now these same concrete intervals of 300 ¢ and 400 ¢ can have more names.16
Another digression: s here is usually called a semitone (which agrees with the previous example of diatonic in 12edo, where 1\12 is called a semitone) and L is called a tone (again, in 12edo this makes the most sense: 2\12 is twice 1\12; and usually even if L / s is not exactly 2, it’s close to warrant the naming). Now also note that small and large variants of intervals differ always by the same amount c := L − s; this interval is named a chromatic semitone17 (then s is a diatonic one, to distinguish).
Back to constructing a multi-modal scale. Let’s say C is the tonic. Then adding DEFGAB gives us M2, M3, P4, P5, M6, M7—CDEFGAB is the major mode, as you most certainly know, and it has almost all of the large variants of generic intervals except the fourth. When adding degrees m2, m3, … from tonic, it would be nice not to invent totally new names for them but to reuse D, E, … we already have. There is a way: let’s denote lowering a pitch (class) X by c as Xb18 (reads “flat”). Then we can add Db (adds m2), Eb (m3), Gb (d5), Ab (m6), Bb (m7). We’re just short of A4 which needs raising by c. We can go the same way and put that down as X#19 (“sharp”), in our case F#. These symbols b and # are called “accidentals” (don’t even ask, there will be no clarifying footnote; maybe take this link).
Now, having C, Db, D, Eb, E, F, F#, Gb, G, Ab, A, Bb, B one can play any diatonic mode with tonic C. But then we may wish to have music that’s starts in one mode and tonic but then changes them to other, or borrows notes from one mode temporarily. So, we allow flats and sharps generally. Being able to play any mode with any tonic makes our relative pitch class names shaky: what’s a C if there is not always an LLsLLLs pattern of intervals above it? So we make these pitch classes absolute, or as absolute as we could. Now, liberally, A is “a pitch class that contains a pitch near 440 Hz”, C is “a class that contains ≈260 Hz20".
And also, unrelated to fixing of the pitch classes, our scale theoretically blows up. Now there’s infinitely many pitch classes which aren’t necessarily the same, like C# and Db, G and Abbb. Abbb might look weird, but music can absolutely go weird places. And at least infinitely many pitch classes that are almost surely not the same, like C# and C## and C###.
As alarming as it can feel, it’s not because each piece of music usually contains only finite number of notes, and so we can omit almost all of these new ones systematically if we don’t just play them at random. That is, we can expand our small ladder of fifths F—C—G—D—A—E—B which gave us a diatonic scale, and get all of those new pitch classes. Because, as you can see that L is octave-equivalent to two fifths up (just an octave down), and s is in this manner five fifths down, so then c = L − s is also expressible as seven fifths up, give or take four octaves. So when we prolong the ladder up from B, we immediately get F#, then C#, G# and so on… and after B# it’s F##, C## etc.. Likewise, going to the left from F we get Bb, Eb, Ab and so on.
All of our added pitch classes—in one neat chain of fifths which we can take just a finite part of if we wish. This is the eponymous “what you get from stacking fifths”: you get infinite diatonic modes.
Circle of fifths
This infinite chain is just “abstractly infinite”—nothing says it won’t actually cycle when we use a concrete fifth when constructing a concrete diatonic scale. But it ends up it does say: if we take even the just fifth 3/2 itself, we’ll never get back (modulo octaves) by stacking fifths. We can get very close… but not even exactly close, because log₂(3/2) is irrational21.
This case is what’s known as Pythagorean tuning (which is usually truncated to 7 or 12 notes per octave). This happens for “almost every” choice of fifth, if we are measure theorists: stacking almost any interval we won’t get to an integer number of octaves, because almost surely that interval is an irrational multiple of an octave (speaking additively).
But there are, of course, plenty of tamer cases which are practically important, so they often prevail. Confining ourselves to an edo and using an approximate fifth from there, one ends up with a circle of fifths. Going 12 fifths up or down in 12edo will get us to a multiple of octave, so B and Cb are enharmonic equivalents: they are the same pitch class; so are C# and Db; or all of E#, F##, Gb and Abbb.
In naming diatonic intervals between all these new notes, we use augmented when c is added to a major interval and diminished when c is subtracted from a minor interval. For example, E—F is m2, E—Fb is d2; C—C is P1, Cb—C is A1 (augmented unison)—the last one is like the case of fifths and fourths. If we add/subtract even more c’s, it goes doubly augmented/diminished and so on: B—F is d5, B#—Fb is ddd5.
Now, circles of fifths of 12edo, 17edo or 19edo has all the notes that edo has to offer, but that’s not always the case. Doubling, tripling etc. amount of octave divisions often gets us the same best approximation of 3/2 (like in 24edo, 36edo, … up to and including 300edo), and in those cases there are several independent circles of pitch classes related by fifths22. And, of course, circles for different edos are of different length, so that the abstract infinite chain of fifths is a notion with its place in the world.
Now that we’re finally here, what we can do with it? Because of how things are defined, it’s quite instructive.
Here it is, truncated to a healthy amount. Now, as we know, seven consecutive pitch classes in here make up a diatonic scale. Let’s add a sliding window to highlight some.
We can slide this window wherever we wish and get different diatonic scales inside a parent scale, so e. g. it’s quick to check if a subscale is diatonic.
Now, if we want a particular mode, we only need to choose its tonic here. After that, remembering that L (whole tone) is two fifths up, we can enumerate all other scale degrees. We can even forget about how much fifths span a semitone s—if we just go right 2 notes each time, wrapping around when we reach the end of the window, it all works automatically. This is C Ionian/major:
We wrapped around from E to F, effectively going 5 fifths down as is needed to get a semitone between those notes.
Another example, C Aeolian/minor:
It’d be a boon to know which mode we get. And we have it: if the first note in the window is the tonic, it’s the brightest mode, and so on and so on, ending with the darkest mode if the tonic is the last in the window:
You just imagine mode labels are glued to the window and travel with it. So, to look at F Dorian, one positions the window so as F is right in the center so it gets the Dorian label, and we can enumerate the pitch classes of the mode upwards: F, G, (wrap around), Ab, Bb, C, D, (wrap around), Eb.
Now that we’re particularly fond of 12edo, we can look for “nice” subsets of large edos which have 12 notes: we just expand the window size to 12! For example, we take Ab through C# here. What good is this, though?
This is good because in C, C#, D, Eb, E, F, F#, G, Ab, A, Bb, B23, we again get k-step intervals in no more than two sizes. 1-steps are now s and c instead of prior s and L (and c now might be both larger or smaller than s, or, in the case of (12 n)-edo, equal). Then we have 2-steps of 2 s and L, and so on. Again, fifths (now 7-steps) and fourths will be of a “perfect” size in all of the modes but one (again, an augmented fourth in the brightest “chromatic mode” and a diminished fifth in the darkest one).
You can’t pick an arbitrary number of notes and have this property, though, nor can you pick any larger number of notes and have this property without first knowing more about the size of the concrete fifth you used. For example, the first choice is whether the fifth is > 700 ¢ (and so c > s) or is it < 700 ¢ (and so c < s). In the first case, we get “nice” sets of 17 notes, in the second case the magic number is 19. We’ll talk about that in detail some time later; for now do know this is about MOS scales.
And another thing…
It should be mentioned that there are approximations of 3/2 that are absolutely unacceptable. That is, they don’t actually generate a diatonic scale: they don’t generate a pattern of steps LLsLLLs.
One boundary is at the fifth of 7edo, 4\7 ≈ 686 ¢. At this point, s becomes as large as L, c vanishes, and the circle of fifths becomes …—B—F—C—G—D—A—E—B—F—…, because # and b change by unison. If we make the fifth even narrower, c becomes negative and s becomes larger than L. In that case, it’s fruitful to rename things so that s and L interchange and c is again positive, but then the diatonic pattern LLsLLLs becomes an antidiatonic ssLsssL instead.
We can translate music that uses only “diatonic constructions” (and doesn’t, for example, abuse 12edo’s conflation of s and c semitones) into such a system. Then each prior # will actually flatten the sound, each b will sharpen; intervals written as minor in sheet music will sound brighter (wider) than those written as major. Examples of edos that support antidiatonic first and foremost are 9, 16 and .
Yes, actually, you can easily finds edos which support diatonic or antidiatonic scales (that is, L and s intervals are always represented by the same number of edosteps): just specify how many edosteps each of s and L are to be mapped to (constrained by 0 < s < L) and calculate the size of edo using the scale pattern. For example, diatonic with ⟦s⟧ = 3 and ⟦L⟧ = 5 is realized in 5 ⟦L⟧ + 2 ⟦s⟧ = 31 divisions of the octave. Also, knowing that the fifth is 3 L + s, you readily calculate it to be 3 ⟦L⟧ + ⟦s⟧ = 18 edosteps.24
Okay, enough diversions; another boundary is the fifth size of 5edo, 3\5 = 720 ¢ (exact). At this other point, s vanishes and c = L. Now the circle of fifths is even smaller, with E = F and B = C, and for instance F# = G. If we make the fifth wider, we just plain break the (anti)diatonic pattern entirely. What we get instead is a topic for another post.
Also, we can’t break antidiatonic: getting the mistuned “fifth” all the way to 1\2 = 600 ¢ = √2 makes us 2edo with s (former diatonic L) vanishing, but then going further just swaps the fifth and the fourth, and they do add up to octave, so you see there’d be nothing new.
So, well, later.
P. S. [2023-10-22] The post is now updated because I mixed up 17 and 19 in one place.
Note that quite often, now and in the past alike, “pitch” can be used in place of “frequency”, but do still remember that they are distinct. Sometimes there is pitch that doesn’t correspond to any frequency in the signal (missing fundamental), and there are other effects.
I don’t know how much of that is nature and how much is it nurture of musical cultures out there, but it still stands.
Those same A, B, C we’ll define later in a way. But here you can also treat A, B, C (and D) as just variables (which denote some unknown pitch classes).
Substitute backwards to attain generality.
Guess why finite scales are frowned upon.
Not that wider intervals are bad for some reason, but octave-equivalent music ended up considering intervals like 3/1 as compounds of 3/2 and 2/1. More or less.
For a general equave X, this is NedX. For example, 9ed(3/2) is Alpha scale of Wendy Carlos; 13ed3 is Bohlen—Pierce equal tuning.
Or ‘c’, when ¢ is hard to obtain or [FONT ISSUES].
So the division of octave into cents is 1200edo.
To convert a ratio x to cents, calculate 1200 log₂(x). Conversely, a cent value of y as a plain ratio is 2^(y/1200). This is the direct consequence of cents being steps of 1200edo.
For a general equave X, an interval of m steps of NedX is written as m\N<X> or something like that, if precise naming is needed. It might be enough to just use m\N.
Please don’t confuse this with product, which would be quite nonsentical in this case: only addition of intervals and their scaling by an integer (or even real number) are well-defined: intervals are naturally a vector space over ℝ.
Not “mode degrees” for some reason—though understandably (and/or not), “scale” may also quite often mean “mode”.
Or further if wider-than-octave intervals were needed. This is Latin, by the way. For some reason I thought it was Italian—well, because a large swath of musical terminology is from Italian—though it’s quite evident that these names aren’t what Italian sounds like since quite a time; many centuries ago, octava became ottava.
And not, like, modes CDEFGAB and EFGABCD that fit together nicely.
In 12edo, m3 = A2 = dd4 = AAA1 = …, and M3 = AA2 = d4 = ddd5 = … You’ll see later what ‘A’ and ‘d’ mean in this context.
In general, possibly non-diatonic setting, L is simply large step, s is small step and c is chroma.
There is a Unicode symbol for that: ♭, but it’s still not that common in fonts, and yet more, when one goes all out with symbols of that sort, there may be no Unicode alternatives at all, in which case b feels more consistent with the rest than ♭.
Likewise, there is Unicode ♯ but see above. Also note that ♯♯ is by historical reasons rendered as 𝄪 (in notation we use here: x). If you see the character not lining up with ♭ and ♯, the font’s doing a bad work. See here on how it looks in a score.
So-called middle C. In A440-and-12edo world of modern music production, it’s ≈261.626 Hz. Look at the sixes!! Look at them! …uh, sorry, what was I about?
Equivalently, 3/2 has irrational value in cents; it starts 701.955 000 865…, so our former 702 ¢ is a pretty good approximation.
And so, for example, 24edo music necessitates new accidentals to reach all its notes from CDEFGAB: half-sharps (usually rendered as ǂ or t in text) and half-flats (d). Though that won’t be useful for 36edo, which needs to use, for example accidentals to raise (up, ^) or lower (down, v) by exactly one edostep, which are written and spoken before all others and before the note, like ^C# or ^M3 for intervals.
This time we go 5 notes to the left each time to enumerate upwards. (But why?..)






