Turning Audio Upside Down with Octave Inversion

You probably know what it’s like to hear a recording played backwards.  But have you ever heard one that’s had its octave flipped upside down?  The idea recently came to me of modifying a 1930s voice scrambling technique for this purpose, and the results have been extremely cool—even addictive.  As far as I’m aware, this is a new effect, and a decidedly fun addition to our toolkit for creatively cultivating legacy audio.

Inverted William Tell Overture (source), split at 58075 Hz.

My octave inverter can take any musical recording and scramble its melody and harmony—or even turn it into a piece of outré microtonal music—while keeping up a resemblance to the original that verges on the uncanny.  Rhythms are entirely unchanged.  Timbres get preserved to a surprising degree as well, albeit not completely: the human voice tends to undergo conspicuous transmogrification even when words remain intelligible, and piano music routinely comes out tinnier or buzzier after inversion.  These changes happen because the patterns of overtone reinforcement that contribute so much to timbre are inverted along with everything else.

Inverted Maple Leaf Rag, excerpt (source), split at 79648 Hz.

Practically speaking, I don’t think anyone would consider octave-inverted recordings to be viable substitutes for the originals (indeed, the first thing I suspect many listeners will want to do is track down the originals for comparison).  On the other hand, they’re clearly derived from other existing recordings, and they still sound remarkably like those recordings in many respects.  What I’ve done to them is surely transformative, but no more so than sampling—another technique that couples creativity with the exploitation of existing material.  That said, the identity of the musical “work” in the Western tradition is often held to inhere primarily in its melody as opposed to its rhythms per se.  Because my octave inverter scrambles those melodies, it could be argued that an inverted recording no longer manifests the same melody as its original—or, arguably, the same “work.”  Indeed, philosophers of music ought to have a field day with the relationship of inverted recordings to particular compositions, “works,” performances, and so forth.  That makes the results of octave inversion interesting from a theoretical standpoint.

Inverted Tequila, excerpt (source), split at 53160 Hz.

And they’re also aesthetically interesting.  Will inverted music dance parties be the next big thing for avant-garde DJs?  Would anyone tune in to an inverted-music radio show?  Will musicians make creative use of inverted samples as well as “straight” ones—or even just listen to inverted music for ideas they can rework?  Would there be any market for an octave inverter app that could be used to invert any recording at will, according to user-selected parameters?

Inverted Music Box Dancer, excerpt (source), split at 50175 Hz.

I was inspired to tackle this project while reading up on the history of voice inversion scrambling.  In its simplest form, which was put to commercial use shortly after the First World War, this entailed flipping the frequency spectrum of a speech signal around a given frequency and then re-flipping it after transmission for decoding.  As the graph below illustrates, the flipping could be done by modulating the speech signal (a) with a carrier wave at twice the desired inversion frequency (b) and then taking the lower sideband (the area below the yellow line at c), which yields a spectral mirror image of the original.Back in the 1920s, the inversion was carried out electronically, but it can also be accomplished digitally, either by mimicking the same process (multiplying a×b, then filtering out everything above b) or—if you want to flip the entire frequency range of a digital recording—simply by multiplying alternate samples by negative one (equivalent to modulating a carrier wave at the Nyquist frequency).  Photosounder offers a similar “pitch inversion” effect but accomplishes it in a different way.  The program as a whole is designed to let you manipulate sound spectrograms in the image domain as a basis for additive synthesis or resynthesis, so to invert the frequency spectrum you just click the FLIP button to flip a given spectrogram image upside down.  You can see and hear the process demonstrated in a YouTube video.

Now, the kind of simple frequency inversion I’ve described so far flips the whole spectrum from top to bottom, and the results don’t much sound like the originals, nor do I find them very compelling to listen to, although they’re certainly weird—witness the flipped-spectrogram examples on YouTube here and here or the following example I created using the older approach.

Simple frequency inversion of 6 kHz file: The Beatles, “I Want to Hold Your Hand” (excerpt)

Nor did simple frequency inversion offer much privacy as a scrambling technique back in the 1920s either, since it proved relatively easy for unauthorized third parties to descramble.  There just isn’t all that much to be done with it.  You probably won’t be inspired to spend hours inverting all your favorite songs this way.

On the voice scrambling front, however, a more elaborate split-band strategy was devised in an effort to make frequency-inversion systems more secure: the signal is first split into different frequency bands, which can be shuffled out of order, and each is then inverted separately—or maybe left uninverted to mix things up even more.  This was the approach to voice scrambling used by the United States at the outbreak of the Second World War, until it was superseded by SIGSALY.

After reading about split-band frequency inversion, I started pondering the idea of using it to produce audio effects of a more aesthetically interesting sort.  What would happen, I wondered, if instead of inverting the whole frequency spectrum, we were to invert each octave separately?  A 44.1 kHz digital file can represent frequencies up to the Nyquist frequency at 22,050 Hz, so in that case we might first invert the octave from 11,025 to 22,050 Hz, and then the octave from 5,512.5 to 11,025 Hz, and then the octave from 2756.25 to 5,512.5 Hz, and so on down the scale as far as we’d care to go.  Frequencies would remain within their original octave ranges after inversion: low-frequency components would remain low, and high-frequency components would remain high.  However, the underlying octave would be flipped upside down, such that a melody would go down where it had originally gone up, and vice versa.

I wrote some code in MATLAB to try out this idea.  After resampling the source to 150% of its original sample rate, it inverts the spectrum, filters out everything but the middle third (which was originally the top half), and saves that result.  Then it halves the source sample rate (eliminating the top half of the frequency range), repeats the process, doubles the sample rate of the new result, and adds that to the first result.  The process up to this point is illustrated below.

Next, my code halves the source sample rate yet again for another round of inversion and repeats the same pattern as many times as desired to yield octave-width bands one by one down the spectrum.  The final result is then resampled to restore the original sample rate.

Inverted Gondolier and Temptation Rag (source), split at 54815 Hz.

The main variable worth playing around with is the band-splitting frequency, which corresponds to the position of the yellow lines in the above illustration.  One default, as outlined earlier, would be to divide the source into bands based on the Nyquist frequency, which at 44.1 kHz would give divisions at (1) 11,025 Hz, (2) 5,512.5 Hz, (3) 2,756.25 Hz, (4) 1,378.125 Hz, (5) 689.0625 Hz, (6) 344.53125 Hz, (7) 172.265625 Hz, (8) 86.1328125 Hz, (9) 43.06640625 Hz, (10) 21.533203125 Hz, etc.  However, the choice of band-splitting frequency has a dramatic impact on the results—much greater than I’d expected while speculating about it beforehand.

One reason for this is that different band-splitting frequencies, with their different inversion frequencies, produce different intervals and harmonic relationships among inverted notes.  The inversion follows a linear scale, like this:

When an octave is flipped, every frequency is inverted to the linear difference between itself and the top frequency.  At first glance, this process might seem neatly symmetrical, and in a sense it is.  But musical pitch is an exponential function, not a linear one.  If we invert an equally tempered scale with respect to a linear axis, the outcome is not symmetrical in terms of musical pitch and ends up yielding a significantly different set of intervals, as illustrated below.

The patterns of tones and semitones in familiar scales, such as “major” and “minor,” introduce a further layer of asymmetry.  Thus, if we invert the notes C, D, E, F, G, A, B, C using different inversion frequencies, we get different scales, or modes, or whatever the heck they are.  I’ve illustrated below what happens if we use each note of a major scale as the inversion frequency, but there are countless other possibilities between those options as well.

Inverted opening notes of Bach’s Tocatta and Fugue in D minor (source), split at twelve different points (starting at A, ending at A♭).

The bottom line is that there isn’t just one single, consistent inverted scale we get whenever we invert something.  Instead, there’s an endless variety of them, which creates all the more opportunity for exploration, experimentation, and discovery.

Meanwhile, the frequencies closest to the inversion frequency will end up separated by an entire octave through inversion, and those ruptures can cause more problems at some places than at others.   For example, if the fundamental of a human voice crosses the inversion frequency, this can have a very strange effect.

Inverted Can’t Buy Me Love, split at 58075 Hz.

To set a band-splitting frequency, practically speaking, my code resamples the source to a rate that’s a multiple of the desired frequency before processing and then restores the original sample rate afterwards.  Depending on the ratios involved, MATLAB sometimes encounters an error at this point, in which case I’ve set it up to try neighboring sample rates until it finds one that will work.

Another value that could be tinkered with is the size of the gap between bands.  If our filter passes exactly the middle third of the frequency range, the doubling-up of signal at the edges produces a kind of ringing “resonance” (especially noticeable if you invert a source recording and then re-invert it to test the result against the original for accuracy).  With this in mind, I’ve been adding a small amount (0.025-0.05) to the 1/3 frequency point and subtracting it from the 2/3 frequency point.

To reiterate: we can invert a recording and then re-invert it using the same band-splitting frequency to restore the original frequency scale.  But of course it’s also possible to invert a recording with one band-splitting frequency and then to re-invert it with a different band-splitting frequency.  When this is done with simple inversion, bass and treble just end up painfully out of tune with each other, but if we instead use octave inversion, the results are more interesting.  Widely separated values produce results in which harmonic relationships are still heavily modified, although in new ways even more theoretically complicated than before, while closer pairings slightly raise or lower the overall pitch with only minor skewing of harmonic relationships.  And then there’s one other noteworthy scenario: a sequence of two inversions at band-splitting frequencies in the ratio 2:3 or 3:2 ends up shuffling uninverted half-octave frequency bands out of order, which produces its own weird effects.

Doubly inverted Stars and Stripes Forever March, excerpt (source), first inverted at 80000 Hz, then re-inverted at 60000 Hz.

One challenge that remains is more musical than technological: namely, working out reliable guidelines for choosing band-splitting frequencies that yield aesthetically worthwhile results.  I’ve found that splitting at the fifth relative to the key of a piece of music is a good place to start, but I’ve also stumbled across prime splitting frequencies entirely by chance (as with one early programming error that fortuitously resulted in audio split at 441002/f rather than f).  There’s much trial and error yet to come, I suspect.  But for now, I’d like to close with a sampler of excerpts from a few of my favorite inversion discoveries to date: “Bourret Chabanet” by Mervent, split at 56320 Hz; “Crystal Tears” by Scott Williams, split at 61528 Hz; “Fodeba” by Lamine Konté, split at 63217 Hz; and “Frühlingstanz” by Schandmaul, split at 56320 Hz.  Enjoy!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s