Here’s another Griffonage goody for connoisseurs of the sonically strange, of the acoustically audacious, of arrant eeriness for the ear. Back in July 2017, I blogged about an audio distortion technique I called window reversal, explaining that it involved
reversing every successive group of x samples throughout a source recording. Extremely short windows have little effect, while longer ones sound like exactly what they are: periodic reversals. But when the window falls into a certain intermediate range—in the ballpark of 75-125 samples—reversal can skew frequencies in interesting ways. Picture what would happen to a sine wave and you’ll start to understand why.
By way of illustration, I’d applied this technique to an excerpt from Chuck Berry’s “Johnny B. Goode,” reversing every successive 110-sample segment, but all I could think of to say about the results at the time was that they were “very weird-sounding.” I’ve since done some more experimenting and thinking, and even though I still find it hard to wrap my head around just what’s going on here, I decided I might as well share what I’ve come up with so that others can enjoy puzzling over it too.
One troublesome issue with window reversal is that it introduces lots of abrupt jumps into the waveform, as can be seen in the above illustration. These jumps produce an unpleasant buzz or rattle during eduction, so I’ve tried tweaking the algorithm to try to mitigate this, specifically by interpolating over one tenth of the signal at segment joins using a Piecewise Cubic Hermite Interpolating Polynomial (PCHIP). This makes the results of window reversal sound a little less jarring than before, but it brought me no closer to understanding the effects I was hearing. Window reversal plainly distorts frequencies in certain ways, and there’s some relationship between the size of the reversal window and the specific effects on specific frequencies. But how does this actually work? What’s going on here acoustically? Why do we hear what we hear? And could there be any way to tame the effect to make it more aesthetically interesting?
I tried to think things through and came up with the following hypotheses:
- If the size of our reversal window (in samples) happens to be exactly equal to the period of an input signal (in samples), there should be no perceptible change. Each cycle will just be flipped along the time axis. The corresponding frequency, which can be calculated in Hertz as the sample rate divided by window size, should also be unchanged, and so should all multiples (i.e., harmonics) of that frequency.
- But a signal at any other frequency, with any other period, will behave differently. Peaks or troughs will be nudged closer together at one end of each segment and further apart at the other end of each segment. The closer-together peaks will introduce an additional (higher) frequency, while the further-apart peaks will introduce an additional (lower) frequency. Meanwhile, the joins will no longer coincide with the same point in each cycle, and so will introduce noise, but noise with at least some periodic character involving complex interactions between the rate of one tic per window and the periodic structure of the source wave.
To examine what happens in actual practice, I applied window reversals of different sizes to a sine sweep running from 20 Hz to 20000 Hz, and then took a look at spectrograms of the results. Here’s a representative spectrogram with a 30-sample reversal window applied to a sine sweep at a 44.1 kHz sample rate, with the audio file given below (sounding a lot like a siren).
The strongest frequency components consist of diagonal “slashes” across the trajectory of the original sine sweep (a line running from lower left up to the 20000 Hz mark in the scale on the right, still present but faint except at the very beginning). These “slashes” correspond to the two frequencies introduced by window reversal, and they cross the line of the original sweep at multiples not just of the frequency corresponding to the window size (1470 Hz), but of half that (735 Hz): thus, 735 Hz, 1470 Hz, 2205 Hz, 2940 Hz, etc. The harmonics and subharmonics of both the “slashes” and the original sweep are also present, though more weakly. The same is true of the frequencies of the sweep itself, except that for part of its range it’s present as strongly as the “slashes.”
When we apply these same processes to a more complex input signal, I imagine they must transform each frequency component in approximately the same way.
For comparison, here’s the result of applying my octave inversion algorithm to the same sine sweep.
The effects of window reversal are plainly a lot more complicated than these. Unlike octave inversion, window reversal transforms octaves differently, splitting each ascending octave into twice as many “slashes” as the preceding one. As a result, two frequencies an octave apart will often not be transformed into two other frequencies an octave apart, or even into two pairs of other frequencies an octave apart. On the other hand, the higher the frequency, the less the “slashes” will move it away from its original value. Thus, while window reversal won’t preserve octave-width relationships as octave inversion does, it may be more effective at preserving some higher-frequency components of timbre.
As a further experiment, I took some MIDI piano scales and ran them through window reversals of 10, 30, 60, 100, 130, 210, 300, and 441 samples, all applied to a 44.1 kHz sample rate. You can hear the results below. The piano scale is presented first one way, and then another starting at 2:16, and the reversals are arranged in order, starting with 0 (unchanged), then 10, then 30, etc.
The 10-sample reversal mainly affects timbre; the 30-sample reversal starts to introduce some odd frequency distortions into the mix; and the reversals with larger window sizes scramble things yet more elaborately.
Next, here are short snippets of several songs with an 80-sample reversal window applied to a 48 kHz sample rate, which comes out to 600 Hz.
These particular results aren’t all that engaging (to my ear, at least), and I’ve given them here mainly just to illustrate the effects the technique has on sources I suspect you’ll recognize. But through trial and error, I’ve found that effects I’d consider musically interesting can sometimes be achieved. There may even be generalizable rules about “good” combinations of musical keys and window sizes, but if such patterns exist, I can’t say what they are and have made no effort as yet to work them out. Older recordings from the acoustic era, with a limited frequency range, tend to weather the process more acceptably, but I’ve thrown in a couple of electrically recorded examples for good measure.