Archivorithm #1: Experiment in Indeterminacy

For my 78th Griffonage-Dot-Com post, I’ve written a piece of software that randomly taps the tens of thousands of old records digitized through the Great 78 Project as a sample library; layers, sequences, and loops the clips according to a set of chance-controlled rules; and then introduces some randomized harmonic filtering and percussion.  The results can be surprisingly engaging, and they also vary widely in character: if you don’t like what you’re hearing at first, wait a minute or so and it will probably transition into something entirely different.


This is an experiment in indeterminacy, an approach to musical composition in which details that ordinarily get spelled out by a composer are instead left to chance or to the discretion of performers: think John Cage’s Imaginary Landscape No. 4 for twelve radios, or his Imaginary Landscape No. 5 for forty-two sound recordings on magnetic tape.  But instead of radio or tape, I’m pulling my source audio from the Internet.  And while Cage prepared chance-controlled but fixed scores for those two Imaginary Landscapes, I’ve structured my work only loosely in terms of rules and probabilities, except for the basic point that it consists of ten-segment cycles (each of which can, however, also be taken separately).


The name I’ve been using for this project is Archivorithm, a portmanteau of Archive and Algorithm.  I think of this as describing a whole genre, though—works that algorithmically sample preexisting collections assembled for other purposes—so let’s call the present work Archivorithm Number One.  It’s what Lee B. Brown, in his article “Phonography, Rock Records, and the Ontology of Recorded Music,” calls a “work of phonography”: a work that’s only “phono-accessible,” can’t be “performed,” and is “created by the use of recording machinery for an intrinsic aesthetic purpose, rather than for an extrinsic documentary one.”  But at the same time, I don’t mean for the work to be defined by any single phonogram (or “sound recording”); instead, many different phonograms could count as realizations of it, even though each one of them will sound more or less the same each time it in particular is played.


You might want to claim instead that I’ve written a program that lets computers compose lots of different “works of phonography” in turn.  But I think it’s more interesting to imagine the whole project in terms of one single “work of phonography” that happens to be indeterminate, just as musical works of other kinds can be indeterminate.  That would make it a “work of phonography” in which the urtext isn’t a phonogram, but a set of instructions embodied in a piece of code.  It wouldn’t be the code as such, which I wrote in MATLAB, but which I’d say could also be represented just as authentically in some other language (or notation).


And what are those instructions?  First, an individual segment is generated as follows:

  1. Randomly choose a duration between 0.3 and 0.6 seconds.
  2. Randomly select between 20 and 35 recordings digitized through the Great 78 Project and download one clip of the chosen duration from a randomly-selected point within each recording.
  3. Randomly choose a quantity of clips to assemble into each bar (four, eight, or twelve) and a number of bar-length loops to generate (ranging from five to twelve); then create the selected number of loops by randomly choosing and sequencing downloaded clips.
  4. Select a quantity of output bars ranging from fifteen to twenty; then assemble each bar by layering loops onto it, such that each available loop has a one-in-four probability of being included in each bar, with the added provision that each bar must contain at least one loop.  If the output is in stereo, each channel contains the signal at 50% as a starting point, and another 50% is then randomly apportioned between the left and right channels.
  5. Now assemble an accompanying rhythm track using a folder of percussion samples, treating these by default as having a sample rate of 48 ksps.  (For the examples presented above, I downloaded and combined four different collections of samples: 1,000 free drum samples from SampleRadar, plus three collections from Soundpacks.com: the Live Percussion Sample Pack, the Dubstep Empire Drum Kit, and the Ultimate Boom Bap Drum Kit.  Collectively, these collections gave my algorithm a pool of over 5,000 nicely heterogeneous samples from which to draw.)  At the start, halfway, 1/4, and 3/4 point of each bar, add a randomly-selected hit with a probability of 50%, and a further probability of 25% of the hit being present in only every other bar; and do this twice for each point.  Also do the same, but just once, with additional points: 1/8, 3/8, 5/8, 7/8, 3/16, 7/16, 11/16, 15/16 (with the probability of alternating bars dropping to 12.5% for the final four, for no particular reason other than that’s how I set things up).  Loop the result as needed to fill out the target duration, and then mix the rhythm track with the previous output (at 100% in mono, or at 75% in stereo).
  6. Melodize the result.  This entails applying another algorithm I developed last year to amplify the pitched character of any sound recording by detecting the strongest musical notes present at each point in time and filtering to them more or less narrowly.  The analysis and processing window is set equal to one eighth of a bar, with start and end points adjusted to coincide with maximum nearby impulses.  Other parameters are chosen randomly: the narrowness of the bandpass filter, the number of notes to pass (1-3), the minimum distance between passed notes (0-2), and the scale (F major or F minor).  Then mix the melodized and unmelodized versions of the signal at 50% each.  This has the effect of making the result more musically “pitched,” and it will sometimes synthesize melodies or chords out of filtered noise.

This process is carried out ten times, and the resulting segments are concatenated into a single longer “cycle”—with a fade-in of 0.5 to 3 seconds and a fade-out of 2 to 5 seconds—as well as being output separately.  For successive segments within a cycle, the parameters are set to change with the following probabilities, the goal being to introduce some formal continuity from segment to segment:

  1. Choose a new duration for downloaded clips: 25%
  2. Download a new set of clips: 25%
  3. Choose a new quantity of clips for each loop, and a new quantity of loops to generate: 25%
  4. Generate new loops: 75%
  5. Generate a new rhythm track: 50%

These probabilities are each applied independently, but in some cases a change of one parameter forces other parameters to change as well; for instance, if the length of downloaded samples changes, then a new set of samples also needs to be downloaded, and new loops also need to be generated from them.  It takes about an hour and a half to generate one cycle running the code on my laptop.

This whole process can also be set up to batch-generate multiple cycles, for example if I want to leave the software running overnight.

Not every segment within a cycle will be equally engaging.  However, I also export the segments so that they can be strung together individually like beads on a string—say, by dropping the “best” ones into a folder and then playing the folder’s contents on shuffle.  Below are compilations of some of my favorite segments from the first couple days of experimental processing.  Note that the final ten segments in this first sequence, starting at about the seven-minute mark, were actually generated together in the order presented, as a single cycle.


Next, here are some experiments in mono, from before I’d introduced stereo separation:


And here’s a slightly earlier group, from before I’d provided for continuity between successive segments (not that this makes a difference when the segments are being independently chosen and sequenced, as they are here).  About four minutes in, there’s a segment I call “Soothing Soup”; see if you can figure out why.


Finally, here are some even earlier results, generated using an alternative pool of percussion samples (with a few pops and clicks at the interstices, which I hadn’t yet come up with a strategy for mitigating at the time).


I’ve also provided a separate “remix” algorithm that uses a previously-generated “seed” segment as the basis for a new cycle.  In this case, each new segment takes the seed segment as its starting point and then applies a 1/4 chance of creating new loops, another 1/2 chance of downloading new clips to replace a randomly-chosen range of source clips (and creating new loops in turn), and a 1/3 chance of generating a new percussion track (as well as resetting the number of loop-length repetitions in the segment).  The resulting cycle becomes less like a compilation of loosely linked segments and more like a coherent theme and variations.  By way of example, here’s one seed segment:


And a remix cycle generated from it:


Here’s another seed segment (which I call “Etta Elle” for reasons that will become obvious):


And a remix cycle generated from it:


But why stop at audio?  My next idea was to create video sequences out of materials chosen at random from the Great 78 Project’s motion-picture counterpart at Archive.org, the Prelinger Archives, and then to combine these with my audio cycles to produce music videos (each of which actualizes what has now become an indeterminate audiovisual work).

The algorithm I came up with for video chooses either a random year between 1920 and 1970 (with a second try on 1921 and 1923, which are poorly represented) or a random letter of the alphabet (with a second try on X or Z), searches the Prelinger Archives for all films from the chosen year or all films with a title beginning with the chosen letter, and downloads one at random.  It then selects a starting point somewhere within the middle three fifths of the film—excluding the first and last fifth in an effort to avoid title sequences and credits—and extracts a clip between 30 and 200 frames long.  The quantity of clips drawn in this way from each film is determined by dividing its duration by two minutes and rounding to the next highest integer.  The process is then repeated with additional Prelinger Archives films until enough clips have been created to fill out the duration of the accompanying audio file, and these clips are assembled into a random sequence.

Unlike my audio algorithm, my video algorithm doesn’t involve any superimposition of clips, and the individual clips are much longer, with multiple clips for each project drawn intentionally from the same source film, so that the repetition of clearly recognizable subject matter can help foster an illusion of coherence and continuity.  The video and audio are created entirely independently from each other, except that the duration of the audio determines the duration of the video.  But when they’re combined, they still complement each other in ways that can feel meaningful—say, by imposing the “mood” of a musical segment onto a video scene, or by emphasizing a transition point.

Here’s the first video I generated to accompany a whole ten-segment cycle (click here to download).


And here’s the second (click here to download).  I call this one What Then Is Music?—watch and see why.


If you have a Windows 10 operating system, you’re welcome to try running Archivorithm Number One yourself.  Here is an installer for the software, which comes with no warranties of any kind, and here are some instructions for using it:

  1. You’ll need to provide a folder of percussion samples in WAV or AIF format.  The sources I listed above work well; just download and unzip the collections of samples, search in them for “wav” and “aif,” and cut and paste the search results into a single folder.  Or feel free to substitute some collection(s) of your own.  Then click the “Load percussion samples” button, navigate to the folder, and select all the sample files you want to use.
  2. Click the “Set Output Folder” button and navigate to the folder where you want your results to be stored.  Plan on needing about 1 GB free per audio cycle.  [IMPORTANT UPDATE, July 2018: I’ve discovered a bug that produces errors if you set the output folder to anything other than the default folder that first pops up when you try to set it.  I’ll fix this bug in a future release, but for now, just stick with that default directory.]
  3. Choose “Stereo” or “Mono” under “Sound Field.”  Stereo is preferable, but mono takes less processing time.
  4. Click “Run.”  If “Cycles” is set to 1, this will create a single cycle, cycle_1.wav; ten segments, segment_1.wav, segment_2.wav, etc.; and ten files containing the clips, loops, and so forth: segment_1.mat, segment_2.mat, etc.  If “Cycles” is set to a higher number, you’ll generate that many separate cycles, plus that many segments times ten.  When creating each new file, the software assigns it the lowest number not already taken; thus, if you delete an old “segment 4,” the next cycle will create a new “segment 4.”
  5. To use one segment as the seed for a remix (or multiple remixes), click “Remix” and then navigate to and select the corresponding .mat file.  This will generate files with names following the pattern cycle(segment_5)_1.wav and segment_5_remix_1.wav.
  6. To create a randomized video accompaniment, click “Generate Video” and select the WAV file you want it to accompany.  This will generate two files, silentvideo_1.avi and soundvideo_1.avi.  It will also create, and then delete, a bunch of temporary files.

I did make one important change to the code with future developments in mind.  Each recording George Blood digitizes for the Great 78 Project receives a “collection catalog number” of the form GBIA0022222A—or GBIA 0022222B for the other side of the same disc—where the “22222” indicates the 22,222nd disc digitized.  It’s therefore easy to grab a random recording to sample by generating a random number, assembling a corresponding “collection catalog number,” and searching for it.  But we need a range for our random numbers.  Back when I first began experimenting with this idea, there were 45,182 items available, so I designed my code to explore the range 1-45,182.  All the examples presented in this blog post were created with that limitation.  However, the number of items has since gone up to 49,600, and there’s no telling where it might end.  The code I’ve provided for download above therefore tries instead to sample items in the range 1-250,000, reflecting the project’s stated long-term goal of digitizing 250,000 “78 singles.”  Of course, most of those numbers won’t point to anything yet, but my code simply skips any items it can’t find, and although processing will take slightly longer this way, the code should now be able to accommodate new additions to the project for quite some time to come.  Naturally, Archivorithm Number One also requires the continued existence of the Great 78 Project and Archive.org in order to function.  If at some point in the future these resources were tragically to vanish from the Internet, the work could no longer be fully actualized; we could only listen to past recordings of it.

If you succeed in running the software as instructed, you ought to get results comparable to the ones I’ve shared above, except of course that you’ll be the first person ever to hear them.  I hope you enjoy the auditory voyage of discovery as much as I have.  Please let me know how it goes.  And if you happen to generate an especially interesting cycle, segment, or remix of Archivorithm Number One, I’d welcome a link below!

Advertisements

One thought on “Archivorithm #1: Experiment in Indeterminacy

  1. Pingback: Archivorithm Number One, Second Edition | Griffonage-Dot-Com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.