It’s been three years since I introduced the soundweft as an alternative form of “sound wave art,” and in this post I’d like to show what a new version of it makes of each entry in the American Song Contest of 2022, which is still underway as of this writing. The ASC entries present a nice test case because they force my “wefting” algorithm to contend with a variety of popular music genres while also demonstrating the range of visually distinctive results it can produce from them. I’ll have more to say below about what a soundweft is and what’s new about the new version of it, but so as not to keep you waiting, here are the first few results (in alphabetical order by state or territory; click to view at full resolution).
Alabama: Ni/Co, “The Difference” (460761, 165, 84)
Alaska: Jewel, “The Story” (288002, 102, 46)
American Samoa: Tenelle, “Full Circle” (375654, 128, 26)
So what is a soundweft? Simply put, it’s a visual display of the digital samples that make up an audio file that has been optimized to draw out longer-term patterns of repetition so that the eye can detect them. Somehow people have come to expect sound recordings to look like this:
A cottage industry has accordingly sprung up around visual art based on such designs, mostly intended for framing and hanging on a wall. But these waveform thumbnails don’t actually do a very good job of representing audio. Their time scale is much too compressed for the information in them to be playable—in spite of what you may think, they don’t contain the audio in that sense. They don’t even reveal all that much to the eye about what a given piece of audio is like. And in themselves they’re not very aesthetically appealing either, although artists often spruce them up with attractive colors, backgrounds, captions and such.
So I decided to try to come up with a more engaging way of displaying a sound recording visually—not one that involves any fundamental transformation of the data itself, like a sound spectrogram, but one that simply arranges the audio sample data more effectively. I’ll explain what I did below—but based on what you see, do you think I succeeded?
Arizona: Las Marías, “De La Finikera”* (551472, 181, 35)
Arkansas: Kelsey Lamb, “Never Like This” (467020, 153, 89)
California: Sweet Taboo, “Keys to the Kingdom” (470204, 167, 68)
Colorado: Riker Lynch, “Feel the Love” (257140, 89, 70)
Connecticut: Michael Bolton, “Beautiful World” (548560, 175, 65)
As originally introduced back in 2019, a soundweft:
- Displays the audio signal as a band of varying brightness rather than a wavy line of varying height. A band can be a single pixel high, so this approach lets us pack audio information into an image very efficiently.
- Displays positive values in the green channel and negative values in the blue channel. Since an audio signal oscillates between positive and negative values, this yields alternating green and blue stripes.
- Assigns the remaining red channel to the absolute value of the cumulative sum. This is superfluous as far as conveying the audio information goes, but it enhances visual contrast, and the cumulative sum corresponds to the function represented by a record groove a hundred years ago, so it wasn’t a random choice.
- Loops the display like a television signal with a cycle length that matches a repetitive pattern in the audio, running first from left to right and then, secondarily, from top to bottom. If a piece of audio features a steady beat (as most current popular songs do), and if we choose the right cycle length, stacking repetitions of the beat will produce vertical lines perpendicular to the horizontal lines representing successive cycles.
Delaware: Nitro Nitra, “Train” (246862, 86, 44)
Florida: Ale Zabala, “Flirt” (460800, 165, 45)
Georgia: Stela Cole, “DIY” (235102, 85, 8)
Guam: Jason J., “Midnight” (508236, 179, 124)
Hawaii: Bronson Varde, “4 You” (411429, 144, 123)
My original soundweft technique created images that were a lot wider than they were tall—so much so that they didn’t work practically for visual display. To get something more suitable for (hypothetically) hanging on a wall, I rescaled the raw results into squares. But that made the vertical dimension largely redundant and thereby wasted a lot of space. Meanwhile, preserving the original audio sample rate for a typical popular song would have meant creating extremely large images—say, 100,000 pixels on each side. So I ended up scaling down the soundweft images I actually published to 1000 or 3000 pixels per side.
But I’ve since refined my method to eliminate this need for rescaling. Instead of having the horizontal band representing each “cycle” run straightforwardly from left to right, I now break it into smaller chunks and display each of these as a vertical band within the horizontal band, with the vertical bands running from top to bottom and then, secondarily, from left to right. An algorithm works out how tall the individual bands need to be in order for the final image to come out approximately square. In this way, audio can be displayed at its original sample rate in square images of reasonable size. As an added bonus, some further patterning—curvy stripes and such—often emerges visibly inside the horizontal bands.
Idaho: Andrew Sheppard, “Steady Machine” (575992, 170, 138)
Illinois: Justin Jesso, “Lifeline” (151579, 54, 53)
Indiana: UG skywalkin (ft. Maxie), “Love in My City” (500868, 172, 168)
Iowa: Alisabeth Von Presley, “Wonder” (265846, 96, 74)
Kansas: Broderick Jones (ft. Calio), “Tell Me” (300522, 106, 94)
Kentucky: Jordan Smith, “Sparrow” (590764, 182, 8)
Louisiana: Brittany Pfantz, “Now You Do” (471252, 167, 22)
Maine: King Kyote, “Get Out Alive” (432618, 113, 59)
Maryland: Sisqó, “It’s Up” (539996, 190, 174)
Wait a minute, I hear you objecting: the images I’ve been sharing are rectangular, not square, as they should be based on what I’ve described so far. And you may also have noticed that they’re vertically symmetrical (or nearly so).
That’s because of how I’ve decided to handle stereo. My current “wefting” algorithm gives me three options: (1) summing the left and right channels; (2) alternating left and right samples; or (3) presenting the right channel on the right and the left channel as a mirror image on the left.
The images in this blog post follow option number three. Instead of running from left to right, then, the audio signal runs from the vertical center outwards, towards the left and right sides of the image simultaneously. The resulting symmetry heightens the visual impact of the images much as folding a paper with an inkblot on it does as the stuff of a Rorschach Test.
Massachusetts: Jared Lee, “Shameless” (566560, 195, 110)
Michigan: Ada LeAnn, “Natalie” (432000, 147, 43)
Minnesota: Yam Haus, “Ready to Go” (576000, 211, 30)
Mississippi: Keyone Starr, “Fire” (221538, 77, 68)
Missouri: HALIE, “Better Things” (535812, 189, 3)
Montana: Jonah Prill, “Fire It Up” (411426, 144, 126)
Nebraska: Jocelyn, “Never Alone” (299220, 89, 87)
Nevada: The Crystal Method (feat. Koda and VAAAL), “Watch Me Now”* (152582, 55, 43)
The three numbers given in parentheses after each selection title represent:
- The cycle length in samples at 48 kHz.
- The height of the horizontal bands.
- A padding value.
The optimal cycle length inferred for a given piece of audio usually doesn’t divide evenly into columns of the optimal height inferred for the horizontal bands, so the padding value refers to the number of blank (white) pixels added to the end of each band to fill out the extra space. In the last example shown above, for example, the band height is 55 and the padding value is 43.
That means that the last column in each band, at the left or right edge of the image, will contain 12 pixels representing audio samples followed by 43 white pixels. I also add white pixels at the end of each recording as needed to pad out any remaining space in the bottom band, which is why the images you see here generally have a wider or narrower “pedestal” at the bottom.
New Hampshire: MARi, “Fly” (411464, 148, 124)
New Jersey: Brooke Alexx, “I Don’t Take Pictures Anymore” (598440, 207, 204)
New Mexico: Khalisol, “Drop” (345600, 120, 0)
New York: ENISA, “Green Light” (523636, 167, 76)
North Carolina: John Morgan, “Right in the Middle” (250434, 79, 75)
North Dakota: Chloe Fredericks, “Can’t Make You Love Me” (128000, 45, 25)
Northern Mariana Islands: Sabyu, “Sunsets and Seaturtles” (392544, 117, 108)
Ohio: Macy Gray (ft. The California Jet Club and Maino), “Every Night”* (239998, 82, 16)
Oklahoma: AleXa, “Wonderland” (576000, 203, 114)
I believe soundwefts—as images—are transformative. They convert source audio into a novel visual form, building on it for a purpose wholly different from that of the original, enabling us to reflect on it in a new way. They can’t be “read” by eye as sheet music or analyzed in the way sound spectrograms can, but their visual patterning is still meaningful and interesting.
Unlike most of what passes out there for “sound wave art,” though, the soundwefts presented in this post are all technically playable, in the sense that they contain enough of the right kind of information to support audio playback. In fact, here’s a Python script that can handily convert any of these soundwefts into a WAV file. All you need to do (presuming you’re comfortable running Python scripts and have SoundFile, NumPy, and OpenCV installed) is manually edit the code to enter the path to the image and the correct sample_rate (48000), band_height, and padding_value.
This is good insofar as it proves that soundwefts really contain the audio they’re supposed to represent—a claim that underlies much of the appeal of other “sound wave art,” but which is usually deceptive. Soundwefts actually do what other “sound wave art” only pretends to do.
But it could be bad for the same reason. If soundwefts aren’t just visual artworks, but also viable sound recordings in disguise, then distributing them might be deemed as problematic on grounds of copyright as distributing mp3s of them.
That’s one of the reasons I decided to share these soundwefts as JPGs (another reason being that JPGs are comparatively small). If I’d shared them as TIFs, you’d have been able to extract pretty decent audio from them, as I know from actual experiment. If I’d shared them as PNGs, the quality might still have been acceptable. But JPG compression has thoroughly hobbled the data: if you go to the trouble of using my Python script to convert the JPGs into WAVs, you’ll be able to recognize the songs, but the quality will be poor, with a lot of distortion, as though you were listening to a radio station a bit too far away for you to be able to pick up properly.
So I feel safe in saying that these specific image files aren’t viable substitutes for the source recordings. If they were printed out and framed, the potential sound quality of the hard copies would drop even further.
But none of this reflects any limitation inherent in the soundweft as such. Soundwefts that look indistinguishable from these could contain high-quality audio. And if a band wanted to distribute a song as a soundweft in TIF or PNG format, together with a special player for it, I see no reason why that wouldn’t be feasible. (Contact me if you’re interested!)
Oregon: courtship., “Million Dollar Smoothies” (299222, 106, 16)
Pennsylvania: Bri Steves, “Plenty Love” (274286, 98, 16)
Puerto Rico: Christian Pagán, “Loko” (587760, 205, 180)
Rhode Island: Hueston, “Held On Too Long” (354232, 125, 18)
South Carolina: Jesse LeProtti, “Not Alone” (540000, 197, 174)
South Dakota: Judd Hoos, “Bad Girl” (303158, 106, 2)
Tennessee: Tyler Braden, “Seventeen” (590764, 164, 128)
Texas: Grant Knoche, “Mr. Independent” (555176, 181, 132)
I’ve found it trickier than I thought it would be to design an algorithm that would reliably choose the best cycle length for a given song. My current “wefting” script uses two passes of autocorrelation—one at lower resolution (to save time), and another at higher resolution near the result of the first pass—to find the strongest cycle with a length of 2.5 seconds plus or minus 0.7 second, and then tests five multiples of that cycle length (×1, 2, 3, 4, 5) to try to find the strongest among them (comparing just the “top” areas corresponding to the shortest image height, which isn’t ideal, but seems to work better in practice than other things I’ve tried).
The automated results often seem to come out right, but not necessarily always. Symptoms of a “wrong” choice of cycle length include patterns alternating between adjacent cycle bands (see possible examples in North Carolina, Washington, and West Virginia) and patterns that repeat diagonally (see possible example in Mississippi). But those symptoms can be visually appealing in themselves.
Maine is a particularly interesting case: the tempo increases and then decreases, producing an interesting vertical curvature. Note that I’ve used the “official audio” of the entries whenever it’s been readily available, and not the televised performances. Sometimes there are substantive differences: for example, the official audio for the Maine entry is a lot longer than the televised performance, and the official audio for the Georgia entry contains some not-suitable-for-radio language that doesn’t fit the “DIY” theme of the song nearly as well as the clean version (“go f**k yourself” versus “go fix yourself”). In a few cases marked with asterisks I couldn’t find “official audio” from ASC and went instead with whatever studio recording I could find that most seemed to match the televised performance.
U.S. Virgin Islands: Cruz Rock, “Celebrando” (460800, 156, 24)
Utah: Savannah Keyes, “Sad Girl” (443073, 146, 37)
Vermont: Josh Panda, “Rollercoaster” (397240, 138, 62)
Virginia: Almira Zaky, “Over You” (493716, 176, 140)
Washington: Allen Stone, “A Bit of Both” (515776, 165, 14)
Washington, D.C.: NËITHER, “I Like It” (240000, 86, 26)
West Virginia: Alexis Cunningham, “Working on a Miracle” (411405, 126, 111)
Wisconsin: Jake’O, “Feel Your Love” (571240, 196, 100)
Wyoming: Ryan Charles, “New Boot Goofin'” (225882, 80, 38)
That’s it for the ASC soundwefts. Some of them have been very visually striking, with standout pieces—in my opinion—being Georgia, Indiana, Kansas, Maine, Pennsylvania, and Wyoming. Others have been less so. But I daresay that even the least compelling of them is a big improvement on this as far as something you might want to hang on your wall:
PS. A few stray reflections on the American Song Contest itself:
- Announcement of Results. There’s been some variation from episode to episode as to how and when results are shared. Starting with the third episode, for example, the relative jury rankings have been reported after every few performances. I’m torn as to whether I like that arrangement or not, but I’d have preferred consistency one way or the other, since the differences in reporting have probably affected the popular vote and made the playing field a little less even.
- The Jury. I’m in Indiana, and published lists identify the jury member representing my state as “Nancy Yearing, Talent Development.” But her LinkedIn profile shows that she’s been working in Los Angeles, California, for the past seventeen years, and her only Indiana connection seems to be that she attended DePauw University in Greencastle from 1998-2002 (her hometown was cited at the time as Ridgewood, New Jersey). It’s true that Yearing brings some impressive qualifications to the table from past work on American Idol and America’s Got Talent. But if this case is typical, it seems the ASC folks didn’t make much effort to recruit music industry professionals active in the specific music industries of the states and territories they’re supposed to represent.
- Non-Qualifier I’d Most Like To See Advance to the Semi-Finals: Alisabeth Von Presley’s “Wonder,” which has inspired my son to want a keytar.