Sonic Cafe, or how to caffeinate studio speakers

I like smart home stuff - in moderation. Some things end up too smart for my liking. So I got rid of my Homepods and got me a pair of KRK Classic 5s. These studio monitors sound great, and are fantastic value for their price, but have one annoying feature: the built-in circuitry that shuts them down when there’s no audio playing. It makes sense in a studio context, but sometimes I just like to listen to music quietly, and the shutdown threshold is just a little bit too low.

Some people have physically modded their speakers to permanently disable that feature, but I’d prefer to keep my warranty for now, and I actually do like auto-shutdown - just on my own terms.

Nothing reinforces one’s claim to a hacker badge than solving hardware problems in software (or vice versa). I wanted something like macOS’s caffeinate(1) command, except for audio.

Theory

Disclaimer: while I do sometimes end up doing audio engineering work professionally, I do not have any formal training - I was a hobbyist until I started “real” work in this area by pure circumstance.

Humans can’t hear sounds outside of the 20Hz (infrasound) - 20kHz (ultrasound) range, but it’s usually possible for various pieces of the audio chain to process and transmit such signals. My speaker’s auto-shutdown circuitry doesn’t seem to care much for the frequency range, but it cares for the amplitude. So I could simply generate a sine wave outside of the audible range to keep the speakers caffeinated.

Ultrasounds are a risky path to take. First off, they get finicky as you approach Nyquist frequency. I could crank up my DAC’s sampling rate, but $OPERATING_SYSTEM will still sometimes randomly revert it to 44.1kHz, because all software sucks.

Early on in the history of CD/digital audio, there was an anti-piracy idea being tried: embed specific signals above the 20kHz threshold, which would mess up casette tape copies. Similar technique was previously employed on vinyl records, an analogue medium. Fortunately the idea to apply it to CDs got scrapped, as it turned out there are people with hearing so perfect, they were able perceive audible distortion during playback - with 100% accuracy in blind tests.

My own hearing is nowhere that good - in practice, many humans can’t even hear anything above 15kHz. But importantly, animals such as cats certainly are able to hear way beyond that (up to 64kHz). Even ~~peasant~~ commodity audio hardware will often reproduce audio in the ultrasound range, let alone studio speakers. I want this signal to be inaudible to all residents of my home; that includes my kitties, Cookie and Spot.

So I’m going to go with infrasounds.

Prototype

I wanted something quick and dirty to test the hypothesis. I know no better environment for these kinds of experiments than Pure Data.

My patch looks like this:

[osc~ 10]  [loadbang]
 |          |          \
 |         [0.2 100 (  [;        (
 |          |          [pd dsp 1 (
 |         [line~]
 |         /
 |       /
 | ___ /
[*~]
 |\
[dac~]

sonic cafe.pd

#N canvas 88 59 299 231 12;
#X obj 27 169 dac~;
#X obj 27 141 *~, f 4;
#X msg 172 71 \; pd dsp 1;
#X obj 98 99 line~;
#X msg 98 71 0.2 100;
#X obj 97 11 loadbang;
#X obj 29 11 osc~ 10;
#X connect 1 0 0 0;
#X connect 1 0 0 1;
#X connect 3 0 1 1;
#X connect 4 0 3 0;
#X connect 5 0 2 0;
#X connect 5 0 4 0;
#X connect 6 0 1 0;

If you’re unfamiliar with Pd, here’s what each element in this patch does:

loadbang triggers an event (a “bang”) when the patch is loaded, which causes the two connected messages to fire.
The message pd dsp 1 turns on Pd’s signal processing engine, which is normally off at startup.
The message 0.2 100 contains the parameters to line~, which is used as a ramp generator. It means: go from 0 to 0.2 over 100ms.
osc~ 10 is a simple sine oscillator, working at 10Hz.
*~ applies the ramp (amplitude control) signal to the oscillator.
The output is sent to both left and right channels on the dac~ (audio output).

The only real “trick” here is the ramp generator - it serves to avoid a sharp “pop” that could occasionally happen when the oscillator’s output is first sent to the speakers. The amplitude value of 0.2 was determined during testing. I wanted it as low as possible, but high enough to trigger the speakers to turn on as soon as I turn the physical volume knob up from zero.

This prototype is good enough that I’ve been using this as-is for several months. I just double-click sonic cafe.pd on my desktop, turn up the volume knob on my DAC a tiny bit, and the speakers turn on. Pd needs to keep the patch open in a window - it’s small and doesn’t bother me, but it’s noticeable enough that I usually won’t leave it running by accident; so the power-saving feature might still engage opportunistically.

The speakers (along with the screen, and a whole bunch of other junk on my desk) will still end up unpowered unconditionally before I go to sleep or leave the house - I have various Homekit automations set up for that.

Future

While the prototype is “good enough” for everyday usage, Pd’s DSP engine sometimes generates small, audible pops, and I hate it. It’s most likely simple buffer underruns - probably fixable; but it gives me motivation to revisit the original problem.

I would like to rewrite this (Swift? Go?) to run as a daemon. I’d like to add many features and fix some annoyances, such as:

Start the thing on login;
Don’t keep any extra windows open;
Lock the output to the specific DAC;
Smoothly restart the output when the DAC is unplugged/plugged back;
Hijack/monitor the DAC output to determine if it actually makes sense to keep the speakers powered on (maybe indeed there’s no audio to be played).

…but there’s also a non-trivial chance, that the Pd patch may outlive the speakers. There’s a piece of duct tape keeping a plane flying, somewhere above you, right now.

Risks

Since the DAC evidently does produce a sub-20Hz line signal, I was curious whether the speakers will actually physically respond at those frequencies (perhaps there was a high-pass filter to remove DC offset). I cranked up the volume way high and indeed, they do! I was concerned whether this creates any risk of physically damaging the speakers, but a more knowledgeable friend reassured me that this should be safe.

Still, the 10Hz sine wave that drives this hack is now very faintly, but physically manifested in the reality around me, just like actual caffeine is physically present in a human body.