if you've ever built a web app that plays audio and you've tried to ship it to iPhones, you've probably run into the same wall we did. iOS Safari blocks audio playback until it sees a user gesture, and a "user gesture" is a much narrower window than the docs imply. by the time your real audio URL arrives 30 seconds after the user tapped submit, the gesture has long expired. audio.play() rejects with NotAllowedError and you get a black box on the screen.
the workaround is a one-line trick that we use in flowy and that almost nobody talks about. this piece is the trick, the explanation of why it works, and a tiny copy-paste implementation. it's all MIT-licensed; do with it what you want.
the trick
inside your user-gesture click handler, set the audio element's src to a 16-byte data URL of a silent WAV, then call play(). that's it. WebKit will mark the audio element as user-gesture-allowed for the rest of the page session. any subsequent play() with a real URL will go through, even minutes later.
the silent WAV (the smallest one that WebKit will treat as valid):
const SILENT_WAV =
"data:audio/wav;base64,UklGRigAAABXQVZFZm10IBIAAAABAAEARKwAAIhYAQACABAAAABkYXRhAgAAAAEA";
function unlockAudio(audioElement) {
if (audioElement._unlocked) return;
audioElement.src = SILENT_WAV;
audioElement.muted = true;
audioElement.play().then(() => {
audioElement._unlocked = true;
audioElement.pause();
audioElement.currentTime = 0;
audioElement.muted = false;
}).catch(() => {
// not unlocked this attempt; try again on the next gesture
});
}call unlockAudio() inside any click, touchend, mouseup, or keydown handler. once the play() promise resolves, the audio element is unlocked for the rest of the page session. the eventual play() with the real stream URL will work without a fresh tap.
why it works
WebKit's audio-gesture policy is enforced per audio element, not per page. the rule is: a play() call must happen inside a user-gesture handler at least once for the element to be unlocked. after that, the element can play whatever, whenever.
so the workaround is to give WebKit something to play immediately on the gesture, so the unlock happens synchronously. the silent WAV is 16 bytes; it's "playable" the moment its src is set. the play() resolves almost instantly. element is now unlocked.
the catch: you have to do this on a real user-gesture event. pointerdown alone doesn't count on iOS; you need at least touchend, mouseup, or click. that's why our implementation listens to multiple event types and takes the first one that comes through.
the gotchas
- don't latch on a rejected play(). if play() rejects, the unlock didn't happen. don't set your unlocked flag to true on rejection; the next gesture should try again.
- don't overwrite a real src. if the audio element already has a real (non-data:) URL loaded, don't reset it to the silent WAV. you'd interrupt playback. our implementation checks audio.src and bails if it's a real URL.
- handle the pause race. by the time the silent WAV's play() resolves, your code may have already set audio.src to the real URL. pausing then would kill real playback. check audio.src === SILENT_WAV before calling pause().
- don't rely on a single event type. pointerdown is too early on some iOS versions; click is too late on others. listen to a few and detach once unlocked.
a small standalone package
the implementation we ship in flowy is open source and lives in this repo at docs/audio-gesture-unlock/. no React deps, no framework, just two functions. copy-paste into your project, or wait for a future npm release.
if you ship something built on this, let us know. it'd be nice to have a list of products that needed the trick.
the larger point
WebKit's autoplay restrictions are reasonable from a user- experience standpoint. they exist because the open web spent years auto-playing music at people. but for tools where the user genuinely wants audio to start (after a tap, with their full consent), the restrictions are a tax. the silent WAV trick is the cleanest workaround we know.
it's not in the iOS docs anywhere. you find it by reading WebKit bug threads, copying code from open-source players, or asking the audio engineer at a podcast company who looks haunted when you bring it up. we're documenting it because nobody else does.