Try Not To Laugh: the laughter saga

The best example of how vibe-coding changes what’s possible is what I’ve started calling the laughter saga. It began with a simple question: could you build a Try-Not-To-Laugh game that actually watches your face?

Building the foundation

The first step was boring but necessary. I built a dataset pipeline that downloads and segments laughter audio clips from YouTube using AudioSet metadata, resampling everything to 16 kHz mono. Then I trained a ResNet18-based CNN on mel spectrograms to classify laughter versus not-laughter. It hit an F1 score of 76.2%, which isn’t going to win any competitions but is more than enough to know when someone cracks.

From there it spiralled. I exported the model to ONNX for browser inference. Built a Flask API with a web frontend for real-time microphone-to-prediction. Rewrote a 2019 sound event detection system to run on Apple Silicon at 60+ FPS. Each version taught me something the previous one hadn’t.

Seven iterations deep

The game iterations are where it gets interesting. Laughter3 was a Vue.js prototype combining face detection with video playback. Laughmachine got more ambitious: adaptive difficulty using contextual bandits to learn what makes each individual player crack, HP bars, segment-level heatmaps, all running at 15 Hz on the webcam with no frames leaving the device. The party version added Jackbox-style lobbies where players join via codes or QR, submit YouTube clips, and watch them together while the system monitors everyone’s face.

PublicLaugh was the most polished iteration: a competitive platform with continuous Elo ratings for both players and video clips. A bandit algorithm selects clips intelligently, WebSocket sync keeps everyone in lockstep, and dual leaderboards track the most composed players alongside the most effective joke-makers.

And then TNTL 2.0 emerged as the grand unification of everything I’d learned. Survival modelling instead of binary pass/fail. Hazard functions and survival curves. A composure index that works like an Elo for poker faces. A hybrid recommender combining collaborative filtering with content embeddings from Whisper, CLIP, and YAMNet. Four game modes.

What it taught me

Seven projects deep into the same idea, each one more sophisticated than the last. A year ago, the first version would have been the only version. I wouldn’t have had the technical endurance to iterate that many times. Vibe-coding didn’t just let me build the thing. It let me build the thing seven times, learning something new each time. The iteration depth is where the real value of AI-assisted development shows up. The full laughter saga is documented alongside all my other projects in Everything I Vibe-Coded This Year.

Building the foundation

Seven iterations deep

What it taught me

Results