← All work
webar / conversational ai · 2024

AR business card

A physical card that opens a real-time voice conversation with an AI version of me, anchored to a printed marker via WebAR. Six years of iteration, currently on v0.4 with ElevenLabs Conversational AI and a knowledge base of this site.

Augmented-reality business card showing a 3D avatar appearing above a printed card when viewed through a phone camera
role
Solo Developer
year
2024
category
WebAR / Conversational AI
stack
ar.js · a-frame · elevenlabs · firebase
The problem
Business cards are static. Could WebAR and conversational AI make a first impression that's actually memorable and useful, or is it always going to be a gimmick?

I’ve been making versions of this since 2019.

The original was a green-screen video of me introducing myself, mapped onto a flat plane positioned just behind the printed business card. AR.js handled the marker tracking; a small JavaScript chroma-key library I’d written handled the green-screen removal. The library was a per-frame canvas pipeline that read the video texture pixels, ran a colour-distance check against a sampled green value, and wrote alpha=0 to anything within the tolerance. Crude, but it held roughly 30fps on a 2019 phone, and the result was a hovering Chris talking out of the card. A help me Obi-Wan Kenobi scanline + edge-glow effect on top, because of course.

People held the card, the phone showed me hovering above it talking, and they walked away delighted enough to remember me. That was the bar.

The intermediate versions

Version two replaced the green-screen video with a photogrammetry scan of myself rigged to play a recorded monologue. Static head and shoulders, lip-sync baked from the audio waveform with a small viseme map. Better than a video plane, but the dialogue was still on rails: tap once, get the same minute-long pitch every time.

Version three added a branching dialogue tree. Touch zones on the card or voice prompts (“tell me about your work”) routed into pre-recorded clips. A-Frame component hierarchy held the conversation state; missed branches looped politely back to a known node. It felt more conversational, but only inside the script the script-writer had imagined. Hit anything off-tree and the avatar smiled and steered you back.

v0.4, the current version

The script is gone. It’s a proper conversation now.

The avatar is a stylised 3D character with a small idle / talk / wave / dance animation library, driven by a state machine that watches the conversation state and triggers gestures on transitions. The voice is ElevenLabs Conversational AI, streamed over a WebSocket session minted by a small Firebase Cloud Function. The function holds the ElevenLabs API key server-side and returns a signed URL with a short TTL, so the key never touches the client and a stolen URL is dead within minutes.

The agent has a knowledge base of every public page on this site. A pipeline reads the markdown content collections (blog, work, services, about, speaking), converts each entry to a clean text document, and uploads the set to the ElevenLabs agent’s KB on each site deploy. Updating a blog post on Friday means the avatar can talk about it on Saturday, without me touching the agent config.

The conversation can trigger interface moments. The agent emits a tool-use payload; the client interprets it. Open a project page on the user’s phone. Download my contact card as a vCard. Share the link with a colleague. The full state machine has nine states (loading, scanning, connecting, listening, thinking, speaking, idle, ended, error), each one with a corresponding UI panel and avatar animation, so the user never wonders what the app is doing.

The marker tracking is still AR.js. The marker is a custom pattern (.patt) file generated from the printed card’s design, with smoothing parameters tuned by trial and error to keep the avatar from jittering when the user’s hand is unsteady. There’s a surface-placement fallback for when nobody has the physical card: WebXR hit-test on Android, gyroscope + camera placement on iOS (Safari still doesn’t ship WebXR), with a manual tap-to-place if neither fires correctly. The whole thing runs in mobile Safari and Chrome with nothing to install.

What I always learn

The first reaction is always the same: people have never seen anything like this. Six years of making versions, and that hasn’t changed.

What that probably tells me, honestly, is that the format hasn’t taken off elsewhere because there’s a ceiling on its usefulness. It’s a memorable handshake at a networking event, not a tool anyone reaches for daily. Every time I demo it I have the same internal argument: is this a gimmick demo, or is it something people would actually opt to use?

The current version pushes hardest at the second question. Linking the avatar to the site’s real content, rather than to scripted dialogue, gives it a job. It’s a guided tour of what I do, by the person who does it, in voice rather than scrolling. That feels closer to useful than anything the earlier versions managed.

What I’m still unsure of

I could turn this into a standalone product. Let anyone configure their own avatar, voice, knowledge base, props, action triggers, theme. The plumbing is mostly there: the KB sync, the signed-URL pattern, the marker-or-surface placement logic, the tool-use interpreter on the client, the state machine. Add an admin UI and a billing layer and it ships.

What I haven’t yet figured out is whether the AR-card form is the right one. The same conversational agent works just as well as a chat on someone’s phone, without the marker dance. The card is novelty, and novelty wears off; the conversation is the thing. The question is whether the spectacle of the avatar appearing on the table helps people remember and engage, or whether it’s friction between them and the substance.

For now it’s my favourite small project, and the longest-running. There’s a longer write-up of how this fits into the wider portfolio, and a no-card surface mode you can try in your browser without the printed card.

Results

  • Real-time voice conversation with ElevenLabs Conversational AI
  • Knowledge base of the website content so the avatar can speak about the work, research and services
  • Tool-use triggers from the agent (open a project, download contact, share)
  • Runs in mobile Safari and Chrome with nothing to install
  • Surface-placement fallback for people without the printed card