Case studies / Calls that file themselves

Calls that file themselves

A dialer makes hundreds of calls a day and records every one. The useful part, who is hot, who said call back, who is a dead end, is locked inside the audio. This is the loop I built to get it out. GoHighLevel hands off the recording, Whisper transcribes it on a machine I own, a model qualifies the transcript, and the answer gets written straight back into GHL so the right automation fires. Nobody has to listen to the calls.

The problem with call recordings

Every dialer produces the same thing at the end of the day. A long list of recordings nobody has time to play back. The signal you actually want is in there, but it is trapped in audio, and audio is slow. Someone has to listen, decide what the call was, and update the CRM by hand. That last mile is where the follow-up dies. The hot lead who asked for a quote at 2pm does not get tagged until tomorrow, if at all.

I did not want to replace the dialer or the CRM. Both already work. I wanted to close the gap between them, the part where a human listens to a call and types in what happened, and let software do that part instead.

The shape of it

It is a round trip. GoHighLevel sends the recording out, the transcription and the judgment happen on a local machine, and the result comes back into GoHighLevel as a tag the automations can see. Three hops, and the middle one never leaves my own hardware.

~ ghl out, whisper and qualify in the middle, ghl back in ~

Step 1. GHL hands off the recording

The dialer logs each call in GoHighLevel with the recording attached, which it already does. When a call wraps, a webhook fires and sends that recording to a small worker running on my own machine. Nothing about the existing setup changes. The dialer keeps dialing, GHL keeps logging, and the only new thing is that the audio now has somewhere to go.

Step 2. Whisper transcribes it locally

The worker runs Whisper on the machine itself, not through a paid API. It takes the recording and turns it into text. Running it locally matters for two reasons. The recordings never leave my environment, and the transcription cost does not scale with call volume, because there is no per-minute meter running.

Why local and not a cloud API. Transcription is the highest-volume step in the whole loop, one job per call, every call. Paying per minute for that adds up fast, and it means shipping customer call audio to a third party. Whisper on a box you own removes both problems at once.

Step 3. Qualify the transcript

A transcript on its own is just text. The judgment is deciding what the call actually was. The worker hands the transcript to Claude with a short rubric, and it returns three things. A disposition, what kind of call this was. A one line summary, so a human can scan it without listening. And the next action, what should happen to this lead now.

~ one call, read and routed in seconds ~

The disposition is the important output, because it is the thing the next step can act on. Everything else, the summary and the notes, is there so a person can trust the call without playing it back.

Step 4. Write it back into GHL

The worker writes the result onto the contact in GoHighLevel through the API. The disposition becomes a tag, the summary goes into a field, and that is the whole trick. A tag is something GHL automations can trigger on. The moment it lands, the CRM knows what the call was, in the same place it keeps everything else.

~ the disposition map, transcript signal to the right workflow ~

Step 5. GHL automations take over

From here it is all GoHighLevel, doing what it is good at. The new tag drops the contact into the workflow that fits. A quote request starts the quote follow-up. A polite brush-off gets suppressed. A no-answer goes back into the retry cadence. These are the same automations you would build by hand. The difference is what triggers them. Instead of a person deciding, it is what the caller actually said.

What a day of calls looks like

Here is the throughput across one representative day. The point is the bottom bar. Every call gets transcribed, tagged, and routed without anyone touching it, and a human only gets pulled in for the handful that are worth a live follow-up.

~ a full day of calls, and you only touch the hot 48 ~

Why Whisper runs locally

The whole design leans on one decision. Keep the transcription on a machine I control instead of renting it. That choice pays off on both cost and on where the recordings live.

~ why the transcription runs at home ~

None of this is exotic. A webhook, a transcription model, a qualifying prompt, and an API write. The reason it works is that it does not try to be clever. It takes the one slow human step, listening to a call and deciding what it was, and moves it onto a machine, then hands the result back to the CRM you already trust.

Honest notes. The numbers above are from a representative day, not a client testimonial, and this is a pattern I build, not a product I sell off a shelf. Whisper is good but not perfect on bad audio, so the worst recordings still get flagged for a human rather than guessed at. Qualification is a filter, not a verdict. And you keep handling call-recording consent exactly the way you do today. This only reads the recordings your dialer already makes.

Have a pile of calls nobody listens to?

If your dialer is filling up with recordings and the follow-up keeps slipping, this is the loop I build. Tell me what your dialer and CRM are, and I will tell you honestly whether this fits.

Tell me about your call flow

Related reading