Contents
- 01 · The Command Center
- 02 · The problem it solves
- 03 · How it works
- 04 · Making measurement trustworthy
- 05 · Scoring you can trust
- 06 · Two readers, two reports
- 07 · Sample output
- 08 · The system, as engineering
- 09 · Built to not break
- 10 · Regression-tested scoring
- 11 · Content automation
- 12 · What it demonstrates
The Command Center
A production system that audits how a local business measures its marketing. It reads each site the way a real browser does, finds where the tracking is lying, scores it honestly, and writes the report two different ways for two different readers.
The tracking was firing. The audit just couldn't see it.
A business owner spends money on ads every month. Ask them which of that money actually brought in a customer and the honest answer is usually a shrug. The tools are supposedly installed, but the report says zero calls, zero form fills, zero bookings, and nobody can tell whether that's the truth or a glitch.
Most "audits" make this worse. They fetch the raw page, don't see the tracking, and declare it broken. But in plenty of cases the tracking is firing fine and the audit simply couldn't see it. The Command Center exists to give a straight answer to one question: when a visitor becomes a customer, can this business tell? And if not, where exactly does the trail go cold?
One business in, an honest report out
You hand it a business: a name, a website, a city. It confirms the business is who you think it is, loads the site in a real browser so everything that runs on the page actually runs, inspects what's measuring and what isn't, scores the result against a fixed rubric, and writes the findings up in plain language. Run one, or feed it a list and watch them complete in real time.
Each step is deliberately separate. The part that judges is isolated from the part that writes, so a number is never quietly invented by the language and the language never quietly changes a number.
Making the measurement trustworthy
The core decision behind the whole system is this. Most sites build their tracking after the page loads, with a tag manager that fires once the browser runs the page's code. Read the raw HTML and none of that exists yet, so a plain fetch reports "no tracking" on sites that are tracking perfectly well. That false negative is the single most common way an audit lies.
So the Command Center doesn't read the raw page. It loads each site in a headless browser, waits for the page's code to finish running and inject everything it's going to inject, and only then takes its snapshot, the same view a real visitor's browser would have. The zero turns into the truth.
Two more decisions follow from the same principle. The detection is vendor-neutral. It recognizes a wide catalog of measurement tools across analytics, tag management, and conversion tracking, so a shop running a non-Google stack scores on what it actually has rather than getting marked down for not being Google. And mobile coverage is verified, not assumed. The system loads the site a second time as a phone and compares, because tracking that works on desktop and silently dies on mobile is a real and common failure a single desktop pass would miss.
A score you can trust, or no score at all
Scoring is fully deterministic. The same site always produces the same numbers, the rubric's weights and thresholds live in one place, and there's no model in the loop deciding what something is worth. Same inputs, same score, every time.
The decision I'm proudest of is what happens when the site can't be read, whether it's blocked by bot protection, a bad URL, or a server error. The easy thing is to score those gaps as zeros. But a zero from "we looked and there's nothing" is indistinguishable from a zero from "we couldn't look," and one of those is a number you might email to a prospect and be wrong about. So when the inspection can't run, the system refuses to emit an overall score at all. It says "couldn't scan" instead of inventing a low number. A missing answer is honest. A confident wrong one isn't.
Two readers, two reports
The same audit data produces two completely different documents, written for two people who could not be more different.
The client-facing report has to pass one test. A twelve-year-old could read it aloud and understand it. No jargon, not "tracking," not "conversions," not a single tool name. A low score isn't an insult, it's money being left on the table, explained in the words a plumber or a spa owner actually uses. Translating measurement into that register is its own craft, and it's deliberate.
That the same data renders into two audience-shaped outputs is an architecture choice. Alongside the client report there's an operator-only briefing with full technical detail, every tool named, and a confidence flag on each finding so a confirmed signal is never mistaken for a guess on a sales call. One source of truth, two renderers, zero duplication of the underlying facts.
One audit, three audiences
The same deterministic engine renders three ways: an operator console for me, a plain-English report for the client, and a confidence-tagged technical sheet for the sales call. Below is the real format on invented data for a made-up business. No real client, no real numbers. Switch between the views to see how one dataset reshapes itself for three very different readers.
The site has analytics installed and a strong review presence, but no call tracking, no booking events, and no labels on outbound links. For a service business where the phone is the cash register, that means the most valuable action — a booked call — is happening completely unmeasured.
Set up call tracking so every booked job ties back to the ad, page, or search that drove it — right now the phone is the main conversion and it's invisible.
Add labels to outbound and ad links so you can tell which channel sends paying customers, not just clicks.
Add structured information about your services and service area so you show up more often when local customers search.
- Priority: No call-tracking solution detected. Implement a dynamic number insertion provider (CallRail / WhatConverts) and fire a GA4
generate_leadevent on call connect; bridge to Google Ads conversion import for ROAS. - UTM discipline: 0 of 118 outbound links UTM-tagged. Build a source/medium/campaign taxonomy for GBP posts, paid social, and email before any ad spend scales — GA4 channel grouping is unreliable until then.
- Booking events: Embedded scheduler present but no event firing on submit. Add a GTM trigger on the confirmation step → GA4
scheduleevent. Layer LocalBusiness + Service schema for the service-area pages.
Customers find you and trust you — your reviews prove that. But right now there's no way to know which ads, posts, or searches actually lead to a booked job. That means some of your marketing money may be working, and some may be wasted, with no way to tell them apart.
This is the starting point. A 30-minute call walks through what's possible — and what it's worth to you.
Request a ConsultWhat's actually deployed, tagged by how certain the detection is. Confirmed = real script tag seen. Likely = config signal. Heuristic = inferred. Absent = looked, found nothing.
Below is how the system that produces it was built.
The system, as engineering
Under the plain-English report is a single backend service that orchestrates the whole flow: business lookup, headless rendering, inspection, scoring, narrative generation, PDF delivery, pipeline tracking, and content automation. It runs as one Node.js application of roughly six and a half thousand lines, organized into distinct domains rather than one tangled script, and I designed, built, and verified all of it.
Batch runs stream their progress back as each audit finishes instead of making you wait for the whole list, and the keys that talk to the outside services stay on the server, never in the browser. The shape of the thing is the point. Every external dependency is wrapped, every stage is replaceable, and nothing trusts input it hasn't checked.
Built to not break
The interesting engineering here is in the failure paths. When the headless render fails, it falls back to a plain fetch, because a partial audit beats a failed one. When the business lookup returns a weak, low-confidence match, it's rejected rather than stapled to the wrong site, so one business's reviews never end up describing another's. A blocked fetch is told apart from a genuinely empty site, because those mean opposite things.
The automation that posts content is protected against the classic double-post. A draft is marked as posting before the network call returns, so a flaky or duplicated response can't cause it to fire twice, and a genuine failure rolls the status back so it can be retried on purpose. Adding a prospect to the pipeline is best-effort. If that step fails, it never takes down the thing that already succeeded. The whole system is written to fail loud and partial, not silent and total.
Regression-tested scoring
Because scoring is deterministic, it can be regression-tested like any other contract. A small set of real sites have their scores frozen as baselines, and a harness runs them through the live pipeline and asserts every number matches exactly. Zero tolerance, because deterministic code on fixed inputs doesn't wobble, so any drift is a real signal, not noise. Green means scoring is unchanged and safe to ship. Red means something moved and I stop to find out why.
The rubric itself is versioned. Changing a weight on purpose means bumping the version and re-freezing the baselines in the same change, so "I changed scoring deliberately" stays permanently distinguishable from "scoring drifted by accident." That distinction is the whole reason a client can trust last month's number still means what it said.
From audit to content, automatically
The findings don't stop at the report. A separate chain turns audit results into social posts. It extracts only real facts from the audits, never an invented number, phrases them into a post, renders a branded card, and drops the draft into a queue for human approval. Approved drafts post themselves on a set weekly cadence, one at a time, oldest first.
This is the part I want to grow. The measurement work is what gets a business in the door. Automating the busywork around it, the reporting, the outreach, the content, is where it becomes an ongoing relationship instead of a one-off. The Command Center is the proof I can build that end to end.
What it demonstrates
Two things at once. The judgment to know which number is lying and refuse to report a dishonest one, and the engineering to ship the production system that produces those numbers, test it so it can't drift, and automate the work around it. Measurement integrity and a system that runs unattended, from the same person.