Rank tracking API: building agency rank tracking that scales
Once you're tracking more than a few hundred keywords across client sites, per-seat rank trackers stop making sense and the bill stops being linear. The fix is a rank tracking API: you pull SERP positions on a schedule, store them yourself, and own the data. I run a system like this in production for an SEO back-office. Here's the architecture, the polling code, and the data-integrity discipline that separates a number you can put in front of a client from a number that's quietly wrong.
What a rank tracking API is
A rank tracking API is an endpoint that returns where a domain ranks for a keyword in a specific location and device, so you can pull positions programmatically instead of reading them off a dashboard. You send a keyword, a target country or city, and desktop or mobile; it returns the ranking URL and its position. Stored on a schedule, those calls become a position history you own.
There are two kinds, and the difference matters. A dedicated rank tracking API (the kind sold by tracking tools) returns a clean position number for your domain. A raw SERP API (DataForSEO, the search-result endpoints from the larger SEO platforms) returns the whole results page, and you find your domain in it yourself. The SERP API is cheaper per call and more flexible, but you inherit the parsing and the edge cases. For agency-scale tracking I use the SERP-API route, because at volume the per-call price is what decides whether this is profitable.
Why agencies move off per-seat tools
Per-seat rank trackers price on tracked keywords and refresh frequency, so the cost climbs with every client you add. A rank tracking API decouples the bill from the seat: you pay per SERP call, store the results once, and serve them to every client report, dashboard, and alert without paying again. At a few hundred keywords the tools win on convenience. Past a few thousand, the API wins on both cost and control.
The control part gets underrated. When you own the position table, you can backfill a keyword you forgot to add, recompute a client's average position after you fix a tracking mistake, and join ranking data against your own traffic and conversion numbers in one query. None of that is possible when your history lives inside a vendor's dashboard. The agency back-office I run does exactly this: ranks land in our own database, and the same rows feed both the internal QA checks and the client-facing report.
The catch is that you're now responsible for the things the tool quietly handled. The position you see depends on location, device, personalization, and the literal minute you checked, because the SERP itself moves. Google's own documentation is blunt that ranking is dynamic and varies by context and over time (Google Search Central, 2026). Owning the API means owning that variance instead of trusting a vendor to have smoothed it for you.
Polling the API on a schedule
The core loop is small: read your list of keyword and location pairs, call the SERP API for each, find the client domain in the results, and write the position with a timestamp. The hard parts are not the call. They're concurrency limits, retries, and recording a result even when the domain doesn't rank at all.
Here's the shape of the worker that runs against our queue. It calls the SERP endpoint, scans the organic results for the tracked domain, and returns a structured result rather than a bare number, so a "not found" is a real recorded outcome and not a gap:
import requests
def fetch_rank(keyword: str, domain: str, location: str, device: str) -> dict:
"""Call the SERP API and locate the domain in the organic results."""
resp = requests.post(
"https://api.example-serp.com/v3/serp/google/organic",
json={"keyword": keyword, "location": location, "device": device},
headers={"Authorization": TOKEN},
timeout=30,
)
resp.raise_for_status()
results = resp.json()["organic"]
for item in results:
if domain in item["url"]:
return {"position": item["rank"], "url": item["url"], "found": True}
return {"position": None, "url": None, "found": False}
That found: False case is the one demos drop. If a domain falls out of the top 100,
most naive scripts either crash or skip the row, and the next day your chart shows a gap that looks
like the job failed. Recording an explicit "not ranking" is what lets you tell the difference between
"we dropped off page one" and "the tracker broke," which is the first question a client asks.
Around that function you need a scheduler and a concurrency cap. Most SERP APIs throttle hard, so firing 5,000 calls at once gets you rate-limited and charged for the failures. A small worker pool with retry-on-429 is the whole pattern:
import time
def fetch_with_retry(job: dict, attempts: int = 3) -> dict:
for n in range(attempts):
try:
return fetch_rank(**job)
except requests.HTTPError as e:
if e.response.status_code == 429 and n < attempts - 1:
time.sleep(2 ** n) # back off: 1s, 2s, 4s
continue
raise
Exponential backoff on 429 alone removed almost all of the transient failures we used to see in the nightly run. The few that remain get logged as failed jobs and retried in the next cycle, never written as a fake zero.
The schema that makes the data trustworthy
Store every check as an immutable row, not a single "current rank" you overwrite. Each row records the keyword, domain, location, device, position, ranking URL, and the exact timestamp of the check. History is the product. A table you overwrite throws away the one thing a client is paying you to show: the trend.
The table that has held up for us is deliberately boring. One row per check, never updated after it's written:
CREATE TABLE rank_checks (
id BIGSERIAL PRIMARY KEY,
keyword TEXT NOT NULL,
domain TEXT NOT NULL,
location TEXT NOT NULL,
device TEXT NOT NULL,
position INT, -- NULL = not in top 100
result_url TEXT,
checked_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
The position INT being nullable is the load-bearing decision. A null is "checked, not
ranking," which is different from a missing row, which means "never checked." Conflating those two is
how a rank chart lies. With the location and device columns, the same keyword tracked from New York
on mobile and London on desktop are separate, comparable series, instead of one averaged number that
means nothing.
Data integrity: the part everyone skips
The number on the client report is only as good as the discipline behind it. Always check the same keyword in the same location and device, treat a non-ranking result as data, expect day-to-day movement that isn't real change, and validate before you publish. Rankings move on their own; your job is to not present noise as a result.
Position volatility is normal even when nothing you did caused it. SERPs fluctuate constantly from personalization, location, and ongoing Google updates, and Ahrefs has documented how much day-to-day ranking movement is just noise rather than a real change in standing (Ahrefs, 2026). If you don't account for it, you'll send a panicked client an alert about a one-position drop that corrects itself the next morning. The integrity rules that keep our reports honest:
- Pin the context. Same location, same device, every check. A keyword's "rank" is meaningless without them, so they're part of the key, not metadata.
- Null is a value. A dropped keyword gets a row with a null position, not silence. The report can then say "fell out of top 100" honestly.
- Validate the batch before it lands. If a nightly run returns nulls for 80% of a domain that was ranking yesterday, that's a parser or API failure, not a mass deindexing. Flag the batch, don't write it.
- Smooth for the report, keep the raw. Show a client a 7-day view so daily noise doesn't read as a trend, but never delete the daily rows underneath.
That fourth rule is why the raw table is append-only. When we move from raw checks to a client-facing view, the smoothing happens in the query, not by mutating the data. The same rows that drive daily rank tracking internally are the ones that roll up into the weekly trend a client sees, which means the internal QA number and the reported number can never silently diverge.
What it actually costs to run
The economics are simple: cost is calls times frequency. Tracking 3,000 keywords daily is roughly 90,000 SERP calls a month, and at SERP-API rates that's a fraction of what the same coverage costs in per-seat tools once you're past a handful of clients. The trade is engineering time and the responsibility for data quality that you now own.
The lever most agencies miss is frequency. Not every keyword needs a daily check. Money keywords and anything in active flux get tracked daily; stable long-tail terms that haven't moved in weeks drop to weekly. We tier the queue by volatility, and that one decision cut our call volume meaningfully without losing anything a client would notice, because a keyword sitting at position 4 for a month doesn't need 30 checks to confirm it's still at 4.
This is the same back-office discipline as everything else we automate. The position data doesn't live alone; it joins the same pipeline that produces the technical SEO audits and the client reporting, so one database answers "where do we rank, what's broken, and what did we ship" in a single place instead of three vendor logins.
Mistakes I made building this
- Overwriting the current rank. The first version kept one row per keyword and updated it. I threw away weeks of history before I realized the history was the entire point. Append-only from then on.
- Treating "not found" as zero. A null and a zero are not the same, and a position of zero charts as "ranking number one," which is the opposite of the truth. Null means not ranking; never coerce it.
- Ignoring location until a client noticed. A local-business client's "rank" from our server's default location was nowhere near what they saw at home. Location and device have to be pinned per keyword, not assumed.
- No batch validation. One night the API changed a field name, our parser returned nulls for everything, and the job wrote them all before anyone looked. Now a batch that swings too hard gets quarantined, not published.
None of these are protocol problems. The SERP call is the easy part, same as the MCP example I keep coming back to: the integration is trivial and the discipline is the work. A rank tracking API gives you the raw positions cheaply. Turning those into a number a client can trust is what the database, the validation, and the append-only history are for.
If you're running an agency and your rank tracking bill grows every time you sign a client, that's the symptom this architecture fixes. We'll run a free audit of your current rank tracking and reporting setup: where the per-seat costs are leaking, which keywords are over- or under-tracked, and whether the numbers your clients see would survive the integrity checks above. No pitch, just the gaps we'd find.
Pavle Lazic is the founder of Scalably, where he builds and runs multi-tenant Claude agent platforms in production for real businesses. He writes about the Claude Agent SDK, MCP servers, and what it actually takes to put AI agents to work. See the platform.