TopListers just crossed one hundred thousand live job listings. Every one of them hand-checked each morning. No paid feeds, no fake roles padding the count, no "matches" the algorithm hallucinated just real, current openings, ranked one through 100,000.
I want to mark the moment, partly because milestones deserve marking and partly because the engineering choices that got us here are worth writing down before they become "obvious" in retrospect. This is a short note on what the architecture had to do, what broke along the way, and where we go from here.
What "hand-checked" actually means
When we say hand-checked, we do not mean a human is reading every single listing every morning that is not how this scales to 100,000. We mean every listing passes through a checked pipeline before it appears on the list, and the pipeline itself is supervised by a person every day. There is a difference, and the difference is the point.
The pipeline does three things every morning:
- Fetches the day's roles from upstream feeds (Adzuna's API is the spine, with a handful of supplementary sources).
- De-dupes & verifies every role gets a canonical company/title/location hash, then a lightweight check that the source URL is still live and the role is still open. Dead links and expired roles are dropped before they reach the ranker.
- Ranks & flags the freshness, source quality, and signal-richness of each listing produce a rank from 1 to N. Anything that smells like a re-post, a recruiter farm, or a duplicate gets flagged for the morning review queue.
I spend ~15 minutes a morning on that review queue. That is the "hand" in hand-checked. It is not glamorous, but it is the difference between a list you trust and a list that gradually fills up with noise.
The things that broke along the way
Pagination at 10k
The first wall was the simplest one: client-side pagination of 10,000 ranked rows is fine; 100,000 is not. We moved to cursor-based pagination on the API and virtualised the list on the React side. The page now never holds more than a few hundred rows in the DOM, regardless of how deep you scroll.
Dedup at 50k
Around 50,000 active listings, the dumb-but-fast dedup logic (company + title + location → hash) started colliding on legitimate roles two genuinely different Senior Frontend Engineer roles at the same company in the same city, posted by different teams. We added the role's source-issued ID to the hash to keep them distinct, with a soft-collision rule that surfaces probable duplicates to the morning queue instead of dropping them silently. Drop-silently is the kind of bug you never notice until someone complains they cannot find their listing.
Trust at 100k
The non-technical failure mode at this scale is trust. Once you cross six figures, the list itself becomes the marketing but only if people believe the count is real. The temptation to pad with stale roles to keep the headline number above some threshold is real, and we have explicitly chosen not to. If the verified count drops below 100,000 tomorrow, the page will say so. That commitment is more important than the number.
What an AI PO lens would say about this
If I take my consulting hat off TopListers and look at the milestone the way I would look at a client product, three things stand out:
Evals before features. The dedup-and-verify pipeline is, in product language, an evaluation system. It runs every morning, against every candidate listing, with a defined notion of pass/fail (live URL, non-expired, not a duplicate). Most AI and data products do not have one of these. They have a "model" and a "demo" and a vague hope that things are fine. TopListers crossed 100k because the eval came before the feature.
The human-in-the-loop earns its keep. Fifteen minutes a morning to handle ambiguous cases is cheap. The mistake people make is trying to design those fifteen minutes away. They are what keep the list trusted; they are not a temporary scaffold to be removed once we have "enough data". They are the product.
Honest counts compound. Every product I have ever worked on that inflated a metric daily actives, recommendations served, listings eventually paid for it. It is much easier to write the post that says "we crossed 100k" when the 100k is real.
What's next
A few things on the immediate roadmap:
- Saved-search alerts. Tell TopListers what you are looking for, get a once-a-day digest when matching live roles appear. Privacy-first as always no account required, signed-URL unsubscribe.
- Heatmaps for AI/PM/Data roles. Country-level views of where the hottest specialities are concentrated this week, this month, this quarter.
- Anonymous sightings. Anyone can drop a role they have seen in the wild, with the source link. Useful for catching roles that the official feeds miss.
If any of that sounds like something you would use, or you have a job-search problem you wish TopListers solved, tell me. I read everything that comes in to hello@ainika.xyz, and the roadmap genuinely shifts based on what people are stuck on.
The list is not the product. The trust that the list is real that is the product.
Posters from the 100k drop
A few of the visuals from the announcement, courtesy of the TopListers design system. Feel free to reshare.
Thanks for being here. If you build anything related to job market data, hiring intelligence, or product analytics, I would love to hear what you are working on. → visit toplisters.xyz