Research · v1 · April 2026
The state of Shopify catalogs, measured against AI shopping agents.
FlintmereBot scans Shopify stores against the seven checks AI shopping agents run — drawn from published Shopify, GS1 UK, and Google Merchant Center specs. We publish the aggregate numbers — score, grade distribution, per-vertical gaps — and nothing else. No individual store is ever named. Early sample — 9 stores scanned so far. We publish the numbers as they come in, but don’t frame them as “the median Shopify store” until the dataset clears 100 per vertical.
Early sample
45
/ 100 · 9 stores so far
Early signal — the first Shopify catalogs are landing, and the gap between visible and structured data is already loud.
The number to the left is the score on the 9 stores scanned so far, not a published median. We don’t call it “the median Shopify catalog” until the per-vertical sample clears 100. The trend is already visible though: catalogs score high on titles and imagery and low on the structured fields agents actually filter on.
Grade distribution · early sample
How the first 9 Shopify stores land against the seven AI-readiness checks.
- A
0% · 0
- B
0% · 0
- C
0% · 0
- D
89% · 8
- F
11% · 1
By vertical
The gap between verticals is bigger than the gap between good and bad stores inside any one vertical.
Apparel
—
sample pending
Size, colour, material, gender — the four fields apparel catalogs most often leave unstructured.
Read the apparel breakdown →
Beauty
—
sample pending
Ingredients, shade, volume, claims — beauty agents filter on all four, and most catalogs ship none of them structured.
Read the beauty breakdown →
Food & drink
—
sample pending
Allergens, nutrition, provenance, certifications — the regulatory fields food agents depend on to answer any query safely.
Read the food & drink breakdown →
Methodology
Scanned by FlintmereBot · aggregate-published · refreshed monthly.
FlintmereBot identifies itself as FlintmereBot/1.0 (+audit.flintmere.com/bot) and rate-limits to one request per two seconds per host. Each scan fetches robots.txt, sitemap.xml, llms.txt, products.json, and a small sample of product pages. Scores are computed by the same rule-based engine that powers the public scanner.
We publish medians, means, and grade distributions. We never publish the domain of any individual store. Merchants who want to be excluded can add FlintmereBot to their robots.txt and the next scan will honour it. The underlying dataset is never shared or sold.
A note on what we could reach
The v1 cohort is the stores FlintmereBot could read politely. A meaningful share of the Shopify market — mostly the larger catalogs sitting behind enterprise bot-management — returns a block before any product page loads. Those same blocks apply to ChatGPT, Perplexity, and every other AI shopping agent that comes knocking. So if a store isn’t in this sample, the agent reading its catalog today is getting the same answer: nothing. That’s the gap this research measures from both sides.
Get your store in the next edition
Run a free scan. Your score sits inside the next monthly refresh.
Scans initiated by store owners are tagged separately from FlintmereBot crawls and contribute to next month’s aggregates. You keep your report; we keep the anonymised score. The more stores in the dataset, the tighter the benchmark becomes for everyone.