Flintmere · State of Shopify catalogs · v1 · July 2026

99% of 849 Shopify catalogs grade D or F.

Add my store to the next edition →How FlintmereBot works

what we measured

The state of Shopify catalogs, measured against AI shopping agents.

FlintmereBot scans Shopify stores against the seven checks AI shopping agents run — drawn from published Shopify, GS1 UK, and Google Merchant Center specs. We publish the aggregate numbers (score, grade distribution, per-vertical gaps) and nothing else. No individual store is ever named. This is v1: 849 stores in the cohort, refreshed periodically.

Scores cluster inside a narrow band (47–50). The difference between the median catalog and the top decile is not sophistication; it is structured fields populated.

overall median

/ 100 · grade F

Most Shopify catalogs fail half the checks an AI shopping agent runs before it recommends a store.

Across 849 scanned stores, the median Shopify catalog earns a grade F — strong on visible surfaces (titles, imagery), weak on the structured fields agents depend on (barcodes, attribute metafields, category mapping). The difference between the median store and the top decile is not sophistication; it is fields populated.

grade distribution

How 849 Shopify stores stack up against grade D or F.

A
·
B
·
C
·
D
·
F
·

by vertical

The gap between verticals is bigger than the gap between good and bad stores inside any one vertical.

A single neatly folded ivory cotton jersey shirt with a small white sewn-in care label visible at the neckline, an ivory cloth measuring tape draped beside it on a warm cream surface in soft afternoon daylight — apparel catalogs hinge on size, colour, material, and gender as structured fields.

Apparel

median · grade F · 140 stores

Size, colour, material, gender — the four fields apparel catalogs most often leave unstructured.

Read the apparel breakdown →

A single small unmarked cream-coloured ceramic apothecary bottle with a fluted neck, a small folded ivory paper booklet with a slim brass spatula resting on top, on a warm cream surface in soft afternoon daylight — beauty agents filter on ingredients, shade, volume, and claims, and most catalogs ship none of them structured.

Beauty

median · grade F · 128 stores

Ingredients, shade, volume, claims — beauty agents filter on all four, and most catalogs ship none of them structured.

Read the beauty breakdown →

A small clear glass artisan preserves jar three-quarters full of warm amber preserves, the lid hand-tied with kraft paper and twine, resting on a small folded blank cream paper sheet on a warm surface in soft daylight — food catalogs depend on allergens, nutrition, provenance, and certifications as regulatory fields.

Food & drink

median · grade F · 561 stores

Allergens, nutrition, provenance, certifications — the regulatory fields food agents depend on to answer any query safely.

Read the food & drink breakdown →

methodology

Scanned by FlintmereBot.

Aggregate-only. Refreshed periodically.

FlintmereBot identifies itself as FlintmereBot/1.0 (+audit.flintmere.com/bot) and rate-limits to one request per two seconds per host. Each scan fetches robots.txt, sitemap.xml, llms.txt, products.json, and a small sample of product pages. Scores are computed by the same rule-based engine that powers the public scanner.

We publish medians, means, and grade distributions. We never publish the domain of any individual store. Merchants who want to be excluded can add FlintmereBot to their robots.txt and the next scan will honour it. The underlying dataset is never shared or sold.

a note on what we could reach

The v1 cohort is the stores FlintmereBot could read politely. A meaningful share of the Shopify market — mostly the larger catalogs sitting behind enterprise bot-management — returns a block before any product page loads. Those same blocks apply to ChatGPT, Perplexity, and every other AI shopping agent that comes knocking. So if a store isn’t in this sample, the agent reading its catalog today is getting the same answer: nothing. That’s the gap this research measures from both sides.

Full methodology notes →

the next edition

Run a free scan. Your score sits inside the next refresh.

Scans initiated by store owners are tagged separately from FlintmereBot crawls and contribute to next month’s aggregates. You keep your report; we keep the anonymised score. The more stores in the dataset, the tighter the benchmark becomes for everyone.

Run the free scan Or book the concierge audit (from £197)

99% of 849 Shopify catalogs grade D or F.99%of849ShopifycatalogsgradeDorF.

The state of Shopify catalogs, measured against AI shopping agents.ThestateofShopifycatalogs,measuredagainstAIshoppingagents.