Last week, I tried to connect Claude to my Shopify, and the simplest path that works against a live store right now is Claude Code. Cursor, Gemini CLI, VS Code, and Codex are all supported too, but Claude Code is the cleanest entry point if you're already in the Anthropic world.

Getting connected

Shopify released the AI Toolkit recently. Just a GitHub repo and what it does is open your store's Admin GraphQL API to Claude - live product data, real-time inventory, write access to titles, descriptions, metafields, SEO fields, and Liquid files. The things you normally do through the Shopify admin, Claude can now do programmatically.

The connection process takes about ten minutes. Open Claude Desktop. Copy the Shopify AI Toolkit command from here or paste this directly into the chat.

/plugin marketplace add Shopify/shopify-ai-toolkit
/plugin install shopify-plugin@shopify-ai-toolkit

The first adds the Shopify marketplace and the second installs the plugin itself, which bundles the full toolkit and auto-updates as new capabilities ship. Now you have access to your live store from a chat window. Ask how many products you have, update a title, pull a customer segment, query inventory, run an SEO audit. What took clicking through the admin for twenty minutes can happen in one prompt.

There’s a catch. What happens when a connected agent has no instructions?

The hard part is what comes after the connection. The connection is the easiest. Claude would update 40 product titles correctly and start drifting by product 60. A rule I'd mentioned at the start of a session would stop being applied in the middle of it and occasionally it would suggest changes I hadn't asked for - not malicious, just an agent filling the gaps with its best guess when there's nothing telling it not to.

On a test store with 21 products and fake data, that's fine but on a live store with real catalog, real customers, and no undo button on some operations - it's NOT.

That's what CLAUDE.md solves.

What CLAUDE.md is and why it matters

CLAUDE.md is a file Claude reads before every operation. It sits in your toolkit directory, loads automatically at the start of each session, and gets carried through every task. The model doesn't forget your rules halfway through a 400-product audit. It doesn't re-interpret what "read-only" means on operation 47.

Shopify's AI Toolkit handles the technical layer - the API connections, GraphQL schemas, and CLI access. CLAUDE.md handles everything the toolkit doesn't know about your specific store: what you sell, how you write, which files you never want touched, what the handle naming convention is, which native metafields you use.

Mine has five sections.

The store section is one paragraph. Name, URL, plan, product count, what we sell. Claude needs this before it can make sensible decisions about anything.

The writing section is where most people write "describe your brand voice in three words." That doesn't work. I gave it five example product titles I'd ship and five I wouldn't, plus a list of words the brand never uses. Without examples, suggestions come back generically close. With them, the output starts landing closer to how the store actually sounds. That section changed the quality of everything more than anything else in the file.

The catalog conventions section covers which native Shopify metafields we use, the handle naming convention, and the product type taxonomy. Before I added this, Claude was inventing its own conventions - ones that would have required cleanup to undo.

The rules section is eight hard nos. Non-negotiable, and Claude reads them before every operation even if I forget to mention them in the prompt.

The operations section has pointers to saved prompts for recurring tasks - SEO audits, customer segmentation, redirect builds. Once Claude knows these exist, a short reference invokes the full spec instead of me rebuilding it from scratch each session.

The eight rules

NEVER modify a product handle. Handles are URL identifiers and changing them creates redirects to maintain forever.
NEVER write to the live theme. If theme changes are needed, propose as a diff and I apply manually.
NEVER apply changes without showing the plan first. Default for every operation is plan-then-confirm.
NEVER run write operations on more than 50 products in a single batch without explicit confirmation per batch.
NEVER touch the checkout liquid, cart liquid, or anything in the /checkout/ section of the theme.
NEVER modify discount codes or sale prices without showing the exact before-and-after first.
ALWAYS include a "do not write" instruction in your own plan for any audit, even if I forgot to ask.
ALWAYS suggest running on a dev store clone first for operations touching over 100 products.

The third rule is the one that matters most. The day I forget to type "show me the plan first" is exactly the day the agent needs to ask anyway.

Where Sidekick fits

Understanding CLAUDE.md makes the Sidekick question easier to answer. They're different tools built for different layers of the same system.

Sidekick is a GUI with a chat layer. Every action confirmed one at a time and the ceiling is whatever the Shopify admin exposes. It handles 80 percent of what a merchant or merchandiser needs day-to-day like customer segments, BOGO rules, order lookups, and merchandising tasks - it's fast, safe, and doesn't need any configuration.

The AI Toolkit is the Admin GraphQL API with an agent layer - composable with other systems via MCP once you wire them up. Things like:

Bulk SEO across the full catalog in a single batched mutation
Metafield population at SKU-count scale
Raw Liquid edits with schema validation
Cross-system operations with Gorgias, Klaviyo, or a warehouse integration

Sidekick doesn't do these - the 20 percent of operations that are hardest to do manually and, if done well, probably drive the most revenue impact.

The catch is that the AI Toolkit's power is also what makes it require CLAUDE.md. Shopify admin execution inherits the logged-in user's permissions, writes live, and has no undo on some operations. Sidekick was built with guardrails by design. The Toolkit wasn't - you bring your own instructions, a scoped staff account, and a dev store clone for bulk operations which make the Toolkit safe to run seriously.

Why Opus 4.7 makes this more significant

On 4.6, the eight rules worked most of the time. Not all of it. A rule like "never modify a product handle" meant I was still spot-checking each operation anyway, because the model would interpret it loosely some percentage of the time. Useful, but not reliable enough to trust at scale without supervision.

Anthropic says Opus 4.7 follows instructions more literally and holds filesystem context better across long sessions. CLAUDE.md is exactly that - a file loaded once and carried through every operation. A 400-product catalog audit is a long-horizon task. 4.6 would start drifting by product 200. I am testing 4.7 this week and will share what breaks.

One honest thing

Token cost doesn't scale the way regular software does. One product in a full catalog operation already uses a significant number of tokens. At 1,000 products, the cost is real. Anyone running this in production needs to think carefully about batching, caching, and which operations justify the spend.

It's still early and even though the toolkit is connected, it isn't at the point where you stop using the Shopify admin and run everything from here. It's genuinely good for specific operations and worth setting up as a layer that handles time-intensive tasks.

Ankit M.
Founder @ Atomz.ai

The CLAUDE.md template

Fork this for your own store. Replace the store-specific sections with yours. The rules in section four are the ones worth keeping regardless of your setup.

template_atomz_claude.md

14.79 KB • File

Running Claude on a live Shopify store