Module 3: APIs & JSON Fundamentals
In this module, you’ll learn how to connect to SEO tools via APIs, work with JSON data, and build automated workflows that pull data from multiple sources. You’re not becoming an API developer – you’re learning how to get data in and out of the tools you already use, so you can automate the boring bits and focus on the insights.
Learning Objectives
By the end of this module, you will be able to:
- Call SEO tool APIs (and similar) using prompts and generated code
- Parse and flatten JSON so you can use it in spreadsheets or Python
- Design multi-step workflows that combine data from more than one source
- Handle common API gotchas: rate limits, errors, and authentication
Prerequisites
- Completion of Module 2: Python for SEO Automation
- Basic familiarity with Python (or with getting an LLM to write it for you)
- Understanding of what your SEO tools export (GSC, Ahrefs, etc.)
Why it matters
APIs are how tools talk to each other. When you click “Export” in Google Search Console, you’re getting a file. When a script calls the GSC API, it gets the same kind of data without you clicking – on a schedule, for many sites, or filtered in ways the UI doesn’t offer.
That means you can combine GSC data with keyword data from another tool, or refresh a report every Monday without opening five different dashboards. You don’t need to build the integration from scratch; you need to know what to ask for (and what can go wrong) so you can prompt effectively and sanity-check the results.
With this understanding, you can:
- Specify integration requirements when you need data from multiple SEO tools in one place
- Troubleshoot “why did my export break?” when an API changes or rate limits bite
- Design repeatable workflows that run on a schedule or trigger from a single action
- Evaluate vendor claims about “API access” and know what questions to ask
Example in action
Instead of downloading the same GSC report every week, manually merging it with keyword data from another tool, and then updating a spreadsheet, you’ll use an LLM to generate a script that calls both APIs (or one API and a CSV), combines the data, and writes one clean file or dashboard.
You don’t need to memorise API documentation. You need to know that APIs exist, that they usually return JSON, and that you can ask an LLM to write the code – as long as you’re clear about which tool, which endpoints or exports, and what you want the output to look like.
Common mistakes
- Putting API keys in code or in the repo – use environment variables or a config file that’s not committed. Never paste keys into prompts you might share.
- Ignoring rate limits – many APIs allow only so many requests per minute. If you don’t ask for retries and backoff, your script will fail when you scale up.
- Assuming the API response never changes – vendors add fields, rename things, or deprecate endpoints. Ask for error handling and clear logging so you can see when something breaks.
- Asking for “call the API” without saying which one – be specific: “Google Search Console Search Analytics API” or “Ahrefs API v3 keywords endpoint” so the generated code matches what you actually use.
- Skipping the small test – run with one site, one date range, or one keyword list first. Then scale up once you’re happy with the output.
APIs and JSON in plain English
What’s an API? Think of it as a menu and a waiter. You send a request (“I want last month’s query data for this site”) in a format the tool understands, and the server sends back the data – usually in JSON – instead of you clicking through the UI.
What’s JSON? It’s a standard way to structure data: key–value pairs and lists. It looks like nested brackets and commas. The good news: you don’t have to parse it by hand. You can ask an LLM to generate Python that reads the JSON, flattens the bits you care about, and outputs CSV or a table. Your job is to say which tool, which report, and which columns or metrics you need.
Why does this matter for SEO? Most serious SEO tools offer an API. That doesn’t mean you have to use it today – but when you need to combine data, refresh reports automatically, or build a custom view, the API is how you do it. Knowing the concepts (and the pitfalls) lets you prompt for the right thing and spot when the result is wrong.
Prompting for API and workflow tasks
Be specific about the tool and the data
❌ BAD: "Write code to get keyword data from an API"
✅ GOOD: "Write Python code that calls the Google Search Console Search Analytics API for property [PROPERTY_ID], requests query data for the last 28 days with dimensions query and date, and exports the result to a CSV with columns date, query, clicks, impressions, ctr, position. Use a service account JSON key from the path in environment variable GSC_CREDENTIALS_PATH."
If you’re combining sources, say so:
✅ GOOD: "Write Python code that (1) reads the latest GSC export CSV from folder ./exports, (2) calls the Ahrefs API endpoint for keywords (using API key from env AHREFS_KEY), (3) left-joins the two on the keyword/query field, and (4) writes the result to combined_report.csv with columns query, gsc_clicks, gsc_impressions, ahrefs_volume, ahrefs_kd."
Ask for error handling and rate limits
❌ BAD: "Call the API and save the response"
✅ GOOD: "Call the API with a 1-second delay between requests to respect rate limits. If the response status is 429 (rate limit), wait 60 seconds and retry once. If the response is not 200, log the status code and response body and exit. Save the JSON to a file with a timestamp in the filename."
Specify how you want JSON turned into tables
❌ BAD: "Parse the JSON and analyse it"
✅ GOOD: "Parse the JSON response. The data I need is in response['rows']. Each item has 'keys' (dimension values) and 'metrics'. Flatten this into a table with one row per item, columns: query, date, clicks, impressions, ctr, position. Export to CSV."
Many APIs return nested structures. The more precisely you describe the path to the numbers you care about, the better the generated code will match your expectations.
Try it yourself
These exercises give you a concrete place to start. Use the starter prompts as a first draft; refine them with your actual property IDs, date ranges, and column names.
Exercise 1: Single API to CSV
Task: Pull data from one SEO tool API (e.g. GSC Search Analytics or another you have access to) and export the result to a CSV you can open in Excel or use in Python.
LLM Prompt Type Needed: API call and JSON-to-CSV prompt
A starter example:
"Write Python code to call the Google Search Console Search Analytics API for a single property. Request the last 7 days of data with dimensions query and date, metrics clicks and impressions. Read the service account credentials from the path in environment variable GSC_CREDENTIALS_PATH. Parse the JSON response, flatten the rows into a table with columns date, query, clicks, impressions, and save to gsc_export_YYYYMMDD.csv. Include error handling for missing env var, invalid credentials, and non-200 responses. Add comments explaining each step."
Common pitfalls to watch out for:
- Not specifying where credentials come from (env vars, not hardcoded)
- Forgetting that GSC uses a service account and domain verification – the LLM can’t fix your Google Cloud setup for you
- Assuming the response structure; if you have a sample response, paste it into the prompt so the parsing matches
- Not asking for a sensible filename (e.g. with date) so you don’t overwrite the same file every run
Exercise 2: Two data sources into one report
Task: Combine data from two sources (e.g. GSC export CSV + one API, or two CSVs) into a single table or report.
LLM Prompt Type Needed: Multi-source workflow and join prompt
A starter example:
"Write Python code that (1) reads gsc_queries.csv with columns query, clicks, impressions, (2) reads keyword_data.csv with columns keyword, volume, kd (or calls [NAMED_API] for that data), (3) joins them on query/keyword (lowercase, trimmed), (4) outputs combined.csv with columns query, clicks, impressions, volume, kd. Handle missing matches with empty or 0. Include brief comments for each step."
Common pitfalls to watch out for:
- Column names differing between sources (query vs keyword, etc.) – specify the join key clearly
- Case or whitespace differences – ask for normalisation (e.g. lowercase, strip) before joining
- Not saying what to do with missing data (leave blank, 0, or drop the row)
- Large files: if either source is big, mention “use pandas and avoid loading the whole file into memory if possible” or specify chunking
Exercise 3: Add basic error handling and retries
Task: Take an existing API script (or one generated in Exercise 1) and add rate-limit handling and retries so it doesn’t fail on the first 429 or timeout.
LLM Prompt Type Needed: Error handling and retry logic prompt
A starter example:
"Update this script to: (1) sleep 1 second between API requests, (2) on HTTP 429, wait 60 seconds and retry that request once, (3) on timeout or connection error, retry up to 3 times with 5-second gaps, (4) log each retry and any final failure. Keep the rest of the logic the same."
Common pitfalls to watch out for:
- Retrying forever – cap the number of retries
- Not logging – you need to see why a run failed (rate limit vs auth vs bad request)
- Different APIs have different rate limits – check the vendor docs and mention the limit in the prompt if you know it
Key topics covered (reference)
- APIs and JSON: What they are, why they matter for SEO, and how to describe what you need when prompting
- Multi-step workflows: Combining CSV and API data, joining tables, and exporting one clean result
- Robustness: Rate limits, retries, error handling, and keeping credentials out of code
- Production-style habits: Logging, filenames with dates, and testing with small data first
For deeper detail on specific SEO tools and their APIs, see the SEO Tools Integration Guide. For prompt patterns, see the LLM Prompting Guide.
Resources
- SEO Tools Integration Guide – Integrating with major SEO tools and APIs
- LLM Prompting Guide – Prompting techniques for workflows and data tasks
- Python Quick Reference Guide – Essential Python patterns for automation
| Next: Module 4: Building SEO Agents | Supporting materials |
Ready to connect your SEO tools and automate the data pull? This module gives you the concepts and prompts you need to get from “export by hand” to “script does it for me”.