Module 5: Data Analysis and Insights

Module 5: Data Analysis and Insights

You’re pulling data from APIs and CSVs and combining it. This module is about what to do next: ask better questions of that data, get clear tables and charts, and turn the results into decisions. You’re not training to be a statistician – you’re learning to specify the right analysis and interpret what comes back so you can spot opportunities and explain them to others.

Learning Objectives

By the end of this module, you will be able to:

  • Define clear analysis questions (segments, comparisons, trends) before prompting
  • Ask for tables and visualisations that answer a specific SEO question
  • Prompt for common patterns: groupings, rankings, time comparisons, simple correlations
  • Avoid the main pitfalls: wrong metrics, over-interpretation, and charts that look good but say nothing

Prerequisites

  • Completion of Module 4: Building SEO Agents (or at least Module 2 and 3)
  • Data you can analyse: e.g. GSC export, keyword list, crawl data, or combined dataset from earlier modules
  • Familiarity with prompting for Python and CSV/API output (Modules 2 and 3)

Why it matters

Data on its own doesn’t change anything. Insights do. The gap is often the question you ask. “Show me my top keywords” is fine; “Show me keywords where impressions grew but clicks didn’t, and group them by intent so I can prioritise content” is an insight that leads to action. Your job is to know what kind of question you need (comparison, trend, segment, anomaly) and to prompt so the output – table or chart – actually answers it.

That doesn’t require a PhD. It requires clarity about the decision you’re trying to make and the level of rigour you need. Sometimes a simple pivot and a bar chart are enough. Sometimes you want a clear “before vs after” or “this segment vs that segment.” Getting that right is a skill you can practise with LLMs and generated code.

With this understanding, you can:

  • Frame analysis requests so the output is actionable, not just “some stats”
  • Ask for the right visualisation – when a table is better than a chart, and when a time series or comparison chart helps
  • Sanity-check results – spot when the metric or segment doesn’t match the question
  • Present findings – clear tables and one-sentence takeaways that stakeholders can use

Example in action

Instead of opening a GSC export and staring at rows, you define the question: “Which queries gained impressions in the last 3 months but didn’t gain clicks, and what’s their average position?” You prompt for a script that filters, groups, and outputs a table plus a simple chart. You get a CSV and an image you can drop into a report, with a one-line takeaway you wrote. The analysis is repeatable next month.

You didn’t hand-code the stats. You specified the question, the segments, and the output format – and you verified the result against what you expect.

Common mistakes

  • Analysing without a question – “Do some analysis on this file” leads to generic summaries. Start with “I need to decide X” or “I need to show Y” and work backwards to the table or chart.
  • Picking the wrong metric – Impressions vs clicks vs CTR vs position each tell a different story. Be explicit: “rank by clicks” or “filter where position improved and impressions > 1000.”
  • Asking for too much in one go – One analysis per prompt is easier to check and iterate. You can chain later.
  • Skipping the sense-check – If the numbers look wrong (e.g. totals don’t match the raw file), say so in a follow-up prompt. Don’t present results you haven’t verified.
  • Charts that obscure the point – Fancy visuals can hide a weak question. Specify the message: “Bar chart: top 10 pages by clicks, descending” so the chart supports the insight.

From data to decisions in practice

Good analysis starts with the decision. What do you need to decide or show? “Which pages to optimise” → you need a way to rank or segment them. “Whether the last change worked” → you need before/after or a trend. “Where we stand vs competitors” → you need a clear comparison and a small set of metrics.

Tables vs charts. Tables are better when someone needs the numbers (e.g. exact values, many categories). Charts are better when the story is about shape or order (trends over time, top N, comparison between groups). Ask for both if it helps: “Output a CSV with the full table and a bar chart of the top 10.”

Segments and filters. Most SEO questions are “for this subset.” Specify the subset: by device, by country, by query intent, by URL pattern. Prompt for the filter and the grouping so the output matches your question.

Prompting for analysis and visualisation

Specify the question and the output

❌ BAD: "Analyse this keyword data and make some charts"
✅ GOOD: "Using keywords.csv (columns: keyword, volume, kd, cpc, intent), produce (1) a table of the top 20 keywords by volume with all columns, (2) a bar chart of top 10 by volume with keyword on x-axis and volume on y-axis, labelled. Save table as top_keywords.csv and chart as top_keywords.png. What question does this answer? One sentence."

Ask for a takeaway or summary

✅ GOOD: "After producing the table and chart, add a one-paragraph 'Summary' that states the main finding and one recommended next step. Do not invent data – only use the numbers in the output."

That keeps the focus on insight, not just output.

Define segments when comparing

❌ BAD: "Compare performance"
✅ GOOD: "Split the GSC data by device (mobile vs desktop). For each group, show total clicks, impressions, average CTR, average position. Output a small comparison table and one sentence on which segment is underperforming and why that might be."

Try it yourself

Use your own data if you have it; otherwise use sample exports from GSC, a keyword tool, or a crawl. The goal is to practise asking a clear question and getting a table or chart that answers it.

Exercise 1: Segment and rank

Task: Take a single dataset (e.g. GSC queries or keyword export) and prompt for a segmented view: e.g. top 10 by clicks, top 10 by impressions, and a third segment of your choice (e.g. high impressions, low CTR). Output a table and one simple chart.

LLM Prompt Type Needed: Segmentation and ranking with visualisation prompt

A starter example:

"Using gsc_queries.csv (columns: query, clicks, impressions, ctr, position), create: (1) a table with three sections – top 10 by clicks, top 10 by impressions, and rows where impressions > 1000 and ctr < 0.02. (2) One bar chart: top 10 queries by clicks. Save table as segmented_queries.csv and chart as top_clicks.png. Add a one-sentence summary of what the low-CTR high-impression segment suggests."

Common pitfalls to watch out for:

  • Column names that don’t match your file – list them explicitly or the script will fail or guess
  • Too many segments in one prompt – start with two or three so you can check the logic
  • Not asking for a summary – the summary forces you to interpret and spot errors

Exercise 2: Before vs after (or period comparison)

Task: You have (or can simulate) data for two time periods. Ask for a comparison: same metric, same grouping, two periods. Output a table and optionally a chart. Include a one-line takeaway.

LLM Prompt Type Needed: Time or period comparison prompt

A starter example:

"I have two files: gsc_jan.csv and gsc_feb.csv, both with columns query, clicks, impressions. For each query that appears in both files, compute change in clicks and change in impressions. Output a table: query, clicks_jan, clicks_feb, change_clicks, impressions_jan, impressions_feb, change_impressions. Sort by absolute change in clicks descending. Top 20 rows. Add one sentence: what does this comparison show?"

Common pitfalls to watch out for:

  • Different column names or date formats between files – specify them
  • Queries that appear in only one period – decide whether to drop them or show as “new” / “gone” and say so in the prompt
  • Presenting without checking – spot-check a few rows against the raw files

Exercise 3: One chart, one message

Task: Choose one clear message (e.g. “Top 10 pages by organic clicks” or “CTR by device”). Prompt for a single chart that conveys that message, with a clear title and labels. No extra charts.

LLM Prompt Type Needed: Single-message visualisation prompt

A starter example:

"Using pages.csv with columns url, clicks, impressions, ctr, position, create one horizontal bar chart: x-axis = clicks, y-axis = page URL (top 10 only). Title: 'Top 10 pages by organic clicks'. Save as top_pages.png. No other charts or tables."

Common pitfalls to watch out for:

  • Too many series or categories – one message per chart keeps it readable
  • Missing title or axis labels – ask for them explicitly
  • Default colours or fonts that make the chart hard to read – you can ask for “high-contrast labels” or “font size at least 12” if needed

Key topics covered (reference)

  • Question first: Define the decision or message, then the table or chart.
  • Segments and filters: Specify subsets (device, intent, URL pattern) so the analysis answers your question.
  • Tables vs charts: Tables for numbers and detail; charts for shape, order, and comparison.
  • Sense-check and summary: Verify numbers; ask for a one-sentence or one-paragraph takeaway.

For more on statistical rigour (e.g. significance, cohorts), see the Python Quick Reference Guide and data analysis libraries (pandas, seaborn).

Resources

Next: Module 6: Deployment and Scaling Supporting materials

Ready to turn data into decisions? This module helps you ask the right questions and get tables and charts that actually answer them.



You Might Also Like