Python Quick Reference Guide
Essential Python commands, libraries, and patterns for SEO automation. Part of the Building Agentic SEO Consultants course. Use this alongside Module 2: Python for SEO Automation, and refer back when working on Module 3: APIs & JSON or Module 5: Data Analysis and Insights.
Key Principles for SEO Python Success
1. LLM-First Approach
- Start with LLM prompts for code generation; refine with conversation
- Be specific about SEO data formats and requirements
- Request comments in all generated code
- Include
requirements.txtfor Streamlit apps
2. Essential LLM Prompting Tips
- Be specific: “Create a Python script to analyse Google Search Console data” vs “Write some Python code”
- Request comments: Always ask for detailed comments explaining each step
- Include requirements: Specify
requirements.txtfor any dependencies - Mention data sources: Specify which SEO tools (Ahrefs, SEMrush, GSC) you’re working with
For more prompting patterns, see the LLM Prompting Guide for SEO.
3. Where to Run Your Python Scripts
- Jupyter Notebooks: Interactive development and data exploration
- Google Colab: Cloud-based Jupyter environment (free)
- Streamlit Cloud: Web app deployment (limited private repos)
- Local Development: VS Code, PyCharm, or Cursor
Essential Libraries for SEO
# Data manipulation
import pandas as pd
import numpy as np
# Data visualisation
import matplotlib.pyplot as plt
import seaborn as sns
# Web scraping and APIs
import requests
import json
# File operations
import csv
import os
Common SEO Data Operations
Reading SEO Data Files
# CSV files (Ahrefs, SEMrush exports)
df = pd.read_csv('keyword_data.csv')
# Excel files
df = pd.read_excel('gsc_data.xlsx', sheet_name='Queries')
# JSON files (API responses)
with open('api_response.json', 'r') as f:
data = json.load(f)
Data Cleaning
# Remove duplicates
df = df.drop_duplicates()
# Handle missing values
df = df.fillna(0)
# Convert data types
df['clicks'] = pd.to_numeric(df['clicks'], errors='coerce')
Data Analysis
# Group by keyword
keyword_performance = df.groupby('keyword').agg({
'clicks': 'sum',
'impressions': 'sum',
'ctr': 'mean'
}).reset_index()
# Sort by performance
top_keywords = df.sort_values('clicks', ascending=False).head(20)
Streamlit Quick Start
import streamlit as st
import pandas as pd
st.title("SEO Data Analyser")
uploaded_file = st.file_uploader("Upload your SEO data", type=['csv', 'xlsx'])
if uploaded_file:
df = pd.read_csv(uploaded_file)
st.dataframe(df)
st.line_chart(df['clicks'])
Common Pitfalls to Avoid
- File Paths: Use relative paths, not absolute paths
- Data Types: Always check and convert data types
- Memory Usage: Use
chunksizefor large files - Error Handling: Always use try-except blocks
- Dependencies: Always include
requirements.txt
SEO-Specific Tips
- GSC Data: Handle date columns properly
- Ahrefs Data: Check for different column names in exports
- SEMrush Data: Watch for encoding issues
- Large Datasets: Use chunking for files > 100MB
Related support guides
- LLM Prompting Guide for SEO – Prompt patterns and templates for code generation and data tasks
- SEO Tools Integration Guide – APIs, data formats, and tool-specific integration patterns
This guide supports Module 2: Python for SEO, Module 3: APIs & JSON, and Module 5: Data Analysis.