Python Quick Reference Guide

Python Quick Reference Guide

Essential Python commands, libraries, and patterns for SEO automation. Part of the Building Agentic SEO Consultants course. Use this alongside Module 2: Python for SEO Automation, and refer back when working on Module 3: APIs & JSON or Module 5: Data Analysis and Insights.

Key Principles for SEO Python Success

1. LLM-First Approach

  • Start with LLM prompts for code generation; refine with conversation
  • Be specific about SEO data formats and requirements
  • Request comments in all generated code
  • Include requirements.txt for Streamlit apps

2. Essential LLM Prompting Tips

  • Be specific: “Create a Python script to analyse Google Search Console data” vs “Write some Python code”
  • Request comments: Always ask for detailed comments explaining each step
  • Include requirements: Specify requirements.txt for any dependencies
  • Mention data sources: Specify which SEO tools (Ahrefs, SEMrush, GSC) you’re working with

For more prompting patterns, see the LLM Prompting Guide for SEO.

3. Where to Run Your Python Scripts

  • Jupyter Notebooks: Interactive development and data exploration
  • Google Colab: Cloud-based Jupyter environment (free)
  • Streamlit Cloud: Web app deployment (limited private repos)
  • Local Development: VS Code, PyCharm, or Cursor

Essential Libraries for SEO

# Data manipulation
import pandas as pd
import numpy as np

# Data visualisation
import matplotlib.pyplot as plt
import seaborn as sns

# Web scraping and APIs
import requests
import json

# File operations
import csv
import os

Common SEO Data Operations

Reading SEO Data Files

# CSV files (Ahrefs, SEMrush exports)
df = pd.read_csv('keyword_data.csv')

# Excel files
df = pd.read_excel('gsc_data.xlsx', sheet_name='Queries')

# JSON files (API responses)
with open('api_response.json', 'r') as f:
    data = json.load(f)

Data Cleaning

# Remove duplicates
df = df.drop_duplicates()

# Handle missing values
df = df.fillna(0)

# Convert data types
df['clicks'] = pd.to_numeric(df['clicks'], errors='coerce')

Data Analysis

# Group by keyword
keyword_performance = df.groupby('keyword').agg({
    'clicks': 'sum',
    'impressions': 'sum',
    'ctr': 'mean'
}).reset_index()

# Sort by performance
top_keywords = df.sort_values('clicks', ascending=False).head(20)

Streamlit Quick Start

import streamlit as st
import pandas as pd

st.title("SEO Data Analyser")
uploaded_file = st.file_uploader("Upload your SEO data", type=['csv', 'xlsx'])

if uploaded_file:
    df = pd.read_csv(uploaded_file)
    st.dataframe(df)
    st.line_chart(df['clicks'])

Common Pitfalls to Avoid

  1. File Paths: Use relative paths, not absolute paths
  2. Data Types: Always check and convert data types
  3. Memory Usage: Use chunksize for large files
  4. Error Handling: Always use try-except blocks
  5. Dependencies: Always include requirements.txt

SEO-Specific Tips

  • GSC Data: Handle date columns properly
  • Ahrefs Data: Check for different column names in exports
  • SEMrush Data: Watch for encoding issues
  • Large Datasets: Use chunking for files > 100MB

This guide supports Module 2: Python for SEO, Module 3: APIs & JSON, and Module 5: Data Analysis.



You Might Also Like