SEO Tools Integration Guide
APIs, data formats, and integration patterns for major SEO tools. Part of the Building Agentic SEO Consultants course. Use this with Module 3: APIs & JSON (calling APIs and combining data), Module 4: Building SEO Agents (tools agents can use), and Module 6: Deployment and Scaling (production API usage and secrets). For prompt patterns when asking an LLM to integrate these tools, see the LLM Prompting Guide.
Google Search Console Integration
Data Export Formats
- CSV Exports: Query, page, country, device data
- API Access: Search Analytics API for real-time data
- Common Columns: query, page, clicks, impressions, ctr, position
Python Integration
# GSC CSV Processing
import pandas as pd
def process_gsc_data(file_path):
df = pd.read_csv(file_path)
df['date'] = pd.to_datetime(df['date'])
df['ctr'] = (df['clicks'] / df['impressions'] * 100).round(2)
return df
# GSC API Integration
from google.oauth2 import service_account
from googleapiclient.discovery import build
def get_gsc_data(property_url, start_date, end_date):
# API implementation here
pass
Common GSC Data Issues
- Date format variations
- Missing data in exports
- Device/country filtering
- Large dataset handling
Ahrefs Integration
Data Export Formats
- Keyword Explorer: Keyword difficulty, search volume, CPC
- Site Explorer: Backlink data, referring domains
- Content Explorer: Content performance metrics
Python Processing
# Ahrefs CSV Processing
def process_ahrefs_keywords(file_path):
df = pd.read_csv(file_path)
# Handle Ahrefs-specific column names
df = df.rename(columns={
'Keyword': 'keyword',
'Volume': 'search_volume',
'KD': 'keyword_difficulty'
})
return df
Ahrefs API Integration
# Ahrefs API Example
import requests
def get_ahrefs_data(api_key, endpoint, params):
headers = {'Authorization': f'Bearer {api_key}'}
response = requests.get(f'https://apiv2.ahrefs.com/{endpoint}',
headers=headers, params=params)
return response.json()
SEMrush Integration
Data Export Formats
- Keyword Analytics: Search volume, competition, trends
- Domain Analytics: Organic traffic, backlinks, competitors
- Position Tracking: Ranking data, SERP features
Python Processing
# SEMrush CSV Processing
def process_semrush_data(file_path):
df = pd.read_csv(file_path, encoding='utf-8')
# Handle SEMrush encoding issues
df = df.dropna(subset=['Keyword'])
return df
SEMrush API Integration
# SEMrush API Example
def get_semrush_data(api_key, report_type, domain):
url = f'https://api.semrush.com/'
params = {
'key': api_key,
'type': report_type,
'domain': domain
}
response = requests.get(url, params=params)
return response.text
Other SEO Tools Integration
Screaming Frog
# Process Screaming Frog CSV exports
def process_screaming_frog(file_path):
df = pd.read_csv(file_path)
# Filter for important issues
issues = df[df['Status Code'] != 200]
return issues
Majestic
# Majestic API Integration
def get_majestic_data(api_key, domain):
url = 'https://api.majestic.com/api/json'
params = {
'app_api_key': api_key,
'cmd': 'GetBackLinkData',
'item': domain
}
response = requests.get(url, params=params)
return response.json()
Data Integration Patterns
1. Multi-Source Data Merging
def merge_seo_data(gsc_data, ahrefs_data, semrush_data):
# Merge data from multiple sources
merged = pd.merge(gsc_data, ahrefs_data, on='keyword', how='outer')
merged = pd.merge(merged, semrush_data, on='keyword', how='outer')
return merged
2. Data Validation
def validate_seo_data(df, required_columns):
missing_cols = set(required_columns) - set(df.columns)
if missing_cols:
raise ValueError(f"Missing columns: {missing_cols}")
return True
3. Data Normalization
def normalize_seo_data(df):
# Standardize column names
df.columns = df.columns.str.lower().str.replace(' ', '_')
# Convert data types
df['clicks'] = pd.to_numeric(df['clicks'], errors='coerce')
return df
API Rate Limiting and Best Practices
Rate Limiting
import time
from functools import wraps
def rate_limit(calls_per_minute):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
time.sleep(60 / calls_per_minute)
return func(*args, **kwargs)
return wrapper
return decorator
@rate_limit(60) # 60 calls per minute
def api_call():
pass
Error Handling
def safe_api_call(api_function, max_retries=3):
for attempt in range(max_retries):
try:
return api_function()
except Exception as e:
if attempt == max_retries - 1:
raise e
time.sleep(2 ** attempt) # Exponential backoff
Data Storage and Management
Database Integration
import sqlite3
def store_seo_data(data, table_name):
conn = sqlite3.connect('seo_data.db')
data.to_sql(table_name, conn, if_exists='replace', index=False)
conn.close()
Cloud Storage
import boto3
def upload_to_s3(data, bucket, key):
s3 = boto3.client('s3')
data.to_csv(f's3://{bucket}/{key}', index=False)
Automation Workflows
Scheduled Data Collection
import schedule
import time
def collect_seo_data():
# Collect data from all sources
gsc_data = get_gsc_data()
ahrefs_data = get_ahrefs_data()
# Process and store
process_and_store(gsc_data, ahrefs_data)
# Schedule daily collection
schedule.every().day.at("09:00").do(collect_seo_data)
Real-time Monitoring
def monitor_rankings(keywords):
for keyword in keywords:
current_position = get_current_position(keyword)
if current_position > 10: # Alert if ranking drops
send_alert(f"Ranking drop for {keyword}")
Common Integration Challenges
1. Data Format Variations
- Different CSV formats across tools
- Encoding issues (UTF-8 vs Latin-1)
- Date format inconsistencies
- Column name variations
2. API Limitations
- Rate limiting
- Data freshness
- Historical data access
- Cost considerations
3. Data Quality Issues
- Missing values
- Inconsistent metrics
- Duplicate entries
- Outdated information
Best Practices
- Standardise data formats: Create common data schemas
- Implement Error Handling: Handle API failures gracefully
- Cache Data: Store data locally to reduce API calls
- Monitor API Usage: Track rate limits and costs
- Validate Data: Check data quality before processing
- Document Integrations: Keep track of API changes
- Test Regularly: Verify integrations work correctly
Related support guides
- Python Quick Reference Guide – Code patterns for reading, merging, and exporting the data you pull from these tools
- LLM Prompting Guide for SEO – How to prompt for API calls, error handling, and multi-source workflows
This guide supports Module 3: APIs & JSON, Module 4: Building SEO Agents, Module 5: Data Analysis, and Module 6: Deployment and Scaling.