SEO Complete Course

Why SEO Matters and Who This Is For

Search Engine Optimisation (SEO) is the practice of improving a website so that it ranks higher in organic (non-paid) search results. When someone types a query into Google, Bing, or any other search engine, the results they see are determined by complex algorithms. SEO is the discipline of understanding those algorithms well enough to ensure your content appears prominently for the right queries.

Who this course is for:

Website owners who want more organic traffic without paying for ads
Marketing professionals adding SEO to their skillset
Developers who need to understand how technical decisions affect search visibility
Content creators who want their work to be discoverable

Why it matters: Organic search is consistently one of the highest-value traffic channels. Unlike paid advertising, rankings you earn through SEO continue to deliver traffic without ongoing spend. A page that ranks #1 for a valuable keyword can generate leads or revenue for years.

An Overview of SEO

Search Engine Optimisation Overview

SEO works across three interconnected pillars:

Technical SEO - ensuring search engines can find, crawl, render, and index your pages correctly
On-page SEO - optimising the content and HTML of individual pages for relevance
Off-page SEO - building authority and trust through links and mentions from other sites

All three must work together. Great content with poor technical foundations won't rank. A technically perfect site with thin content won't rank either.

Types of Search Result

When you search Google, you don't just see ten blue links anymore. Understanding result types helps you target the right opportunities.

Result Type	Description
Organic listings	Standard ranked results
Featured snippets	A box at the top quoting content directly from a page
Knowledge panels	Entity cards (people, places, organisations) drawn from structured data
Local pack	Map + business listings for local queries
Image results	Images surfaced inline from Google Images
Video results	YouTube and other video embeds
Shopping (PLA)	Paid product listings - not organic
People Also Ask	Expandable question/answer boxes
Sitelinks	Sub-links shown beneath a result for branded queries

Why this matters: Each result type has different optimisation levers. A featured snippet requires structuring your content to answer a question clearly and concisely. A local pack result requires a Google Business Profile and local signals. Knowing what type of result is available for your target query shapes your strategy.

Search Engine Algorithms

Search engines use automated programs (crawlers/spiders) to discover web pages, then algorithms to decide how to rank them. The algorithm considers hundreds of signals, but the major categories are:

Relevance - does this page actually address the query?
Authority - do other trusted sites link to this page/domain?
Quality - is the content well-written, accurate, and comprehensive?
Experience - is the page fast, mobile-friendly, and safe?
Intent match - does the format and depth match what searchers actually want?

Google's core algorithm has evolved significantly:

Panda (2011) - targeted thin, duplicate, and low-quality content
Penguin (2012) - targeted manipulative link schemes
Hummingbird (2013) - moved toward understanding query intent, not just keywords
RankBrain (2015) - machine learning layer for interpreting novel queries
BERT (2019) - natural language understanding, context within sentences
Helpful Content (2022) - demoted sites producing content primarily for search engines rather than people

Core principle: Google's algorithm changes are all pushing in the same direction - toward rewarding genuinely useful content for real users, and penalising manipulation.

Thinking Like Google

Google is a business. Its product is search results. If its results are bad, users switch to Bing, DuckDuckGo, or elsewhere. Every decision Google makes is aimed at keeping its results the most useful they can be.

This reframe is powerful: Google is not your enemy, it is your ally. If your page is genuinely the best answer to a query, Google wants to rank it. Your job is to remove the technical barriers that prevent Google from recognising this, and to communicate relevance clearly.

Ask yourself, for any page you're trying to rank: If I were a Google engineer reviewing this page, would I be proud to show it as the #1 result for this query?

The Golden Rule of SEO

Create the best possible resource for your target query, then make it easy for Google to find, understand, and trust it.

Everything in this course supports one or both halves of that sentence. Technical SEO is largely about the second half. On-page SEO and keyword research inform the first half. Link building (covered in more advanced courses) supports "trust."

Crawling Your Own Website

HTML, CSS and JavaScript Primer

To do technical SEO you need a working understanding of how web pages are built. You don't need to be a developer, but you do need to know what you're looking at.

HTML (HyperText Markup Language) is the structure of a page. It's a series of elements (tags) that describe content:

<html>
	<head>
		<title>Page Title Here</title>
		<meta name="description" content="A brief description of the page." />
	</head>
	<body>
		<h1>Main Heading</h1>
		<p>A paragraph of text.</p>
		<a href="/another-page">A link</a>
	</body>
</html>

Key elements for SEO:

<title> - the page title shown in search results and the browser tab
<meta name="description"> - the description sometimes shown below a result
<h1>, <h2>, <h3> - heading hierarchy; signals topic structure
<a href=""> - links; how PageRank flows and how crawlers discover pages
<img alt=""> - image alt text; how images are understood by search engines

CSS (Cascading Style Sheets) controls visual presentation. It doesn't directly affect SEO, but it does affect page experience (load time, layout shift).

JavaScript is where complexity for SEO lies. JS can:

Render content dynamically (content may not exist in the initial HTML)
Inject links that crawlers may or may not follow
Delay page rendering, harming Core Web Vitals

Google can execute JavaScript, but it does so in a second wave of indexing that can lag days or weeks. Critical SEO content should not depend on JavaScript to render.

View Source and the DOM

There's an important distinction between view source and the DOM:

View Source (Ctrl+U or right-click > View Page Source) shows you the raw HTML as delivered by the server - before any JavaScript runs.
The DOM (Document Object Model) is the page as it exists in memory after the browser has parsed HTML and executed JavaScript. You inspect it via DevTools (F12 > Elements tab).

SEO implication: Googlebot first sees the view-source version. If your content only appears in the DOM (rendered by JS), Google may not index it on first crawl. Always check view source to confirm your important content - titles, headings, body text, links - is present in the raw HTML.

Crawling and Indexing

These are two distinct steps:

Crawling - Googlebot visits a URL, downloads the HTML, and follows links to discover more URLs
Indexing - Google analyses the crawled content, processes it, and stores it in its index so it can be retrieved for queries

A page can be crawled but not indexed (if Google decided it wasn't worth indexing). A page cannot be indexed without first being crawled. You can block either step independently - which we'll cover in the robots.txt and noindex sections.

How Googlebot discovers pages:

Following links from already-known pages
XML sitemaps submitted via Google Search Console
Direct URL submission via Search Console's URL Inspection Tool

The `site:` and `intitle:` Operators

Google's advanced search operators let you investigate a site's index directly.

site:example.com - shows all pages Google has indexed from that domain

site:example.com

Use it to:

Check roughly how many pages are indexed
Spot pages that should be indexed but aren't
Find indexed pages that shouldn't be (staging environments, test pages)

site:example.com/folder/ - narrows to a specific directory

intitle:keyword - finds pages with that keyword in the title tag

intitle:"privacy policy" site:example.com

Useful for finding duplicate title tags or pages you forgot existed.

inurl:keyword - finds pages with the keyword in the URL

Combining operators:

site:example.com inurl:blog intitle:"2019"

This would find blog pages from 2019 still indexed on the site.

These operators are estimates. Google doesn't guarantee they're exhaustive. For a true picture of indexation, use Google Search Console.

Anatomy of a Web Address

Understanding URLs is fundamental to SEO. A full URL breaks down as:

https://www.example.com:443/blog/seo-tips?ref=newsletter#comments
  |        |        |    |   |              |               |
scheme  subdomain domain port  path          query string   fragment

Scheme (https://) - the protocol; always use HTTPS
Subdomain (www) - technically a separate host; blog.example.com is different from example.com for SEO purposes
Domain (example.com) - your root domain
Port (:443) - usually omitted; 443 is the default for HTTPS
Path (/blog/seo-tips) - the specific page; keep it short, descriptive, lowercase, hyphenated
Query string (?ref=newsletter) - parameters; can cause duplicate content if not managed
Fragment (#comments) - browser-side anchor; Google typically ignores this

URL best practices for SEO:

Use HTTPS
Keep paths short and descriptive
Use hyphens (-) not underscores (_) to separate words
Avoid unnecessary parameters where possible
Be consistent with or without trailing slashes (and canonical accordingly)

Crawling, Indexing and Domain Basics

Screaming Frog SEO Spider

Screaming Frog is the industry-standard desktop crawler for SEO audits. It mimics what Googlebot does - visiting your site, downloading pages, following links - and gives you structured data about everything it finds.

What Screaming Frog reveals:

All URLs on your site (crawled, linked, or both)
HTTP status codes (200, 301, 404, 500, etc.)
Title tags and meta descriptions (length, duplicates, missing)
H1 and H2 tags
Canonical tags
Noindex directives
Response times
Images (missing alt text, oversized files)
Internal and external links

Running your first crawl:

Enter your domain in the address bar and click Start
Wait for the crawl to complete (time depends on site size)
Use the top tabs (Response Codes, Page Titles, etc.) to navigate results
Export any tab to CSV for further analysis

Free vs paid: The free version is limited to 500 URLs. For larger sites you need a licence. For small sites or initial testing, free is sufficient.

Crawling and Rendering

Screaming Frog (and Googlebot) can crawl in two modes:

HTTP crawl: Downloads the raw HTML only. Fast, but won't see JavaScript-rendered content.

JavaScript rendering: The crawler loads the page in a headless browser (like Chrome), executes all JavaScript, then captures the final DOM. Slower but shows what a real user (and Google) would see.

Why this matters: If there's a significant difference between the HTTP and JavaScript-rendered versions of your pages, you have a rendering issue. Content or links that only appear post-render may not be indexed reliably.

In Screaming Frog: Configuration > Spider > Rendering > select "JavaScript"

HTTP Response Codes

Every time a browser or crawler requests a URL, the server responds with a three-digit status code. These codes tell you (and Google) what happened.

Code	Meaning	SEO Impact
200	OK - page served successfully	Good
301	Moved Permanently	Good if used correctly
302	Found (Temporary Redirect)	Use carefully
404	Not Found	Neutral if expected; bad if important pages
410	Gone (permanent 404)	Tells Google definitively the page is removed
500	Internal Server Error	Bad - Google will back off crawling
503	Service Unavailable	Temporary; Google will retry

The full range:

2xx - success
3xx - redirection
4xx - client error (the URL doesn't exist or isn't accessible)
5xx - server error (the server failed to fulfill the request)

HTTP Response Codes and Redirects

Auditing Status Codes

In Screaming Frog, the Response Codes tab shows every URL and its status code. Filter by code to find problems:

4xx errors on internal links - you're linking to pages that don't exist; fix the links or restore the pages
5xx errors - server problems; needs developer attention
3xx chains - a redirect that points to another redirect; inefficient and loses a small amount of link equity

Export the 4xx list and cross-reference with your Google Search Console Coverage report to prioritise which broken URLs Google has actually been trying to crawl.

301 Permanent Redirects

A 301 tells Google (and browsers): "This page has permanently moved to a new URL. Update your records."

When to use 301:

You've permanently changed a URL structure
You're migrating a domain
You're consolidating duplicate content
You're merging two sites

How to implement: On Apache (.htaccess):

Redirect 301 /old-page/ https://www.example.com/new-page/

On Nginx:

return 301 https://www.example.com/new-page/;

SEO impact: 301s pass the vast majority of link equity (PageRank) from the old URL to the new one. Google consolidates signals from old to new. The old URL will eventually drop out of the index and the new URL will rank instead.

301 Redirect Depth (Redirect Chains)

A redirect chain is when A → B → C, instead of A → C directly.

Why chains are bad:

Each hop adds latency for users
Each hop may reduce the PageRank passed
Screaming Frog highlights chains so you can fix them

Best practice: Redirect directly to the final destination. If you accumulate redirects over multiple migrations, clean them up so everything points directly to the canonical URL.

302 Temporary Redirects

A 302 tells Google: "This page has temporarily moved. Keep the original URL in your index."

When to use 302:

A/B testing (briefly redirecting users to a variant)
Maintenance pages (show a temporary page while you fix something)
Login-required redirects

The critical mistake: Many developers use 302 when they mean 301, because it's the default in some frameworks. If Google sees a 302, it won't transfer link equity and will keep the old URL in the index. If the redirect is permanent, always use 301.

302 and 307 Redirects

307 Temporary Redirect is the HTTP/1.1 equivalent of 302. The key difference: a 307 guarantees the request method is preserved (e.g., a POST stays a POST). For SEO purposes, 302 and 307 behave identically - both are temporary, neither consolidates PageRank.

For SEO decisions:

Permanent change → 301
Temporary change → 302 or 307 (307 preferred for modern HTTP/2 setups)

Soft 404s

A soft 404 is a page that returns a 200 status code but displays "not found" content to the user. This is a mistake.

Example: A product goes out of stock. The developer removes the content but leaves the page returning 200 with just "Product not available." Google crawls it, sees a 200, tries to index it, but it has no useful content. This wastes crawl budget and can dilute your site's quality signals.

Fix: Return an actual 404 or 410 for truly gone pages, or redirect to the best alternative.

Google Search Console's Coverage report flags soft 404s it detects. Check it regularly.

503 Service Unavailable

A 503 tells crawlers: "The server is temporarily unable to handle the request. Try again later."

When Google sees a 503:

It backs off and retries later
If it sees 503 persistently, it may reduce crawl frequency
If 503 continues long enough, Google may eventually drop pages from the index

Use case: Sending a 503 during planned maintenance (combined with a Retry-After header) is the correct approach. Google will wait rather than deindex.

HTTP/1.1 503 Service Unavailable
Retry-After: 3600

Controlling Crawling and Indexing

Robots.txt Introduction

robots.txt is a plain text file hosted at the root of your domain (https://www.example.com/robots.txt). It's part of the Robots Exclusion Protocol - a widely-adopted convention (not a technical enforcement mechanism) that tells crawlers which parts of your site they should and shouldn't access.

Example robots.txt:

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /public/

Sitemap: https://www.example.com/sitemap.xml

Critical distinction: robots.txt controls crawling, not indexing. Blocking a URL in robots.txt prevents Googlebot from visiting it, but if other sites link to that URL, Google may still index it (without seeing its content). To prevent indexing, use the noindex meta tag instead.

User-Agents

The User-agent: directive in robots.txt specifies which crawler the rule applies to.

User-agent	Crawler
`*`	All crawlers
`Googlebot`	Google's main web crawler
`Googlebot-Image`	Google's image crawler
`Bingbot`	Microsoft Bing
`AhrefsBot`	Ahrefs' crawler

Example - block all crawlers from a directory except Google:

User-agent: *
Disallow: /staging/

User-agent: Googlebot
Allow: /staging/

Rules are evaluated in order. More specific user-agent blocks take precedence over the * wildcard.

Crawling with Different User-Agents

Screaming Frog lets you set a custom user-agent string for your crawl. This is useful to:

See what Googlebot sees (vs. what real browsers see)
Test if your server is serving different content to different bots (cloaking - a violation of Google's guidelines)
Check if your robots.txt rules apply correctly to specific bots

In Screaming Frog: Configuration > Spider > User-Agent

You can paste in Googlebot's official user-agent string to simulate its crawl precisely.

Dealing with 403 Errors When Crawling

A 403 Forbidden means the server understood the request but refused to fulfill it. When Screaming Frog returns lots of 403s, common causes include:

The server is blocking your IP (rate limiting or bot protection)
The server requires authentication for those pages
Cloudflare or a WAF (Web Application Firewall) is blocking the crawler

Solutions:

Reduce crawl speed in Screaming Frog (Configuration > Speed)
Whitelist your IP in Cloudflare or your server firewall
Use the "Cookie" option in Screaming Frog to pass authentication

Google Search Console

Google Search Console (GSC) is Google's free tool for monitoring how your site performs in Google Search. It's essential - no other tool gives you direct data from Google itself.

Key features:

Feature	What it tells you
Performance report	Clicks, impressions, CTR, average position per query/page
Coverage report	Which pages are indexed, and why others aren't
URL Inspection	Detailed crawl/index status for a specific URL
Sitemaps	Submit and monitor XML sitemaps
Core Web Vitals	Field data (real-user experience)
Links	Top linked pages and linking domains
Manual Actions	Whether Google has penalised your site

Adding and Verifying Google Search Console

To use GSC, you must verify ownership of the domain.

Verification methods:

HTML file - upload a specific file to your server root (simplest for developers)
HTML meta tag - add a <meta name="google-site-verification"> tag to your homepage <head>
Google Analytics - if GA is already installed, GSC can verify automatically
Google Tag Manager - similar to GA
DNS record - add a TXT record to your domain's DNS (best for domain-level properties)

Domain property vs URL prefix property:

A URL prefix property covers one protocol and subdomain (e.g., https://www.example.com)
A Domain property covers all subdomains and protocols - requires DNS verification

Use a Domain property where possible for a complete picture.

URL Inspection Tool

The URL Inspection Tool in GSC gives you Google's current view of a specific URL:

Is it indexed?
Was it crawled, and when?
What did Google see when it crawled it (the rendered screenshot)?
Are there any issues (noindex tag, robots.txt blocked, redirect, etc.)?
What canonical URL did Google select?

You can also request indexing directly from this tool - useful after publishing new content or fixing a page.

Important: "Request indexing" adds the URL to a priority crawl queue. It doesn't guarantee instant indexing. Google still makes the final decision.

Noindex Tag

The noindex directive tells Google: "Crawl this page if you want, but do not include it in the search index."

Implementation (meta tag in <head>):

<meta name="robots" content="noindex" />

Via HTTP header (useful for non-HTML files like PDFs):

X-Robots-Tag: noindex

Common use cases:

Thank-you pages after form submissions
Internal search results pages
Pagination (though there are better approaches)
Staging or development environments
Thin or duplicate pages you don't want competing in search

Noindex vs robots.txt:

noindex = "crawl it, but don't index it"
robots.txt Disallow = "don't even crawl it"
Never combine them - if you disallow crawling, Google can't see the noindex tag

Managing Crawl Budget

Crawl Budget

Crawl budget is the number of URLs Googlebot will crawl on your site within a given timeframe. It's determined by two factors:

Crawl rate limit - how fast Googlebot is willing to crawl without overloading your server
Crawl demand - how often Google wants to recrawl your pages based on perceived value and change frequency

For small sites (under a few thousand pages), crawl budget is rarely a concern. For large sites (e-commerce, news, large CMSs with millions of URLs), it matters significantly.

Signs of crawl budget problems:

New or updated content takes weeks to be indexed
GSC Coverage report shows large numbers of "Discovered but not yet indexed" URLs
Crawl stats in GSC show Googlebot spending lots of time on low-value pages

Crawl-Delay and Crawl Speed

Crawl-delay is a robots.txt directive that tells crawlers to wait N seconds between requests:

User-agent: *
Crawl-delay: 2

Important: Google does not officially support Crawl-delay. To manage Google's crawl rate, use the Crawl Rate setting in Google Search Console (Settings > Crawl stats > Open crawl stats > Adjust crawl rate).

Other crawlers (Bingbot, AhrefsBot, etc.) do respect Crawl-delay.

Robots.txt vs Noindex Usage

Goal	Use
Stop Googlebot visiting a URL	`robots.txt Disallow`
Stop a URL appearing in search results	`noindex` meta tag
Stop a page appearing AND being crawled	`noindex` alone (still allows crawling - OK for low-value pages)
Protect private content from crawlers	`robots.txt Disallow` (but remember it's not security)
Keep a page crawlable but not indexed	`noindex` tag

The critical mistake to avoid: Blocking a URL in robots.txt and adding a noindex tag. If you block crawling, Googlebot can't read the noindex. The page may still appear in the index (without a snippet) if other sites link to it.

Low-Value Pages and Crawl Budget

Low-value pages waste crawl budget and can dilute site quality signals. Common culprits:

Faceted navigation - e.g., /products?colour=red&size=medium&brand=nike - can generate millions of parameter combinations
Session IDs in URLs - creates duplicate pages for every session
Printer-friendly pages - duplicates of main content
Empty category pages - e.g., a category with zero products
Paginated archives - deep pagination (page 500 of a blog archive)
Duplicate parameter pages - ?sort=price vs ?sort=price_asc

Solutions:

Use robots.txt to block entire parameter-based directories
Use canonical tags to consolidate duplicate URLs
Use noindex on pagination beyond a certain depth
Use GSC's URL Parameters tool (for Google specifically)

Auditing Crawling and Indexing

Installing Sitebulb and Running Your First Crawl

Sitebulb is an alternative to Screaming Frog with a stronger focus on visualisation and prioritised recommendations. Where Screaming Frog gives you raw data in spreadsheet-style tabs, Sitebulb interprets that data and surfaces the most important issues first.

First crawl workflow:

Create a new project with your domain
Select crawl type (Website Crawl is standard)
Configure crawl settings (authentication, JavaScript rendering, crawl limits)
Run the crawl and review the Overview dashboard

Sitebulb's Hints system assigns a severity level (Critical, Warning, Advisory) to each issue it finds, which helps prioritise remediation.

Testing Robots.txt Rules

Google provides a robots.txt Tester in Google Search Console (under Legacy Tools). It lets you:

View your current robots.txt as Google sees it
Test whether specific URLs are blocked for specific user-agents
Identify syntax errors

You can also test robots.txt rules at the command line using Screaming Frog's configuration, or via third-party tools.

Common robots.txt mistakes:

# Wrong - blocks everything
User-agent: *
Disallow: /

# Wrong - Disallow requires a path, not a comment
User-agent: *
Disallow: # this doesn't work

# Wrong - case sensitivity matters on Linux servers
User-agent: *
Disallow: /Admin/   # won't block /admin/

Noindex and Robots.txt Together

As established: never use both together on the same URL. The combination is self-defeating.

Scenario to watch for:

A developer adds Disallow: /blog/ to robots.txt (maybe to speed up staging)
Later, a noindex is added to blog posts to prevent duplicate content
The robots.txt block is forgotten and never removed after launch
Result: blog posts can't be indexed because Google can't crawl them to see the noindex

Always audit robots.txt and noindex directives together when investigating indexation issues.

Quickly Removing Pages from Google

If you need a page out of Google's index urgently (sensitive content, a leaked document, an error page that went live):

URL Removal Tool in Google Search Console - temporarily suppresses a URL from results for ~6 months
This is temporary. For permanent removal, you also need to add a noindex tag or return a 404/410

The URL Removal Tool does not delete the page from the internet. It only hides it from Google results temporarily. For genuine removal, the noindex or 404 must be in place before the removal expires.

Advanced Robots.txt

Wildcards:

User-agent: *
Disallow: /search?*    # block all URLs with query strings starting with ?
Disallow: /*.pdf$      # block all PDFs

Google supports * (match any sequence) and $ (end of URL) wildcards.

Allow overrides Disallow:

User-agent: *
Disallow: /private/
Allow: /private/public-document.pdf

More specific rules take precedence. If there's a tie in specificity, Allow wins.

Multiple sitemaps:

Sitemap: https://www.example.com/sitemap.xml
Sitemap: https://www.example.com/sitemap-news.xml
Sitemap: https://www.example.com/image-sitemap.xml

Auditing Robots.txt Coverage

A robots.txt coverage audit checks whether your robots.txt rules are doing what you intend:

Crawl your site with Screaming Frog
Cross-reference blocked URLs with your intent
Check whether any linked URLs are blocked (Screaming Frog flags these)
Verify that your robots.txt allows your most important pages
Test in GSC's robots.txt tester

Screaming Frog marks blocked URLs with "Blocked by robots.txt" in the Indexability Status column.

Canonical Tags

A canonical tag is an HTML element that tells Google: "This is the preferred version of this page. If you find duplicate or near-duplicate versions, treat this URL as the original."

<link rel="canonical" href="https://www.example.com/preferred-page/" />

This tag lives in the <head> section of your HTML.

The duplicate content problem: Google regularly encounters multiple URLs with identical or near-identical content. This can happen through:

HTTP vs HTTPS versions (http:// and https://)
www vs non-www (www.example.com vs example.com)
Trailing slash vs no trailing slash (/page vs /page/)
URL parameters (?sessionid=abc, ?sort=price)
Print versions of pages
Syndicated content

Without direction, Google has to guess which version to index and rank. It usually guesses correctly, but you shouldn't rely on it.

When to Use Canonical Tags

Use canonical tags when:

You have parameterised URLs that create duplicate pages (e-commerce facets, session IDs)
You syndicate content to other sites (the original should self-canonicalise)
You have www/non-www or HTTP/HTTPS duplicates (though redirects are better here)
A CMS automatically creates multiple versions of pages

Self-referencing canonicals: Best practice is for every page to have a canonical tag pointing to itself, even if there are no known duplicates. This is defensive - it tells Google explicitly what the preferred URL is.

<!-- On https://www.example.com/blog/seo-tips/ -->
<link rel="canonical" href="https://www.example.com/blog/seo-tips/" />

Other Canonical Tag Uses

Cross-domain canonical: You can canonical from one domain to another. This is used when syndicating content - the syndicated version canonicalises back to the original:

<!-- On syndication site, pointing back to original -->
<link rel="canonical" href="https://original-publisher.com/article/" />

Pagination: Canonical tags are sometimes used to consolidate paginated series to the first page. However, Google's current recommendation is to let paginated pages be indexed naturally with proper internal linking, rather than canonicalising all pages to page 1.

Canonical Tag Best Practices

Use absolute URLs - always include the full URL including protocol and domain
Be consistent - decide on www/non-www, trailing slash or not, and stick to it everywhere
Don't canonicalise to redirected URLs - the canonical should point to the final destination URL
Don't canonicalise to a noindexed page - this creates a conflict; Google will likely ignore it
One canonical per page - if multiple canonical tags are present, Google will ignore all of them
Use 301 redirects for true duplicates - canonical is for when you need both URLs accessible; if you genuinely want users and bots to end up at one URL, redirect instead
Canonical tags are hints, not directives - Google may choose a different canonical than you specify. If Google consistently overrides your canonical, investigate why (usually a redirect, sitemap, or internal linking inconsistency)

Internationalisation (Canonical Context)

When running multilingual or multi-regional sites, canonical tags interact with hreflang. The key rule: hreflang should point to canonical URLs. If your French page (/fr/) has a canonical pointing to the English version (/en/), you're telling Google to prefer the English page - which defeats the purpose of the French version.

Each language/region version should:

Have a self-referencing canonical
Have hreflang annotations pointing to all other versions

Auditing Canonical Tags with Sitebulb

Sitebulb's canonical audit surfaces:

Pages missing canonical tags
Non-self-referencing canonicals (pages pointing elsewhere)
Canonical chains (A canonicals to B, B canonicals to C)
Canonicals pointing to non-indexable pages (404s, noindex, redirects)
Canonical vs redirect conflicts
Pages where Google has chosen a different canonical than specified

The "Canonicalised" pages report shows all pages being consolidated into others - review this to ensure consolidation is intentional.

Internationalisation and Hreflang

Hreflang Tags

Hreflang is an HTML attribute that tells Google which language and/or region a page is intended for, and links it to equivalent pages in other languages/regions.

<!-- On the English (US) page -->
<link
	rel="alternate"
	hreflang="en-us"
	href="https://www.example.com/en-us/page/"
/>
<link
	rel="alternate"
	hreflang="en-gb"
	href="https://www.example.com/en-gb/page/"
/>
<link rel="alternate" hreflang="fr" href="https://www.example.com/fr/page/" />
<link
	rel="alternate"
	hreflang="x-default"
	href="https://www.example.com/page/"
/>

x-default specifies the fallback page shown when no other hreflang matches the user's language/region.

Where to implement hreflang:

HTML <head> tags (shown above)
HTTP headers (for non-HTML files like PDFs)
XML sitemap

Hreflang Tag Best Practices

All pages in the set must reference each other - if you have 5 language versions, each page must have 5 hreflang tags (including a self-referencing one)
Use valid language/region codes - ISO 639-1 for language (en, fr, de), ISO 3166-1 Alpha-2 for region (US, GB, FR)
Always canonicalise before applying hreflang - hreflang signals on non-canonical pages are wasted
Don't hreflang to noindexed pages - Google can't index them
Machine translation is not a separate language version - Google is increasingly good at detecting this; only create language versions with genuinely translated content

Language only vs language + region:

en = English, any region
en-gb = English, United Kingdom
en-us = English, United States

Use language+region only when the content genuinely differs between regions (pricing, date formats, culturally specific content). Don't create US/UK/AU versions of identical English content - that's duplicate content.

Auditing Hreflang Tags

Common hreflang errors to check for:

Error	Cause
Missing return links	Page A references Page B but B doesn't reference A
Wrong language codes	Using `en-EN` instead of `en-GB`
Hreflang to redirected URL	Should point to final destination
Hreflang to noindexed page	Google ignores these
Self-referencing hreflang missing	Each page needs to include itself in the hreflang set

Tools for auditing: Screaming Frog (Hreflang tab), Sitebulb, Ahrefs Site Audit.

Domain Selection for International SEO

How you structure URLs for international content has long-term SEO implications:

Structure	Example	Pros	Cons
ccTLD	`example.fr`	Strong geo-targeting signal	Expensive, harder to build authority separately
Subdomain	`fr.example.com`	Separate crawl, easy to host separately	Authority doesn't consolidate to main domain as cleanly
Subdirectory	`example.com/fr/`	Consolidates all authority to one domain	Requires shared hosting, slightly weaker geo-signal

Recommendation for most sites: Subdirectories (/fr/, /de/) unless you have specific reasons to need ccTLDs (strong local brand trust requirements) or subdomains (technical infrastructure requirements).

Geo-IP Redirects

Geo-IP redirects automatically redirect users to the regional version of your site based on their IP address. For example, a user with a French IP visiting example.com is redirected to example.com/fr/.

SEO concerns:

Googlebot crawls from US IPs. If you redirect US IPs to the English version, Google may never crawl your French pages
Always allow Googlebot to access all versions regardless of IP
Use Vary: Accept-Language HTTP header to signal content negotiation
Pair geo-IP redirects with hreflang so Google understands the full international structure

Nofollow Tags

Rel Nofollow Attribute

The rel="nofollow" attribute on a link tells Google: "Don't follow this link or pass PageRank through it."

<a href="https://external-site.com" rel="nofollow">External link</a>

Originally introduced in 2005 to combat comment spam (spammers adding links to comments to gain PageRank). It became widely overused - many sites nofollow all external links indiscriminately.

Google's current treatment (since 2019): Nofollow is a hint, not a directive. Google may choose to follow nofollow links if it considers them valuable. However, it generally doesn't pass PageRank through them.

Types of Rel Nofollow

Google introduced two additional link attributes in 2019 to provide more nuance:

Attribute	Use case
`rel="nofollow"`	General purpose: don't associate this link with my endorsement
`rel="sponsored"`	Paid links, advertising, affiliate links
`rel="ugc"`	User-generated content: comments, forum posts

These can be combined: rel="nofollow ugc"

When to use nofollow:

Paid or sponsored links (using sponsored is preferred and more accurate)
Links within user-generated content you haven't editorially reviewed
Links to pages you don't want to endorse (e.g., a competitor you're mentioning)

When NOT to use nofollow:

Random external links to legitimate sites you're citing
Internal links (this can disrupt internal PageRank flow)
Links in your main navigation or footer

Page Experience

Speed, Performance and UX

Google uses page experience as a ranking factor. The rationale: a fast, usable page delivers a better experience for searchers, which serves Google's goal of surfacing high-quality results.

Key performance concepts:

TTFB (Time to First Byte) - how quickly the server starts sending data; influenced by hosting quality and server-side processing
LCP (Largest Contentful Paint) - how quickly the main content loads; a Core Web Vital
FID / INP (Interaction to Next Paint) - how quickly the page responds to user interaction; a Core Web Vital
CLS (Cumulative Layout Shift) - how much the layout moves unexpectedly while loading; a Core Web Vital

Quick performance wins:

Enable GZIP or Brotli compression on your server
Use a CDN (Content Delivery Network) to serve assets from servers close to users
Optimise and compress images (use WebP format)
Minify CSS, JavaScript, and HTML
Use browser caching (set appropriate Cache-Control headers)
Defer non-critical JavaScript

Mobile-Friendly SEO

Google uses mobile-first indexing - it primarily uses the mobile version of your content for indexing and ranking. If your mobile site has less content than desktop, the missing content may not be indexed.

Requirements:

Responsive design (or equivalent mobile experience)
Same content on mobile and desktop versions
Same structured data markup on mobile
No mobile-specific interstitials blocking content
Touch targets large enough to tap (Google recommends 48x48px minimum)
Readable font sizes without zooming

Testing: Google's Mobile-Friendly Test tool (search "mobile-friendly test" in Google). Also check GSC's Mobile Usability report.

HTTPS

HTTPS (HyperText Transfer Protocol Secure) encrypts the connection between the browser and the server. Google confirmed HTTPS as a ranking signal in 2014, and Chrome now flags HTTP sites as "Not Secure."

Why HTTPS matters for SEO:

Direct (minor) ranking signal
Trust signal for users - visitors are less likely to bounce from a "Not Secure" warning
Required for HTTP/2 (which is faster than HTTP/1.1)
Protects integrity of content (prevents ISPs injecting ads or malware)
Enables modern browser features required for performance (like Service Workers)

Implementing and Checking HTTPS

Implementation steps:

Obtain an SSL certificate (Let's Encrypt provides free certificates)
Install and configure on your server
Update all internal links to HTTPS
Update your canonical tags to HTTPS URLs
Update your XML sitemap to HTTPS URLs
Set up 301 redirects from all HTTP URLs to HTTPS
Update Google Search Console with the HTTPS property
Update Google Analytics to HTTPS

Common HTTPS issues:

Mixed content - page is HTTPS but loads HTTP resources (images, scripts, stylesheets); browsers block these and it shows as insecure. Fix with Screaming Frog's HTTPS audit.
Redirect chains - HTTP → HTTP/www → HTTPS/www instead of directly HTTP → HTTPS
Expired certificate - causes browser warnings and Googlebot will back off

Intrusive Interstitials

Google penalises pages that show intrusive interstitials (popups, overlays) that block users from accessing content - particularly on mobile.

What's penalised:

Popups that cover the main content immediately upon landing
Standalone interstitials users must dismiss before seeing content
Content pushed below the fold by large banners

What's allowed:

Cookie consent banners (legal requirement)
Age verification popups (legal requirement)
Small, easily dismissable banners
Login dialogs for content that requires sign-in

Safe Browsing

Google's Safe Browsing program flags sites that contain malware, phishing content, or harmful downloads. If your site is flagged:

Chrome shows a large red warning to users
Google may mark your listing in search results
GSC will alert you under Security Issues

Causes of Safe Browsing flags:

Hacked site with injected malware
Phishing pages (legitimate or injected)
Software that downloads malware
Third-party scripts from unsafe providers

Remediation: Use GSC's Security Issues report, clean the site, then request a review. Safe Browsing issues must be resolved urgently - they kill traffic.

Real-Time Monitoring

For ongoing page experience, set up monitoring for:

Uptime monitoring (UptimeRobot, Pingdom) - alerts when your site goes down
Performance monitoring (SpeedCurve, Calibre) - tracks Core Web Vitals over time
GSC alerts - subscribe to email alerts for new Coverage issues, Manual Actions
Synthetic monitoring - scheduled Lighthouse runs to catch regressions

Core Web Vitals

Core Web Vitals (CWV) are Google's specific page experience metrics used as ranking signals:

Metric	Measures	Good threshold
LCP (Largest Contentful Paint)	Loading performance	≤ 2.5 seconds
INP (Interaction to Next Paint)	Interactivity	≤ 200 milliseconds
CLS (Cumulative Layout Shift)	Visual stability	≤ 0.1

LCP optimisation:

Preload the LCP image with <link rel="preload">
Serve images at the correct size
Use a CDN
Optimise your server response time

INP optimisation:

Minimise long JavaScript tasks
Break up heavy scripts
Use requestIdleCallback for non-critical work

CLS optimisation:

Always specify width and height on images and iframes
Reserve space for dynamically injected content
Avoid inserting content above existing content

Lab Testing Core Web Vitals

Lab tools measure performance in a controlled environment. Use them for debugging:

Lighthouse (built into Chrome DevTools, or PageSpeed Insights)
WebPageTest (more detailed waterfall analysis)

Lab data is useful for diagnosing specific issues, but doesn't reflect real user experience (different devices, networks, regions).

Field Data and the CrUX Report

Field data (also called Real User Monitoring or RUM) measures performance as experienced by real users. Google's field data comes from Chrome users who have opted into sharing data - this is the Chrome User Experience Report (CrUX).

CrUX data is publicly available:

PageSpeed Insights shows it for any URL with sufficient traffic
The CrUX API lets you query it programmatically
Google Data Studio has a CrUX dashboard template

Why field data matters: Google uses field data, not lab data, for its Core Web Vitals ranking signal. A page might score well in Lighthouse but poorly in field data if real users on slow connections or devices have poor experiences.

Field Data in GSC

Google Search Console's Core Web Vitals report shows field data for your site, aggregated by URL group. URLs are categorised as "Good," "Needs Improvement," or "Poor."

This report tells you:

Which pages have CWV issues as experienced by real users
What metric is failing (LCP, INP, or CLS)
Roughly how many URLs are affected

Prioritise pages in the "Poor" category, starting with your highest-traffic templates.

Internal Linking

PageRank and Linking

PageRank (named after Google co-founder Larry Page) is Google's algorithm for measuring the authority of a page based on the quantity and quality of links pointing to it. While Google no longer publishes PageRank scores, the underlying concept still drives how link authority flows through the web and within a site.

How PageRank works conceptually:

A link from Page A to Page B passes some authority (PageRank) to Page B
Pages with more links pointing to them accumulate more PageRank
Links from high-PageRank pages pass more authority than links from low-PageRank pages
PageRank distributes across all outbound links on a page (more links = less passed per link)

Internal linking implications: Every internal link you add is an opportunity to direct PageRank to a page that needs it. Your homepage typically has the most PageRank (it gets the most external links). Linking from the homepage to key internal pages passes that authority inward.

Internal Linking and Anchor Text

Anchor text is the visible, clickable text of a link:

<a href="/seo-tips/">SEO tips for beginners</a>

Here, "SEO tips for beginners" is the anchor text.

Anchor text tells Google what the linked page is about. Consistent, descriptive anchor text reinforces the topical relevance of the destination page for related keywords.

Best practices:

Use descriptive, keyword-relevant anchor text (not "click here" or "read more")
Vary anchor text naturally - exact match everywhere looks manipulative
Match anchor text to the content of the destination page
Use anchor text that makes sense out of context (accessibility benefit too)

Internal link anchor text is more flexible than external link anchor text - you have full control over it and Google treats it as an editorial choice rather than a potential manipulation.

Auditing Internal Anchor Text

In Screaming Frog:

Go to the Bulk Export menu > All Inlinks
Export to CSV
Filter by destination URL to see all anchor text pointing to a given page
Identify: generic anchors ("click here"), missing anchor text (image links without alt text), over-optimised exact-match anchors

Also useful: the Anchor tab in Screaming Frog, which shows all anchor text used in internal links across the site.

Linking from Important Pages

Because PageRank flows through links, links from your most important pages are the most valuable internal links you can create.

Strategies:

Hub pages / pillar pages - comprehensive pages on broad topics that link to more specific sub-pages (topic clusters). The hub accumulates authority and distributes it to cluster pages.
Homepage links - direct links from the homepage to key conversion pages or important content
Navigation links - your main navigation appears on every page; what's in it receives PageRank from every page on your site
Contextual links - links within body content from high-traffic, high-authority pages to newer or struggling pages

Audit: orphan pages - pages with no internal links pointing to them. Googlebot may never find them. Screaming Frog's Crawl source filter can help identify these.

Structured Data

Introduction to Structured Data

Structured data is code added to your HTML that helps search engines understand the meaning and context of your content - not just the words, but what those words represent.

For example: a page about a recipe contains text mentioning "30 minutes." With structured data, you can explicitly tell Google: those 30 minutes are the cookTime for a Recipe object. This unlocks rich results in Google Search - enhanced listings with star ratings, images, cooking times, etc.

Structured data doesn't directly improve rankings, but it can significantly improve click-through rates by making your listing more visually prominent and informative.

Getting Started with Schema.org

Schema.org is the vocabulary (the set of defined types and properties) used for structured data. It's a collaboration between Google, Bing, Yahoo, and Yandex.

Key concepts:

Type - what the thing is (e.g., Article, Product, Recipe, FAQPage, LocalBusiness)
Property - an attribute of that type (e.g., name, price, ratingValue, author)

Browse types and properties at schema.org.

Formats for implementing structured data:

JSON-LD - recommended by Google; JavaScript object placed in a <script> tag in the <head> or <body>. Easier to implement and maintain.
Microdata - HTML attributes inline with content. More complex, tightly coupled to HTML.
RDFa - similar to Microdata; less common.

Use JSON-LD unless you have a specific reason not to.

Google's SERP Feature Gallery

Implementing the right structured data can unlock these rich results in Google:

Schema Type	Rich Result
`Recipe`	Image, cook time, ratings in results
`Product`	Price, availability, reviews
`FAQPage`	Expandable Q&A sections below the listing
`HowTo`	Step-by-step instructions with images
`Event`	Date, location, ticket availability
`Article` / `NewsArticle`	Top Stories carousel, AMP eligibility
`LocalBusiness`	Enhanced Knowledge Panel
`BreadcrumbList`	Breadcrumb trail shown below result URL
`VideoObject`	Video thumbnail and duration in results
`Review` / `AggregateRating`	Star ratings

Not all schema types generate rich results. Check Google's Search Gallery (search "Google Search Gallery") for the current list of eligible types and their requirements.

Writing Structured Data

Example - FAQ structured data:

<script type="application/ld+json">
	{
		"@context": "https://schema.org",
		"@type": "FAQPage",
		"mainEntity": [
			{
				"@type": "Question",
				"name": "What is SEO?",
				"acceptedAnswer": {
					"@type": "Answer",
					"text": "SEO stands for Search Engine Optimisation. It is the practice of improving a website to increase its visibility in organic search engine results."
				}
			},
			{
				"@type": "Question",
				"name": "How long does SEO take?",
				"acceptedAnswer": {
					"@type": "Answer",
					"text": "SEO typically takes 3–6 months to show significant results, depending on competition, domain authority, and how much work is done."
				}
			}
		]
	}
</script>

Example - Article:

<script type="application/ld+json">
	{
		"@context": "https://schema.org",
		"@type": "Article",
		"headline": "Complete Guide to SEO in 2024",
		"author": {
			"@type": "Person",
			"name": "Jane Smith"
		},
		"datePublished": "2024-01-15",
		"dateModified": "2024-03-01",
		"publisher": {
			"@type": "Organization",
			"name": "Example Blog",
			"logo": {
				"@type": "ImageObject",
				"url": "https://www.example.com/logo.png"
			}
		}
	}
</script>

Testing structured data: Use Google's Rich Results Test (search "rich results test") to validate your markup and preview how it might appear in search results. Also check the Enhancements section in Google Search Console for live data.

On-Page SEO

On-page SEO is the practice of optimising individual web pages to rank for specific queries. Unlike technical SEO (which focuses on how search engines access your site) and off-page SEO (which focuses on external signals), on-page SEO is about making each page the best possible match for its target keywords and user intent.

The core question for every page: What query is this page trying to rank for, and is every element of this page aligned with that goal?

Optimising Page Titles

The title tag (<title> in HTML) is arguably the most important on-page SEO element. It:

Appears as the headline in search results (though Google may rewrite it)
Is one of the strongest relevance signals for ranking
Influences click-through rate from search results

Best practices:

Include your primary keyword - towards the start of the title where possible
Keep it under ~60 characters - longer titles get truncated in search results
Make it click-worthy - the title must attract clicks, not just rank. Numbers, power words, and clear value propositions help.
Brand at the end - Primary Keyword - Secondary Modifier | Brand Name
Unique per page - duplicate titles dilute relevance signals and confuse users

Examples:

Bad: Home | Our Company
Bad: We offer the best SEO services in the UK for businesses of all sizes at competitive rates
Good: SEO Services UK | Affordable Search Engine Optimisation

Auditing Page Titles and Canonical Issues

In Screaming Frog, the Page Titles tab shows:

Missing titles (urgent - Google will generate one)
Duplicate titles (multiple pages with the same title)
Titles over 60 characters (will truncate)
Titles under 30 characters (likely too short/generic)
Pixel width (more precise than character count)

Canonical interaction: If you have pages with identical titles and a canonical tag pointing to one of them, Screaming Frog's audit will show both issues. Resolve the underlying duplicate content issue (redirect or canonical) rather than just changing the title.

Google Rewriting Page Titles

Since 2021, Google has more aggressively rewritten page titles in search results, displaying text from headings, the <h1>, navigation text, or other prominent on-page text instead of the <title> tag.

When Google rewrites:

The title is too long and gets truncated
The title is keyword-stuffed and doesn't match the page content
The title is generic (e.g., "Home" or "Untitled")
Google believes anchor text from inbound links is a better descriptor

How to minimise rewrites:

Write concise, accurate, descriptive titles
Ensure your <title> matches your <h1> in topic (they don't have to be identical, but they should be about the same subject)
Avoid titles that don't reflect what the page is actually about

You can monitor title rewrites in GSC's Performance report: compare the query that drove clicks against the URL, and manually check whether the displayed title in search results matches your title tag.

Meta Descriptions

The meta description is the snippet of text that sometimes appears below your title in search results:

<meta
	name="description"
	content="Learn everything about SEO in this comprehensive guide. Covering technical SEO, on-page optimisation, and keyword research."
/>

Important: Meta descriptions are not a direct ranking factor. Google confirmed this. However, they influence click-through rate - a well-written description can significantly increase clicks, which has indirect SEO value.

Best practices:

Keep under ~155–160 characters (longer gets truncated)
Include the primary keyword (Google bolds matching terms)
Write it as an ad - give a reason to click
Make it unique per page
Match the content of the page accurately

Google often ignores the meta description and pulls text from the page it thinks is more relevant to the query. This is expected behaviour.

Auditing Meta Descriptions

In Screaming Frog, the Meta Description tab shows:

Missing descriptions
Duplicate descriptions
Descriptions over 155 characters
Descriptions under 70 characters (probably too short)

Pages missing meta descriptions aren't necessarily broken, but they're a missed opportunity to control the snippet Google shows.

Meta Descriptions Best Practice

Write meta descriptions like mini-ads. Ask: why should someone click this result over the nine others on the page?

Template approach:

Sentence 1: What does the page offer / what problem does it solve?
Sentence 2: What's the unique angle or call to action?

Example:

"A complete guide to technical SEO for website owners and marketers. Learn how to fix crawling, indexing, and structured data issues step by step."

Header Tags

Header tags (<h1> through <h6>) define the heading hierarchy of your page. They're an important relevance signal and affect readability.

<h1>Complete Guide to Technical SEO</h1>
<h2>What is Technical SEO?</h2>
<h2>Crawling and Indexing</h2>
<h3>How Googlebot Crawls the Web</h3>
<h3>Managing Crawl Budget</h3>
<h2>Core Web Vitals</h2>

SEO rules for header tags:

One <h1> per page - the main topic of the page; should include or be closely related to your primary keyword
<h2> for major sections - should support and expand on the <h1> topic
Use heading hierarchy properly - don't skip levels (e.g., jumping from <h1> to <h4>)
Don't use headings for styling - use CSS for visual size, not heading tags
Natural keyword inclusion - headings should describe the section, not be keyword-stuffed

<h1> vs <title>: They can and should differ. The title tag is for search results. The <h1> is for the reader who has landed on your page.

Keyword Research

Search Demand Curve

Not all keywords are equally worth targeting. The search demand curve describes the distribution of search volume across keyword types:

Search
Volume
   |
   |*
   | *
   |  *
   |   *  **
   |       ****
   |           *****
   |                ********
   |                        **********************
   +--------------------------------------------->
   Head    Mid    Long-tail      Ultra long-tail

Head keywords - 1–2 words, very high volume, extremely competitive. E.g., "shoes", "insurance"
Mid-tail keywords - 2–3 words, moderate volume, moderate competition. E.g., "running shoes men", "car insurance UK"
Long-tail keywords - 4+ words, lower volume per phrase, lower competition. E.g., "best running shoes for flat feet 2024"
Ultra long-tail / conversational - very specific queries, very low individual volume but very high collective volume

Key insight: Long-tail keywords convert better. Someone searching "best running shoes for flat feet for marathon training" knows exactly what they want. Someone searching "shoes" might want anything. Start with long-tail where competition is achievable, then build up to mid-tail and head terms.

Google Trends

Google Trends shows the relative search popularity of terms over time. It's free and provides data not available elsewhere.

Use cases:

Seasonality - understand when searches for your topic peak (e.g., "Christmas gifts" spikes in November/December)
Rising vs declining topics - don't build content around keywords in terminal decline
Geographic interest - where are searches for your topic concentrated?
Related queries - discover related rising topics before they become competitive
Compare keywords - see which of two terms is searched more

Limitation: Google Trends shows relative interest (0–100 index), not absolute search volume. Combine with a keyword volume tool to get the full picture.

Keyword Research Primer

The keyword research process:

Seed keywords - brainstorm the core topics your business covers
Expand - use tools to find related terms, variations, and questions
Prioritise - evaluate each keyword on volume, difficulty, and business value
Map to pages - assign keywords to specific pages (or identify gaps requiring new pages)
Cluster - group semantically related keywords to target together on single pages

Intent classification - every keyword has an underlying intent:

Informational - "how to install solar panels" (wants information)
Navigational - "Apple support" (wants a specific site)
Commercial investigation - "best running shoes 2024" (researching before buying)
Transactional - "buy running shoes size 10" (ready to buy)

Match your page format to intent. A transactional keyword needs a product/category page. An informational keyword needs a guide or article.

Head/Middle Keyword Research Tools

Google Keyword Planner (free with Google Ads account): Shows search volume ranges and competition data. Better for paid search planning than organic SEO, but useful as a baseline.

Ahrefs Keywords Explorer: Shows monthly volume, keyword difficulty (KD), click-through distribution, parent topic. Best-in-class data quality.

Semrush Keyword Magic Tool: Similar to Ahrefs; strong for finding keyword variations.

Moz Keyword Explorer: Includes "Organic CTR" and "Priority" scores.

What to look for in a head/mid keyword:

Volume: worth the effort?
KD/difficulty: is it realistic to rank?
SERP features: does the result type match your content plan?
Click potential: some high-volume queries have low clicks because Google answers them directly

Google Keyword Planner Is Not Free

Despite being labelled "free," Google Keyword Planner provides only volume ranges (e.g., "1K–10K") rather than precise numbers unless you're actively running and spending on Google Ads campaigns.

For meaningful keyword research, use a dedicated SEO tool (Ahrefs, Semrush, Moz) which provide more precise volume data and organic-search-specific metrics.

Mid/Long-Tail Keyword Research Tools

For long-tail and question-based keywords:

AnswerThePublic: Visualises questions, prepositions, comparisons, and alphabetical variations around a seed keyword. Great for content ideation.

AlsoAsked: Specifically maps "People Also Ask" question hierarchies - shows which questions Google associates with each other. Very useful for FAQ and content structure.

Google Search Console: Shows the actual queries users are already finding your site for. Mine this for keyword opportunities you're ranking in positions 5–20 for (with optimisation you could move to the top 3).

Google autocomplete / Related searches: The suggestions Google offers while typing and at the bottom of search results are real search queries - valuable long-tail signals.

Reddit, Quora, forums: Real language real people use when asking questions in your niche. Invaluable for matching natural language.

AnswerThePublic vs AlsoAsked

Feature	AnswerThePublic	AlsoAsked
Data source	Autocomplete suggestions	People Also Ask boxes
Output	Questions, prepositions, comparisons	Hierarchical question maps
Best for	Content topic generation	Structuring FAQ content, PAA targeting
Free tier	Limited daily searches	Limited daily searches

Use both: AnswerThePublic for breadth of question types, AlsoAsked for understanding how questions relate to each other (useful for structuring a long-form guide).

Effective Zero Volume Keyword Research

Zero volume keywords are keywords that show 0 searches per month in tools - but that doesn't mean nobody is searching for them. Tools have minimum thresholds; queries with very few monthly searches are grouped or excluded.

Why zero volume keywords matter:

Almost zero competition
Searchers with very specific intent convert extremely well
As your domain authority grows, ranking for ultra-specific terms is very achievable
Collectively, zero-volume long-tail queries represent a huge share of all searches

Finding zero volume keywords:

GSC search data - real queries you're already getting impressions for, even 1–2/month
Customer conversations - the exact phrases support tickets, sales calls, and reviews use
Forum threads - specific technical questions in niche communities
Competitor FAQs - questions your competitors are answering
Combining modifiers - take a mid-tail keyword and add qualifiers: location, year, comparison, "for X" (beginner, advanced, small business)

Validation approach: If a keyword returns any results at all in Google - even one or two pages - someone is searching for it. Write a focused, high-quality page targeting it.

Rank Math SEO (WordPress Plugin)

Introduction to Rank Math SEO

Rank Math is a WordPress SEO plugin (alternative to Yoast SEO) that handles on-page SEO configuration through the WordPress admin interface. It's designed for site owners and content editors who want to manage SEO without editing code directly.

Core capabilities:

Set title tags and meta descriptions per page/post
Manage redirects
Add structured data (schema markup)
Analyse on-page SEO with a built-in scorer
Integrate with Google Search Console
Handle XML sitemaps

Managing Redirects in Rank Math SEO

Rank Math includes a Redirect Manager that lets you create and manage 301/302 redirects without server access.

Use cases:

Redirect old post slugs after URL changes
Set up redirect patterns for entire URL structures
Create temporary 302 redirects for campaigns

Limitations: Plugin-based redirects are slower than server-level redirects because WordPress must load before the redirect fires. For high-traffic sites or complex migrations, server-level redirects (.htaccess, Nginx config) are preferable.

Adding Structured Data in Rank Math SEO

Rank Math provides a Schema Builder with templates for:

Article
Product
Recipe
Local Business
Event
FAQ
How-To

You can add schema without writing JSON-LD manually, using a visual interface to fill in properties.

Limitation: The built-in templates cover the most common types. For complex or custom schema, you may still need to write raw JSON-LD and embed it in the page.

On-Page SEO with Rank Math SEO

Rank Math's content analysis provides a checklist-style score for each page:

Is the focus keyword in the title?
Is the focus keyword in the meta description?
Is the focus keyword in the URL?
Is the focus keyword in the <h1>?
Does the content contain the focus keyword at an appropriate density?
Is the content long enough?
Are there internal and external links?
Are images present with alt text?

Caveat: These scores are guides, not strict rules. A page can rank without ticking every box. Over-optimising for keyword density in particular can make content read unnaturally.

Additional Rank Math SEO Features

Local SEO module - adds local business schema and manages location-specific SEO
WooCommerce SEO - extends on-page features to product pages and categories
Image SEO - automatically adds alt text to images based on filename or title
News SEO - adds Google News sitemap support
Google Search Console integration - shows GSC performance data within WordPress dashboard

InLinks

Overview of InLinks

InLinks is an SEO platform focused on semantic SEO, internal linking automation, and content optimisation. Its core thesis is that Google understands topics and entities, not just keywords - and that connecting related pages through semantically rich internal links helps signal topical authority to Google.

InLinks works by analysing your content, identifying entities (people, places, things, concepts), and suggesting or automating internal link connections between related pages.

Basic Internal Linking with InLinks

At its simplest, InLinks scans your site, identifies pages that share topical relevance, and suggests internal links to add between them.

Workflow:

Connect your site to InLinks
InLinks crawls and analyses your content
Review suggested internal links
Approve or customise links, which InLinks injects via a JavaScript snippet

The JavaScript approach means links can be added without editing individual pages - though for optimal SEO (avoiding JS-rendered-only links), baking them into the CMS is preferable.

Advanced Internal Linking with InLinks

Advanced InLinks features include:

Topic wheel - visualises which topics your site covers and how they connect
Entity coverage - shows which entities are mentioned across your site and how well they're linked
Anchor text management - ensures anchor text is semantically relevant and varied
Bulk link suggestions - identify the highest-priority internal link opportunities across the whole site

Internal Linking Strategy with InLinks

InLinks advocates a topic cluster approach:

Identify a core topic (the "hub")
Create a comprehensive hub page
Create supporting "spoke" pages covering sub-aspects of the topic
Link bidirectionally between hub and spokes, and between related spokes
Ensure all related pages reference the same key entities

This structure signals to Google that your site has deep, comprehensive coverage of a topic - which supports rankings across the entire cluster.

Internal Linking for Large and E-Commerce Websites

For large sites, manual internal linking is impractical. InLinks' automation addresses this:

Programmatic links - rules-based linking (e.g., "any page mentioning 'running shoes' links to the running shoes category page")
E-commerce category links - automatically link product pages to their parent categories and related products
Breadcrumb reinforcement - ensure breadcrumbs are semantically meaningful, not just navigational

For large sites, even small improvements to internal linking at scale can have significant aggregate impact.

Internal Link Auditing with InLinks

InLinks' audit identifies:

Pages with no internal links pointing to them (orphan pages)
Pages with links only from navigation (may need contextual links)
Underlinked important pages (high business value but few incoming links)
Over-linked pages (too many links diluting PageRank)

Cross-reference this with your GSC performance data: if an important page has poor rankings and few internal links, that's a clear action item.

Schema Markup with InLinks

InLinks includes a schema markup tool that auto-generates and injects structured data based on entity recognition. When InLinks identifies that a page is about a specific entity (a person, organisation, product), it can apply the appropriate schema type.

This reduces the manual work of writing JSON-LD for every page, particularly valuable for large sites with many similar page types.

Content Basics with InLinks

InLinks provides content analysis that goes beyond keyword density:

Entity coverage - which relevant entities does the content mention? Missing entities may be gaps.
Topic depth - compared to top-ranking competitors, how comprehensively does the page cover the topic?
NLP analysis - how does Google's Natural Language API interpret your content?

Run any page you're trying to rank through InLinks' content tool and compare it against the top 10 results for your target keyword.

Advanced Content with InLinks

Advanced content features:

Content briefs - automatically generated outlines based on what top-ranking content covers
Competitor analysis - entity and topic gap analysis vs. ranking competitors
Content score - composite score based on semantic coverage, not just keyword frequency

The goal: write content that covers a topic comprehensively from Google's perspective (entities, related concepts, subtopics) - not content that mentions a keyword a specific number of times.

InLinks includes social media optimisation features - Open Graph and Twitter Card tag management - to control how your content appears when shared on social platforms.

<!-- Open Graph tags (Facebook, LinkedIn) -->
<meta property="og:title" content="Complete SEO Guide" />
<meta
	property="og:description"
	content="Everything you need to know about SEO"
/>
<meta property="og:image" content="https://example.com/seo-guide-cover.jpg" />

<!-- Twitter Card tags -->
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:title" content="Complete SEO Guide" />

While social signals are not a direct Google ranking factor, social sharing increases the reach of your content, which leads to more natural backlinks - which are a ranking factor.

The Perfect Content Plan with InLinks

InLinks' content planning brings together:

Keyword research - identify target queries with sufficient demand
Intent analysis - understand what type of content ranks for each query
Entity map - list the entities that must be covered to satisfy Google's understanding of the topic
Competitor gap analysis - what are the top-ranking pages covering that yours doesn't?
Internal link plan - which existing pages will link to the new content?
Schema plan - which structured data types apply?
Success metrics - what position, CTR, or traffic target will indicate success?

A structured content plan eliminates guesswork and aligns your creation effort with the signals Google actually uses to evaluate content quality.

Summary: The SEO Hierarchy

Everything in this course connects back to a single hierarchy of priorities:

Can Google find your pages? → Technical SEO (crawling, robots.txt)
Can Google understand your pages? → Technical SEO (rendering, structured data)
Will Google trust your pages? → Authority (links, canonical, HTTPS)
Are your pages the best result? → On-page SEO + keyword research + content quality
Is the experience good? → Page experience, Core Web Vitals

Work through this hierarchy top to bottom. A site with crawl blocks has no chance of ranking regardless of content quality. A site that's technically perfect but has thin content won't outrank comprehensive competitors. Get the foundations right, then layer quality content on top.

End of course.

SEO summary for myself

SEO Complete Course

Why SEO Matters and Who This Is For

An Overview of SEO

Search Engine Optimisation Overview

Types of Search Result

Search Engine Algorithms

Thinking Like Google

The Golden Rule of SEO

Crawling Your Own Website

HTML, CSS and JavaScript Primer

View Source and the DOM

Crawling and Indexing

The site: and intitle: Operators

Anatomy of a Web Address

Crawling, Indexing and Domain Basics

Screaming Frog SEO Spider

Crawling and Rendering

HTTP Response Codes

HTTP Response Codes and Redirects

Auditing Status Codes

301 Permanent Redirects

301 Redirect Depth (Redirect Chains)

302 Temporary Redirects

302 and 307 Redirects

Soft 404s

503 Service Unavailable

Controlling Crawling and Indexing

Robots.txt Introduction

User-Agents

Crawling with Different User-Agents

Dealing with 403 Errors When Crawling

Google Search Console

Adding and Verifying Google Search Console

URL Inspection Tool

Noindex Tag

Managing Crawl Budget

Crawl Budget

Crawl-Delay and Crawl Speed

Robots.txt vs Noindex Usage

Low-Value Pages and Crawl Budget

Auditing Crawling and Indexing

Installing Sitebulb and Running Your First Crawl

Testing Robots.txt Rules

Noindex and Robots.txt Together

Quickly Removing Pages from Google

Advanced Robots.txt

Auditing Robots.txt Coverage

Canonical Tags

Canonical Tags

When to Use Canonical Tags

Other Canonical Tag Uses

Canonical Tag Best Practices

Internationalisation (Canonical Context)

Auditing Canonical Tags with Sitebulb

Internationalisation and Hreflang

Hreflang Tags

Hreflang Tag Best Practices

Auditing Hreflang Tags

Domain Selection for International SEO

Geo-IP Redirects

Nofollow Tags

Rel Nofollow Attribute

Types of Rel Nofollow

Page Experience

Speed, Performance and UX

Mobile-Friendly SEO

HTTPS

Implementing and Checking HTTPS

Intrusive Interstitials

Safe Browsing

Real-Time Monitoring

Core Web Vitals

Lab Testing Core Web Vitals

Field Data and the CrUX Report

Field Data in GSC

Internal Linking

PageRank and Linking

Internal Linking and Anchor Text

Auditing Internal Anchor Text

The `site:` and `intitle:` Operators