🔑 Layer 03

Keyword Research

Direction sets your traffic ceiling

📖 13 min read 🕑 Updated 2026-06-22

If the foundational layer teaches you how search engines work, the Keyword Research layer answers a far more practical question: what are real people actually typing into the search box, and who should you be writing for? It sits between “pouring the foundation” and “raising the walls.” Crawling and indexing decide whether a page can be seen; keywords decide whether being seen is worth anything at all.

Here is the mental model to lock in before anything else: direction sets your traffic ceiling. You can write perfectly semantic HTML, ship a flawless sitemap, and score 100 on Lighthouse — but if you aim all of that at a phrase nobody searches, or at a phrase a hundred established sites already dominate, you have built a beautiful house on a road with no traffic. Keyword research is how you choose the road before you start building. Get it right and mediocre execution still earns clicks; get it wrong and brilliant execution earns nothing.

The rest of this layer walks through four things in order: the types of keywords, the metrics that tell you if a keyword is worth chasing, how to map keywords to actual pages, and the tools that automate the grunt work.

Keyword Types

Think of keywords as queries at different levels of granularity — from broad and generic down to narrow and specific.

Seed keywords are the broad, one-or-two-word core terms of your topic: seo, python, kubernetes, email marketing. Search volume is enormous, but so is competition, and intent is hopelessly vague — someone searching python might want the language, the snake, or Monty Python. You will almost never rank for a seed keyword as a new site. Their real job is to be the starting point you expand from.

Long-tail keywords are specific queries, usually three or more words, that express a clear need: how to generate a sitemap for an astro site, fix cumulative layout shift on mobile, best free keyword tool for small blogs. Each one has modest volume on its own, but collectively long-tails account for the majority of all search traffic (the classic “long tail” of the distribution). They are easier to rank for, their intent is unambiguous, and they convert better. This is the main battlefield for anyone starting out.

LSI / semantically related keywords (often called LSI, from “Latent Semantic Indexing”) are the words that naturally co-occur with a topic. Write a genuine article about keyword difficulty and terms like search volume, SERP, backlinks, and competitor show up on their own. Modern search engines use these co-occurrences to confirm your page is genuinely about the topic, not just repeating one phrase. You do not “insert” LSI terms mechanically — you earn them by actually covering the subject well.

Here is the progression as a quick reference:

TypeExampleVolumeCompetitionIntent clarityUse it for
SeedseoVery highBrutalVagueExpansion starting point
Long-tailhow to add canonical tags in astroLow–mediumManageableSharpThe pages you actually rank
LSI / relatedmeta tags, crawl budget, SERPn/an/aSupports topicProving topical depth

🧑‍💻 Developer’s view: think of it like API design. A seed keyword is GET /search — broad, generic, matches everything and therefore nothing. A long-tail keyword is GET /search?q=astro&type=sitemap&platform=cloudflare — the more specific the parameters, the more precise the hit and the fewer competitors fighting for that exact route. LSI terms are the other fields in the response payload that prove the endpoint really handles this resource.

Metrics

Choosing keywords is not a gut feeling; it is reading a small dashboard of numbers and weighing them against each other. Four metrics matter most.

Search volume is the rough number of monthly searches for a term. It is the first number everyone looks at and the most overrated. High volume is meaningless on its own — you must weigh it against the three metrics below. Treat the numbers tools give you as ranges and signals, not precise truth; they are modeled estimates, and different tools disagree by a lot.

Keyword difficulty (KD) is usually a 0–100 score estimating how hard it is to reach page one, mostly based on how strong the pages already ranking are (their backlinks, authority, and content depth). KD is not standardized — Ahrefs’s KD 30 is not Semrush’s KD 30 — so use it relatively within one tool. As a new site, prioritize low KD with at least some volume. A rough starting filter:

# A reasonable first-pass filter for a brand-new site
KD < 20  AND  volume >= 100  AND  intent matches your goal

Click-through rate (CTR) is the share of searchers who actually click a result at a given position, and the share who click any organic result at all. This is the metric beginners forget. Many queries are now answered directly on the results page by a featured snippet, an AI overview, a knowledge panel, or are buried under four ads — so a query with 50,000 searches might send almost no clicks to organic results. A question-style long-tail with 800 searches and a clean SERP can out-earn it.

Commercial value (intent value) is how close the person behind the query is to your goal — a sale, a signup, a lead. what is seo is educational traffic: huge volume, low commercial value, far from any purchase. seo audit pricing is business traffic: lower volume, high commercial value, one step from a transaction. Neither is “better” — they serve different stages of the funnel — but you must know which one you are chasing and why.

MetricWhat it tells youCommon trap
Search volumeSize of the opportunityChasing it in isolation
Keyword difficulty (KD)How hard page one isComparing KD across different tools
Click-through rate (CTR)Whether the clicks actually existIgnoring snippets, ads, and AI answers
Commercial valueHow close to your goalFilling a site with zero-value info traffic

⚠️ Caution: do not chase volume alone. One term at KD 70 with 50,000 searches is far less attainable than ten long-tails at KD 15 with 500 searches each. The ten long-tails total the same 5,000 monthly potential, you can realistically rank for all of them, and each one targets a sharper need.

Once you have a candidate term, sanity-check the real SERP before trusting any number. Search it yourself, or use the SERP preview tool to see how the page one results actually look — if the entire first page is dominated by huge brands, the tool’s “KD 18” is lying to you.

Mapping to Pages

You have a pile of terms. The single most important move now is group → mapnot building one page per keyword. Spinning up a separate page for every variation is the classic beginner mistake; it scatters your authority and makes your own pages compete with each other (this is called keyword cannibalization, and it quietly suppresses all the cannibalizing pages at once).

The fix is to organize by search intent. Every query falls into one of four intent buckets:

  • Informational — the user wants to learn something: what is keyword difficulty, how does crawling work.
  • Navigational — the user wants a specific site or page: ahrefs login, google search console.
  • Commercial investigation — the user is comparing options before deciding: ahrefs vs semrush, best keyword research tool.
  • Transactional — the user is ready to act: buy ahrefs subscription, seo audit pricing.

The rule that follows: one intent + one topic = one page. Cluster every term that shares the same intent and overlapping meaning into a single topic cluster, and point that cluster at exactly one URL. All the long-tail variants of “how do I create a sitemap” belong on the same page — they are the same need phrased ten ways, not ten pages.

Finally, translate those clusters into your actual URL and directory structure. The grouping is your information architecture. Here is a worked keyword → intent → target page mapping table — this is the deliverable you are aiming for:

Keyword cluster (lead + variants)IntentTarget page
what is seo, seo meaning, how seo worksInformational/en/layers/foundations/
keyword difficulty, what is kd, how to read keyword difficultyInformational/en/layers/keyword-research/
ahrefs vs semrush, best keyword tool, semrush alternativesCommercial/en/compare/keyword-tools/
serp preview tool, google snippet preview, check title lengthTransactional / tool/en/tools/serp-preview/

Notice each row is one page absorbing several phrasings. That table is not busywork — it becomes the blueprint for everything downstream. When you write the page, the lead keyword guides your <title> and <h1>, the variants become your <h2> subheadings, and the intent tells you what the page must do (teach? compare? convert?).

<!-- The mapping table directly drives the on-page tags -->
<title>Keyword Difficulty (KD), Explained for Beginners</title>
<h1>Keyword Difficulty: How to Read It and When to Trust It</h1>
<!-- variants become section headings -->
<h2>What does keyword difficulty actually measure?</h2>
<h2>How to read a KD score without getting fooled</h2>

💡 Tip: keep the table simple — three columns is enough: keyword | intent | target URL. Maintain it in a spreadsheet or a CSV checked into your repo. Every new term you discover gets slotted into an existing cluster or starts a new row; you never again wonder “do I have a page for this?”

🧑‍💻 Developer’s view: this is a routing table. Keywords are incoming requests, intent is the matcher, and the target URL is the handler. Cannibalization is two routes matching the same request — the framework (Google) picks one arbitrarily and both suffer. One canonical handler per request pattern is as true in SEO as it is in your router config.

Research Tools

You do not invent keywords from your head — you mine them with tools, then validate them. Here is how the major options compare, so you can pick by budget and stage.

ToolBest atCostWhen to reach for it
Google Search ConsoleTerms you already rank for, with real clicks & impressionsFreeAlways — start here, it is your own ground truth
Google Keyword PlannerOfficial volume ranges, straight from Google’s ad dataFree (needs an Ads account)Validating volume; ideas around seeds
AnswerThePublicQuestion / preposition long-tails around a seedPartly freeBrainstorming informational long-tails
AhrefsKD, volume, and competitor reverse-lookup; deep dataPaidBulk KD checks; stealing competitors’ keywords
SemrushKeyword Magic Tool, built-in intent labels, comparisonPaidLarge-scale research with intent tagging

Starting with zero budget? Run this loop:

  1. Open Search Console and find your “low-hanging fruit” — queries where you already get impressions but rank on page two (position 11–20). These are terms Google already thinks you are almost relevant for; a little dedicated content often pushes them onto page one fast.
  2. Take your best seeds into AnswerThePublic and Keyword Planner to expand into dozens of long-tails and get rough volume.
  3. Validate the survivors by searching the real SERP (or the SERP preview tool) to judge true difficulty by eye.

Have a budget? Ahrefs or Semrush collapse days of manual work into minutes: check KD on hundreds of terms at once, and — the real superpower — reverse-look-up exactly which keywords your competitors rank for so you can target the gaps they left open.

Search Console deserves special emphasis because it is free, it is your data, and it cannot be gamed. To find page-two opportunities, open Search Console → Performance, enable the Average position metric, and sort or filter for queries sitting around positions 11–20 with healthy impressions.

# Search Console (and most paid tools) expose an API — automate the boring part.
# Pull your query data, then filter candidates by a threshold, e.g.:
#   position between 8 and 20  AND  impressions > 200  AND  ctr < 2%
# Those are pages one good edit away from real traffic.
curl -s -H "Authorization: Bearer $TOKEN" \
  "https://www.googleapis.com/webmasters/v3/sites/$SITE/searchAnalytics/query" \
  -d '{"dimensions":["query"],"startDate":"2026-05-01","endDate":"2026-06-01"}'

🧑‍💻 Developer’s view: nearly every tool here offers an API or CSV export. There is no reason to eyeball thousands of rows by hand. Pull the data, drop it into a script or a notebook, and auto-filter candidates against a rule like KD < 20 && volume > 100 && position <= 20. Treat keyword selection as a small data pipeline, not a manual chore.

Summary

The heart of this layer is one sentence: direction sets your traffic ceiling. Before you optimize a single tag, decide which terms are worth the effort — and that decision is never about volume alone. It is about matching low-difficulty, real-intent keywords to the right page, grouped by intent so your own pages never fight each other. Pick the right road, and every later layer compounds on a solid foundation; pick the wrong one, and there is nothing to compound.

✅ Checklist

  • Pick 1–2 seed keywords and expand them into at least 30 long-tail keywords with a tool
  • Annotate every candidate with search volume, KD, and search intent
  • Cut terms whose KD is too high or whose intent does not match your goal
  • Cluster the survivors into topic clusters by intent — one cluster maps to exactly one page
  • Build a keyword | intent | target URL mapping table and keep it in your repo
  • Open Search Console and harvest page-two “low-hanging fruit” (position 11–20 with impressions)
  • Sanity-check your top terms against the real SERP before committing to them