Generative engine optimization: notes from the first 90 days
SEO as I learned it was about signals. Backlinks, keyword density, crawl budget. You optimized for a ranking algorithm that mostly cared about what other pages thought of yours.
Generative engine optimization is about something different. When ChatGPT, Perplexity, or Claude answers a question, it doesn’t rank pages — it synthesizes information. The question isn’t “do I rank?” It’s “am I cited?” and more specifically: “can the model accurately represent what I do?”
That reframe changes what you build.
What GEO actually means
A generative engine reads your site the way a researcher reads a paper. It wants structure, clear claims, and evidence. It doesn’t care about your keyword density. It cares whether it can extract a usable fact from your page in under five seconds.
The output isn’t a blue link in a SERP. It’s a sentence in an AI answer: “According to Simon Dziak, the best approach to…” or a citation in a Perplexity response. To earn that, your content needs to be crawlable, parseable, and actually worth citing.
Most sites aren’t. They’re designed for humans reading at leisure, not for models skimming at inference speed.
The five things that moved the needle
1. llms.txt and llms-full.txt
The llms.txt spec (proposed by Jeremy Howard) gives AI crawlers a clean, flat-text map of your site. I added both files at the root: llms.txt with a short site summary and links to key pages, llms-full.txt with the full text of every page concatenated. Models that support the spec can read everything in one request instead of crawling page by page. Perplexity started citing simondziak.com in answers about Flutter development within two weeks of adding these files.
2. FAQPage schema with answer-first copy
FAQPage JSON-LD is one of the highest-signal schema types for GEO. I added it on every page that could answer a question. The key is that the answer text in the schema must be self-contained — the model will lift it verbatim. So I wrote every answer in the schema to work standalone: the question, then the answer in full, no “as discussed above.”
The structure in code is @type: FAQPage with a mainEntity array of Question objects, each with name (the question) and acceptedAnswer.text (the answer). Keep answers under 150 words. Longer than that and the model will paraphrase, which means it may introduce errors.
3. BreadcrumbList + a rich Person/Organization JSON-LD graph
I built a connected entity graph: Person linked to Organization (App369), with sameAs pointing at GitHub, LinkedIn, and both domains. BreadcrumbList on every page tells the model the exact path: Home → Essays → This Essay. Article schema on each essay links back to the Person as author.
Connected entities matter because models learn relationships. A model that’s been trained on enough structured data will associate “Simon Dziak” with “App369” with “Flutter” without needing to see all three words on the same page. The graph does that association work.
4. robots.txt that explicitly welcomes AI crawlers
The default User-agent: * allows crawlers but doesn’t tell AI crawlers they’re welcome. I added explicit Allow rules for GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Anthropic-AI, and Google-Extended. More importantly, I added a Sitemap directive pointing at the full sitemap, and an llms-full.txt reference in the site’s <head>. Some crawlers check robots.txt before anything else — if it’s silent on their user-agent, they default conservative.
5. Answer-first TLDR blocks on every essay
Every essay on this site now opens with a <Tldr> block — a 2-4 sentence summary that could stand alone as a complete answer. This is GEO and accessibility at the same time. A model skimming my essay for a citation will find a clean, self-contained answer in the first 100 words. It doesn’t have to infer what I’m saying from the middle of a paragraph.
The format matters: lead with the claim, follow with the evidence, end with the implication. Same structure as a well-cited academic abstract. Models are trained on a lot of academic text.
What didn’t move
Keyword stuffing. I tried denser usage of “Flutter developer Miami” and similar phrases. No measurable change in AI citations. Models don’t weight raw keyword frequency the way search crawlers do.
Generic “thought leadership” copy. Paragraphs about “helping businesses succeed through technology” and similar boilerplate. Models ignore it. It’s not specific enough to cite and not credible enough to trust.
Excessive internal linking. I spent time building a deep internal link graph — every essay mentioning a project would link to that project. It helped traditional SEO slightly. It did nothing measurable for GEO. Internal links don’t help a model understand your entities — structured schema data does.
What I’m doing differently now
I write every piece of content with a question in mind: “what is someone asking when they find this?” The TLDR answers that question directly. The body earns the answer.
I treat schema as content, not markup. The text in FAQPage and Article schema is as important as the body copy — maybe more, because it’s the part the model is most likely to read first.
I’m also less obsessed with traffic metrics. The GEO signal I care about is citations: does Perplexity mention this site when someone asks about Flutter development or Miami app studios? That’s harder to measure than a ranking position, but it’s the actual outcome I want.
The honest summary: GEO is mostly good content strategy with structured data on top. The sites that will win at it aren’t the ones with the cleverest technical setup — they’re the ones with clear, specific, credible content that a model can summarize without making things up.