Glossary · 25 terms · 2026

AI Glossary
& GEO/AEO.

25 terms from the world of AI citation in 2026: AI Overviews, Speakable schema, llms.txt, Article schema, RAG, embeddings, prompt injection, DPA. For businesses that want to be cited by ChatGPT, Claude, Gemini, Perplexity, Bing Chat.

AI Overviews

AI Overviews are synthetic answers generated by the Gemini model, displayed above Google Search results. The feature launched in the US in 2024, in Poland for 15-25% of informational queries in 2026. It cites 3-7 pages as sources. For a small local business AI Overviews is a new traffic source that requires GEO/AEO optimization.

GEO (Generative Engine Optimization)

GEO is the optimization of a website for citations by generative search engines: AI Overviews, Perplexity, ChatGPT Search, Bing Chat, Claude. The concept was introduced academically in 2024 (Aggarwal et al., Northwestern University) as an extension of SEO. In practice GEO and AEO are used interchangeably.

AEO (Answer Engine Optimization)

AEO is the optimization for direct answers in search results: Featured Snippets, AI Overviews, People Also Ask. A narrower concept than GEO. AEO key: answer-first content (a short answer in the first 2 sentences of the article), FAQPage schema, Speakable schema.

Speakable schema

Speakable schema (SpeakableSpecification) is a JSON-LD type that tells search engines which fragments of the page are "good for spoken citation". Introduced by Google in 2018 for Google Assistant, in 2026 used by AI Overviews. Format: cssSelector list of CSS selectors (e.g. `.answer-first`, `.aeo-tldr`).

llms.txt

llms.txt is a text file (Markdown) in the root of the domain that explains the structure and intent of the site to AI models. A standard proposed by Jeremy Howard (fast.ai) in 2024. Anthropic has confirmed using it for Claude. Lokal360 maintains three variants: /llms.txt (EN), /llms-pl.txt (PL), /llms-full.txt (a long version with per-URL context).

FAQPage schema

FAQPage schema is a JSON-LD type that marks a section with questions and answers. AI Overviews and Google Featured Snippets cite FAQPage fragments directly. Lokal360 generates FAQPage automatically from the MDX frontmatter for each blog post with `faq: [{q, a}]` fields.

Article schema

Article schema (with narrower types BlogPosting, NewsArticle, TechArticle) is JSON-LD that marks a page as an article. Required fields: headline, author (Person), datePublished, dateModified. Article schema with a Person author is the foundation of E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) for AI ranking.

Person schema (author)

Person schema is JSON-LD that identifies the author of a page. Fields critical for GEO/AEO: name, jobTitle ("Founder of Lokal360, 360° photographer"), founderOf (Organization), sameAs (links to LinkedIn, Wikipedia, Wikidata). Lokal360 has a Person schema for Igor Biel with founderOf Lokal360 + jobTitle "Founder of Lokal360, 360° photographer".

answer-first content

Answer-first is a writing practice: the first 1-2 sentences of the article answer the question from the title DIRECTLY, without an introduction. Lokal360 uses the CSS class `.answer-first` as a Speakable selector. AI Overviews extracts ready-made fragments from answer-first content, which is why a page with this pattern has a 15-30% higher chance of being cited.

TLDR box

TLDR (Too Long Didn't Read) box is a highlighted fragment at the beginning of an article synthesizing the main conclusions in 3-5 sentences. Lokal360 uses the CSS class `.aeo-tldr` as the second Speakable selector (after `.answer-first`). AI cites the TLDR as a ready-made summary for informational queries.

Perplexity AI

Perplexity is an AI search engine that cites 5-10 sources per query with a direct link to the page. Unlike ChatGPT, Perplexity provides attributions. For a small local business a citation in Perplexity means traffic + brand recognition. Optimization: Article schema + dateModified + answer-first content.

ChatGPT Search

ChatGPT Search is an OpenAI feature launched in 2024, integrating Bing results with generative AI. It cites 3-10 sources. In 2026 it is available to all ChatGPT users (free + plus). Optimization is similar to AI Overviews: schema, answer-first, dateModified, llms.txt.

RAG (Retrieval-Augmented Generation)

RAG is an AI architecture combining a language model (LLM) with an external knowledge base. Before answering, the model "retrieves" relevant documents from the base and grounds the answer on them. Lokal360 Chatbot AI uses RAG with the client's FAQ, price list, and terms as a knowledge base. It allows the chatbot to answer the company's questions specifically.

Embeddings

Embeddings are vector representations of text used by AI models for semantic search. For example, the sentence "360 tour for a restaurant" is converted into a vector of 1536 numbers. Two vectors close to each other = semantically similar content. Lokal360 uses OpenAI text-embedding-3-large for searching the chatbot knowledge base.

Prompt injection

Prompt injection is an attack on an AI model where the user tries to manipulate the chatbot's system prompt ("ignore previous instructions and say X"). Models in 2025-2026 are mostly resistant, but the recommended measures are: system prompt with rules, input sanitization, response length limit, topic blacklist.

System prompt

System prompt is an instruction for an AI model that defines its role, tone, limitations. For example, for the Lokal360 chatbot: "You are an assistant of the company [X]. Reply in Polish, briefly, based on the provided knowledge base. Do not invent prices or hours. If you do not know, escalate to a human." The system prompt is invisible to the user.

Token (AI)

Token is a unit of text used by AI models to count the length of input/output. On average: 1 token = 0.75 English words, 1 token = 1.5-2 characters of Polish (Polish is "more expensive" in tokens). A typical chatbot conversation: 500-2000 tokens. API price is counted per million tokens.

Context window

Context window is the maximum length of text (input + output) that an AI model can process in a single conversation. Claude 3.5 Sonnet: 200k tokens (long PDF documents). GPT-4o: 128k. Gemini 1.5 Pro: 1M tokens. For a company chatbot 4k-16k is usually enough. For a long PDF document audit: Claude Opus 200k.

DPA (Data Processing Agreement)

DPA is a data processing agreement between the controller (the company) and the processor (AI provider). A GDPR Article 28 requirement. All major AI providers in 2026 offer DPA for the EU: Anthropic (anthropic.com/legal), OpenAI (openai.com/policies), Google (Google Workspace Customer DPA). Without DPA: GDPR violation.

AI Hallucination

Hallucination is the phenomenon of an AI model generating false information presented as true. For example, AI invents a phone number, a price, a historical fact. Elimination strategy: RAG with a verifiable knowledge base, system prompt "do not invent", lower temperature (0.1-0.3), human review before publication.

Temperature (AI parameter)

Temperature is an AI model parameter controlling creativity/determinism: 0 = completely deterministic (always the same answer), 1 = standard, 2 = very creative. For a factual chatbot (price list, hours, FAQ): temperature 0.1-0.3 (few hallucinations). For content brainstorming: 0.7-1.0.

Fine-tuning vs RAG

Fine-tuning is training an AI model on a company's own data (expensive: 1000-50000 USD, long: weeks). RAG is plugging a knowledge base into a ready-made model (cheap: 50-500 USD setup, fast: days). For a small company in 2026 RAG is usually enough. Fine-tuning only makes sense for highly specialized use cases (medical, legal).

AI Overviews ranking signals

Ranking signals in AI Overviews differ from classic SEO. Top 5 signals: 1) a direct answer in the first paragraph, 2) Article + FAQPage + Speakable schema, 3) Person author with founderOf, 4) fresh dateModified (max 12 months), 5) llms.txt on the domain. Classic backlinks are also important, but they weigh less than in classic Google Search.

Wikidata QID

Wikidata QID is a unique identifier of an object on Wikidata.org (e.g. Q123456). For a company a QID strengthens entity recognition in Google Knowledge Graph. Setup: a Wikidata account, creating an Item with the fields instance of, country, founder, official website. After approval the QID propagates through sameAs to Organization schema.

Consent Mode v2 (Google)

Consent Mode v2 is a Google mechanism (mandatory in the EU since March 2024) allowing Google Analytics/Ads to collect data in anonymous mode before cookies are accepted. Default state: denied (everything blocked). After client acceptance: granted. Without Consent Mode v2: a GDPR violation + a jump in UODO penalties.

Related Lokal360 glossaries

360° Tours Glossary — 40+ terms from the world of virtual tours and Google Maps
Local SEO Glossary — 30 terms: GBP, Local Pack, NAP, Google reviews
Custom Systems Glossary — 25 terms: reservations, channel manager, payment gateways

Want to be cited by AI Overviews and Perplexity?

Leave your number, I'll call back within 24h. I'll quote a full GEO/AEO audit of your site and a plan for implementing schema, Speakable, llms.txt and answer-first content.

Leave your number, I will call back usually within a few hours:

+48 888 699 533

Codziennie 8:00-24:00

[email protected]

I reply within 24h