AI_FOR_CYNICAL_DEVS
← Back to The Grind
Module 18 // 10 minutes // Reference

The Glossary of AI Nonsense

We're building an agentic, multimodal, RAG-enabled copilot leveraging foundation models for enterprise-grade AI transformation.

So... a chatbot?

Yes.

— Every AI startup pitch, 2024-2026

Table of Contents


The AI field has a language problem. Half the terms are legitimate technical concepts. The other half are marketing nonsense designed to make simple things sound impressive and impossible things sound achievable.

This glossary covers both. Use it to decode conversations, spot bullshit, and sound like you know what you’re talking about (because now you will).


Actual Technical Terms

These are real concepts that mean specific things. Learn them.

Agent

What it means: A system where an LLM can take actions (call functions, access tools) in a loop until it achieves a goal.

What it doesn’t mean: Magic autonomous AI that does your job.

Reality check: Most “agents” are while loops with API calls. Useful, but not sci-fi.

Attention (Mechanism)

What it means: The core innovation in transformer models. Lets the model weigh which parts of the input are relevant to each part of the output.

Why it matters: This is why LLMs can understand context and relationships in text.

You probably don’t need to know: The mathematical details, unless you’re doing ML research.

Chain of Thought (CoT)

What it means: Prompting the model to show its reasoning step by step before giving a final answer.

Why it works: Forces the model to “think” through problems rather than jumping to conclusions.

How to use it: Add “think step by step” or “show your reasoning” to prompts.

Context Window

What it means: The maximum amount of text (in tokens) a model can process at once. Includes both input and output.

Common sizes: 8K, 32K, 128K, 200K tokens depending on model.

The catch: Bigger isn’t always better. Models often perform worse at the edges of their context window.

Embedding

What it means: A vector (list of numbers) that represents the meaning of text. Similar texts have similar embeddings.

What it’s for: Semantic search, clustering, finding related content.

Key insight: Embeddings capture meaning, not just keywords. “Happy” and “joyful” have similar embeddings even though they share no letters.

Fine-tuning

What it means: Taking a pre-trained model and training it further on your specific data.

When to use: When prompting isn’t enough and you need the model to behave differently at a fundamental level.

When not to use: Most of the time. Fine-tuning is expensive, complex, and often unnecessary. Try better prompts first.

Foundation Model

What it means: A large pre-trained model (like GPT-4, Claude, Llama) that serves as a base for various applications.

Why the term exists: To distinguish from task-specific models and emphasize that one model can do many things.

Marketing alert: Sometimes used to make any large model sound more impressive than it is.

Grounding

What it means: Connecting model outputs to real, verifiable information (like your documents or databases).

Why it matters: Reduces hallucinations by giving the model actual facts to work with.

See also: RAG (Retrieval-Augmented Generation).

Hallucination

What it means: When the model generates plausible-sounding but factually incorrect information.

Why it happens: LLMs predict likely text, not true text. Sometimes likely text is wrong.

The uncomfortable truth: You cannot eliminate hallucinations. You can only reduce and detect them.

Inference

What it means: Running a trained model to get predictions/outputs. The actual “using the model” part.

Contrast with training: Training creates the model. Inference uses it.

Cost note: Inference costs money every time. Training costs money once (but a lot more).

LLM (Large Language Model)

What it means: A neural network trained on massive text data to predict/generate text.

Examples: GPT-4, Claude, Llama, Gemini, Mistral.

Key insight: “Large” refers to parameter count (billions), not capability.

LoRA (Low-Rank Adaptation)

What it means: A technique for fine-tuning large models efficiently by only training a small number of additional parameters.

Why it matters: Makes fine-tuning accessible without needing massive compute resources.

When you’ll encounter it: If you’re customizing open-source models locally.

MCP (Model Context Protocol)

What it means: A standardized protocol for connecting LLMs to external tools and data sources.

Why it exists: So you don’t have to write custom integrations for every model-tool combination.

Status: Still emerging. Adoption growing but not universal.

Multimodal

What it means: A model that can process multiple types of input (text, images, audio, video).

Examples: GPT-4V (text + images), Gemini (text + images + audio).

Reality check: “Multimodal” doesn’t mean “good at everything.” Check capabilities for each modality.

Parameter

What it means: A trainable value in a neural network. More parameters generally means more capacity.

Common sizes: 7B, 13B, 70B, 175B parameters.

The misconception: More parameters ≠ better. A well-trained 7B model can outperform a poorly trained 70B model on specific tasks.

Prompt

What it means: The input text you give to an LLM.

Types: System prompts (instructions), user prompts (queries), few-shot examples.

Key insight: Prompt quality dramatically affects output quality. Same model, different prompts, wildly different results.

Quantization

What it means: Reducing the precision of model weights (e.g., from 16-bit to 4-bit) to make models smaller and faster.

Trade-off: Smaller size and faster inference vs. some quality loss.

When it matters: Running models locally on limited hardware.

RAG (Retrieval-Augmented Generation)

What it means: A pattern where you retrieve relevant documents first, then include them in the prompt for the LLM to use.

Why it’s popular: Lets you add custom knowledge without fine-tuning. Reduces hallucinations by providing source material.

Components: Embedding model + vector database + LLM.

Temperature

What it means: A parameter controlling randomness in model outputs. Higher = more creative/random. Lower = more deterministic/focused.

Typical values: 0 (deterministic) to 1 (creative). Some models allow higher.

Rule of thumb: Use low temperature (0-0.3) for factual tasks. Use higher (0.7-1) for creative tasks.

Token

What it means: The unit LLMs process. Roughly 4 characters or ¾ of a word in English.

Why it matters: You’re charged per token. Context windows are measured in tokens.

Gotcha: Tokenization varies by model. The same text might be different token counts in different models.

Transformer

What it means: The neural network architecture behind modern LLMs. Introduced in the 2017 paper “Attention Is All You Need.”

Why it matters: This architecture enabled the current generation of capable language models.

You probably don’t need to know: The internal mechanics, unless you’re doing ML research.

Vector Database

What it means: A database optimized for storing and searching embeddings (vectors).

Examples: Pinecone, Weaviate, Qdrant, Chroma, pgvector.

Use case: Semantic search, RAG systems, finding similar content.

Zero-shot / Few-shot

What it means:

  • Zero-shot: Asking the model to do something without examples
  • Few-shot: Including examples in your prompt

When to use few-shot: When zero-shot isn’t giving good results and you have examples to share.


Marketing Buzzwords (Translated)

These terms are used more for marketing than technical precision. Here’s what they usually mean in practice.

”AI-Powered”

What they say: “Our AI-powered solution…”

What they mean: “We added an API call to GPT somewhere.”

Reality check: Everything is “AI-powered” now. The term has lost all meaning.

”Autonomous”

What they say: “Fully autonomous AI agent.”

What they mean: “A loop that runs until it finishes or crashes.”

Reality check: True autonomy doesn’t exist. These systems need guardrails, limits, and human oversight.

”Cognitive”

What they say: “Cognitive computing platform.”

What they mean: “We do some NLP.”

Origin: IBM marketing term from the 2010s. Largely meaningless today.

”Democratizing AI”

What they say: “We’re democratizing AI for everyone.”

What they mean: “We have an API or a UI.”

Reality check: Access has improved, but “democratizing” is marketing speak.

”Enterprise-Grade”

What they say: “Enterprise-grade AI solution.”

What they mean: “We have SSO and compliance checkboxes.”

Reality check: Often means “expensive” more than “better."

"Generative AI” (GenAI)

What they say: “Leading the GenAI revolution.”

What they mean: “We use LLMs.”

Reality check: Legitimate term, but overused. Everything is GenAI now.

”Human-Level”

What they say: “Approaching human-level performance.”

What they mean: “Beat humans on one specific benchmark.”

Reality check: Benchmark performance ≠ general capability. Be skeptical.

”Intelligent”

What they say: “Intelligent automation.”

What they mean: “Has some conditional logic.”

Reality check: “Intelligent” is entirely subjective. Press for specifics.

”Next-Generation”

What they say: “Next-generation AI platform.”

What they mean: “Newer than something.”

Reality check: Meaningless without comparison. Next generation of what?

”Revolutionary”

What they say: “Revolutionary AI breakthrough.”

What they mean: “Incremental improvement we want to hype.”

Reality check: Real revolutions are rare. Most progress is evolutionary.

”Transformative”

What they say: “AI will be transformative for your business.”

What they mean: “AI might be useful for your business.”

Reality check: Maybe. Probably not as much as the sales pitch suggests.


Red Flag Phrases

When you hear these, be suspicious.

”The AI understands…”

Red flag level: 🚩🚩🚩

Why it’s concerning: LLMs don’t “understand” in any meaningful sense. They predict tokens.

What to ask: “What do you mean by ‘understands’? What happens when it’s wrong?"

"Just add AI to…”

Red flag level: 🚩🚩

Why it’s concerning: Suggests AI is a simple add-on, not a complex integration.

What to ask: “What problem is AI solving here that simpler solutions can’t?"

"AI will replace…”

Red flag level: 🚩🚩🚩

Why it’s concerning: Usually wrong or wildly premature.

What to ask: “What’s the timeline? What evidence supports this?"

"The model is 99% accurate”

Red flag level: 🚩🚩🚩

Why it’s concerning: Accuracy depends heavily on test conditions. 99% in the lab, 60% in production.

What to ask: “On what test set? What’s the error distribution? What happens in the 1%?"

"We trained our own model”

Red flag level: 🚩🚩

Why it’s concerning: Often means “we fine-tuned GPT” or “we wrote some prompts.”

What to ask: “From scratch, or fine-tuned? On what data? What’s the parameter count?"

"It’s like having a [senior role] on your team”

Red flag level: 🚩🚩🚩

Why it’s concerning: AI can assist, not replace expertise. This oversells capabilities.

What to ask: “What can the AI do that a senior person can? What can’t it do?"

"Hallucination-free”

Red flag level: 🚩🚩🚩🚩

Why it’s concerning: Hallucination-free LLMs don’t exist. Anyone claiming this is lying or ignorant.

What to ask: “How do you measure hallucination rate? What’s your detection method?”


Acronyms Quick Reference

AcronymMeaningNeed to Know?
AGIArtificial General IntelligenceTheoretical. Doesn’t exist yet.
APIApplication Programming InterfaceYes, how you call AI services.
CoTChain of ThoughtYes, useful prompting technique.
CUDACompute Unified Device ArchitectureOnly if running local models on NVIDIA.
GPUGraphics Processing UnitYes, what runs AI models.
LLMLarge Language ModelYes, fundamental concept.
LoRALow-Rank AdaptationOnly if fine-tuning.
MCPModel Context ProtocolUseful for tool integration.
MLMachine LearningYes, broader field containing AI.
NLPNatural Language ProcessingYes, what LLMs do.
RLHFReinforcement Learning from Human FeedbackHow models are aligned. Optional knowledge.
RAGRetrieval-Augmented GenerationYes, very common pattern.
SaaSSoftware as a ServiceNot AI-specific.
SDKSoftware Development KitNot AI-specific.
SLMSmall Language ModelEmerging term for efficient models.
TPUTensor Processing UnitGoogle’s AI chip. Optional knowledge.
VRAMVideo RAMMatters for local model deployment.

The Translation Guide

When someone says → They probably mean:

What they sayWhat they mean
”We’re building an AI""We’re using someone else’s AI via API"
"Our AI learns""We update prompts when things break"
"AI-native architecture""We put GPT in the middle of everything"
"Responsible AI""We have a checkbox somewhere"
"State of the art""Better than the old version"
"Industry-leading""We exist in this industry"
"Seamless integration""There’s an API"
"Real-time AI""Latency under 5 seconds"
"Production-ready""Works in demos"
"Battle-tested""Used by at least one customer”

How to Sound Like You Know What You’re Talking About

Do say:

  • “What’s the latency like in production?”
  • “How do you handle hallucinations?”
  • “What’s the cost per request?”
  • “Is that zero-shot or fine-tuned?”
  • “What’s the retrieval accuracy?”

Don’t say:

  • “Can’t AI just figure it out?”
  • “Why don’t we just train our own model?”
  • “Is this like Skynet?”
  • “My nephew made a chatbot over the weekend”
  • “When will this be sentient?”

Keep this glossary handy. You’ll need it the next time someone tries to sell you an “enterprise-grade, autonomous, AI-powered cognitive platform for transformative intelligence.”

(It’s a chatbot. It’s always a chatbot.)