Chapter 2

What Are LLMs?

Predictive text at scale — training, inference, limits.

Learning objectives

  • Explain LLM in one sentence a manager understands
  • Distinguish training, fine-tuning, and inference
  • Know what an LLM cannot reliably do

LLM = autocomplete at industrial scale

A Large Language Model (LLM) reads your text and predicts the most likely next token (piece of a word), again and again, until it finishes a reply. It has no persistent memory unless you send history in each API call.

Plain definition

ChatGPT, Claude, Gemini, Llama, Mistral — all are LLM products. “GPT-4” is a model name; “ChatGPT” is the chat app built on top.

Three phases

PhaseWho runs itCost
TrainingVendor (months, thousands of GPUs)Millions — not your job
Fine-tuningOptional — custom style on your docs$$ — specialist task
InferenceYou — each API call or self-hosted GPUPer token — your budget

What LLMs are good / bad at

Good at

Summarizing, drafting, classifying intent, explaining docs, code snippets from examples

Weak at

Math without tools, live facts after training cutoff, guaranteed truth, secrets you should not share

Worked example — FAQ without hallucination risk

Workshop Co. should not ask the model “when is the next class?” from memory. Instead:

  1. API fetches class list from database (ground truth)
  2. LLM formats answer in friendly tone from that JSON only

This pattern — retrieve facts, then generate prose — is the foundation of RAG (Chapter 6).

Try it yourself

List three Workshop Co. tasks: (1) safe for LLM alone, (2) needs database first, (3) should never use LLM.

Sample
  • Safe alone: rewrite class description for social media
  • Needs DB: “seats left in March intro class”
  • Never: legal waiver interpretation, medical advice, storing credit cards in prompts

Quick quiz

Does the model “know” your server IP unless you paste it in the prompt?

Answer

No. It only sees what you send in that request (plus any retrieval/RAG you attach). Treat every prompt as potentially logged by the provider.