Chapter 2: Core Concepts and Terminology

Essential Vocabulary and Foundational Concepts

Learning Objectives

After completing this chapter, you will be able to:

Define key prompting terminology accurately
Describe the anatomy of a prompt
Explain tokens and context windows
Understand how AI models process prompts
Apply terminology correctly in discussions

The Language of Prompting

Before diving into techniques and frameworks, we need a shared vocabulary. This chapter establishes the essential terms and concepts that will be used throughout this handbook.

Why Terminology Matters

Clear terminology enables:

Precise communication about techniques
Accurate diagnosis of problems
Effective knowledge transfer between practitioners
Consistent documentation of practices

Core Definitions

Prompt

A prompt is the input provided to an AI model to elicit a response. It can range from a simple question to a complex, multi-component instruction set.

Simple Prompt:
"What is the capital of France?"

Complex Prompt:
"You are a senior financial analyst. Review the following quarterly
report and provide: 1) Key performance highlights, 2) Areas of concern,
3) Recommendations for Q3. Format as a bulleted executive summary."

Prompt Engineering

Prompt engineering is the discipline of designing, testing, and optimizing prompts to achieve desired outcomes from AI systems. It encompasses both the art (creativity, intuition) and science (systematic testing, measurement) of prompt creation.

Completion / Response / Output

The completion (also called response or output) is what the AI model generates in reply to a prompt. The quality of the completion depends heavily on the quality of the prompt.

Large Language Model (LLM)

A Large Language Model is an AI system trained on vast amounts of text data to understand and generate human-like text. Examples include GPT-4, Claude, Gemini, and Llama.

Anatomy of a Prompt

Basic Components

Every prompt can be analyzed in terms of its components:

Figure 2.1: The Anatomy of a Prompt - Every prompt consists of these four components: Context (background), Instruction (task), Input Data (content to process), and Output Specification (format requirements). Input Data and Output Specification are optional but often critical for quality results.

Component Definitions

Component	Purpose	Example
Context	Sets the stage	“You are a medical professional…”
Instruction	Defines the task	“Diagnose the following symptoms…”
Input Data	Provides content	“Patient presents with: fever, cough…”
Output Spec	Controls format	“Provide top 3 possibilities with confidence levels”

Example Breakdown

**Prompt:**
You are an experienced Python developer specializing in data analysis.  [CONTEXT]
Review the following code and identify any bugs or inefficiencies.      [INSTRUCTION]

```python
def calculate_average(numbers):
    total = 0
    for n in numbers:
        total = total + n
    return total / len(numbers)
```                                                                       [INPUT DATA]

Provide your analysis as:
1. Issues found (numbered list)
2. Corrected code
3. Brief explanation of changes                                          [OUTPUT SPEC]

Tokens and Tokenization

What Are Tokens?

Tokens are the basic units that AI models use to process text. A token can be a word, part of a word, or a character, depending on the tokenizer used.

Tokenization Examples

Text	Approximate Tokens
“Hello”	1 token
“Hello, world!”	3-4 tokens
“Artificial Intelligence”	2-3 tokens
“supercalifragilisticexpialidocious”	7-9 tokens

Why Tokens Matter

Cost: Most AI services charge per token
Limits: Models have maximum token limits
Efficiency: More tokens = more processing time
Context: Available space for information

Token Rules of Thumb

English text: ~4 characters per token (average)
1,000 tokens: ~750 words
Code: Often tokenizes less efficiently than prose
Special characters: May consume extra tokens

Flow diagram showing token economy with three input components: Prompt (500 tokens), Context (200 tokens), and Data (100 tokens) combining to Total Input of 700 tokens. A model limit bar shows 8,000 tokens capacity, with 7,300 tokens available for output after the input allocation. Figure 2.2: Token Economy - Understanding how tokens are allocated helps optimize prompts. In this example, 700 input tokens leave 7,300 tokens available for the response within an 8,000-token context window.

Context Windows

What Is a Context Window?

The context window is the maximum number of tokens an AI model can process in a single interaction. It includes both the prompt (input) and the completion (output).

Context Window Sizes

Model (Example)	Context Window
GPT-3.5	4K - 16K tokens
GPT-4	8K - 128K tokens
Claude 3	200K tokens
Gemini	32K - 1M tokens

Managing Context

Vertical container diagram showing context window structure with an 8,000 token total capacity divided into four sections: System Prompt (500 tokens) at the top, Conversation History (2,000 tokens) below that, Current Prompt (1,500 tokens), and Reserved for Response (4,000 tokens) at the bottom. Each section is proportionally sized and color-coded. Figure 2.3: Context Window Management - The context window must accommodate system prompts, conversation history, current prompts, and response space. Strategic allocation ensures sufficient room for quality outputs.

Context Window Strategies

Strategy	When to Use
Summarization	Long conversations, historical context
Truncation	Less important older content
Chunking	Processing long documents
Prioritization	Limited space, critical information

Prompt Types

Zero-Shot Prompts

Zero-shot prompts ask the model to perform a task without any examples:

Classify the following text as positive, negative, or neutral:
"The product arrived quickly but the quality was disappointing."

Few-Shot Prompts

Few-shot prompts include examples to guide the model:

Classify the sentiment:

Text: "I love this product!" → Positive
Text: "Worst purchase ever." → Negative
Text: "It's okay, nothing special." → Neutral

Text: "The product arrived quickly but the quality was disappointing." →

Chain-of-Thought Prompts

Chain-of-thought prompts encourage step-by-step reasoning:

Solve this problem step by step:
If a train travels at 60 mph for 2.5 hours, then at 80 mph for 1.5 hours,
what is the total distance traveled?

Key Terminology Glossary

Term	Definition
Prompt	Input provided to an AI model
Completion	Output generated by the model
Token	Basic unit of text processing
Context Window	Maximum tokens processable
Zero-Shot	No examples provided
Few-Shot	Examples included
Chain-of-Thought	Step-by-step reasoning
System Prompt	Persistent behavioral instructions
User Prompt	Individual request or query
Temperature	Randomness/creativity control
Top-p	Probability threshold for token selection
Max Tokens	Limit on response length
Stop Sequence	Text that terminates generation
Hallucination	Plausible but incorrect output
Grounding	Anchoring to provided facts
Prompt Injection	Malicious input manipulation

Model Parameters

Temperature

Temperature controls the randomness of outputs:

Value	Effect
0.0	Deterministic, focused
0.5	Balanced
1.0	Creative, varied
>1.0	Highly random

Top-p (Nucleus Sampling)

Top-p limits token selection to a cumulative probability:

Top-p = 0.9 means:
Consider only tokens that together account for 90% probability

Max Tokens

Max tokens limits the response length:

max_tokens = 500
Model will stop generating after ~500 tokens

Stop Sequences

Stop sequences define text that halts generation:

stop = ["###", "END"]
Model stops if it generates "###" or "END"

Understanding Model Behavior

How Models Process Prompts

Figure 2.4: Model Processing Flow - Text input is tokenized into numeric IDs, processed through the model’s attention layers and neural networks, then decoded back into readable text output.

Key Behaviors to Understand

Pattern Matching: Models respond based on patterns in training data
Context Sensitivity: Earlier context influences later generation
Instruction Following: Models try to follow explicit instructions
Recency Bias: Recent context often has more influence
Verbosity Matching: Output length often matches prompt style

Common Misconceptions

Misconception 1: “The model understands me”

Reality: Models process statistical patterns, not meaning. Clear, explicit prompts work better than implied intent.

Misconception 2: “Longer prompts are better”

Reality: Quality over quantity. Concise, relevant prompts often outperform lengthy ones.

Misconception 3: “One perfect prompt exists”

Reality: Optimal prompts vary by model, task, and context. Iteration is essential.

Misconception 4: “The model remembers previous sessions”

Reality: Each session starts fresh unless context is explicitly provided.

Key Takeaways

Prompts are inputs; completions are outputs
Tokens are the currency of AI interaction
Context windows set hard limits on what can be processed
Parameters (temperature, top-p, max_tokens) control output characteristics
Clear terminology enables precise communication about prompting
Understanding model behavior helps craft better prompts

Summary

This chapter established the foundational vocabulary and concepts of prompt engineering. Understanding these terms—prompt, token, context window, zero-shot, few-shot, and others—is essential for discussing techniques, diagnosing problems, and learning from others. With this shared language in place, we can now explore the frameworks and patterns that make structured prompting so powerful.

Review Questions

What are the four basic components of a prompt?
Approximately how many tokens is 1,000 words of English text?
What is the difference between zero-shot and few-shot prompting?
How does temperature affect model output?
Why is understanding the context window important?

Practical Exercise

Exercise 2.1: Token Estimation

Estimate the token count for the following prompt, then verify with a tokenizer:

You are a helpful assistant. Please summarize the following article
in exactly three bullet points, focusing on the main arguments:

[300-word article text here]

Format your response as:
• Point 1
• Point 2
• Point 3

Hint: System message (~10 tokens) + Instructions (~30 tokens) + Article (~400 tokens) + Format spec (~20 tokens) ≈ 460 tokens

Previous	Up	Next
← Chapter 1: Introduction	Part I: Foundations	Chapter 3: Prompting Pillars →