Glossary - EduPrompt

Prompt Engineering

The practice of designing and refining text inputs to effectively guide AI models toward desired outputs.

Why it matters: Enables precise control over AI responses and improves result quality.

Token

The basic units that language models process, typically words, subwords, or punctuation marks.

Why it matters: Understanding tokens helps optimize prompt length and efficiency.

Zero-Shot Learning

A model's ability to perform tasks it wasn't explicitly trained on, using only the prompt for guidance.

Why it matters: Allows models to handle new tasks without additional training.

Few-Shot Learning

Providing a small number of examples in the prompt to guide the model's response format and content.

Why it matters: Improves accuracy by demonstrating the desired output structure.

Context Window

The maximum amount of text (in tokens) a model can process in a single request.

Why it matters: Determines how much information can be included in a prompt.

Temperature

A parameter controlling randomness in model outputs, with lower values being more deterministic.

Why it matters: Balances creativity and consistency in generated text.

Top-p (Nucleus Sampling)

Limits token selection to the most probable options that cumulatively reach a certain probability threshold.

Why it matters: Reduces low-probability outputs while maintaining diversity.

Fine-Tuning

Training a pre-trained model on a specific dataset to specialize its capabilities for particular tasks.

Why it matters: Adapts general models to specific domains or requirements.

Embedding

Numerical representations of text that capture semantic meaning in a high-dimensional space.

Why it matters: Enables models to understand relationships between words and concepts.

Attention Mechanism

A model component that determines which parts of the input are most relevant for each output element.

Why it matters: Allows models to focus on important context when generating responses.

Transformer

The neural network architecture that underlies most modern language models, using attention mechanisms.

Why it matters: Revolutionized NLP by enabling models to handle long-range dependencies.

Hallucination

When a model generates false or misleading information that appears plausible but is factually incorrect.

Why it matters: Critical to verify model outputs, especially for factual content.

In-Context Learning

A model's ability to adapt its behavior based on examples and instructions provided within the prompt.

Why it matters: Eliminates need for retraining by leveraging prompt-based adaptation.

Chain-of-Thought

Prompting technique that encourages models to show their reasoning steps before providing an answer.

Why it matters: Improves accuracy on complex reasoning tasks and makes outputs interpretable.

Retrieval-Augmented Generation (RAG)

Combining language models with external knowledge retrieval to produce more factual responses.

Why it matters: Reduces hallucinations by grounding responses in verified information.

Bias

Systematic favoring of certain groups, perspectives, or outcomes in model outputs based on training data.

Why it matters: Can perpetuate unfairness and requires careful mitigation strategies.

Alignment

Techniques for ensuring AI systems behave in accordance with human values and intentions.

Why it matters: Essential for developing trustworthy and beneficial AI systems.

Emergent Abilities

Capabilities that appear in large models without explicit training, such as reasoning or translation.

Why it matters: Reveals unexpected potential but also unpredictable behaviors.

Scaling Laws

Predictable relationships between model size, dataset size, and performance improvements.

Why it matters: Guides resource allocation decisions in model development.

Instruction Tuning

Training models to better follow natural language instructions through specialized datasets.

Why it matters: Makes models more responsive to user intent and easier to use.

Anthropomorphism

Attributing human characteristics, emotions, or intentions to AI systems.

Why it matters: Can lead to unrealistic expectations and misunderstanding of AI capabilities.

Latent Space

High-dimensional representation space where models encode semantic relationships between concepts.

Why it matters: Enables mathematical operations on concepts and analogy formation.

Pre-training

Initial training phase where models learn general language patterns from large text corpora.

Why it matters: Provides foundational knowledge that can be specialized through fine-tuning.

Overfitting

When a model performs well on training data but poorly on new, unseen examples.

Why it matters: Indicates the model has memorized rather than learned generalizable patterns.

Transfer Learning

Applying knowledge gained from one task to improve performance on a related task.

Why it matters: Reduces training time and data requirements for new applications.

Multimodal

Models that can process and generate multiple types of data, such as text, images, and audio.

Why it matters: Enables more natural and comprehensive human-AI interaction.

Few-Shot Chain-of-Thought

Combining few-shot examples with chain-of-thought reasoning to solve complex problems.

Why it matters: Maximizes reasoning performance with minimal prompt engineering effort.

Self-Attention

Mechanism allowing models to weigh the importance of different input elements relative to each other.

Why it matters: Enables models to capture long-range dependencies in text.

Decoder-Only

Model architecture that generates text sequentially, predicting one token at a time based on previous tokens.

Why it matters: Efficient for autoregressive text generation tasks.

Prompt Injection

Malicious manipulation of prompts to override intended model behavior or extract sensitive information.

Why it matters: Security risk requiring careful input validation and prompt design.