AI-900 Study Guide: Foundations of Artificial Intelligence

This study guide is designed to help you review the core concepts covered in the AI-900 exam, specifically focusing on both general Azure AI services and Generative AI.

I. Core Concepts of Artificial Intelligence

A. Defining Artificial Intelligence (AI)

AI’s fundamental goal: Imitate some aspect of human behavior or capability (e.g., speech recognition, image classification, language translation, decision-making, prediction).
AI vs. Generative AI
- Regular AI: Primarily focuses on imitating specific human behaviors or making predictions based on existing data.
- Generative AI: Focuses on creating original content (e.g., natural language, code, images) based on patterns learned from vast datasets—it doesn’t just imitate; it generates new outputs.

B. Machine Learning (ML)

Definition: A subset of AI where computers learn from data without being explicitly programmed—training a “model” on past data to make future predictions.
Key components
- Training data: Labeled data used to teach the model.
- Features: Input data points.
- Labels: Correct answers or outcomes for the features.
- Algorithm: Rules/procedures to find relationships between features and labels (e.g., decision trees, linear regression, SVM).
- Model: The result of training, capable of making predictions on new data.
- Iterative process: Refine via testing and parameter tuning.
Types of machine learning
- Supervised learning: Training data includes labels.
  - Regression: Predict a numerical value (e.g., house price).
  - Classification: Predict a category or class.
    - Binary classification: Two outcomes (e.g., spam/not spam).
    - Multiclass classification: Multiple possible outcomes (e.g., movie genres).
- Unsupervised learning: Training data has no labels.
  - Clustering: Group data points based on similarities.

C. Deep Learning (DL)

Definition: Subset of ML using multi-layer neural networks to model complex relationships—effective where traditional ML is insufficient.
Neural networks: Inspired by the human brain.
- Neurons (activation functions): Units that transform inputs and pass outputs onward.
- Layers:
  - Input layer: Receives initial data.
  - Hidden layers: Where complex processing occurs (often many layers).
  - Output layer: Produces final result.
- Weights and biases: Trainable parameters adjusted to optimize predictions.
- Parameters: Total number of weights and biases (can be billions/trillions in large models); more parameters often improve performance.
Transformer models: Neural network architecture underlying many modern Generative AI models (e.g., GPT).

II. Generative AI Deep Dive

A. Core functionality

Original content creation: Generative AI can create:
- Natural language: Conversation, summarization, code comments, Q&A, creative writing.
- Images: Text-to-image (e.g., DALL·E).
- Code: Authoring and debugging across languages.
Large Language Models (LLMs): Foundation of Generative AI.
- Examples: GPT (OpenAI), LLaMA (Meta).
- Mechanism: Predict the next token based on the prompt and previously generated tokens until an end-of-sequence token (inference).
- Training: Massive datasets (web, Wikipedia, books) + significant compute to learn billions/trillions of parameters.
- Read-only at inference: Core parameters don’t change during generation.

B. Transformer model architecture (high-level)

Core components
- Encoder: Processes the input sequence (prompt).
- Decoder: Generates the output sequence (response).
Processing flow
1. Text → tokens: Break input into tokens (words/sub-words/punctuation).
2. Tokens → embeddings: Convert tokens to vectors capturing semantic meaning.
3. Positional encoding: Add word-order information to embeddings.
4. Multi-head (self-)attention: Model relations/dependencies among tokens; decoder-only models (e.g., GPT) use masked self-attention to look only backward.
5. Feed-forward networks: Core transformation layers producing a context representation used to predict the next token.
Architectures
- Encoder-decoder vs. decoder-only (GPT is primarily decoder-only; inputs feed the decoder).

C. OpenAI and GPT

OpenAI: Company developing advanced AI models.
GPT (Generative Pre-trained Transformer): OpenAI’s flagship LLM series.
- Pre-trained: On vast datasets.
- Transformer-based: Uses the Transformer architecture.
- Versions: GPT-3.5, GPT-4, GPT-4 Turbo (newer → more parameters, larger context windows).
- Context window: Max tokens (input + output) per interaction; larger windows allow longer prompts/responses.
ChatGPT: GPT fine-tuned for dialogue.
- Fine-tuning: Supervised learning + reinforcement learning from human feedback to align with conversational expectations.

D. Prompt engineering

Definition: Crafting effective prompts to guide models.
Key techniques
- Explicitness: Be clear and precise.
- Role-playing: Specify the model’s role (e.g., “Act as a marketing assistant”).
- Zero-shot learning: No examples, rely on pretraining.
- Few-shot learning: Provide a few exemplars.
- Grounding / RAG: Supply external, relevant data in the prompt to improve accuracy and reduce hallucinations.

III. Microsoft Azure AI Services

A. Microsoft’s role in AI

OpenAI partnership: Microsoft provides supercomputing infrastructure for training/hosting.
Azure OpenAI Service: Access OpenAI models (GPT, embeddings, DALL·E) in Azure.
- Deployment: Deploy model instances (e.g., GPT-4) in your subscription.
- Azure OpenAI Studio: Web UI for experimentation and deployment (chat/completions/image playgrounds).
- API access: Consume via REST APIs.
- Pricing: Usage-based (tokens for prompts and completions).

B. Microsoft Copilots

Definition: GenAI assistants integrated into Microsoft products (Word, Teams, Bing, Windows 11, Dynamics, Security).
Orchestrator role: Take user prompts, perform grounding (e.g., via Microsoft Graph/Bing), create a meta-prompt, and send to the LLM.
Capabilities: Accelerate tasks, suggest next steps, generate content, integrate with services (e.g., DALL·E).
Model hosting: Microsoft runs its own OpenAI model instances in Azure, meeting regulatory requirements.
SaaS solution: Copilots are end-to-end GenAI services.

C. Other Azure AI services (non-Generative AI)

Azure Machine Learning Studio (ml.azure.com): Build, train, deploy, and manage custom ML models.
- Datasets: Create/import/label data.
- Model training: Use various algorithms.
- Deployment: To containers (on-prem, AKS, ACI).
Azure AI Services (pre-built models):
- Vision
  - Image Analysis (v4.0): Captions, tags, object detection (bounding boxes), background removal, smart cropping, OCR for small text; supports custom vision with fewer images (Transformer-based).
  - Face API: Face detection, liveness, identification/verification, head pose, masks, glasses, landmarks (no emotion/gender).
  - Custom Vision Service: Older custom image classification/object detection (CNN-based; more images needed).
- Natural Language Processing (NLP)
  - Language Service: Language detection, sentiment, key phrases, NER, summarization.
  - Question Answering (Q&A Maker): Build knowledge bases from FAQs or custom pairs.
  - Language Understanding (LUIS): Detect intents and entities in utterances.
- Speech
  - Text-to-Speech
  - Speech-to-Text
  - Speech translation
  - Translation Service: Text and document translation; supports custom glossaries.
- Document Intelligence (formerly Forms Recognizer)
  - Document analysis: Extract structured data from PDFs/images, forms, receipts, invoices.
  - Pre-built models and custom models (train with as few as five examples; no-code studio).
  - Semantic understanding: Understands meaning (e.g., recognizes an address).
- Knowledge Mining (Azure AI Search / formerly Azure Cognitive Search)
  - Purpose: Extract insights and make diverse data searchable.
  - Data sources: Works with blobs, databases, data lakes.
  - Skill sets: Apply AI skills during ingestion (e.g., chunking, vision/language).
  - Embeddings & vector indexes: For semantic search.
  - Indexes: Keyword and vector.
  - Hybrid search & semantic ranking: Combine keyword + vector for relevance.
  - RAG: Key component for grounding LLMs with private data.

D. Azure AI service resource types & endpoints

Single-service accounts: Dedicated to one capability (e.g., Computer Vision, Speech). Often have free tiers; granular cost tracking.
Multi-service accounts: One resource for most Azure AI services (excludes Azure OpenAI, AI Search). No free tier; simpler, less granular cost.
Azure OpenAI resource: For OpenAI models.
Azure AI Search resource: For knowledge mining.
Endpoints: Each instance exposes a REST API endpoint.
- Authentication
  - API keys: Store securely (e.g., Key Vault).
  - Azure AD / Entra ID: RBAC with managed identities or service principals.
SDKs: Language-friendly libraries for REST APIs.

IV. Responsible AI Principles

A. Importance: Essential for trustworthy, ethical AI—mitigating risks and ensuring societal benefit.

Risks: Bias in data, errors, data exposure, lack of inclusivity/trust, unclear accountability.

B. Microsoft’s six principles

Fairness: Treat all people fairly; avoid bias via comprehensive testing/mitigation.
Reliability and safety: Perform reliably, consistently, and safely; requires rigorous testing and robust deployment.
Privacy and security: Protect personal/sensitive data; scrub data, use legitimate sources, respect privacy.
Inclusiveness: Empower everyone regardless of background or characteristics.
Transparency: Make systems’ purpose/limitations understandable to build trust.
Accountability: People/organizations are responsible for ethical and legal standards.

C. Responsible AI practices in Generative AI

Identifying harms: Stress-testing/red teaming.
Measuring severity/frequency: Quantify impact and likelihood.
Mitigating risks: Content filters, prompt techniques, other safeguards.
Operating: Ongoing monitoring/maintenance.
Content filters: Prevent harmful content; severity settings may require permissions.
Jailbreaking: Attempts to bypass safeguards—necessitates robust protections.

Quiz: Foundations of AI

Instructions: Answer each question in 2–3 sentences.

What is the primary distinction between “regular AI” and “Generative AI” as described in the source material?
Explain the concept of “machine learning” and identify its key goal.
Describe the difference between supervised and unsupervised learning in the context of machine learning.
Briefly explain the role of “parameters” in a deep learning neural network.
What is a “Large Language Model (LLM),” and what is its core function?
How does “prompt engineering” contribute to the effectiveness of Generative AI?
What is the purpose of “grounding” (or Retrieval Augmented Generation – RAG) in a Generative AI context?
Identify two examples of pre-built Vision AI services offered by Azure.
Explain why “Azure AI Search” is useful for knowledge mining beyond simple keyword searches.
Name and briefly describe two of Microsoft’s Responsible AI principles.

Quiz Answer Key

Regular AI primarily imitates specific human behaviors or makes predictions from existing data (e.g., image classification), while Generative AI focuses on creating original content (e.g., text, code, images).
Machine learning is a subset of AI where computers learn from past data to train a model instead of explicit programming; the key goal is to enable predictions/decisions from learned patterns.
Supervised: Labeled data teaches relationships between features and outcomes. Unsupervised: Unlabeled data—finds inherent structure (e.g., clusters) without known answers.
Parameters (weights/biases) define how inputs are transformed; training adjusts them so the model captures complex relationships and improves predictive ability.
An LLM is a Transformer-based Generative AI model trained on massive text corpora; its core function is next-token prediction to generate coherent, contextually relevant language.
Prompt engineering improves outputs by specifying instructions, context, and examples that guide the model toward desired content/format/quality.
Grounding/RAG augments prompts with relevant external data so responses are factual, up-to-date, and less prone to hallucinations.
Image Analysis (captions, tags, object detection) and Face API (face detection, liveness, identification/verification).
Azure AI Search supports both keyword and vector (semantic) indexes, enabling hybrid search and semantic ranking for more relevant results even with varied phrasing.
Fairness: Ensure equitable treatment, mitigate bias. Reliability & safety: Ensure consistent, correct, and safe operation via testing and robust deployment.

Essay Questions (No Answers Provided)

Compare and contrast training and architecture of traditional ML models vs. deep learning, focusing on algorithms, neural networks, and parameters.
Discuss the significance of prompt engineering and grounding in deploying Generative AI responsibly; give examples addressing common LLM challenges.
Explain Copilot orchestration—from user prompt to LLM response—and highlight Azure’s infrastructure role.
Analyze ethical considerations/risks aligned with Microsoft’s Responsible AI principles; choose three and discuss challenges and mitigations.
Describe diverse Azure AI services for non-GenAI tasks; pick three (Vision, NLP, Speech, Document Intelligence) and explain real-world applications.

Glossary of Key Terms

Activation function: Function deciding whether a neuron activates based on its inputs.
AI-900: Microsoft Azure AI Fundamentals exam.
Algorithm (ML): Rules/procedures by which models learn patterns from data.
Artificial General Intelligence (AGI): Hypothetical AI with human-like general cognition.
Artificial Intelligence (AI): Computer ability to imitate aspects of human intelligence.
“Attention Is All You Need”: Paper introducing the Transformer architecture.
Azure AI Search (formerly Azure Cognitive Search): Cloud search for private, heterogeneous content; key for knowledge mining and RAG.
Azure AI Services: Pre-built/customizable AI services on Azure (Vision, Speech, Language, etc.).
Azure Machine Learning Studio (ml.azure.com): Build/train/deploy/manage custom ML models.
Azure OpenAI Service: Access to OpenAI models within Azure.
Bias (AI): Unfair outcomes often stemming from biased training data.
Binary classification: Supervised learning with two outcomes.
Chatbot: Program simulating human conversation.
ChatGPT: OpenAI’s conversational GPT fine-tuned for dialogue.
Chunking: Break large documents/data into smaller segments for processing.
Classification: Supervised learning that assigns data to predefined classes.
Clustering: Unsupervised grouping of similar data points.
Copilot: Microsoft’s GenAI assistant integrated into products.
Computer vision: Enabling computers to interpret images/videos.
Context window: Max tokens (input + output) a model can process at once.
Convolutional Neural Network (CNN): Older image-focused architecture using filters.
Custom model: Model trained on specific user data for a particular task.
Deep Learning (DL): Multi-layer neural networks learning from large data.
Decoder: Transformer component generating the output sequence.
Document analysis: Extract structured data/insights from documents.
Document Intelligence (Forms Recognizer): Azure service for extracting data from forms, receipts, invoices.
DALL·E: OpenAI text-to-image model.
Embeddings: Vector representations capturing semantic meaning.
Encoder: Transformer component processing the input sequence.
Endpoint (AI): URL/address for consuming an AI service.
Entities (NLP): Specific items identified in text (people, orgs, dates, etc.).
Features (ML): Input variables used to train a model.
Few-shot learning: Prompt technique with a few examples.
Fine-tuning: Further training a pre-trained model on a specific dataset/task.
Generative AI: AI that creates new content (text, images, code).
GPT: OpenAI’s Generative Pre-trained Transformer models.
Grounding (RAG): Augment prompts with external relevant data.
GPU: Parallel processor crucial for deep learning training.
Hybrid search: Combine keyword + vector (semantic) search.
Inference: Using a trained model to predict/generate outputs.
Intent (NLP): Underlying purpose of a user’s utterance.
Jailbreaking (AI): Attempts to bypass model safety/ethics constraints.
JSON: Lightweight data-interchange format.
Key phrase extraction: Identify important phrases in text.
Knowledge base: Repository of information for Q&A systems.
Knowledge mining: Extract insights/structure from large un/semistructured data.
Labels (ML): Correct answers/targets in supervised datasets.
Language models: Models for understanding/generating human language.
Language Understanding (LUIS): Azure service for intents/entities.
Large Language Model (LLM): Deep model trained on massive text data.
Linear regression: Supervised algorithm predicting continuous values linearly.
Liveness check: Face-detection feature verifying a live person is present.
Machine Learning (ML): Systems learn from data to predict/decide.
Managed identity: Azure identity for resources without stored credentials.
Masked self-attention: Decoder mechanism restricting attention to past tokens.
Meta-prompt: Enhanced prompt incorporating grounding/context before LLM call.
Microsoft Graph: API connecting data/intelligence across Microsoft 365.
Multiclass classification: Supervised prediction among many classes.
Multi-head attention: Transformer component attending to different subspaces.
Multimodal (AI): Processing/generating multiple data types (text/images/audio/video).
Natural Language Processing (NLP): Understanding/generating human language.
Neural network: Interconnected layers of artificial neurons.
Object detection: Identify and locate objects in images (bounding boxes).
Optical Character Recognition (OCR): Convert images/PDFs to editable/searchable text.
Parameters (deep learning): Trainable weights and biases.
Positional encoding: Add position information to token embeddings.
Pre-built model: Ready-to-use model trained for general tasks via API.
Prompt (Generative AI): Input instruction guiding model output.
Prompt engineering: Designing prompts to elicit desired responses.
Q&A Maker (Question Answering): Azure service for conversational experiences from content.
Red teaming (AI): Stress-testing to find vulnerabilities/harms.
Regression: Supervised prediction of continuous values.
Reliability and safety (AI): Principle ensuring consistent, correct, safe operation.
Responsible AI: Ethical framework/principles for AI development/deployment.
REST: Web architectural style commonly used by HTTP APIs.
Role-Based Access Control (RBAC): Restrict access by user roles.
Semantic Kernel: Microsoft’s SDK for integrating AI models with code.
Semantic ranking: Re-order results by semantic relevance beyond keywords.
Semantic search: Understand meaning/context, not just keywords.
Sentiment analysis: Determine emotional tone of text.
Skill sets (Azure AI Search): AI skills applied during data ingestion.
Smart cropping: Crop images to salient content automatically.
Softmax: Output-layer activation mapping scores to probabilities.
Speech-to-Text: Convert speech to written text.
Speech translation: Translate spoken language (to text or speech).
Supervised learning: Train with labeled data (input-output pairs).
Support Vector Machine (SVM): Supervised algorithm for classification/regression.
Text-to-Speech: Synthesize speech from text.
Token (AI): Fundamental unit of text (word/sub-word/punctuation).
Tokenizer: Break text into tokens.
Training data (ML): Dataset used to train models (features + labels).
Transformer model: Attention-based deep architecture foundational to LLMs.
Translation Service: Azure service for text translation.
Transparency (AI): Principle that users should understand an AI system’s purpose/limits.
Unsupervised learning: Train on unlabeled data to find patterns/structures.
Vector embedding: Numerical representation in multi-dimensional space.
Vector index: Structure for efficient similarity search over embeddings.
Weights (neural network): Values on connections determining signal strength.
Zero-shot learning: Prompting without examples—rely on pretraining.

Timeline of AI Evolution and Azure AI Services

This timeline details the evolution of Artificial Intelligence, from general AI to the more specialized fields of Machine Learning and Deep Learning, with a particular focus on Azure AI services and their applications.

Early AI Concepts (Prior to Machine Learning)

Concept: Artificial Intelligence (AI) is broadly defined as a computer's ability to imitate some aspect of human behavior, such as recognizing speech, classifying images, or translating languages.
Early Implementation (Example): Early chess computers, where logic might be explicitly coded.

Rise of Machine Learning (A Subset of AI)

Core Principle: Machine Learning enables computers to train themselves on past (labeled) data to make future predictions, rather than explicitly coding every piece of logic.
Process: Training data (features + labels) is fed into algorithms (e.g., decision trees, linear regression, support vector machines) to find relationships and create a generalized model.
Iterative Refinement: Models are trained with training data, tested with testing data, and then tweaked (parameters adjusted, more data added) until suitable for real-world use.
Types of Learning:
- Supervised Learning: Uses labeled data.
  - Regression: Predicts numeric values (e.g., house prices based on square footage).
  - Classification: Predicts categories.
    - Binary Classification: Two mutually exclusive outcomes (e.g., like/dislike a video).
    - Multiclass Classification: Multiple possible categories (e.g., video genres).
- Unsupervised Learning: Uses unlabeled data to find patterns and group similar data points (clustering).
Tools: Azure Machine Learning Studio (ml.azure.com) is used for custom model creation, data labeling, training, testing, and deployment (including to containers).

Emergence of Deep Learning (A Subset of Machine Learning)

Core Principle: Uses neural networks to address complex relationships that traditional ML algorithms struggle to model.
Architecture: Multiple layers of interconnected “neurons” (activation functions), including input, output, and hidden layers.
Weights and Biases: Each connection has a weight, and each neuron has a bias, adjusted during training to model complex behaviors.
Activation Functions: Neurons “activate” if input reaches a threshold (e.g., ReLU, sigmoid).
Computational Intensity: Training often requires massive compute (e.g., GPUs) and significant time.
Common Output: Frequently uses a softmax output to provide a probability distribution for easy next-item selection.

The Generative AI Revolution

Core Principle: Focuses on creating original content, based on patterns learned from vast datasets.
Key Technology: Large Language Models (LLMs), built upon deep learning and the Transformer architecture.
“Attention Is All You Need” Paper: Pioneered the Transformer model; many LLMs (e.g., GPT) use decoder-only variants.
Prompt Engineering: Crafting effective natural-language instructions to guide LLMs. Includes zero-shot, few-shot, and grounding (adding external data to the prompt).
Tokenization: Input is broken into tokens (words, subwords, punctuation, emojis) for processing.
Embeddings: Tokens mapped to high-dimensional vectors capturing semantic meaning.
Positional Encoding: Added to embeddings to preserve word order and positions.
Multi-headed Self-Attention: Lets the model weigh the importance of different words (and generated tokens) to maintain long-range context.
Inference: A trained model predicts the next token from a prompt until an end-of-sequence is reached.
Scaling: More parameters and training data generally yield greater capability.
Small Language Models (SLMs): Smaller, specialized models optimized for cost and latency on specific tasks.

OpenAI’s Contributions

GPT (Generative Pre-trained Transformer): Generative models pre-trained on massive datasets using the Transformer architecture.
GPT Versions: Evolving capabilities and larger context windows (e.g., 3.5, 4, 4 Turbo).
ChatGPT: An application built on GPT, fine-tuned for interactive dialogue via supervised training and response scoring.

Microsoft’s Role and Azure AI Services

Partnership with OpenAI: Microsoft provides supercomputing infrastructure and hosts OpenAI services.
Azure OpenAI Service: Access to OpenAI models (GPT, DALL·E, embeddings) within Azure, deployable via API with enterprise controls.
Azure OpenAI Studio: Manage deployments, experiment in a playground, and access APIs for integration.
Microsoft Copilots: Orchestrators leveraging LLMs across Microsoft products (Word, Teams, Bing, Windows 11, Dynamics, security).
- Functionality: Take user prompts, perform grounding (e.g., via Microsoft Graph or search indexes), build a meta-prompt, call the LLM, and return a refined response.
- Capabilities: Summarization, content generation, comparison, code generation, image generation (via DALL·E).
- Service Model: Delivered as SaaS.
Other Azure AI Services (Non-Generative AI):
- Computer Vision
  - Image Analysis (v4.0): Captions, tags, object detection (bounding boxes), background removal, smart cropping, OCR for small text; Transformer-based, less data needed than older CNNs.
- Face API: Face detection, liveness checks, identification/verification, head pose, masks, glasses, facial landmarks (note: emotional state/gender not supported due to abuse concerns).
- Custom Vision (Legacy): Older CNN-based custom image training requiring more data.
- Natural Language
  - Language Service: Language detection, sentiment, key phrases, entities, summarization, question answering.
  - Question Answering (Q&A): Build knowledge bases (FAQs, chit-chat) consumable by bot services.
  - Azure Bot Service: Framework to develop, publish, and manage bots (Teams, web chat, email).
  - LUIS: Intent and entity extraction from user utterances (e.g., “turn on the lights” → intent: turn on; entity: lights).
- Speech Services
  - Text-to-Speech: Synthesize natural-sounding speech.
  - Speech-to-Text: Transcribe spoken audio; language recognition.
  - Speech Translation: Translate spoken language to text or speech in a target language.
- Translation Service: Translate text/documents; supports custom domains, profanity filters, selective translation.
- Document Intelligence (Forms Recognizer): Extract structured data from documents (forms, receipts, invoices); pre-built and custom models with few examples (no-code).
- Knowledge Mining — Azure AI Search (formerly Azure Cognitive Search): Index data from various sources; extract info via skill sets (e.g., chunking, OCR), build keyword and vector indexes; supports hybrid search (keyword + vector).
Responsible AI Principles (cross-cutting):
- Fairness
- Reliability & Safety
- Privacy & Security
- Inclusiveness
- Transparency
- Accountability
Resource Management: Services can be deployed as single-service resources (often with free tiers), multi-service accounts (shared endpoint, no free tier), Azure OpenAI, and Azure AI Search (its own resource type).
Authentication: Access via REST/SDKs; authenticate with API keys (store in Azure Key Vault) or Entra ID (Azure AD) with RBAC.

Cast of Characters

This list focuses on the main entities and concepts as “characters” in the narrative of AI development and application.

Artificial Intelligence (AI)
- Bio: The overarching concept of computers imitating human capabilities like speech recognition, image classification, or translation.
Machine Learning (ML)
- Bio: A subset of AI where computers learn from labeled data to make predictions without explicit rules for every scenario.
Deep Learning
- Bio: A subset of ML using multi-layer neural networks to model intricate relationships; powers many advanced capabilities, including generative AI.
Generative AI
- Bio: Focused on creating original content (text, code, images) rather than only imitating or classifying; primarily powered by LLMs and Transformers.
Large Language Models (LLMs)
- Bio: Transformer-based deep learning models trained on massive text corpora to understand and generate natural language by predicting the next token.
The Transformer Model
- Bio: Foundational architecture introducing self-attention for sequence modeling; described in “Attention Is All You Need.”
OpenAI
- Bio: AI research and deployment company behind GPT and DALL·E.
GPT (Generative Pre-trained Transformer)
- Bio: Powerful generative models pre-trained on vast datasets; used for text generation, summarization, coding; versions include GPT-3.5, GPT-4, GPT-4 Turbo.
ChatGPT
- Bio: An interactive chatbot fine-tuned from GPT for conversational dialogue via supervised training.
Microsoft
- Bio: Strategic OpenAI partner providing supercomputing infrastructure and hosting OpenAI models on Azure; integrates AI via Copilots.
Microsoft Copilots
- Bio: AI assistants embedded across Microsoft products, orchestrating grounding + LLMs to provide context-aware assistance.
Azure OpenAI Service
- Bio: Access to OpenAI models within Azure for secure, compliant enterprise deployment and integration.
Azure Machine Learning Studio
- Bio: Cloud service to build, train, deploy, and manage custom ML models (supervised and unsupervised).
Azure AI Services (General)
- Bio: Pre-built, cloud AI models/APIs across vision, language, speech, and document intelligence for easy app integration.
Azure AI Search (formerly Azure Cognitive Search)
- Bio: Cloud search enabling knowledge mining with skill sets, keyword + vector indexes, and semantic (hybrid) search.
Responsible AI
- Bio: Ethical principles and practices—Fairness, Reliability & Safety, Privacy & Security, Inclusiveness, Transparency, Accountability—guiding safe, beneficial AI.

What is the fundamental difference between traditional AI and Generative AI?

Traditional AI focuses on imitating human behaviors, such as recognizing speech, classifying images, or translating languages. Its goal is to replicate existing human-like intelligence. Generative AI, on the other hand, shifts the focus to creating original content—natural language, code, images, etc. While it’s trained on vast datasets, it generates novel outputs that didn’t exist in its training data in that specific form, much like how humans create new ideas based on learned experiences.

How do Large Language Models (LLMs) like GPT work, and what is the role of a “prompt”?

Large Language Models are the core of many generative AI applications. They are built upon a Transformer architecture and trained on massive amounts of text from the internet, books, and other sources.

When you interact with an LLM, you provide a prompt (natural-language instructions or questions). The model:

breaks the text into tokens,
maps tokens to embeddings (numerical vectors capturing semantic meaning),
adds positional encoding (so word order affects meaning),
uses self-attention to relate parts of the sequence,
then predicts the next most probable token repeatedly until the response is complete.

The quality and phrasing of the prompt—prompt engineering—strongly influence the output.

What is the “Transformer model” and its key components in the context of LLMs?

The Transformer is a foundational architecture (introduced in Attention Is All You Need) with an encoder and a decoder. The encoder turns input into a rich representation; the decoder uses it to generate output.

Key components:

Tokenization: Split input text into tokens.
Embeddings: Convert tokens into vectors so similar words have similar representations.
Positional Encoding: Inject order information into token representations.
Multi-head Attention: Let the model weigh different tokens’ importance and maintain long-range context.

Note: Modern GPT-style LLMs are typically decoder-only Transformers.

What is the relationship between OpenAI, GPT, ChatGPT, and Microsoft’s Azure AI services and Copilots?

OpenAI developed GPT (Generative Pre-trained Transformer) models that predict the next token.
ChatGPT is a GPT model further tuned for dialogue, making it effective for conversational tasks.
Microsoft partners with OpenAI, providing large-scale GPU infrastructure and hosting models in Azure OpenAI so developers can use them via Azure.
Microsoft also embeds these models in product Copilots (e.g., Microsoft 365, Bing, Windows). Copilots orchestrate user prompts, ground them with product data (e.g., Microsoft Graph), build a meta-prompt, and call the underlying LLM to produce a response.

What are the different types of Machine Learning, and how does Deep Learning fit in?

Machine Learning (a subset of AI) enables computers to learn from data without explicit programming. Major types include:

Supervised Learning: Train on labeled data (features + correct labels).
- Regression: Predict numeric values (e.g., house prices).
- Classification: Assign categories (e.g., spam/not spam; multiple genres).
Unsupervised Learning: Find patterns in unlabeled data (e.g., clustering).

Deep Learning is a subset of ML that uses neural networks with many hidden layers (neurons, weights, biases, activation functions). This depth lets models learn complex relationships, powering capabilities found in generative AI and LLMs, but typically requires significant data and compute.

What are “Responsible AI” principles, and why are they crucial for Generative AI?

Six key principles:

Fairness: Treat people equitably; mitigate dataset and outcome bias.
Reliability & Safety: Test and operate systems consistently and safely.
Privacy & Security: Protect sensitive data and prevent leakage.
Inclusiveness: Make AI beneficial and accessible to diverse users.
Transparency: Explain purpose, limits, and workings to build trust.
Accountability: Assign responsibility for outcomes and impacts.

Generative AI can create new content and thus introduces risks (e.g., biased, harmful, or misleading outputs). Filters, policies, and governance are essential.

What are some key pre-built Azure AI services beyond Generative AI?

Computer Vision: Image captioning, tagging, object detection, background removal, smart cropping, OCR for small text; Face API for detection, liveness, and identification/verification.
Natural Language: Language detection, sentiment analysis, key phrases, entity recognition, summarization, question answering, and intent/entity parsing (LUIS).
Speech: Text-to-speech and speech-to-text, plus speech and text translation.
Document Intelligence (Forms Recognizer): Extract structured data from documents (receipts, invoices), with custom models trainable from few examples.
Knowledge Mining (Azure AI Search): Index structured/semi-structured/unstructured content; use skill sets for chunking/enrichment; support keyword and vector (semantic) search.

Many services offer per-service resources and often a free tier for experimentation.

How do developers build their own AI solutions and integrate them with Azure AI services?

Use Azure Machine Learning Studio to create datasets, label, train, evaluate, and deploy ML/DL models.
Call pre-built services via REST endpoints secured by API keys or Entra ID (Azure AD) for role-based access.
Leverage SDKs for popular languages to simplify integration.
For complex, data-aware workflows, frameworks like Semantic Kernel orchestrate tools and data sources (e.g., Azure AI Search for grounding) to build enriched meta-prompts before invoking LLMs.

AI Fundamentals: Briefing Document

1. Introduction to Artificial Intelligence (AI)

Artificial Intelligence (AI) encompasses computer systems designed to "imitate some aspect of human behavior." This broad definition covers a wide range of capabilities, from recognizing speech and classifying images to translating languages and making predictions based on historical data.

1.1. Machine Learning (ML)

Machine Learning is a subset of AI where computers "train itself based on past data and then use that past data to train a model to be able to make future predictions." This involves:

Training Data: Labeled datasets containing "features" (input data) and "labels" (correct answers).
Algorithms: Various algorithms (e.g., decision trees, linear regression, support vector machines) are used to find relationships between features and labels.
Model: The output of the training process, which can then be used to predict labels for new, unseen data.
Iterative Process: Models are often refined through testing with "testing data" and tweaking parameters until desired accuracy is achieved.

Types of Machine Learning:

Supervised Learning: Uses labeled data to:
- Regression: Predicts numeric values (e.g., house prices based on square footage).
- Classification: Categorizes data into classes:
  - Binary: Two possible outcomes (e.g., "like" or "not like" a video).
  - Multiclass: Multiple possible outcomes (e.g., classifying a video as horror, fiction, or learning).
Unsupervised Learning: Uses unlabeled data to identify "clustering" or natural groupings within the data based on similarities.

1.2. Deep Learning

Deep Learning is a more advanced subset of Machine Learning, essential for handling complex relationships that traditional algorithms cannot. It utilizes "neural networks," which are composed of multiple layers of interconnected "neurons."

Neural Network Structure:
- Input Layer: Receives the initial data (e.g., tokens from text).
- Hidden Layers: Multiple intermediate layers where complex computations occur.
- Output Layer: Produces the final result (e.g., predicted next token).
Neurons (Activation Functions): Each neuron processes incoming values and decides whether to "activate" (pass a value) or pass zero to subsequent neurons, often based on a threshold.
Weights and Biases: Each connection between neurons has a "weight," and each neuron has a "bias." These "trillions of parameters" are adjusted during training to allow the network to model highly complex patterns.
Training: Requires "months and 10,000s super powerful GPUs" to "nudge these weights and biases" to represent the ability to make predictions (e.g., the next most probable token).

2. Generative AI

Generative AI is a new field that "switches the focus to its ability to create original content." Unlike regular AI that imitates human behavior (e.g., classifying images), generative AI produces new content.

2.1. Core Concepts of Generative AI

Original Content Generation: Can create natural language (conversations, summaries, code), images, and other forms of media.
Large Language Models (LLMs): The foundation of most generative AI, such as GPT (Generative Pre-trained Transformer) and LLaMA. These models are "trained on this huge amount of information" (web data, Wikipedia, books) and have "billions and even trillions of parameters."
Transformer Model: The underlying architecture for modern LLMs, pioneered by the paper Attention Is All You Need. It processes input through an encoder and generates output through a decoder, although many modern LLMs like GPT primarily use the decoder.
Prompt: The natural language instructions given to an LLM. The "quality of the prompt will drive the quality of its ability to respond."
Prompt Engineering: The science of crafting effective prompts. Key techniques include:
- Explicitness: Being clear about what is desired.
- Role-playing: Instructing the model on how it should act.
- Zero-shot: Providing no examples.
- Few-shot: Providing examples of user input and desired agent responses.
- Grounding (Retrieval Augmented Generation - RAG): "Bringing in data from that to add to your prompt" from external sources (e.g., emails, search indexes) to provide context and overcome the LLM's lack of real-time knowledge.
Token Prediction (Inference): The LLM's fundamental task is to "predict the next word specifically it's a token... until fin it gets something called an end of sequence." This is an iterative process.
Tokenization: Converts input text into "tokens" (parts of words, whole words, punctuation, emojis) that computers can process numerically.
Embeddings: Creates "vectors" (sequences of numbers in high-dimensional space) that represent the "semantic meaning of the words or those tokens." Words with similar meanings will have vectors that are "very close to each other."
Positional Encoding: Adds information about the position of words in a sequence to the embedding vectors, as "the positions of the words matter."
Self-Attention: A crucial mechanism that allows the model to understand the relationships between different words in a sequence, ensuring it "doesn't forget about stuff that's earlier on" and maintains context.

2.2. OpenAI and GPT

OpenAI: A company that has done significant work in generative AI, notably developing the GPT models.
GPT (Generative Pre-trained Transformer): A family of LLMs. Versions like GPT-3.5, GPT-4, and GPT-4 Turbo exist, differing in "number of parameters is how powerful they get" and "token size" (context window for input and output memory). GPT-4 Turbo, for example, has a 128K context window but an output limit of 4,096 tokens.
ChatGPT: Built on the GPT model but with "additional training for interaction" (supervised training with user/agent examples and result scoring) to align its behavior with typical user dialogue.

2.3. Microsoft's Role in Generative AI

Microsoft is a major partner of OpenAI, providing "all the data center infrastructure" and "supercomputers with all the GPUs" for training. Microsoft leverages these technologies in two primary ways:

Microsoft Co-pilots: These are "orchestrator[s]" that integrate LLMs into various Microsoft products (Word, Teams, Bing, Windows 11, Dynamics).
- Functionality: Co-pilots take user prompts, perform "grounding" (gathering relevant data from specific applications via APIs like Microsoft Graph or Bing search indexes), create a "meta prompt," send it to an LLM (Microsoft's own instances, often within their regulatory boundaries), and return a refined response.
- Capabilities: Co-pilots accelerate tasks, suggest next steps, and can even "tell the large language model hey here are the apis I have available" to execute actions on behalf of the user.
- Nature: They are presented as a "SaaS type solution" or "generative AI as a complete service."
- Multimodal Interaction: Bing Chat demonstrates multimodal capabilities, allowing image input, image generation (via DALL-E 3), and code generation.
Azure OpenAI Service: Allows developers to deploy and use instances of OpenAI models (GPT, embedding models, DALL-E) within their own Azure cloud environment.
- Azure OpenAI Studio: Provides a platform for creating and deploying models, experimenting in a "playground," and exposing them via an API for custom applications.
- Pricing: Based on usage (tokens for prompts and completions).
- Data Integration: Applications can use services like Azure AI Search (formerly Azure Cognitive Search) and Semantic Kernel (an orchestrator) to connect LLMs to proprietary data (blob storage, databases, data lakes). This involves creating "vector embeddings" of the data and user queries to find semantically similar information, enhancing the LLM's responses.

3. Other Azure AI Services

Beyond generative AI, Azure offers a comprehensive suite of pre-built AI services for various tasks.

3.1. Computer Vision

Deals with analyzing images (fundamentally pixels with values). Can be multimodal, understanding images, language, and video.

Image Analysis (v4.0):
- Capabilities: Caption generation, tagging, object detection (with bounding boxes), background removal, smart cropping, Optical Character Recognition (OCR) for small amounts of text.
Custom Models: Can be trained with fewer images (2–5 recommended, but 50–60 for best quality) using the Transformer architecture, offering higher performance than older convolutional neural networks.
Face:
- Capabilities: Detects faces, performs liveness checks, identifies and verifies individuals against a database, detects head pose, masks, glasses, and facial landmarks.
Limitations: No longer supports emotional state or gender detection due to abuse potential.

3.2. Natural Language

Enables computers to understand and interact with human language, primarily through tokenization.

Language Service:
- Capabilities: Text analysis (language detection, sentiment analysis, key phrase extraction, entity recognition), summarization, question answering.
Question Answering: Creates a "knowledge base" of Q&A pairs, often consumed by an Azure Bot Service.
Azure Bot Service: Provides a framework to develop, publish, and manage bots across various channels (Teams, web chat, email).
Language Understanding (LUIS): Detects the "intent" and "entities" within a user's "utterance" (e.g., "turn on the lights" → intent: turn on, entity: lights).
Speech Service:
- Capabilities: Text-to-speech (synthesizing voices), speech-to-text (transcription), language recognition, speech translation (to text or to another language of speech).
Translator Service:
- Capabilities: Translates text and documents between over 60 languages, supports custom domain/industry-specific language dictionaries, profanity filters, and selective translation of certain words.

3.3. Document Intelligence (formerly Forms Recognizer)

Focuses on extracting structured data from documents (forms, receipts, invoices, large text bodies).

Document Analysis: Provides a structured data version of documents.
Pre-built Models: Available for common document types (receipts, invoices, ID cards).
Semantic Understanding: Recognizes the "semantic meaning" of text within documents (e.g., "that's an address," "that's a phone number").
Custom Models: Can be trained with as few as five examples using a no-code approach in the Document Intelligence Studio.

3.4. Knowledge Mining (Azure AI Search)

A dedicated Azure AI Search resource for extracting insights from large datasets.

Data Sources: Integrates with various data sources (blob, data lake, databases).
Skill Sets: Defines operations to enrich and process data:
- Chunking: Breaks large documents into smaller, manageable parts.
- Embedding Model: Creates high-dimensional vector representations of data chunks.
- Enrichment: Calls other services (e.g., Vision service for text extraction from images).
Indexes: Creates both traditional exact text indexes and "vector indexes" (for semantic meaning).
Hybrid Search: Combines keyword and vector-based search, with semantic ranking, to provide highly relevant results for natural language queries.

4. Azure AI Service Management

Resource Types:
- Single Service: Dedicated to one AI capability (e.g., Computer Vision, Speech). Often offers a "free option available" for experimentation.
- Multi-Service: A single resource to consume "nearly all of the different types of service" (excluding Azure OpenAI and AI Search). No free option.
- Azure OpenAI Resource: For OpenAI models.
- Azure AI Search Resource: For knowledge mining.
Endpoints: Each service exposes a REST endpoint (URL) for applications to interact with.
Authentication:
- API Keys: Applications use keys to authenticate, which should be stored securely (e.g., Azure Key Vault).
- Azure AD Integrated Authentication: Many services support using Azure Active Directory (Entra ID) with Role-Based Access Control (RBAC) and managed identities, eliminating the need to store keys.

5. Responsible AI

The introduction of AI systems necessitates careful consideration of ethical implications and potential risks. Microsoft outlines six key principles:

Fairness: Ensuring all people are treated fairly and mitigating "bias" introduced through training data. Requires "comprehensive testing."
Reliability and Safety: Ensuring models are trustworthy, especially for critical applications. Requires "rigorous testing and a good deployment process."
Privacy and Security: Protecting data used for training and ensuring it's not exposed or misused. Data scrubbing and legitimate sourcing are critical.
Inclusiveness: Ensuring the technology works for and includes "everyone from all parts of society."
Transparency: Users should understand "how it's working, what are its limitations, what is its purpose" to foster trust.
Accountability: Establishing clear responsibility for AI systems, typically with developers and companies ensuring adherence to ethical and legal standards.

Practical Application: Azure AI Services incorporate content filters and protections to prevent negative or harmful behavior. While some filter severities can be adjusted, certain functionalities require special permissions, highlighting the importance of responsible AI design.

rostrovsky/AI-900.md