Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save savishy/bb6c53679aa933857c8b1a7afe66591f to your computer and use it in GitHub Desktop.
Save savishy/bb6c53679aa933857c8b1a7afe66591f to your computer and use it in GitHub Desktop.
Azure Certification Prep Notes

Azure Solutions Architect

Module: Design authentication and authorization solutions

Notes from going through Design authentication and authorization solutions.

  • Authentication: Check that a user is recognized by the system
  • Authorization: Guard access for the authenticated user to specific resources

Identity and Access Management (IAM)

Core features of a well-designed IAM solution

  1. Unified - central location for all identities
  2. Seamless - fast user experience, keeps users productive
  3. Secure - has adaptive policies to protect access to resources
  4. Simplified identity governance

Design Considerations

  1. Consider using Entra ID
  2. Consider your B2B requirements
  3. Consider your B2C requirements

Microsoft Entra ID

  • Azure's solution for IAM
  • Can be implemented as a cloud-only solution
    • In cloud-only mode it provides identity management and RBAC
  • Can also be implemented in hybrid modes
    • In hybrid mode, extends on-premises Active Directory to cloud
    • Useful if an organization is already using Active Directory and does not want to implement a whole new solution just to access Azure resources.

Design Considerations

  1. Consider benefits of centralized identity management.
    1. For hybrid scenarios, don't create a new identity just for cloud. Use Entra ID in hybrid mode.
    2. Users will be more productive with a single identity across on-premises and cloud.
    3. Admins will be more productive if they can manage identities centrallly.
  2. Consider using a single Microsoft Entra instance.
    1. Single Entra instance = single authoritative source = single source of truth = lower risks due to human error
  3. Consider limiting account synchronization.
    1. If you have Microsoft Entra accounts with high privileges, don't sync them to on-premises Active Directory.
    2. This limits blast radius.
  4. Consider password hash synchronization.
    1. Described here.
  5. Consider single sign-on (SSO).
    1. Reduces likelihood of users reusing passwords or generating weak passwords.
  6. Consider overhead of managing separate identities.
    1. Separate on-premises and cloud identities = overhead of account management = extra risk

Azure Solutions Architect

Module: Design for Governance

Notes from going through https://learn.microsoft.com/en-us/training/modules/design-governance

Azure Hierarchy

Azure Hierarchy

Tenant Root Group contains all management groups and subscriptions. Allows global policy assignment.

Item Description
Management Group Manage multiple subscriptions.
Subscription Logical container for management and scale. Is a billing boundary.
Resource Group Logical containers for resources.
Resource Instance of services.

Management Groups

  • Place policies across multiple subscriptions
  • MGs can be nested if you wish.
    • E.g. Place limits across subscriptions on which regions can be used
  • If multiple subscriptions require the same policies, group them under one MG.

πŸ’‘ Don't overuse the MG hierarchy; this makes it unnecessarily complex.

πŸ’‘ Don't create too many MGs if a subscription can do the job for you.

Subscriptions

  • Each subscription is one billing environment

πŸ’‘ Consider placing all shared networking resources in one subscription. (e.g. all WANs, ExpressRoutes)

πŸ’‘ Consider placing specialized resources (e.g. IoT) under dedicated subscriptions.

πŸ’‘ Consider placing all resources with a specific compliance requirement (e.g. PCI) into one subscription.

πŸ’‘ Remember that VNETs cannot be shared across subscriptions, so VNET peering may be needed if your resources need to communicate across subs.

Resource Groups

  • All resources that are linked to each other and have the same lifecycle go into one Resource Group.
  • Resources in different RGs can talk to each other.
  • Resources can be moved between RGs.

Group by type or technology:

graph TD;
    SQLRG-->Resource1;
    SQLRG-->Resource2;
    SQLRG-->Resource3;

    WEBRG-->Resource4;
    WEBRG-->Resource5;
    WEBRG-->Resource6;
Loading

Group by application:

graph TD;
    App1RG-->Resource1;
    App1RG-->Resource2;
    App1RG-->Resource3;

    App2RG-->Resource4;
    App2RG-->Resource5;
    App2RG-->Resource6;
Loading

Resource Tags

Same as tags everywhere else. Key-value pairs.

πŸ’‘ if you apply a tag to an RG, the resources in the RG don't inherit that tag.

Azure Policy

  • Prevent non-compliant resources from being created
  • Can be applied at different levels and are inherited down the hierarchy.

Examples

Policy Applied at
All resources should have mandatory tags Management Group
Only specific SKUs for VMs Management Group

Role Based Access Control (RBAC)

πŸ’‘ Policy vs RBAC: Policy ensures that the resource is compliant. RBAC ensures the individual (user) has access to perform certain actions.

  • RBAC is an "allow" model, i.e. you specify what things to allow access to.
  • πŸ’‘ Best practices when designing RBAC:
    • Assign roles to groups, not users. (makes it more manageable)
    • Consider principle of least-privilege.
  • Custom roles are useful when the default roles don't give you the granularity you need.
  • RBAC is an "additive" model, i.e. effective permissions are the sum of your role assignments.

Azure Landing Zones

Azure Landing Zones provide an infrastructure environment for hosting your workloads.

Knowledge Check

Tailwind Traders is planning on making some significant changes to their governance solution. They would like your assistance with recommendations and questions. Here are the specific requirements:

  1. Consistency across subscriptions. It appears each subscription has different policies for the creation of virtual machines. The IT department would like to standardize the policies across the Azure subscriptions.

  2. Ensure critical storage is highly available. There are several critical applications that use storage. The IT department wants to ensure the storage is made highly available across regions.

  3. Identify research and development costs. The CFO wants to know the cost estimates for a new project. The costs are spread out across multiple departments.

  4. ISO compliance. Tailwind Traders wants to certify that it complies with the ISO 27001 standard. The standard requires resource groups, policy assignments, and templates.

  5. How can Tailwind Traders ensure policies are implemented across multiple subscriptions?

  • Add a resource tag that includes the required policy.
  • Create a management group and place all the relevant subscriptions in the new management group.
  • Add a resource group and place all the relevant subscriptions in it.
  1. How can Tailwind Traders ensure applications use geo-redundancy to create highly available storage applications?
  • Add a resource tag to each storage account for geo-redundant storage.
  • Add a geo-redundant resource group to contain all the storage accounts.
  • Add an Azure policy that requires geo-redundant storage.
  1. How can Tailwind Traders report all the costs associated with a new product?
  • Add a resource tag to identify which resources are used for the new product.
  • Add a resource group and move all product assets into the group.
  • Add a spreadsheet and require each department to log their costs.

Azure Solutions Architect

Module: Design for Security

Defense in Depth

My notes from going through https://docs.microsoft.com/en-us/learn/modules/design-for-security-in-azure/

  • Zero trust model: Never assume trust, but continually validate it.

    • E.g. don't assume that because a request came from inside the network it can be trusted.
    • This approach enforces "Defense in Depth".
  • Azure Security Center provides a solution to implement security and mitigate threats.

Defense in Depth Principles (CIA)

  • Confidentiality: Principle of least privilege. Restrict access only to individuals explicitly granted access.
  • Integrity: Prevent unauthorized changes to data at rest or in transit.
  • Availability: Ensure services are available to authorized users.

Some of the stuff in this module is also covered in https://gist.github.com/savishy/bb6c53679aa933857c8b1a7afe66591f#design-for-security.

Each of the Defense in Depth layers can implement one or more of the CIA principles.

Check your knowledge

  1. True or false: defense in depth is a strategy aimed to protect you against attacks attempting to gain access to your information?
  • True
  • False
  1. True or false: by moving to the cloud, my architecture is fully secure and I can hand off all security responsibilities to my cloud provider?
  • True
  • False

Identity Management

Single Sign On

  • More identities = more effort to manage them by end users = greater risk of misplacing or exposing them.
  • More effort if a user is locked out, or an employee leaves org etc. Azure AD
  • Provides centralized SSO
  • can also integrate with your existing on-premises Active Directory (with Azure AD Connect)
  • all applications (even on-prem) can share same credentials
  • Add rules and policies to control access to applications and data

Multi Factor Authentication

  • MFA
    • Something you have: an RSA device or an app that generates OTP
    • Something you know: passwords, security questions
    • Something you are: biometrics
  • Azure AD has MFA Support:
    • All global administrators get MFA enabled free of charge
    • Other accounts can have MFA enabled by purchasing licenses

Conditional access policies

  • E.g block logins from IPs that are not in a certain range
  • Require MFA from IPs outside of work IP range

Securing legacy applications

Azure Application Proxy

  • Legacy on-premises applications can be accessed remotely without any code changes.
  • Users use MyApps portal to get single-sign-on both to SaaS apps and On-Prem apps.
  • remote access does not require opening any inbound connections through the firewall.
  • Application Proxy is a cloud application, so you save time and money on infrastructure and maintenance costs.

2 components of Azure Application Proxy

  • An external URL
  • An on-prem connector agent
Users -> Navigate to URL -> Authenticate with Azure AD -> Connector Agent routes them to the on-premises application

Working with consumer identities

Azure AD B2C

  • Allows users to use social identities (e.g. Google login)
  • B2C AD directories are distinct from standard Azure AD directories.

Check your knowledge

Check your knowledge

  1. Which of the following is NOT a benefit of single sign-on?
  • Increased complexity assigning permissions to users
  • Fewer IDs and passwords for users to remember
  • Lower administration effort when users change roles or leave an organization
  • Ensures a consistent password policy across applications

Infrastructure Protection

Role Based Access Control

  • Role: collection of access permissions

  • Roles can be granted at individual service level or at a higher scope (e.g. an entire subscription)

  • Roles assigned at a higher scope are inherited by child scopes

  • Management Groups allow grouping subscriptions together and apply policy at an even higher level.

πŸ’‘ Users, groups and roles are all stored in Azure AD.

Privileged Identity Management (PIM)

  • Additional paid offering
  • Manage, control and monitor access to resources in Azure, AD, and other Microsoft services (Office 365)

Providing just-in-time privileged access to Azure AD and Azure resources Assigning time-bound access to resources by using start and end dates Requiring approval to activate privileged roles Enforcing Azure Multi-Factor Authentication (MFA) to activate any role Using justification to understand why users activate Getting notifications when privileged roles are activated Conducting access reviews to ensure that users still need roles Downloading an audit history for an internal or external audit To use PIM, you need one of the following paid or trial licenses:

Azure AD Premium P2 Enterprise Mobility + Security (EMS) E5

service principals

  • Identity: something that can be authenticated (e.g. a user account, or an application, or a server)
  • Principal: An identity that assumes a role. E.g. using sudo changes the role of your identity.
  • Service Principal: A service that uses an identity and that identity can assume certain roles.

Managed identities for azure resources

  • Creation of service principals can be tedious
  • Maintenance of SPs is difficult
  • Managed identities can be instantly created for a supported Azure service
  • Managed identity = an account in Azure AD.
  • Azure will take care of authenticating the service and managing the AD account.
  • You can manage access for the AD account to other resources.

πŸ’‘ not all azure services support managed identities.

Check your knowledge

  1. Azure role-based access control can be applied to all but which of the following scopes?
  • Subscription
  • Resource group
  • Files and folders within a Linux filesystem
  • Resource
  1. True or false: a managed identity for Azure resources could be assigned to a virtual machine to give it rights to start and stop other virtual machines.
  • True
  • False

Encryption

Types of encryption

  • Symmetric same key used to encrypt and decrypt data.
  • Assymmetric uses 2 keys (key pair).
    • Public or Private key can encrypt
    • Only private key can decrypt

Approaches of encryption

  • At rest
  • in transit

Encryption at rest

  • Refers to encrypting data at rest.
  • At rest = data stored in a DB, file storage, storage account etc.
  • This ensures that if an attacker obtains a hard drive (or VHD for e.g.) the data cannot be decrypted without the keys.

Encryption in transit

  • Referes to encrypting data in transit
  • e.g. sending data over a network

πŸ’‘ It help the decision-making process to identify and classify data as Restricted, Moderate, Public. This then allows you to decide what level of encryption to apply.

Azure SSE

πŸ’‘ Encryption for data in the physical disks.

  • Storage Service Encryption
  • For data at rest.
  • All Azure Storage services i.e. Azure Managed Disks, Blob Storage, Files, Queues, Tables
  • All performance tiers (Standard and Premium)
  • Both deployment models (Resource Manager and Classic)
  • AES 256-bit
  • Enabled by default, no additional code or features needed.
  • By default, Microsoft manages the encryption key - but you can provide your own key if needed.

Azure Disk Encryption (ADE)

πŸ’‘ SSE won't help if someone gets access to the Azure subscription and therefore, the VHDs attached to your VM.

  • Uses Bitlocker feature of Windows, and DM-Crypt feature of Linux
  • Encryption keys can be managed in Azure Key Vault
  • Azure Security Center will alert you if you have unencrypted VMs.

Transparent Data Encryption (TDE)

πŸ’‘ For databases

  • Enabled by default for all newly deployed Azure SQL databases
  • Uses a Unique encryption key per logical SQL Server
  • By default, Azure-managed keys. Bring your own keys and store them in Azure Key Vault.

"SQL Server Always Encrypted" feature is designed for personal information or financial data.

  • Install a client driver
  • Driver performs encryption and decryption
  • Rewrites T-SQL queries to encrypt data passed to the DB
  • DB will always work with encrypted data.

Azure Key Vault

  • Cloud based secret storage
  • Each "vault" is backed by a Hardware Security Module (HSM)
  • Vaults can handle requesting and renewing TLS certificates.
  • Azure AD identities can be given access to key vault secrets
    • i.e. applications can acquire the secrets they need without needing to hardcode them

Azure Backup

  • Azure Backup uses AES256
  • Encryption key is generated from passphrase configured by administrator

Check your knowledge

  1. True or false: only Windows virtual machines can use Azure Disk Encryption

True

False 2. When classifying data, which of the following is a factor?

Level of risk posed to customers if exposed

Method of data transport

Whether the data is stored on virtual machines or in a database

The amount of data stored

Network Security

Internet Protection

πŸ’‘ This is for outer-most i.e. Internet-facing layer.

  • Use Azure Security Center to identify internet-facing resources that are at risk.

    • E.g. resources without NSGs
    • resources that are not behind firewall
  • Application Gateway: a layer 7 load balancer + WAF (Web Application Firewall)

    • Has rules from OWASP 3.0 or 2.2.9 rule sets.
    • Protects against XSS / SQL Injection
  • Network Virtual Appliance (NVA)

    • Increased complexity of configuration, however additional customizability.
  • Azure DDoS Protection

    • Azure Monitor metrics will notify you within a few minutes of attack detection.

Virtual Network Security

πŸ’‘ This i for inner layer within a VNet.

  • NSGs operate at layer 3 and 4.
  • E.g. one NSG for each environment, or one NSG per tier.
  • VNet Service Endpoints: allow you to isolate Azure services to allow communication only from Vnets.

Network integration (ExpressRoute)

  • Connecting Azure VNet to on-prem network

VNet Peering

  • By default VNets are isolated; peering allows them to be connected.
  • Peering is not transitive; VNetA -peer- VNetB -peer- VNetC does not imply VNetA -peer- VNetC.

Check your knowledge

  1. Azure network security groups can be used to secure communication between which of the following?

Communication between Azure virtual machines and the internet

Communication between Azure virtual machines within a VNet

Communication between Azure virtual machines and systems in an on-premises network

All of the above 2. Which of the following is not a method for protecting internet facing services from network attacks?

Azure DDoS

Azure Application Gateway WAF

Azure Disk Encryption

A network virtual appliance

Azure Solutions Architect

Module: Pillars of a great azure architecture

My personal notes from going through https://docs.microsoft.com/en-us/learn/modules/pillars-of-a-great-azure-architecture

  • Security: protect data, build security in from design phase onwards etc
  • Performance and scalability
  • Availability and recoverability
  • Efficiency and operations

Check your knowledge:

  1. Which of the following would be an example of something you might address in the security pillar?
  • Defining a policy for virtual machine backup
  • Enabling multi-factor authentication for all administrative accounts
  • Evaluating your cloud spend to identify areas of cost savings
  • Moving to an autoscaling service to dynamically handle fluctuations in load
  1. Which of the following would be an example of something you might address in the availability & recoverability pillar?
  • Defining a policy for virtual machine backup
  • Enabling multi-factor authentication for all administrative accounts
  • Evaluating your cloud spend to identify areas of cost savings
  • Moving to an autoscaling service to dynamically handle fluctuations in load

Design for Security

  • Aim: Create secure architecture that protects data, infrastructure that it sits on, and identities we use to access it
  • Ensure any compliance requirements for data are met (GDPR, PII, PCI, HIPAA)

Defence in depth: A layered approach to security. Creates depth of protection, so even if one layer fails the other layer will stop or at least slow down an attacker.

Layers:

  • Data (inner most layer)
  • Applications
  • VM/compute
  • Networking
  • Perimeter
  • Policies & access
  • Physical security (outer most layer)

Notes

  • Cost is a strong consideration as each layer has different tools/techniques for security
  • There is no single solution to protect all layers.
  • Security is not just about technology, its also about people and processes.
Example Attack or Vulnerability Attacks this layer
An encryption key used to encrypt a DB uses an obsolete encryption standard Data
An Encryption key used to encrypt a DB was exposed Data
SQL injection Applications
XSS Applications
A virus is installed into a system VM
A firewall rule has all ports open Networking
DoS attacks Perimeter
A team has given full admin privileges to all members, and one member's key was exposed. Policies and access
A security badge was stolen Physical security

Check your knowledge

  1. Which of the following types of data may need to have security protections?
  • Customer data that contains personal information
  • Financial data supporting business operations
  • Intellectual property
  • All of the above may need security protections
  1. Which of the following is an example of an attack you might see at the policies & access layer?
  • Exposed credentials posted online
  • A SYN flood attack
  • Following an employee into a datacenter without presenting credentials
  • Ransomware that encrypts the disks of a virtual machine

Design for performance and scalability

Scale up vs Scale Out

Scale Up Scale Out
Increase the capability of a single instance Add more instances of similar capabalities
a.k.a. vertical scaling a.k.a horizontal scaling
e.g Change to a higher-spec'ed SKU e.g. add an Auto Scaling Group
πŸ’‘ VMs cannot scale up infinitely (limited by the physical host) πŸ’‘ scale-out requires some sort of load-distribution mechanism, e.g. a load balancer.

πŸ’‘ Cost Optimization is required in both scale-up and scale-out situations (i.e. reduce resources when not needed).

Performance Optimization

  • Data Partitioning splits data into separate partitions that can be managed separately.
  • Caching ensures frequently accessed data is retrieved faster.
    • Caching between an application and DB can reduce the load on the DB.
    • Caching of static content on a website in a location close to the user, can bring down page load times.
  • Autoscaling automatically scales up as demand increases, and scales back down as demand drops.
    • This manages cost without compromising on performance.
    • This also takes away the need for an operations guy to take such decisions (another cost optimization)
  • Background jobs
    • Create a Non-blocking model
    • Long running workflows can be created as background jobs
    • Decouple any resource-intensive tasks from UI, this can make UI more responsive.
  • Messaging
    • Add a message queue between services.
    • This means Service1 need not directly call Service2, instead Service1 publishes to an MQ, Service2 subscribes to MQ.
    • Pub-Sub model introduces asynchronous-ness to the architecture.
    • Service2 will not get overloaded by direct calls from Service1, instead it can consume messages as per its processing capacity.
  • Scale Units
    • If a web server is scaled out, app and DB servers may have to be scaled out as well.
    • Identify a scale unit e.g. X web server Y app server and Z DB servers.
    • Scale out as a unit. N * (X*web + Y*app + Z*db)
  • Performance Monitoring (self explanatory)

Check your knowledge

  1. Which of the following is an example of scaling up?
  • Updating your application to use a queuing service
  • Adding more web servers into a web farm
  • Adding another virtual machine into a database cluster
  • Updating a virtual machine to a larger size
  1. Which of the following is an example of scaling out?
  • Updating a virtual machine to a larger size
  • Adding more storage to a virtual machine
  • Adding more web servers into a web farm
  • Replicating backups to another region

Design for availability and recoverability

  • a.k.a HA and DR, BC and DR

Design for availability

  • Availability: maintain uptime
  • Eliminate Single Points of Failure
  • Minimize impact of infrastructure maintenance

The role of SLA: ask yourself:

  • What SLA are you committing to?
  • Which areas of your application need improvement to meet this SLA? The goal is to add redundancy where required to ensure availability is not impacted (or to reduce likelihood of outage).

Examples

  1. Clustering, load-balancing

Design for recoverability

  • recoverability: reduce MTTR. ability to recover from data loss or other disasters
  • May require manual intervention however automation can aid

Do an analysis of recovery strategies given various downtime scenarios, and understand the tradeoffs of each recovery strategy.

Define the following goals

  • Recovery Point Objective (RPO): The max duration of acceptable data loss. E.g. 30 minutes of data loss
  • Recovery Time Objective (RTO): The max acceptable downtime E.g. 8 hours of downtime.

With the goals defined you can architect your application to meet those goals.

Check your knowledge

  1. Suppose you would like to increase the availability of your system to provide a better service-level agreement (SLA) to your customers. Which of the following is a guiding principle you can use?
  • Reduce your target for maximum duration of acceptable data loss.
  • Encrypt all data at rest
  • Eliminate single point of failure
  1. Which of the following would be impacted by your defined Recovery Point Objective (RPO)?
  • The frequency of database backups
  • The number of regions that data is replicated to
  • The number of instances in a database cluster
  • The type of load-balancing technology used in your application

Design for efficiency and operations

Aim: Identify waste. Examples:

  • VMs that are underutilized
  • Using SSDs for storage where HDD ought to be enough (no stringent disk IO considerations)

Waste can be identified through rigorous monitoring at every layer.

Efficiency Best Practices

  • Right-size VMs
  • Deallocate VMs that are not used
  • Consider moving from IaaS to PaaS
    • Typically PaaS means you don't need to worry about patching/maintenance
    • Typically PaaS may cost less than IaaS
    • Not everything can be moved to PaaS

Operations Best Practices

  • automate as much as possible
  • rigorous monitoring throughout the application architecture
  • Design with CI and DevOps in mind

Check your knowledge

  1. Suppose you have recently moved your application to the cloud and your monthly bill seems higher than expected. The utilization level of your VM is high enough that you are hesitant to downsize. What might be a reasonable next step you could take to help you find inefficiencies?
  • Wait a month and recheck your bill
  • Increase the amount of application testing you do before each release
  • Add monitoring and instrumentation
  1. Which of the following is an example of improving operational efficiency?
  • Automating the deployment of infrastructure
  • Moving services from PaaS to IaaS
  • Documenting manual steps for the deployment of infrastructure
  • Maintaining multiple logging systems
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment