Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save mrichards360/c371807fc1418e8aa778bd29bc43c17f to your computer and use it in GitHub Desktop.

Select an option

Save mrichards360/c371807fc1418e8aa778bd29bc43c17f to your computer and use it in GitHub Desktop.
How DevOps Works at Groups360 — primer covering deployments, Terraform, Kubernetes, migrations, and key repositories

How DevOps Works at Groups360

A practical guide to infrastructure, deployments, and operations for engineers joining the team or looking to understand the full picture.


Table of Contents


Repository Map

Infrastructure Repositories

Repository Purpose
devops_aws_terraform Terraform for all AWS resources — EKS clusters, IAM, S3, ECR, Glue, Redshift, AppConfig, and more. 22 modules covering the full AWS footprint.
devops_github_terraform Terraform for GitHub organization management — repository settings, branch protection rules, team membership, secrets, and environment configuration.
devops_terraform_initiators GitHub Actions workflows that let you trigger Terraform plan/apply operations from the Actions UI without needing local AWS credentials.

CI/CD Repositories

Repository Purpose
devops_pipelines The engine. All reusable GitHub Actions workflows, build scripts, deployment scripts, Helm templates, Docker configurations, and service config files. This is where CI/CD logic lives.
devops_pipeline_initiators_eng The steering wheel for engineering. User-friendly GitHub Actions workflows with dropdown menus for deploying services and web apps to lower environments.
devops_pipeline_initiators_prod Same concept, but for production deployments with additional approval gates.
devops_pipeline_initiators_devops DevOps-specific deployment workflows for infrastructure tools and maintenance operations.
devops_qa_pipelines QA-specific CI/CD pipelines for test automation and validation workflows.

Configuration and GitOps Repositories

Repository Purpose
env_files_all Environment-specific configuration files for every service across every environment. The traditional Helm deployment path reads values from here.
appconfig_toolkit Python CLI for managing AWS AppConfig — syncing application settings, detecting drift, rendering templates. Covers 47 services across 4 tech stacks.
g360_argocd ArgoCD bootstrap configuration — cluster setup, platform apps (cert-manager, Karpenter, ALB controller, GitHub ARC runners), RBAC policies.
g360_env_configs GitOps config repository. ArgoCD watches this repo; CI/CD pipelines update image tags and metadata here, triggering automatic syncs.
g360_helm_charts Shared Helm library charts and service chart definitions used across deployments.

Operations and Tooling

Repository Purpose
devops_helper_scripts Utility scripts — workflow usage analysis, GitHub org backups, Kafka schema management, build validation tools.
devops_sysadmin_scripts System administration tools and operational runbooks.

How Deployments Work

The Marketplace Model

Deployments are triggered through initiator repositories — GitHub Actions workflows with guided dropdown menus that abstract away the underlying complexity. Engineers don't need to know Helm, Terraform, or Kubernetes to deploy.

For engineering teams: Go to devops_pipeline_initiators_eng, click Actions, and select the appropriate workflow.

Available Marketplace Workflows

Workflow What it Deploys Services
ENG | SVC Backend microservices 26 services (approval, booking, payment, identity, search, etc.)
ENG | WEBAPP | MP Marketplace web apps 13+ apps (dashboard, search, sourcing, inventory, etc.)
ENG | WEBAPP | HILTON Hilton private-label web apps Brand-specific frontend builds
ENG | WEBAPP | IHG IHG private-label web apps Brand-specific frontend builds
ENG | WEBAPP | WYNDHAM Wyndham private-label web apps Brand-specific frontend builds
ENG | INFRA Tools Infrastructure utilities eks_deploy_info
ENG | ETL ETL services leonardo_service
ENG | SVC Auto-Migration Services with forced DB migrations Multi-language with validation

Deployment Input Flow

When you trigger a deployment, you provide:

  1. Repository/service — dropdown selection (no free-text, prevents typos)
  2. DB modifications — optional; whether to run database migrations
  3. Branch or SHA — toggle between deploying a branch tip or a specific commit
  4. Branch name / SHA value — the actual ref to deploy
  5. Environment — target environment (dev52, qa63, uat71, etc.)

The initiator validates your inputs (SHA format, branch existence via GitHub API), then calls a reusable workflow in devops_pipelines.

The CI/CD Pipeline

Once triggered, the pipeline runs through these phases:

Initiator Workflow (devops_pipeline_initiators_eng)
    │
    ▼
Base Workflow (devops_pipelines/cicd-base-svc-v1.yml)
    │
    ├─► CI Phase: Build → Test → Lint → Containerize → Push to ECR
    │
    ├─► DB Phase (optional): Run database migrations
    │
    └─► CD Phase: Deploy to target environment
            │
            ├─► GitOps path (uat71): Update g360_env_configs → ArgoCD syncs
            │
            └─► Helm path (all others): Helm upgrade via EKS admin container

Two Deployment Paths

1. Traditional Helm Deployment (all environments except uat71)

The pipeline:

2. GitOps / ArgoCD Deployment (uat71, expanding to more environments)

The pipeline:

Service Configuration

Every service has a config file at devops_pipelines/repo_artifacts/vars/cicd-per-repo/{service}.txt. This defines:

  • codename — Kubernetes deployment name
  • codetype — Technology stack (ruby, python, dotnet, node)
  • codeversion_* — Runtime versions per environment tier
  • Container registry URLs (lower vs prod accounts)
  • Dockerfile and Helm template references
  • Health check paths, database names, AppConfig settings

The script get-env-vars-repo-svc-v1.sh parses these files and exports all values as environment variables for the pipeline.

Batch / Release Deployments

For deploying multiple services at once (e.g., cutting a release branch to CUAT environments), use the release-branch-* workflows in devops_pipeline_initiators_eng:

  • release-branch-backend-marketplace.yml — Deploy all 36 backend services or a selected subset
  • release-branch-frontend-marketplace.yml — Deploy all frontend apps
  • Brand-specific variants for Hilton, IHG, Wyndham

These generate deployment matrices dynamically and skip services where the target branch doesn't exist.


How Terraform is Managed

Repository Structure

All AWS infrastructure lives in devops_aws_terraform, organized by service:

Module What it Manages
eks-cluster/aws-eks-cluster EKS clusters, node groups, OIDC federation, Helm addons, KMS encryption
eks-cluster/aws-eks-networking VPC, subnets, Transit Gateway, VPC peering, security groups
eks-cluster/aws-eks-spc Secret Provider Class for AWS Secrets Manager → K8s secrets integration
aws-iam-service IAM users, groups, roles, Lambda-based credential distribution
aws-container-registry ECR repositories with lifecycle policies and cross-account pull
aws-s3-artifacts S3 buckets across environments
aws-glue-etl Glue jobs, connections, catalogs, DynamoDB tables
aws-glue-etl-jobs Modular Glue job definitions
aws-appconfig AppConfig deployment strategies, IAM roles
aws-ec2-gh-runners-artifacts Self-hosted GitHub Actions runners (always-on and on-demand)
aws-redshift Redshift cluster and schema management
aws-break-glass Emergency break-glass access and backup infrastructure

GitHub organization management lives separately in devops_github_terraform:

Module What it Manages
repos Repository settings, branch protection, collaborators, environments, secrets
teams Team membership — Engineering, Marketplace, Core Services, QA, DevOps, Security, Architects

State Management

All Terraform state is stored in S3 with DynamoDB locking:

Bucket: g360-tfstate-{environment}
Key:    {project}/{environment}/{cluster}/terraform.tfstate
Lock:   DynamoDB table g360-tfstate-{environment}
Environment AWS Account Region S3 Bucket
lower 599778853101 us-east-2 g360-tfstate-lower
prod 638757669574 us-east-1 g360-tfstate-prod
infra (shared) various g360-tfstate-infra
breakglass 663568... us-west-2 g360-tfstate-breakglass

Cross-module references use terraform_remote_state data sources to read outputs from related modules (e.g., EKS cluster reads network state for VPC/subnet IDs).

Running Terraform

Every module has a Makefile with standardized targets:

make init INFRA_ENV=lower CLUSTER_NAME=eks-cluster-02      # Initialize backend
make plan INFRA_ENV=lower ACCOUNT_ID=599778853101           # Generate plan
make apply INFRA_ENV=lower                                   # Apply changes
make destroy INFRA_ENV=lower                                 # Destroy resources
make format                                                  # Format HCL files
make validate                                                # Syntax check

The Makefile sets AWS_PROFILE={INFRA_ENV}-tf automatically and selects the correct .tfvars file.

CI/CD for Terraform

devops_aws_terraform has 24 GitHub Actions workflows in .github/workflows/, one per module. Each follows the plan → approve → apply pattern:

  1. Plan job generates a terraform.plan artifact
  2. Apply job downloads the artifact and applies (requires environment-based approval)
  3. Self-hosted runners: g360-infra for lower/infra, ci-cd-prod for production

You can trigger these from the devops_terraform_initiators repo or directly from the workflow files.

Terraform version is pinned to 1.10.4 across all workflows.


How Kubernetes is Managed

Cluster Architecture

Cluster Environment Region Purpose
eks-cluster-02 lower us-east-2 All lower environments (dev, qa, uat, inter, etc.)
g360-infra-cluster infra us-east-1 Shared infrastructure and tooling
g360-prod-core-01 prod us-east-1 Production workloads

All clusters use:

  • Private-only API endpoints (no public access)
  • Managed node groups with CloudWatch-based autoscaling
  • Karpenter for dynamic node provisioning
  • OIDC federation for IAM role assumption (no static credentials)
  • KMS encryption for secrets at rest
  • Core addons: CoreDNS, kube-proxy, VPC CNI, EBS CSI driver

The EKS infrastructure is provisioned through Terraform in this order:

  1. aws-eks-networking — VPC, subnets, Transit Gateway
  2. aws-eks-cluster — Cluster, node groups, IAM, Helm addons
  3. aws-eks-spc — Secret Provider Class

Helm Chart Templates

Service deployments use standardized Helm templates stored in devops_pipelines/repo_artifacts/helm/v3.16.4/template/.

The primary template (g360_template_LIVEPROB_file) generates:

  • Deployment — with Datadog APM integration, liveness probes via file check (/k8spod_liveprobe/liveprobe.txt), resource limits, node selectors
  • Service — ClusterIP or LoadBalancer
  • Ingress — Kubernetes ingress rules
  • HPA — Horizontal Pod Autoscaler
  • PDB — Pod Disruption Budget
  • ServiceAccount — for OIDC/IAM role binding
  • RBAC — Role and RoleBinding (per-service opt-in)

Environment-specific deployment YAMLs live in separate directories:

ArgoCD (GitOps)

ArgoCD is bootstrapped and managed via g360_argocd. It follows the App of Apps pattern:

  • A root application manages ArgoCD itself (self-management)
  • Platform apps are declared in platform-apps/:
    • cert-manager
    • AWS Load Balancer Controller
    • Karpenter
    • GitHub Actions Runner Controller (ARC)

Service deployments via ArgoCD (currently uat71 only):

  • ArgoCD watches g360_env_configs for changes
  • ApplicationSet definitions in appsets/services/{service}.yaml
  • Per-environment values in configs/{service}/envs/{env}.yaml
  • Automated sync with prune and self-heal enabled

Container Registry

Three ECR registries, one per AWS account:

Account Region Use
599778853101 us-east-2 Lower environment images (builds happen here)
638757669574 us-east-1 Production images
294892597080 us-east-1 Shared infrastructure images

Docker images use a three-tier strategy:

  1. Base images — Foundation with runtime environments (repo_artifacts/dockerfiles/new-base-img/)
  2. Build images — CI/CD build tooling (repo_artifacts/dockerfiles/cicd/)
  3. Runtime images — Final application containers

How Database Migrations Work

Database migrations are integrated into the deployment pipeline and run as a separate phase before the application deployment.

Migration Scripts

Technology-specific migration scripts live in devops_pipelines/workflow_scripts/cicd/db/:

Language Auto-Migration Script Manual Script
.NET dotnet/dotnet-db-auto-migration-v1.sh dotnet/dotnet-db-v1.sh
Python python/python-db-auto-migration-v1.sh python/python-db-v1.sh
Ruby ruby/ruby-db-auto-migration-v1.sh ruby/ruby-db-legacy-v1.sh

Triggering Migrations

Option 1: During deployment — When triggering a service deployment via the marketplace, select the "db-mods" dropdown option (e.g., service-YES-db-mods). This runs migrations before the application deployment.

Option 2: Auto-migration workflow — The SVC_auto-migration.yml workflow in devops_pipeline_initiators_eng is dedicated to running migrations. It supports all four tech stacks and always enables DB modifications.

Option 3: Standalone migration workflows — The following workflows in devops_pipelines can be called directly:

How Migrations Execute

  1. The pipeline detects the service's codetype (dotnet/python/ruby) from the service config
  2. Database credentials are fetched from AWS AppConfig via fetch-appconfig.sh
  3. The appropriate migration script runs inside a container with database connectivity
  4. For .NET: EF Core migrations; for Python: Alembic or custom scripts; for Ruby: ActiveRecord migrations
  5. Legacy vs non-legacy detection is automatic (the auto-migration script handles both)

Environment Database Sync

The env_db_new_sync_all.yml workflow handles database synchronization across environments — useful for refreshing lower environments with production-like data.


Environment Configuration

Environment Landscape

Groups360 maintains 30+ environments across four tiers:

Tier Environments Purpose
Dev dev51, dev52, dev53, dev54, dev55 Active development and feature testing
QA qa61, qa62, qa63, qa65 QA validation
UAT uat71, uat72, uat73, uat74, uat75 User acceptance testing
Integration inter31, inter34, inter35 Integration testing
Platform plat21, plat25 Platform-level testing
Sandbox sab11, sab15 Sandbox / experimentation
Search search41-search45 Search service testing
Partner part81-part85 Partner integration testing
Demo demo01 Demonstrations
Brand CUAT hiltoncuat01, ihgcuat01, wyndcuat01, hyattcuat01, micuat01 Brand-specific customer UAT
Brand Dev/QA hiltondev1-3, hiltonqa3 Brand-specific development
Production prod01 Live production

Configuration Layers

  1. Service config (cicd-per-repo/{service}.txt) — Runtime versions, registry URLs, health checks, AppConfig settings. Applies across all environments.

  2. Environment files (env_files_all) — Per-environment, per-service overrides. Organized as {env}/svcs/{tech_stack}/{service}/. The Helm deployment path reads these.

  3. AWS AppConfig (managed by appconfig_toolkit) — Application-level settings like database connection strings, feature flags, API keys. Supports multi-format configs (JSON, YAML, key-value) across 47 services and 4 tech stacks.

  4. GitOps config (g360_env_configs) — For ArgoCD-managed environments. Image tags and deployment metadata stored as YAML.

AppConfig Toolkit

The appconfig_toolkit is a Python CLI that manages AWS AppConfig configuration:

# Sync all service configurations to an environment
python -m appconfig_toolkit.cli sync-all --env dev55

# Sync a specific service
python -m appconfig_toolkit.cli sync-service --service payment-service-v2 --env dev55

# Detect configuration drift
python -m appconfig_toolkit.cli reconcile --env dev55

# Preview rendered configuration
python -m appconfig_toolkit.cli preview --service payment-service-v2 --env dev55

Templates are Jinja2-based, stored in templates/services/{service}.appsettings.json.j2, with environment variables sourced from env_files_all.

Production sync requires the --prod flag and uses a separate AWS account with cross-account IAM roles.


Secrets Management

AWS Secrets Manager

Secrets are stored in AWS Secrets Manager and injected into Kubernetes pods via the Secret Provider Class (managed by aws-eks-spc Terraform module).

Encrypted Backups

Nightly automated backups run via backup-secrets-nightly.yml:

  • Hybrid encryption: RSA (wraps key) + AES (encrypts secrets)
  • Storage: Encrypted to S3
  • Restore: restore_secrets.py
  • Key management: RSA keypairs generated via generate_keys.py

GitHub Secrets

CI/CD pipeline authentication uses GitHub repository and organization secrets:

  • ACTIONS_PIPELINE — PAT for cross-repo workflow calls
  • AWS credentials for ECR, EKS, and Secrets Manager access
  • Datadog API keys for observability

Monitoring and Observability

Datadog Integration

All services ship with built-in Datadog APM:

  • Helm templates inject DD_ENV, DD_SERVICE, DD_VERSION environment variables
  • Prometheus scraping enabled via pod annotations
  • Log injection via DD_LOGS_INJECTION for structured logging
  • APM tracing via Unix socket mount to the Datadog DaemonSet agent
  • CI/CD tracing via datadog-ci wrapping major pipeline steps

Pipeline-level observability:

GitHub Organization Backups

Nightly backups of the entire GitHub organization via backup-github-nightly.yml, using github_org_backup.py.


Quick Reference

Common Operations

Task Where to Go
Deploy a service to dev/qa/uat devops_pipeline_initiators_eng → Actions → ENG | SVC
Deploy a web app devops_pipeline_initiators_eng → Actions → ENG | WEBAPP | MP
Deploy to production devops_pipeline_initiators_prod → Actions
Run database migrations devops_pipeline_initiators_eng → Actions → ENG | SVC Auto-Migration
Plan/apply Terraform devops_terraform_initiators → Actions
Manage GitHub repos/teams devops_github_terraform
Add a new service to CI/CD Create config in devops_pipelines/repo_artifacts/vars/cicd-per-repo/
Update environment config env_files_all
Manage AppConfig settings appconfig_toolkit
View ArgoCD config g360_argocd

Key Documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment