Skip to content

Instantly share code, notes, and snippets.

@exonomyapp
Last active April 25, 2025 23:19
Show Gist options
  • Save exonomyapp/a94e6d1a977c0c4626780bbb13404e57 to your computer and use it in GitHub Desktop.
Save exonomyapp/a94e6d1a977c0c4626780bbb13404e57 to your computer and use it in GitHub Desktop.
Coolify Orchestrated PostgreSQL Cluster

Coolify Orchestrated DB Cluster

In this project, our goal is to establish a robust and scalable infrastructure for a PostgreSQL database with high availability, seamless security, and integrated monitoring and alerting systems.

Introduction

We'll leverage tools like Patroni, Consul, Vault, Prometheus, Grafana, and Cert-Manager to ensure a comprehensive, modern solution. Coolify will act as our orchestration platform, managing various services and simplifying deployments. We aim to not only build a highly available database cluster but also provide a learning experience for interns that demonstrates best practices in DevOps, security, and observability.

The backbone of our infrastructure will focus on a distributed, high-availability PostgreSQL cluster. To ensure reliability, we’ll introduce Patroni for automating failover, Consul for service coordination, and Vault for managing sensitive information. Monitoring will be handled by Prometheus and visualized using Grafana, while Cert-Manager will manage SSL certificates, reinforcing our focus on security. We aim to introduce the concept of ephemeral design in this project, which emphasizes solutions that avoid persistent state issues—highlighting a preventative approach to infrastructure management.

The technology stack we'll introduce in this project includes:

  1. Coolify v4.0.0-beta.355: Our management platform for deploying, monitoring, and maintaining all services and applications. Coolify streamlines orchestration by offering an easy-to-use UI, while maintaining flexibility in deploying services. It’s perfect for teaching new developers how to manage a full-stack application from database to front-end.

  2. PostgreSQL with Patroni: The database at the heart of our project, PostgreSQL is renowned for its reliability and performance. Patroni is a framework that allows us to manage a highly available PostgreSQL cluster, automatically handling leader elections and failover when necessary.

  3. pgBackRest: An advanced backup and restore tool for PostgreSQL. We'll use it to ensure our data is safely stored and easily retrievable in case of disaster. By creating automated backup schedules with daily incremental backups, we ensure efficient storage and performance while minimizing the impact on the cluster.

  4. Consul: A distributed service mesh that allows for service discovery and coordination. In this project, Consul will be used to manage our PostgreSQL cluster, ensuring high availability by assisting with Patroni’s leader election process and coordinating between nodes.

  5. Vault: HashiCorp Vault is a secret management solution that allows for secure storage and controlled access to secrets. Vault will handle all credentials and sensitive information related to our PostgreSQL cluster, as well as secure tokens for OAuth2.

  6. Cert-Manager: This tool automates the issuance and management of SSL certificates. We will use it specifically for managing SSL certificates, allowing us to focus on keeping our infrastructure ephemeral and secure. SSL certificate management is a complex process, and Cert-Manager ensures that certificates are renewed automatically, reducing the risk of downtime due to expired certificates.

  7. Prometheus and Grafana: Prometheus will handle monitoring and metrics collection across all services, while Grafana will provide visualization of these metrics. By tracking PostgreSQL performance, disk usage, and system health, we ensure early detection of issues. Grafana Loki will assist with log aggregation, allowing us to monitor logs and troubleshoot efficiently.

  8. AlertManager: Part of the Prometheus ecosystem, AlertManager will help notify us when certain thresholds are reached, enabling proactive infrastructure management.

  9. OAuth2 and Drizzle ORM: For future applications, we’ll integrate OAuth2 for authentication and Drizzle ORM for database interaction. OAuth2 will manage authentication for applications connecting to the PostgreSQL cluster, while Drizzle ORM will serve as the interface for querying the database from TypeScript-based applications.

Understanding the Problem and Our Approach

In any database-driven system, ensuring uptime and security are the top priorities. However, traditional database management is often susceptible to issues such as single points of failure, inefficient backup processes, and poor secrets management. This is where our approach with a distributed, highly available PostgreSQL cluster shines.

What Is High Availability (HA)?

High Availability means ensuring that our system is resilient to hardware and software failures. In the case of PostgreSQL, this involves creating a cluster where multiple nodes (servers) work together to ensure that even if one node fails, others can continue to operate. Patroni and Consul work hand-in-hand to achieve this. Patroni acts as the orchestrator, monitoring the health of the database nodes and deciding when a failover (a change in the database leader) is necessary. Consul coordinates this by keeping track of the state of each node, ensuring consistent and accurate information on the cluster’s health.

By distributing the database across multiple nodes, we create an infrastructure that can handle node failures without downtime. This ensures ephemeral recovery—failures are quickly rectified through automated processes that prevent prolonged downtime, reducing the need for manual intervention and offering a "soft" security layer that emphasizes prevention of large-scale issues.

The Role of Consul and Patroni

Consul is a service mesh that coordinates between services, providing a consistent state of the system across multiple nodes. In our case, Consul will track the state of each PostgreSQL instance and assist Patroni in leader election—ensuring that at any given time, one node is the primary writer while others are replicas, ready to take over if needed.

Patroni is the heart of our high availability strategy for PostgreSQL. It allows us to define and maintain a cluster of PostgreSQL nodes, ensuring they are all synchronized and can seamlessly take over the leader role if the current leader fails. It manages PostgreSQL’s built-in streaming replication mechanism and automates failover, making sure the database is always available.

Backup Strategy

Backups are crucial to any database system. We are using pgBackRest because it allows us to efficiently manage backups and restores, especially in large, distributed environments like ours. Our strategy involves daily incremental backups to ensure that data is always safe without overwhelming our storage system. The design is to perform daily backups and manage them through Coolify’s UI.

Secrets Management with Vault

A core part of our strategy is secrets management. Storing sensitive information such as passwords, OAuth tokens, and API keys in plain text is a significant security risk. Vault mitigates this by storing secrets securely and only exposing them to services when absolutely necessary. For example, PostgreSQL credentials, OAuth tokens, and any other critical secrets will be dynamically managed through Vault, ensuring that no sensitive data is ever hard-coded or exposed.

Vault also provides an additional layer of ephemeral security. Instead of permanent credentials that are vulnerable if exposed, Vault generates dynamic, time-bound credentials that expire, minimizing the risk window. If a secret is compromised, it is rendered useless after its TTL expires, reinforcing the idea of prevention over remediation.

SSL Certificates with Cert-Manager

While Vault handles secrets, we rely on Cert-Manager specifically for managing SSL certificates. SSL certificates ensure that our communication channels are encrypted, protecting data in transit. Cert-Manager simplifies this process by automating the creation, distribution, and renewal of SSL certificates. For example, Let’s Encrypt will be our certificate authority, and Cert-Manager will handle renewals automatically, reducing the need for manual intervention.

SSL certificate management is an ephemeral process, where certificates are automatically replaced before they expire. By integrating Cert-Manager, we introduce a "soft" layer of security that prevents issues like expired certificates from becoming a concern.

Monitoring with Prometheus and Grafana

Monitoring is essential to ensuring the health of our system. Prometheus is a powerful metrics collection tool that will gather real-time data from all components, including PostgreSQL, Consul, and our nodes. This data will be visualized using Grafana, enabling us to create dashboards that display critical metrics like CPU usage, memory consumption, disk I/O, and query performance.

In addition to monitoring, Grafana Loki will be deployed to aggregate logs, allowing us to track and troubleshoot issues efficiently. AlertManager will notify us whenever certain thresholds are reached, ensuring we are always aware of any potential issues before they escalate.

This monitoring system is vital to the ephemeral nature of our infrastructure. By continuously collecting and analyzing data, we can proactively address issues before they cause downtime, avoiding unnecessary risks and maintaining uptime.

Prerequisites

The Prerequisites section focuses solely on what's necessary for setting up the proper environment for installing and configuring PostgreSQL cluster and its supporting services:

Before proceeding to Step 1, ensure the following components are in place:

1. HashiCorp Vault Installation:

  1. Download the Vault Binary:

    • Use wget to download the latest version of Vault. Replace VERSION with the latest version number (check the latest version here).
    wget https://releases.hashicorp.com/vault/VERSION/vault_VERSION_linux_amd64.zip
  2. Unzip the Downloaded File:

    • Install the unzip package if it’s not already installed.
    sudo apt update
    sudo apt install unzip
    • Unzip the downloaded file.
    unzip vault_VERSION_linux_amd64.zip
  3. Move the Binary to a Directory in Your PATH:

    sudo mv vault /usr/local/bin/
  4. Verify the Installation:

    • Check that Vault is installed correctly by running:
    vault -version

2. Initialize and Unseal Vault

  1. Create a Configuration File:

    • Create a configuration file for Vault at /etc/vault.hcl.
    sudo nano /etc/vault.hcl
    • Add the following configuration:
    storage "file" {
      path = "/opt/vault/data"
    }
    
    listener "tcp" {
      address     = "0.0.0.0:8200"
      tls_disable = 1  # Set to 1 for development; use TLS in production
    }
    
    api_addr = "http://127.0.0.1:8200"
  2. Create the Data Directory:

    sudo mkdir -p /opt/vault/data
    sudo chown -R $(whoami):$(whoami) /opt/vault/data
  3. Start Vault Server:

    • Start Vault in the background using the configuration file.
    vault server -config=/etc/vault.hcl
  4. Set the Environment Variable for Vault:

    export VAULT_ADDR='http://127.0.0.1:8200'
  5. Initialize Vault:

    • Run the following command to initialize Vault. This command will output the unseal keys and the initial root token. Store these keys securely.
    vault operator init
  6. Unseal Vault:

    • Use the unseal keys from the initialization step to unseal Vault. You will need to enter at least three unseal keys (depending on the threshold set during initialization).
    vault operator unseal <unseal_key_1>
    vault operator unseal <unseal_key_2>
    vault operator unseal <unseal_key_3>
  7. Login to Vault using the root token generated during initialization. In HashiCorp Vault, while you can paste the root token directly into the vault login command, it's generally recommended to avoid hardcoding or directly pasting sensitive tokens into scripts or commands. Instead, best practices suggest using environment variables or other secure methods to manage sensitive information. Here’s how you can handle it:

    Best Practices for Logging into Vault

    1. Using an Environment Variable:
    • You can store the root token in an environment variable and then use it in the vault login command. For example:
    export VAULT_TOKEN=<root_token>
    vault login $VAULT_TOKEN
    1. Using a File:
    • Store the token in a secure file and use it when needed:
    echo "<root_token>" > ~/vault_token.txt
    vault login $(cat ~/vault_token.txt)
    1. Prompting for Token:
    • You can also have the script prompt you for the token, which avoids showing it in command history:
    read -s -p "Enter Vault Token: " VAULT_TOKEN
    vault login $VAULT_TOKEN

3. Configuring Vault's AppRole Solution for Authentication

AppRole is one of the authentication methods offered by HashiCorp Vault, designed for machines or applications to authenticate and obtain secrets from Vault securely. It is particularly useful in scenarios where human interaction is minimal or nonexistent (e.g., automation pipelines, services, or applications running in the background).

How AppRole Works

AppRole allows clients to authenticate by combining two key elements:

  • Role ID: A static identifier (like a username) assigned to the application or service.
  • Secret ID: A dynamic, time-bound identifier (like a password) that provides an extra layer of security. It can be generated for each authentication request and has time-to-live (TTL) and usage limits.

Together, these two elements allow an application or machine to securely authenticate with Vault, obtain a token, and then use that token to access secrets or perform other operations based on the associated policies.

AppRole Use Case

AppRole is ideal for:

  • Non-interactive authentication: For services, scripts, or automated processes needing access to Vault without human interaction.
  • Secure secret distribution: Used when there is a need to distribute dynamic credentials for short-lived access.
  • Granular access control: AppRole allows you to assign specific policies that limit what secrets the authenticated entity can access.

Key Concepts

  • Role ID: This is tied to a specific role in Vault and defines what policies or permissions an AppRole has. It’s static, meaning it doesn’t change and is known to the application that wants to authenticate.
  • Secret ID: This is dynamic and can be generated on demand. It can be time-bound (TTL) and restricted by usage count, adding a layer of security to the authentication process.
  • Policies: Vault policies define what permissions (read, write, delete, etc.) are granted once the AppRole successfully authenticates.

Steps to Set Up AppRole

  1. Enable AppRole Authentication Method:

    vault auth enable approle
  2. Create an AppRole with Specific Policies: Define a role that ties an application to specific policies and operations:

    vault write auth/approle/role/my-role \
        token_policies="my-policy" \
        secret_id_ttl="24h" \
        token_ttl="1h" \
        token_max_ttl="4h"
  3. Retrieve the Role ID: The role ID will be needed by the application:

    vault read auth/approle/role/my-role/role-id
  4. Generate a Secret ID: The secret ID is generated dynamically and can have a TTL or usage limit:

    vault write -f auth/approle/role/my-role/secret-id
  5. Authenticate with Role ID and Secret ID: The application will use both the role ID and the secret ID to authenticate and get a Vault token:

    vault write auth/approle/login \
        role_id="<your_role_id>" \
        secret_id="<your_secret_id>"
  6. Vault Issues a Token: Upon successful authentication, Vault returns a token that the application can use to access secrets:

    Key                  Value
    ---                  -----
    token                s.ZqWvxGadf9lf....
    token_accessor       Bdswvhsl....
    token_duration       1h
    token_renewable      true

Best Practices for AppRole

  • Limit Secret ID TTL: Keep the secret ID short-lived to reduce the risk of exposure.
  • Restrict Secret ID Use: Limit the number of uses for a secret ID (e.g., a single-use secret ID).
  • Token Renewal: Ensure the token obtained by the AppRole is renewable so that long-running applications don’t need to re-authenticate as frequently.
  • Monitoring and Auditing: Enable auditing to track when roles, secrets, and tokens are used, ensuring no unauthorized access.

Why Use AppRole?

  1. Automated Systems: AppRole is perfect for non-human systems, such as microservices, scripts, or CI/CD pipelines, where there is no user to provide credentials interactively.
  2. Fine-Grained Control: It offers precise control over permissions and authentication.
  3. Security: By combining a static identifier (role ID) with a dynamic secret (secret ID), AppRole adds an extra layer of security, making it harder to misuse.

AppRole, in essence, makes Vault access secure and manageable for non-interactive systems, ensuring secrets are distributed safely and with fine-tuned control over what resources each authenticated entity can access.

Summary of Steps

  • Installed Vault and configured it to run.
  • Initialized Vault, received unseal keys, and unsealed it for use.
  • While pasting the root token directly is possible, it is not ideal due to security risks. Using environment variables or prompting for input is a more secure practice.
  • Configure AppRole for internal authentication

4. Environment Variables:

  • Set up the necessary environment variables in Coolify:
    • POSTGRES_USERNAME: Username for PostgreSQL.
    • POSTGRES_PASSWORD: Password for PostgreSQL.
    • VAULT_TOKEN: Token for accessing Vault.
    • VAULT_ADDR: URL of the Vault server.
    • OAuth2_Client_ID: Client ID for OAuth2.
    • OAuth2_Client_Secret: Client Secret for OAuth2.

5. Consul Installation:

Patroni does not strictly depend on Consul. However, it does rely on a Distributed Configuration Store (DCS) to coordinate leader election and cluster state. Consul is just one of several supported options for the DCS in Patroni. Other options include etcd, ZooKeeper, and Kubernetes.

Consul, however, is the best option for us, considering the following factors in our environment:

1. Preference for HashiCorp Products:

Since we already have a preference for HashiCorp products, and we are using Vault for secrets management, Consul aligns naturally with this stack. Both Consul and Vault are designed to work well together, sharing similar security, scaling, and configuration mechanisms. HashiCorp's toolset is known for its powerful and reliable orchestration, which suits our needs.

2. Service Discovery:

Consul offers native service discovery capabilities, which can be useful not only for Patroni, but also for Traefik (our load balancer) and other services in the future. This means it centralizes service discovery, leader election, and cluster state management across our infrastructure, making it more efficient to maintain.

3. Scalability and Flexibility:

  • Consul can easily scale with our system, which may become important as our PostgreSQL cluster grows or if we start managing additional services.
  • It also offers flexibility, as Consul is capable of being a DNS provider, so it integrates well with dynamic environments, especially ones using containers (even though we're not using Kubernetes in this setup).

4. Resource Efficiency:

Since we're operating on a VPS with limited resources (8GB RAM), Consul is relatively lightweight compared to other options like Kubernetes, which would likely introduce unnecessary complexity and resource overhead.

5. Integrated Solution:

With Coolify managing our infrastructure, Consul provides a well-integrated solution for managing high availability and leader election in Patroni, without needing a separate orchestration system like Kubernetes. This streamlines the setup while providing robust DCS functionality.

Comparisons:

  • etcd: While popular with Patroni and Kubernetes users, etcd is more commonly deployed in Kubernetes environments and doesn't provide the same level of integration with Vault or service discovery capabilities that Consul offers.
  • ZooKeeper: This is an older option, but generally considered more complex to manage compared to Consul or etcd. It also doesn't provide the same service discovery capabilities.
  • Kubernetes: While powerful, Kubernetes introduces significant complexity and overhead for a relatively small setup like ours. It’s best suited for large-scale containerized environments, which isn't necessary for our project.

Conclusion:

Given our focus on using a cohesive, efficient, and scalable solution with minimal resource overhead, Consul is the best fit for managing the DCS in Patroni. It integrates well with the existing HashiCorp tools (Vault), provides essential service discovery features for Traefik, and fits our goal of maintaining a lightweight yet reliable infrastructure. Installing Consul is crucial, not only for PostgreSQL high availability but also because it tightly integrates with Vault, providing essential service discovery and coordination for secrets management.

Installing Consul in Coolify

We will configure Consul through Coolify's management interface to serve as our distributed coordination and service discovery tool. This will ensure that Consul can integrate with Vault and Patroni efficiently.

Step 1: Install Consul through Coolify

  1. Navigate to Coolify's dashboard.
  2. From the Services tab, select the Add a New Service option.
  3. Choose Consul from the available services.
  4. Configure the following options:
    • Name: vsExoConsul
    • Port: 8500 (default port)
    • Nodes: Set the number of nodes based on our cluster (at least 3 for production-grade reliability).

Step 2: Configure Consul

We need to configure Consul to act as the service discovery backend for both Patroni and Vault. Here is a sample Consul configuration:

{
  "datacenter": "dc1",
  "data_dir": "/var/lib/consul",
  "log_level": "INFO",
  "node_name": "consul-node-1",
  "server": true,
  "bootstrap_expect": 3,
  "ui": true,
  "client_addr": "0.0.0.0",
  "bind_addr": "153.92.208.222",
  "retry_join": ["153.92.208.222"],
  "encrypt": "your_consul_encryption_key",
  "performance": {
    "raft_multiplier": 1
  }
}
  • The encrypt key is automatically managed by Coolify if we choose encrypted gossip.
  • The retry_join IP is the address of our VPS.

Step 3: Start Consul

Once the configuration is completed, we start the Consul service:

  1. Go to the Services section in Coolify.
  2. Start the vsExoConsul service.
  3. Ensure that the service is accessible by opening the Consul Web UI (http://153.92.208.222:8500) to check the health and status of the Consul cluster.

Step 4: Integrate Consul with Vault

Now that Consul is installed and running, we need to configure Vault to use Consul as its storage backend.

In the Vault configuration (which we installed earlier):

storage "consul" {
  address = "http://153.92.208.222:8500"
  path    = "vault/"
}

listener "tcp" {
  address     = "0.0.0.0:8200"
  tls_disable = 1
}

ui = true
  • This tells Vault to store all of its data in Consul, ensuring redundancy and coordination across nodes.

Step 5: Test Consul and Vault Integration

Once the Consul service is running and Vault is integrated:

  1. Verify Vault is correctly using Consul by running:
vault status
  1. Ensure that Consul shows Vault as a healthy node in the Consul Web UI.

6. pgBackRest for Backup and Restore

Now that we have Consul and Vault set up, the next step is configuring pgBackRest for efficient backup and recovery of our PostgreSQL database. Since pgBackRest is a powerful, open-source backup and restore solution designed for PostgreSQL, it will be critical for our high-availability setup.

Step 1: Install pgBackRest on PostgreSQL Nodes

We will install pgBackRest on each of the PostgreSQL nodes to manage our backups.

Step 1.1: Install pgBackRest

On each node of the PostgreSQL cluster, we will install pgBackRest:

sudo apt-get install -y pgbackrest

Step 1.2: Create Backup Storage Directory

On the main PostgreSQL node, we need to create a dedicated directory for storing backups. We'll use the recommended industry-standard path:

sudo mkdir -p /var/lib/pgbackrest
sudo chown postgres:postgres /var/lib/pgbackrest
sudo chmod 700 /var/lib/pgbackrest

This path is where pgBackRest will store all of our backup files.

Step 1.3: Configure pgBackRest

We need to configure pgBackRest on each PostgreSQL node, specifying the backup path, repository settings, and connection details.

Here’s a sample configuration file for pgBackRest (/etc/pgbackrest.conf):

[global]
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2

[global:archive-push]
compress-level=3

[db]
pg1-path=/var/lib/postgresql/16/main
pg1-host=153.92.208.222
pg1-user=postgres

[stanza]
pg1-host-user=postgres
pg1-host-config=/etc/postgresql/16/main/postgresql.conf
pg1-host-recovery=/var/lib/postgresql/recovery.conf

Explanation:

  • repo1-path: Specifies the path where backups will be stored.
  • pg1-path: Path to the PostgreSQL data directory.
  • pg1-host: IP address of our main node (153.92.208.222).
  • We set compression (compress-level=3) for efficient storage usage.

We can replicate this configuration to the other nodes, changing the pg1-host to the respective IP addresses of each node in the cluster.

Step 1.4: Configure PostgreSQL for WAL Archiving

To ensure we have continuous backups of Write-Ahead Log (WAL) files, we need to configure PostgreSQL to archive WAL files using pgBackRest.

In /etc/postgresql/16/main/postgresql.conf, modify or add the following parameters:

archive_mode = on
archive_command = 'pgbackrest --stanza=stanza archive-push %p'
archive_timeout = 300

This configures PostgreSQL to use pgBackRest for WAL archiving.

Step 2: Create and Verify Backup Stanza

Step 2.1: Create Backup Stanza

A stanza in pgBackRest is a named configuration section for managing backups of a PostgreSQL database. We will create a new stanza to manage our backups.

sudo -u postgres pgbackrest --stanza=stanza --log-level-console=info stanza-create

Step 2.2: Verify Stanza

After creating the stanza, verify that the configuration is correct by checking the status:

sudo -u postgres pgbackrest --stanza=stanza --log-level-console=info check

Ensure that the output shows a healthy configuration.

Step 3: Perform a Full Backup

Now, we will perform our first full backup using pgBackRest.

sudo -u postgres pgbackrest --stanza=stanza --type=full --log-level-console=info backup

This command will initiate a full backup of our PostgreSQL database to the previously specified path (/var/lib/pgbackrest).

Step 4: Schedule Regular Backups

We will schedule regular backups using cron to automate the backup process.

  1. Open the cron table:

    crontab -e
  2. Add the following entry to schedule daily full backups at midnight:

    0 0 * * * sudo -u postgres pgbackrest --stanza=stanza --type=full --log-level-console=info backup

This ensures that our PostgreSQL database is backed up daily.

Step 5: Monitor Backups Using pgBackRest Info

We can use pgBackRest's built-in tools to monitor our backups. To check the status of backups and validate that everything is running as expected, run:

sudo -u postgres pgbackrest info

This will provide details on the latest backups, their types (full, incremental), and the size of each. Now that pgBackRest is configured and running, we have a robust backup and restore solution in place, safeguarding our data.

7. Network Configuration:

Following all the installations and configurations, we'll configure the VPS firewall (UFW) to allow traffic on the necessary ports for the services: - PostgreSQL: 5432 - Consul: 8500 - Vault: 8200 - Any additional ports required for Traefik (if used) or other services.

By completing these prerequisites, we establish a functioning foundation for the implementation of the PostgreSQL cluster and its supporting services.

Implementation Plan

Now, let's proceed step by step to implement the plan introduced above.

Step 1: PostgreSQL High Availability with Patroni and Consul

In this step, we will deploy a high-availability PostgreSQL cluster using Patroni for automatic failover and cluster management. Consul will handle the Distributed Configuration Store (DCS) and leader election for the PostgreSQL nodes. Additionally, Vault will be used to manage secrets such as database credentials, and Cert-Manager will be responsible for handling SSL certificates to secure the communication within our services.

Why High Availability?

For any production-grade database, downtime can lead to data loss, broken user experience, and loss of revenue. This is why ensuring that our PostgreSQL database remains available and reliable even during failures is essential. Patroni achieves this by managing PostgreSQL clusters and orchestrating automatic failover in case of a node failure, minimizing downtime. Consul serves as the backend for the cluster, ensuring that one and only one node assumes the leader role for write operations. This setup also supports our ephemeral design, where recovery from failures is automated and downtime is minimized.

Deployment of PostgreSQL Cluster in Coolify

  1. Create the PostgreSQL Cluster:

    • Using Coolify v4.0.0-beta.355, navigate to the "Applications" section and deploy a new PostgreSQL service.
    • We will configure the PostgreSQL cluster with Patroni for high availability. Ensure that we allocate enough resources for the database cluster (based on the 8GB of RAM we have available).
  2. Configure Patroni with Consul:

    • Patroni will act as our PostgreSQL cluster manager. Coolify simplifies this process by allowing us to integrate Consul for cluster coordination.
    • Patroni needs to be set up to use Consul as its Distributed Configuration Store (DCS). This is crucial because Consul will handle leader election, ensuring that only one node handles write operations at a time.

Example Patroni Configuration:

In this configuration, we will define the roles and behavior for Patroni, Consul, and PostgreSQL. This will be applied to each node in our cluster. To ensure security and avoid hardcoding sensitive information like passwords in the configuration file, we should reference the environment variables directly in the YAML file. By doing this, we allow Coolify or any environment management system to inject those values securely at runtime.

scope: exo_pg_cluster
namespace: /patroni/
name: postgresql_node_1

restapi:
  listen: 0.0.0.0:8008
  connect_address: 153.92.208.222:8008  # Replace with actual IP of the node

etcd:
  host: consul.service.consul:8500  # Consul coordinates the nodes
  protocol: http

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        max_wal_senders: 5
        max_replication_slots: 5
        wal_keep_segments: 8

postgresql:
  listen: 0.0.0.0:5432
  connect_address: 153.92.208.222:5432  # Replace with actual IP of the node
  data_dir: /var/lib/postgresql/data  # PostgreSQL data directory
  authentication:
    replication:
      username: {{ POSTGRES_USERNAME }}  # Reference to environment variable
      password: {{ POSTGRES_PASSWORD }}  # Reference to environment variable
    superuser:
      username: {{ POSTGRES_USERNAME }}  # Reference to environment variable
      password: {{ POSTGRES_PASSWORD }}  # Reference to environment variable
  parameters:
    max_connections: 100
    shared_buffers: "512MB"
    wal_keep_size: 64MB

This Patroni configuration sets up the necessary parameters for PostgreSQL replication, including using Consul for cluster coordination. It also defines how failovers are handled. With this approach, we maintain security and best practices in managing sensitive data, while also ensuring that Coolify dynamically injects the required values into the runtime environment. In this YAML file, we are using {{ POSTGRES_USERNAME }} and {{ POSTGRES_PASSWORD }} as placeholders for environment variables. Patroni will automatically fetch these from the environment where the service is running (in this case, Coolify’s environment management). We no longer hardcode sensitive credentials, keeping the configuration secure and flexible.

Connecting Patroni and Consul in Coolify

In Coolify's service environment settings:

  • Define Consul as the service backend by entering its service address and port (default is 8500).
  • Use Coolify's environment variable section to configure Patroni’s access to Consul and PostgreSQL credentials:
    • PATRONI_SCOPE: exo_pg_cluster
    • PATRONI_NAME: postgresql_node_1
    • PATRONI_ETCD_HOSTS: consul.service.consul:8500
    • POSTGRES_USERNAME: saExoDBA
    • POSTGRES_PASSWORD: securepassword123

By leveraging Coolify’s UI and environment variable management, the integration of Patroni, PostgreSQL, and Consul is simplified. We will use Consul to ensure that leader elections are consistently handled without manual intervention, contributing to the ephemeral nature of our infrastructure.

Resources Management and Performance Considerations

Given that our current system has 8GB of RAM, we will carefully configure PostgreSQL to balance between memory consumption and performance. We can start with 512MB of shared buffers (as shown in the Patroni configuration above) to limit the memory usage, and max_connections set to 100 to prevent overloading the system.

If needed, we can scale up the RAM based on observed performance metrics from Prometheus (introduced later in the monitoring steps) or add additional nodes to our cluster to balance the load and further improve availability.

Step 2: Deploying Prometheus, Grafana, and Node Exporter for Monitoring

We will now focus on setting up the monitoring stack using Prometheus, Grafana, and Node Exporter to ensure that we can monitor our PostgreSQL cluster, the system metrics of the VPS, and other performance indicators. This will be integrated with Coolify's service management and Prometheus will store metrics in our existing PostgreSQL database cluster, following the ephemeral nature of our deployment plan.

1. Deploying Prometheus in Coolify

  1. In Coolify’s UI, create a new Service and select Prometheus.
  2. Set the PostgreSQL database as the storage backend for Prometheus by configuring the datasource in Prometheus' configuration:
datasource:
  postgresql:
    host: 153.92.208.222
    port: 5432
    user: saExoDBA
    password: <Vault-stored-password>
    database: dbExocracy
    sslmode: require
  • Here, Prometheus will store metrics in our existing PostgreSQL cluster, which is managed by Patroni. The ephemeral nature of this setup ensures that, in case of a node failure, Prometheus can easily reconnect and store metrics in the failover node without interruptions.

2. Configuring Prometheus to Monitor PostgreSQL

  1. Modify the Prometheus configuration (via Coolify’s config management) to scrape PostgreSQL metrics. We will use pg_stat_statements and node_exporter to collect the relevant metrics:
scrape_configs:
  - job_name: 'postgresql'
    static_configs:
      - targets: ['153.92.208.222:9187']
    metrics_path: '/metrics'
    scheme: 'https'
    basic_auth:
      username: saExoDBA
      password: <Vault-stored-password>
  • The pg_stat_statements extension will provide detailed query statistics, and it is integrated within the database for query optimization and debugging.

3. Deploying Grafana in Coolify

  1. In Coolify, create another Service and select Grafana.
  2. Once Grafana is deployed, add Prometheus as a data source:
    • URL: http://prometheus:9090 (assuming Coolify routes between internal services correctly).
    • Set up dashboards for PostgreSQL using prebuilt Grafana dashboards, such as the PostgreSQL dashboard which includes detailed performance metrics.
  3. Configure Grafana Loki for logging and integrate it into Grafana's data sources to centralize logs for better visibility.

4. Deploying Node Exporter in Coolify

  1. In Coolify, create another Service and select Node Exporter.
  2. This will expose system-level metrics like CPU, memory, disk usage, and network performance from the VPS.
  3. Configure Prometheus to scrape Node Exporter data:
scrape_configs:
  - job_name: 'node_exporter'
    static_configs:
      - targets: ['153.92.208.222:9100']
  • This allows us to monitor the health of the VPS itself, providing detailed system metrics, which is crucial for performance monitoring and resource allocation. We will be using this data to evaluate our current RAM usage and assess the need for future upgrades.

5. Integrating with pgBouncer for Connection Pooling

  1. Prometheus can potentially create a large number of database connections, especially when scraping frequently. We will mitigate this by using pgBouncer as a connection pooler.
  2. Modify the Prometheus configuration to use pgBouncer as the intermediary to the PostgreSQL cluster.

Example config to direct Prometheus queries through pgBouncer:

datasource:
  postgresql:
    host: 153.92.208.222
    port: 6432  # pgBouncer's port
    user: saExoDBA
    password: <Vault-stored-password>
    database: dbExocracy
    sslmode: require
  • pgBouncer will efficiently manage and reduce the number of open connections, preventing any database performance degradation caused by high connection counts.

Summary of Step 2

In this step, we deployed and configured the core monitoring tools:

  • Prometheus collects PostgreSQL and system metrics and stores them in our existing PostgreSQL cluster, leveraging an ephemeral configuration for resilience.
  • Grafana is used to visualize metrics from Prometheus and logs from Loki, ensuring that we have clear insights into the performance and health of our services.
  • Node Exporter collects system metrics from the VPS to monitor its resource consumption, which will help guide us when planning for potential upgrades.

Next, we will proceed to configure OAuth2 authentication for secure access, followed by secrets management with Vault, and finally, complete the system with SSL certificates handled by Cert-Manager.


This step now ties together the Prometheus, Grafana, and Node Exporter deployments under Coolify while ensuring that the PostgreSQL cluster is efficiently monitored and Prometheus data is securely stored.

Additional Reading

Patroni for PostgreSQL

@CompileWithG
Copy link

CompileWithG commented Oct 21, 2024

Step 3: first command (modification/best practises)
When enabling an authentication method in Vault, it's a good practice to use a custom path or name instead of the default, especially when you have multiple roles or use cases. This can help with organizing, managing, and tracking authentication methods in larger or more complex environments. Here are some best practices for naming conventions when enabling approle:

  1. Use Descriptive Names
    Use names that describe the purpose or the application that will use the role. This helps make it clear which authentication method is being used for what purpose.
    Example:
    vault auth enable -path="pgEdge" approle

@CompileWithG
Copy link

CompileWithG commented Oct 21, 2024

1. Vault Binary

  • Location: /usr/local/bin/vault
  • Purpose: This directory is intended for user-installed binaries, and it is part of the default system path, making it easy to execute the Vault binary from anywhere in the command line.

2. Configuration Files

  • Location: /etc/vault/
  • Purpose: This directory is suitable for storing all configuration files related to Vault, including main configuration files, policy files, and any other data needed for Vault to operate correctly.

Example Structure:

  • Vault Binary:

    • /usr/local/bin/vault (the executable binary)
  • Configuration Files:

    • /etc/vault/config.hcl (main Vault configuration file)
    • /etc/vault/policies/ (directory for policy files)
      • /etc/vault/policies/pgEdge-spock-policy.hcl (example policy file)
      • /etc/vault/policies/team-policy.hcl (another policy file)

Steps to Set It Up:

  1. Ensure the Vault Binary is in the Correct Location:

    • Move or verify that the Vault binary is located at /usr/local/bin/vault:
      sudo mv /path/to/your/vault /usr/local/bin/vault
  2. Create the Configuration Directory:

    • Create a directory for Vault configurations if it doesn’t already exist:
      sudo mkdir -p /etc/vault/policies
  3. Store Configuration Files:

    • Place your configuration files in /etc/vault/ and organize any policies in /etc/vault/policies/.
  4. Set Appropriate Permissions:

    • Ensure that the directories and files have the correct permissions. For example:
      sudo chown -R root:root /etc/vault
      sudo chmod -R 755 /etc/vault

Summary

By storing the Vault binary in /usr/local/bin and configuration files in /etc/vault, you maintain a clean and organized file structure that adheres to best practices. This separation makes it easier for developers to manage, update, and access the Vault application while keeping configurations secure and organized.

@CompileWithG
Copy link

what i learned about Vault's configuration folders:

1. auth

  • Purpose: This folder typically contains data related to authentication backends that you have enabled in Vault, such as AppRole, Kubernetes, LDAP, etc.
  • Contents: The subfolders within auth store metadata and token information related to the specific authentication methods configured in your Vault instance.

2. core

  • Purpose: This folder stores the core data for Vault, including information about the secret engines, policies, and other essential operational data.
  • Contents: It may include data related to the internal workings of Vault, such as the secret data, tokens, leases, and other core components necessary for Vault to function.

3. logical

  • Purpose: This directory is related to the logical data storage of secrets and other vault-specific information.
  • Contents: It usually contains the logical secrets stored in Vault, such as key-value pairs or any data generated or managed by various secret engines.

4. sys

  • Purpose: This folder holds system-related information and metadata about Vault's state and operations.
  • Contents: It may include system-level data regarding the status of Vault, audit logs, and other operational metadata needed to monitor and manage the Vault server.

Summary

  • The /opt/vault/data directory is structured to organize the different types of data that Vault manages, separating authentication information, core operational data, logical secrets, and system metadata. This organization helps maintain clarity and facilitates data management within the Vault server.

@CompileWithG
Copy link

Comment on Gist: Creating AppRole for pgEdge with Vault Policy

To create an AppRole for the pgEdge application running on Coolify on a VPS and assign a policy that restricts the role to read secrets from the pgEdge/connection path in the KV secrets engine, follow these steps:

  1. Enable KV Secrets Engine (if not already enabled):
    First, enable the KV secrets engine at the path pgEdge/ to store connection secrets for the pgEdge application.

    vault secrets enable -path=pgEdge kv
  2. Write pgEdge Connection Secrets:
    Store your database connection credentials in the pgEdge/connection path. Replace db_username and db_password with your actual credentials.

    vault kv put pgEdge/connection db_username="postgres" db_password="your_secure_password"
  3. Create the Policy:
    Create a policy that restricts access to only the pgEdge/connection path. This policy will grant read access to the path containing the connection secrets.

    Create a new policy file pgEdge-read-policy.hcl:

    path "pgEdge/connection" {
      capabilities = ["read"]
    }

    Write the policy to Vault:

    vault policy write pgEdge-read pgEdge-read-policy.hcl
  4. Create the AppRole:
    Create an AppRole for the pgEdge application. This will enable the application to authenticate to Vault using the AppRole mechanism.

    vault auth enable approle
  5. Assign the Policy to the AppRole:
    Once the policy is created, assign it to the pgEdge AppRole. This restricts the AppRole to only have the read capability for the pgEdge/connection path.

    vault write auth/approle/role/pgEdge policies="pgEdge-read"
  6. Retrieve the AppRole Role ID and Secret ID:
    After creating the role, retrieve the Role ID and Secret ID for the pgEdge AppRole. The Role ID is typically used to authenticate the application to Vault, while the Secret ID is used for a more secure login.

    vault read auth/approle/role/pgEdge/role-id
    vault write -f auth/approle/role/pgEdge/secret-id
  7. Authenticate Using the AppRole:
    The pgEdge application can now authenticate using the Role ID and Secret ID to access the Vault and read the secrets stored at pgEdge/connection.


This setup ensures that the pgEdge application can authenticate using the AppRole and access only the database connection credentials stored under pgEdge/connection without broader access to other secrets in Vault. This is a secure way to grant the application access to specific secrets with fine-grained control over permissions.

@CompileWithG
Copy link

CompileWithG commented Nov 7, 2024

Here is your content in proper markdown format:


Vault Configuration with Consul Storage Backend

The following shows the contents of the /etc/vault.d/vault.hcl file before configuring Consul.

GNU nano 7.2                                                   /etc/vault.d/vault.hcl
# Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: BUSL-1.1

# Full configuration options can be found at https://developer.hashicorp.com/vault/docs/configuration

ui = true

#mlock = true
#disable_mlock = true

# Add the api_addr here, just before the storage configuration
api_addr = "https://127.0.0.1:8200"

storage "file" {
  path = "/opt/vault/data"
}

#storage "consul" {
#  address = "127.0.0.1:8500"
#  path    = "vault"
#}

# HTTP listener
#listener "tcp" {
#  address = "127.0.0.1:8200"
#  tls_disable = 1
#}

# HTTPS listener
listener "tcp" {
  address       = "127.0.0.1:8200"
  tls_cert_file = "/opt/vault/tls/tls.crt"
  tls_key_file  = "/opt/vault/tls/tls.key"
}

# Enterprise license_path
# This will be required for enterprise as of v1.8
#license_path = "/etc/vault.d/vault.hclic"

What Changes When You Install Consul

When you install Consul and configure it to work with Vault, you will need to modify the /etc/vault.d/vault.hcl file. Consul will replace the file storage backend with the Consul storage backend, and the Vault server will use Consul for its storage. You'll also configure Vault to communicate with Consul through the listener and storage settings.

Here is what the changes will look like after integrating Consul:


Updated /etc/vault.d/vault.hcl with Consul Storage Backend

# Enable the Vault UI (already set in your current config)
ui = true

# Add the API address to interact with Vault
api_addr = "https://127.0.0.1:8200"

# Consul Storage Configuration
storage "consul" {
  address = "127.0.0.1:8500"  # Consul address
  path    = "vault"            # Path in Consul to store Vault data
}

# Set up the listener for Vault's HTTPS connection
listener "tcp" {
  address       = "127.0.0.1:8200"
  tls_cert_file = "/opt/vault/tls/tls.crt"
  tls_key_file  = "/opt/vault/tls/tls.key"
}

# Optionally enable the Enterprise license if applicable
#license_path = "/etc/vault.d/vault.hclic"

Key Changes

  1. Storage Backend:
    The storage backend is switched from file (local disk) to consul. Vault will now store its data in Consul at the specified path in Consul (vault in this case).

    storage "consul" {
      address = "127.0.0.1:8500"  # Address of the Consul server
      path    = "vault"            # Path in Consul to store Vault data
    }
  2. Consul Integration:
    Vault now uses Consul as its storage backend. It is essential to ensure Consul is running before starting Vault to avoid any issues with storage.

  3. Listener Configuration:
    Vault is configured to use HTTPS with the certificates specified (tls.crt and tls.key) for secure communication.


This setup will enable Vault to use Consul for storage, providing high availability, replication, and a more resilient storage backend.


@CompileWithG
Copy link

To install Consul on the same VPS where Vault is installed, you can follow these steps:

Step 1: Update the System

Make sure your system is up-to-date by running the following commands:

sudo apt update && sudo apt upgrade -y

Step 2: Install Consul

  1. Add the HashiCorp repository to get the latest version of Consul.

    First, install the required dependencies:

    sudo apt install -y software-properties-common
  2. Add the HashiCorp repository:

    sudo apt-add-repository "deb [arch=amd64] https://releases.hashicorp.com/debian stable main"
  3. Update the package list:

    sudo apt update
  4. Install Consul:

    sudo apt install consul -y
  5. Verify the installation:

    To check if Consul was installed correctly, you can run:

    consul --version

Step 3: Configure Consul for Vault

Once Consul is installed, you need to configure Vault to use Consul as its storage backend.

  1. Update Vault's Configuration (vault.hcl):

    Open /etc/vault.d/vault.hcl:

    sudo nano /etc/vault.d/vault.hcl

    Replace the storage "file" block with the following storage "consul" configuration:

    # Consul Storage Configuration
    storage "consul" {
      address = "127.0.0.1:8500"  # Consul address
      path    = "vault"            # Path in Consul to store Vault data
    }
  2. Enable Consul as a backend for Vault:

    By default, Vault uses the file storage backend. To switch to Consul, you'll replace it with the above configuration for Consul as the storage backend.


Step 4: Configure Consul for Vault (Optional: If not started already)

If you want Consul to run as a service, you can configure it to start on boot.

  1. Create a Consul configuration file in /etc/consul.d/:

    sudo mkdir -p /etc/consul.d

    Then create a Consul configuration file:

    sudo nano /etc/consul.d/consul.hcl

    Add the following to enable Consul to run in server mode:

    server = true
    bootstrap_expect = 1
    data_dir = "/opt/consul"
    bind_addr = "0.0.0.0"
    client_addr = "0.0.0.0"
    advertise_addr = "127.0.0.1"
    ui = true
  2. Start and Enable Consul Service:

    To start Consul, run:

    sudo systemctl start consul

    Enable it to start on boot:

    sudo systemctl enable consul
  3. Verify Consul is running:

    You can check if Consul is running with:

    sudo systemctl status consul

Step 5: Start Vault with Consul Backend

Now that Consul is installed and configured, you can start Vault with the updated configuration:

sudo systemctl restart vault

Verify that Vault is correctly using Consul by checking its status:

vault status

Vault should now be using Consul as its storage backend.


Step 6: Verify Consul UI (Optional)

Once Consul is up and running, you can access the Consul web UI on port 8500:

  • Open a browser and navigate to: http://<your_vps_ip>:8500

You should be able to see the Consul UI where you can manage key-value stores, services, and nodes.


Final Configuration Summary:

  • Vault's storage backend has been switched from file to Consul.
  • Consul is installed and configured to run as a service.
  • Vault and Consul are running on the same VPS, with Vault using Consul as its storage backend.

This setup will allow Vault to store its data in Consul and leverage its high availability and replication features.

@CompileWithG
Copy link

CompileWithG commented Nov 7, 2024

Workflow: Vault, pgEdge, and PostgreSQL

This document explains how pgEdge can interact with a PostgreSQL database by authenticating with Vault to retrieve credentials and using those credentials to read from and write to PostgreSQL.

Overview

  1. Vault is used to store and manage secrets, such as PostgreSQL database credentials (username and password).
  2. pgEdge authenticates with Vault using an AppRole to retrieve the credentials required to access PostgreSQL.
  3. pgEdge connects to the PostgreSQL database using the credentials fetched from Vault and can read/write data based on the permissions assigned to it.

Workflow

Step 1: pgEdge Authenticates with Vault

  • pgEdge authenticates with Vault using the AppRole authentication method.
  • The AppRole consists of a Role ID and Secret ID, which are used to authenticate pgEdge and allow it to retrieve the required credentials from Vault.
  • After successful authentication, pgEdge gets a Vault token that grants it access to secrets.

Step 2: pgEdge Fetches PostgreSQL Credentials from Vault

  • pgEdge uses the Vault token to read secrets stored in the Vault secrets engine (typically a KV (Key-Value) engine).
  • The PostgreSQL credentials (username and password) are stored in a Vault path, e.g., pgEdge/connection.
  • These credentials are dynamically retrieved by pgEdge when it needs to connect to PostgreSQL.

Example Vault secret retrieval:

vault read pgEdge/connection

Step 3: pgEdge Connects to PostgreSQL Using the Retrieved Credentials

  • With the credentials fetched from Vault (e.g., db_username and db_password), pgEdge connects to the PostgreSQL database.
  • It uses these credentials to authenticate with PostgreSQL, establishing a connection.

Step 4: pgEdge Reads from and Writes to PostgreSQL

  • pgEdge can now interact with the PostgreSQL database:
    • Read data from PostgreSQL by executing queries such as SELECT.
    • Write data to PostgreSQL by executing queries such as INSERT, UPDATE, or DELETE.

Example of pgEdge reading from PostgreSQL:

  • Fetching configuration data stored in a table or a particular value.

Example of pgEdge writing to PostgreSQL:

  • Inserting a new record into a configuration table or updating an existing value.

Step 5: Secure Access

  • Vault can be configured to dynamically generate database credentials for pgEdge, which means:
    • pgEdge gets temporary, short-lived credentials from Vault.
    • These credentials are automatically rotated and expire after a set duration, reducing the risk of credential leakage or misuse.

Considerations

  • PostgreSQL Role Permissions:

    • Ensure that pgEdge has the correct permissions in PostgreSQL to perform the necessary actions (e.g., reading, writing, modifying tables).
    • This is typically done by creating a PostgreSQL role with the necessary privileges and ensuring pgEdge uses this role.
  • Vault Policies:

    • Define specific Vault policies for pgEdge to control which secrets it can access.
    • Example: pgEdge can be granted read-only access to secrets at the path pgEdge/connection while being restricted from accessing other paths.

Example Vault Policy for pgEdge

# pgEdge-read.hcl

path "pgEdge/connection" {
  capabilities = ["read"]
}
  • This policy ensures that pgEdge can only read the secrets at the pgEdge/connection path and not modify any other secrets in Vault.

Conclusion

In this setup, pgEdge authenticates with Vault, retrieves the necessary credentials to access PostgreSQL, and then reads from and writes to PostgreSQL using these credentials. Vault ensures that the credentials are securely stored and dynamically managed, enhancing security through rotation and access controls.

@CompileWithG
Copy link

📘 Documentation: 3-Node Consul Cluster Setup on VPS Using Docker Compose

Overview

This setup guides the creation of a 3-node Consul cluster on a VPS using Docker Compose. The cluster nodes are configured in server mode and connected via a custom Docker bridge network. The Consul UI is enabled, and the nodes automatically join each other to form the cluster.

Steps to Set Up the Cluster

  1. Create the docker-compose.yml file with the following content:

    version: '3'
    services:
      consul1:
        image: hashicorp/consul
        command: "agent -server -bootstrap-expect=3 -node=consul1 -client=0.0.0.0 -ui -retry-join=consul2 -retry-join=consul3"
        environment:
          CONSUL_BIND_INTERFACE: eth0
        ports:
          - "8500:8500"  # UI and HTTP API
          - "8600:8600/udp"  # DNS
          - "8301:8301"  # Serf LAN for cluster
          - "8301:8301/udp"
        networks:
          - consul-network
    
      consul2:
        image: hashicorp/consul
        command: "agent -server -node=consul2 -client=0.0.0.0 -ui -retry-join=consul1 -retry-join=consul3"
        environment:
          CONSUL_BIND_INTERFACE: eth0
        ports:
          - "8501:8500"
          - "8601:8600/udp"
          - "8302:8301"
          - "8302:8301/udp"
        networks:
          - consul-network
    
      consul3:
        image: hashicorp/consul
        command: "agent -server -node=consul3 -client=0.0.0.0 -ui -retry-join=consul1 -retry-join=consul2"
        environment:
          CONSUL_BIND_INTERFACE: eth0
        ports:
          - "8502:8500"
          - "8602:8600/udp"
          - "8303:8301"
          - "8303:8301/udp"
        networks:
          - consul-network
    
    networks:
      consul-network:
        driver: bridge
  2. Deploy the Cluster by running the following command:

    sudo docker-compose up -d
  3. Verify Cluster Status:

    • Check the status of the containers:
      sudo docker ps
    • Check the health of the Consul cluster by inspecting the members:
      sudo docker exec -it consul-cluster_consul1_1 consul members
    • You can access the Consul UI on http://<your-vps-ip>:8500, http://<your-vps-ip>:8501, or http://<your-vps-ip>:8502 for each node’s UI.

Explanation of Configuration

  • command: Each node runs the consul agent in server mode, joining the other nodes to form a cluster using the -retry-join flag.
  • Ports: Each node exposes the following ports:
    • 8500 for the Consul UI and HTTP API.
    • 8600 for DNS (UDP).
    • 8301 for Serf LAN communication between the nodes.
  • environment: CONSUL_BIND_INTERFACE is set to eth0 to ensure Consul binds to the correct network interface on the VPS.
  • Networks: All nodes are connected via the custom consul-network (using the bridge driver) to ensure secure communication between them.

Notes

  • The cluster nodes will try to join each other using the -retry-join flag until they are successfully connected.
  • The Consul UI is available on different ports for each node (8500 for consul1, 8501 for consul2, and 8502 for consul3).
  • The setup ensures high availability with 3 nodes, providing fault tolerance in case one node fails.

@CompileWithG
Copy link

Consul Configuration and Vault Integration

This document outlines the steps taken to configure Consul and link it with Vault for storage and service discovery.


Consul Configuration File (/etc/consul.d/consul.json)

The Consul agent configuration file is located at /etc/consul.d/consul.json. The contents of this file define various configuration parameters for Consul nodes, such as data storage, server setup, node communication, and encryption. Below is the content of the configuration file:

{
  "datacenter": "dc1",
  "data_dir": "/var/lib/consul",
  "log_level": "INFO",
  "node_name": "consul-node-1",
  "server": true,
  "bootstrap_expect": 3,
  "ui": true,
  "client_addr": "0.0.0.0",
  "bind_addr": "153.92.208.222",  // Your VPS IP address
  "retry_join": ["172.20.0.3", "172.20.0.4", "172.20.0.2"],  // IPs of Consul nodes
  "encrypt": "your_consul_encryption_key",  // Secure key for gossip encryption
  "performance": {
    "raft_multiplier": 1
  }
}

Key Parameters:

  • datacenter: Specifies the name of the datacenter for the Consul cluster.
  • data_dir: The directory where Consul stores its data.
  • log_level: Defines the verbosity of logs.
  • node_name: A unique identifier for each Consul node.
  • server: Specifies that this node is a Consul server.
  • bootstrap_expect: The number of servers in the cluster to wait for before starting.
  • ui: Enables the Consul Web UI.
  • client_addr: Allows all clients to connect.
  • bind_addr: The IP address the Consul agent binds to.
  • retry_join: The IP addresses of other Consul nodes to join the cluster.
  • encrypt: An encryption key used for gossip encryption between nodes.

Docker Configuration for Consul

In order to use the Consul configuration file with Docker, the docker-compose.yml file has been modified to mount the host’s configuration directory (./consul/config) into the container’s /etc/consul.d directory. This ensures that the Consul containers use the configuration file for proper setup.

Here is the updated Docker Compose configuration for the Consul service:

version: '3'
services:
  consul1:
    image: hashicorp/consul
    command: "agent -server -bootstrap-expect=3 -node=consul1 -client=0.0.0.0 -ui -retry-join=consul2 -retry-join=consul3 -config-dir=/etc/consul.d"
    environment:
      CONSUL_BIND_INTERFACE: eth0
    volumes:
      - ./consul/config:/etc/consul.d  # Mount the config directory from host to container
    ports:
      - "8500:8500"  # UI and HTTP API
      - "8600:8600/udp"  # DNS
      - "8301:8301"  # Serf LAN for cluster
      - "8301:8301/udp"
    networks:
      - consul-network

  consul2:
    image: hashicorp/consul
    command: "agent -server -node=consul2 -client=0.0.0.0 -ui -retry-join=consul1 -retry-join=consul3 -config-dir=/etc/consul.d"
    environment:
      CONSUL_BIND_INTERFACE: eth0
    volumes:
      - ./consul/config:/etc/consul.d  # Mount the config directory from host to container
    ports:
      - "8501:8500"
      - "8601:8600/udp"
      - "8302:8301"
      - "8302:8301/udp"
    networks:
      - consul-network

  consul3:
    image: hashicorp/consul
    command: "agent -server -node=consul3 -client=0.0.0.0 -ui -retry-join=consul1 -retry-join=consul2 -config-dir=/etc/consul.d"
    environment:
      CONSUL_BIND_INTERFACE: eth0
    volumes:
      - ./consul/config:/etc/consul.d  # Mount the config directory from host to container
    ports:
      - "8502:8500"
      - "8602:8600/udp"
      - "8303:8301"
      - "8303:8301/udp"
    networks:
      - consul-network

networks:
  consul-network:
    driver: bridge

Key Docker Compose Changes:

  • Volumes: The ./consul/config directory on the host is mounted to /etc/consul.d inside the container to ensure Consul uses the correct configuration file.
  • Ports: Configured the necessary ports for Consul's UI and communication between nodes in the cluster.
  • Networks: All Consul containers are placed on the same custom network (consul-network) to allow internal communication between nodes.

Vault Integration with Consul

Vault is now configured to use Consul as the storage backend. This setup enables Vault to securely store its data in Consul, leveraging Consul’s high availability and distributed nature.

Steps to Integrate Vault with Consul:

  1. Set the storage backend for Vault to Consul by modifying Vault’s configuration file (e.g., vault.hcl):

    storage "consul" {
      address = "127.0.0.1:8500"
      path    = "vault/"
    }
  2. After configuring Vault to use Consul, start Vault and verify that it’s using Consul as the storage backend.


Conclusion

With these changes:

  • Consul is configured and running on a cluster of three nodes.
  • The Consul configuration is mounted into the containers, and the necessary ports are exposed.
  • Vault is successfully integrated with Consul for storage, providing secure and reliable data storage for Vault.

This setup ensures that both Consul and Vault are configured correctly for high availability, secure communication, and data management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment