Skip to content

Instantly share code, notes, and snippets.

@vdparikh
Last active November 15, 2024 19:21
Show Gist options
  • Save vdparikh/1482efc22c4dbe8c9e09022a0264dd8e to your computer and use it in GitHub Desktop.
Save vdparikh/1482efc22c4dbe8c9e09022a0264dd8e to your computer and use it in GitHub Desktop.

Building a Key Management System (KMS) on top of an on-premises Hardware Security Module (HSM) involves creating a software layer that interacts with the HSM to provide key lifecycle management and cryptographic operations. This KMS can then expose APIs or SDKs for other services to use. Here’s what it takes to build such a system:


1. Understand HSM Capabilities

HSMs provide hardware-backed secure key storage and cryptographic operations. Before building the KMS, understand the following features of your HSM:

  • Key Storage: Maximum number of keys, supported key types (e.g., RSA, AES, ECC).
  • Cryptographic Operations: Supported algorithms (e.g., encryption, signing, HMAC).
  • APIs or SDKs: Interfaces provided for interacting with the HSM (e.g., PKCS#11, JCE, KMIP, proprietary APIs).
  • Access Control: User roles, permissions, and multi-tenancy support.

2. Define KMS Features

Your KMS should expose a high-level abstraction over the HSM, offering features like:

  • Key Lifecycle Management:
    • Create, import, and generate keys.
    • Rotate, revoke, delete, and archive keys.
    • Define key expiration and policies.
  • Cryptographic Operations:
    • Encryption/decryption.
    • Signing/verification.
    • Key derivation and wrapping.
  • Access Management:
    • Policies for users, roles, and services accessing keys.
    • Integration with authentication systems (e.g., LDAP, IAM, Keycloak).
  • Audit Logging:
    • Record all operations for compliance and debugging.
  • Multi-Tenancy:
    • Segregate keys and policies for different services or clients.

3. Choose or Build APIs

Design APIs for your KMS to make it consumable by other services:

  • REST APIs or gRPC for client communication.
  • Examples:
    • POST /keys: Create a new key.
    • POST /encrypt: Encrypt data.
    • POST /decrypt: Decrypt data.
  • Consider using OpenAPI/Swagger for API documentation.

4. HSM Integration Layer

Implement the integration layer to interact with the HSM:

  • Use the HSM's SDK or APIs (e.g., PKCS#11, JCE).
  • Abstract low-level HSM commands into reusable functions (e.g., CreateKey, EncryptData).
  • Ensure efficient connection pooling and session management.

Example in Go using PKCS#11:

package hsm

import (
	"github.com/miekg/pkcs11"
)

type HSMClient struct {
	ctx *pkcs11.Ctx
}

func NewHSMClient(libraryPath string) (*HSMClient, error) {
	ctx := pkcs11.New(libraryPath)
	err := ctx.Initialize()
	if err != nil {
		return nil, err
	}
	return &HSMClient{ctx: ctx}, nil
}

// Example: Create Key
func (c *HSMClient) CreateKey(label string, keyType pkcs11.ObjectClass, keySize int) error {
	// Implement key creation logic
	return nil
}

5. Implement Key Policy and Access Control

  • Define policies for key usage (e.g., "Key X can only decrypt data for Service Y").
  • Implement fine-grained access control, ensuring that services and users have the minimum necessary permissions.
  • Integrate with your organization’s identity and access management (IAM) system for user authentication.

6. Design for High Availability

  • HSM Cluster: If supported, configure your HSM in a high-availability cluster.
  • Failover Logic: Implement logic to detect and failover to a backup HSM.
  • Database for Metadata: Use a database (e.g., PostgreSQL) to store metadata like key policies, usage logs, and mappings.

7. Security Considerations

  • HSM Access Control: Ensure only the KMS service can access the HSM directly.
  • Secure API: Use HTTPS and mutual TLS for KMS APIs.
  • Data Encryption: Ensure that plaintext data is never logged or exposed.
  • Audit Logs: Enable tamper-proof audit logs for compliance.
  • Key Encryption: Use HSM-backed master keys to encrypt data encryption keys (DEKs).

8. Additional Enhancements

  • Service-Level Policies: Allow services to define encryption policies, like preferred algorithms or key rotation frequency.
  • Key Caching: For performance, cache frequently used keys securely in memory with automatic expiration.
  • Token-Based Access: Issue short-lived access tokens for services to interact with the KMS.
  • SDKs: Provide client SDKs in languages like Go, Java, and Python to simplify integration.

9. Example System Architecture

High-Level Architecture:

  1. Services/Applications → Consume KMS APIs for cryptographic operations.
  2. KMS → Abstracts HSM operations and provides policies, logging, and multi-tenancy.
  3. HSM → Secure storage and execution of cryptographic operations.

10. Tools & Frameworks

  • Go Libraries for HSMs:
    • miekg/pkcs11: PKCS#11 bindings for Go.
    • Proprietary SDKs (e.g., Thales, Utimaco, Marvell LiquidSecurity).
  • Database Options:
    • PostgreSQL for key metadata and policy storage.
  • API Frameworks:
    • Use frameworks like gin or echo for RESTful APIs in Go.

Implementing Key Policy and Access Control in a Key Management System (KMS) involves establishing mechanisms to define, enforce, and audit permissions and restrictions on how cryptographic keys are used. Below is a breakdown of the components and considerations involved:


1. Key Policy Design

Key policies define who can do what with a cryptographic key. These policies are stored as JSON or a similar format and attached to keys or key groups.

Elements of a Key Policy

  • Key ID or Key Group: Specifies the scope of the policy (individual key or group of keys).
  • Principals: Users, services, or roles allowed to interact with the key (e.g., user:alice, role:serviceA).
  • Actions:
    • encrypt
    • decrypt
    • sign
    • verify
    • rotate
    • delete
  • Conditions:
    • Time-bound restrictions (e.g., key usage is allowed only during business hours).
    • IP-based restrictions (e.g., access from specific network ranges).
    • Tags or metadata conditions (e.g., environment: production).
  • Effect:
    • allow or deny actions for specified principals.

Example Policy (JSON):

{
  "keyId": "key123",
  "statements": [
    {
      "principal": "user:alice",
      "actions": ["encrypt", "decrypt"],
      "effect": "allow",
      "conditions": {
        "ipAddress": "192.168.1.0/24"
      }
    },
    {
      "principal": "role:admin",
      "actions": ["rotate", "delete"],
      "effect": "allow"
    }
  ]
}

2. Authentication

The system must verify who is making the request. This is typically achieved via:

  • Identity Providers (IdP):
    • Use systems like Keycloak, OAuth2, or SAML for user authentication.
  • API Tokens:
    • JWT tokens issued to applications or users.
  • Mutual TLS (mTLS):
    • Authenticate requests using client certificates.

3. Authorization

Authorization ensures that the authenticated user can perform the requested action on the specified key.

Authorization Workflow:

  1. Principal Identification:
    • Extract principal details from the authentication token (e.g., user ID, roles).
  2. Policy Evaluation:
    • Match the request against applicable policies.
    • Evaluate conditions (time, IP, etc.).
  3. Decision:
    • Return allow or deny.

Enforcement Strategies:

  • Role-Based Access Control (RBAC):
    • Assign permissions based on user roles (e.g., admin, operator, viewer).
  • Attribute-Based Access Control (ABAC):
    • Use attributes like environment, department, or time for fine-grained control.
  • Policy-Based Access Control (PBAC):
    • Use explicit policies to define granular permissions.

4. Policy Assignment and Management

Policies can be assigned at:

  • Key Level:
    • Attach policies directly to individual keys.
  • Key Group Level:
    • Assign policies to a group of keys for easier management.
  • User Level:
    • Assign policies to users or roles across multiple keys.

APIs for policy management:

  • Create Policy:
    • POST /policies (Define a new policy).
  • Attach Policy:
    • POST /keys/{keyId}/policy (Attach policy to a specific key).
  • Update Policy:
    • PUT /policies/{policyId} (Modify an existing policy).
  • Retrieve Policy:
    • GET /policies/{policyId} (View policy details).

5. Access Enforcement

The KMS must enforce access restrictions at runtime during cryptographic operations.

Runtime Steps:

  1. Verify Token or Certificate: Ensure the request is from an authenticated source.
  2. Evaluate Policies: Check if the request matches any allow or deny policies.
  3. Log and Monitor: Record the action for audit and compliance purposes.
  4. Respond to Request:
    • Grant or reject the requested operation.

6. Logging and Audit Trails

Tracking access and key usage is crucial for compliance and debugging.

  • Log Entries Include:
    • Key ID.
    • Action performed (e.g., encrypt, rotate).
    • Principal (user/service) initiating the action.
    • Timestamp.
    • Result (success or failure).
  • APIs for Audit Logs:
    • GET /logs (Filter by key ID, user, or action).

7. Monitoring and Alerting

  • Monitor policy violations or unauthorized access attempts.
  • Trigger alerts for:
    • Excessive key usage.
    • Access from unauthorized locations.
    • Expired or inactive keys being used.

8. Handling Policy Updates

  • Versioning:
    • Keep track of policy changes and allow rollbacks.
  • Grace Period:
    • Apply new policies with a delay to allow affected systems to adjust.
  • Policy Conflict Resolution:
    • If multiple policies apply, determine precedence (e.g., deny overrides allow).

9. Testing and Validation

  • Test policies with different users and actions to ensure correct enforcement.
  • Use simulation APIs to validate policies:
    • POST /simulate
      • Input: Policy and action details.
      • Output: Evaluation result (allow or deny).

End-to-End Flow

  1. User or Service: Authenticates with the KMS.
  2. Request: Sends a request to perform an action (e.g., encrypt).
  3. KMS:
    • Validates the request using attached policies.
    • Logs the request for audit purposes.
    • Performs the action if authorized.
  4. Response: Returns success or error to the user/service.

A Key Management System (KMS) exposes APIs that enable applications to perform secure cryptographic operations and manage keys. These APIs typically cover key lifecycle management, cryptographic operations, policy enforcement, access control, and audit logging. Below is a categorized list of APIs you might expose in your KMS:


1. Key Management APIs

APIs for creating, managing, and rotating keys.

  • Key Creation

    • POST /keys
      • Input: Key type (AES, RSA, etc.), key size, purpose (encrypt/decrypt, sign/verify).
      • Output: Key ID, metadata.
    • Example: Create an AES-256 encryption key.
  • Key Retrieval

    • GET /keys/{keyId}
      • Input: Key ID.
      • Output: Key metadata (type, size, status, expiration).
  • Key Rotation

    • POST /keys/{keyId}/rotate
      • Input: Key ID.
      • Output: New key version, metadata.
    • Rotates the key while maintaining key continuity for decrypting older data.
  • Key Deletion

    • DELETE /keys/{keyId}
      • Input: Key ID.
      • Output: Confirmation of deletion.
    • Often, keys are "soft deleted" for a retention period before permanent deletion.
  • Key Import/Export (if allowed)

    • POST /keys/import
      • Input: Encrypted key material, metadata.
      • Output: Key ID.
    • GET /keys/{keyId}/export
      • Input: Key ID.
      • Output: Encrypted key material.

2. Cryptographic APIs

APIs for using keys to perform cryptographic operations.

  • Encryption

    • POST /encrypt
      • Input: Key ID, plaintext, associated data (optional).
      • Output: Ciphertext.
    • Example: Encrypt sensitive data using an AES key.
  • Decryption

    • POST /decrypt
      • Input: Key ID, ciphertext, associated data (optional).
      • Output: Plaintext.
  • Signing

    • POST /sign
      • Input: Key ID, message to sign.
      • Output: Signature.
    • Example: Generate a digital signature for a document.
  • Verification

    • POST /verify
      • Input: Key ID, message, signature.
      • Output: Verification result (valid/invalid).
  • Key Wrapping/Unwrapping

    • POST /wrap
      • Input: Wrapping key ID, key material to wrap.
      • Output: Wrapped key material.
    • POST /unwrap
      • Input: Wrapping key ID, wrapped key material.
      • Output: Original key material.
  • Random Number Generation

    • GET /random
      • Output: Cryptographically secure random bytes.

3. Policy and Access Control APIs

APIs for managing access control and usage policies.

  • Policy Creation

    • POST /policies
      • Input: Policy JSON (key-level permissions, access restrictions).
      • Output: Policy ID.
    • Example: Define which users/services can use a key for encryption or signing.
  • Policy Assignment

    • POST /keys/{keyId}/policy
      • Input: Policy ID.
      • Output: Confirmation.
    • Example: Attach a policy to a specific key.
  • Policy Retrieval

    • GET /policies/{policyId}
      • Input: Policy ID.
      • Output: Policy JSON.

4. Audit Logging APIs

APIs for retrieving logs of operations for compliance and debugging.

  • Log Retrieval
    • GET /logs
      • Input: Filters (e.g., time range, key ID, operation type).
      • Output: List of logs.
    • Example: Retrieve all encryption and decryption operations for a specific key.

5. Monitoring and Metadata APIs

APIs to track system health, key usage, and status.

  • Key Usage Statistics

    • GET /keys/{keyId}/usage
      • Input: Key ID.
      • Output: Metrics (e.g., number of encryptions, decryptions).
  • System Health

    • GET /health
      • Output: System status (e.g., HSM connection, API availability).
  • Metadata Management

    • POST /keys/{keyId}/metadata
      • Input: Metadata (e.g., tags, description).
      • Output: Confirmation.
    • GET /keys/{keyId}/metadata
      • Output: Metadata JSON.

6. Authentication and Authorization APIs

If the KMS includes its own authentication/authorization layer:

  • Token Issuance
    • POST /auth/token
      • Input: User/service credentials.
      • Output: Access token.
  • Access Verification
    • POST /auth/verify
      • Input: Token, requested operation.
      • Output: Permission status (allowed/denied).

7. Multi-Tenancy APIs (if applicable)

For environments with multiple clients or services:

  • Tenant Management
    • POST /tenants
      • Input: Tenant information (e.g., name, admin).
      • Output: Tenant ID.
  • Tenant Keys
    • GET /tenants/{tenantId}/keys
      • Output: List of keys scoped to the tenant.

8. Example API Workflow

Use Case: Encrypt Data

  1. Create Key:
    POST /keys
    Response: { "keyId": "abc123" }

  2. Encrypt Data:
    POST /encrypt
    Input: { "keyId": "abc123", "plaintext": "Hello, World!" }
    Response: { "ciphertext": "encryptedData" }

  3. Decrypt Data:
    POST /decrypt
    Input: { "keyId": "abc123", "ciphertext": "encryptedData" }
    Response: { "plaintext": "Hello, World!" }


9. Advanced Features (Optional)

  • Key Labeling and Tagging:
    • To organize and search for keys.
  • Key Search:
    • GET /keys?filter=label:backup_key
  • Geo-Restrictions:
    • Define geographical usage policies for keys.
  • Custom Key Attributes:
    • Allow clients to add custom attributes to keys.

Not necessarily. Whether the KMS calls the HSM for all cryptographic operations depends on how the KMS is architected and the desired balance between performance and security. Here's a breakdown:


1. Direct HSM Calls for Every Operation

In this model, the KMS forwards cryptographic operations (like encrypt, decrypt, sign, or verify) to the HSM based on the key ID.

  • Advantages:

    • Maximum security: The key material never leaves the HSM, ensuring a hardware-rooted trust.
    • Simplified compliance: Many regulations require key operations to occur inside certified HSMs.
  • Disadvantages:

    • Performance bottleneck: HSMs, while secure, have limited throughput compared to software-based operations.
    • Latency: Every API call involving a cryptographic operation incurs HSM communication overhead.
  • Use Cases:

    • High-security environments requiring strong guarantees (e.g., financial transactions, government systems).

2. Cached Keys in Memory (Hybrid Model)

In this model, the KMS retrieves the key material from the HSM on first use and caches it in memory (or a secure enclave) for subsequent operations. The cache can expire based on time-to-live (TTL) or specific usage policies.

  • Advantages:

    • Improved performance: Avoids repeated HSM calls for the same key.
    • Reduced HSM load: Ideal for high-throughput environments.
    • Flexible trade-offs: The cache can be encrypted and isolated in secure hardware like Intel SGX or AWS Nitro Enclaves.
  • Disadvantages:

    • Slightly reduced security: The key material, even if encrypted, is temporarily outside the HSM.
    • Complexity: Implementing and securing the cache adds design overhead.
  • Use Cases:

    • Scenarios requiring a balance between performance and security (e.g., cloud-native applications with high transaction volumes).

3. Delegation to HSM for Sensitive Operations Only

In this model, only certain sensitive operations (like key generation, signing, or decryption) are delegated to the HSM. Less critical or symmetric encryption operations might use software cryptographic libraries with keys securely derived or cached.

  • Advantages:

    • Optimized for performance: Non-critical operations don’t involve the HSM.
    • Security for critical operations: Keeps the most sensitive operations in the HSM.
    • Cost efficiency: Reduces reliance on high-cost HSM capacity.
  • Disadvantages:

    • Partitioning logic: Developers must clearly delineate sensitive operations to route to the HSM.
  • Use Cases:

    • Systems where a mix of security and performance is necessary (e.g., API gateways or IoT platforms).

4. All Software-Based with HSM Key Root

In this model, the HSM is used only for generating and storing the root key(s). Derived keys are used for actual cryptographic operations and are handled by the KMS or software libraries like Google Tink.

  • Advantages:

    • Maximum performance: All cryptographic operations occur in software.
    • Scalability: Independent of HSM throughput limits.
    • HSM provides root-of-trust: Ensures keys are securely seeded and rotated.
  • Disadvantages:

    • Lower security: Keys or derived key material must be carefully managed outside the HSM.
    • Regulatory challenges: May not meet requirements for on-HSM operations.
  • Use Cases:

    • Low-security environments or use cases like tokenization, where direct HSM calls are unnecessary.

Decision Factors for HSM Involvement

  1. Security Requirements:
    • If key material must never leave the HSM, all operations should be performed within the HSM.
  2. Performance Needs:
    • For high-throughput systems, caching or hybrid models may be required.
  3. Cost Considerations:
    • HSMs are expensive and throughput-limited, so reducing dependency on HSMs can save costs.
  4. Regulatory Compliance:
    • Regulations like PCI DSS or GDPR may dictate when HSM usage is mandatory.
  5. Operation Types:
    • Asymmetric operations (e.g., signing) are more suited to direct HSM involvement, while symmetric operations (e.g., AES encryption) can often be offloaded.

Recommended Best Practices

  • Use HSM for Key Management Operations:
    • Generate, store, and rotate keys inside the HSM.
  • Limit HSM Use for Performance-Critical Operations:
    • Cache keys or use derived keys for non-sensitive, high-throughput tasks.
  • Audit All HSM Interactions:
    • Keep a detailed log of every interaction with the HSM.
  • Encrypt and Isolate Key Material Outside the HSM:
    • Use secure memory or enclave technologies when caching keys.

In the context of a high-availability Key Management System (KMS), the Database for Metadata is a crucial component for ensuring that all instances of the KMS can operate seamlessly, maintain consistency, and recover from failures. Below is a detailed explanation of what the metadata database entails:


What is Metadata in a KMS?

Metadata refers to non-sensitive information required for managing cryptographic keys and operations. This data is not the keys themselves but includes everything necessary to identify, manage, and control access to keys. Examples include:

  • Key Identifiers:
    • Unique IDs or aliases for keys (e.g., key123, payment-key-01).
  • Key Attributes:
    • Creation date, expiration date, last rotation date, and algorithm type.
  • Policies:
    • Access control policies attached to keys (e.g., roles, conditions).
  • Usage Stats:
    • Number of operations performed with the key (e.g., encryptions, decryptions).
  • Key State:
    • Whether a key is active, disabled, or deleted (soft-deleted for recovery).
  • Auditing Information:
    • Logs or pointers to detailed logs of key operations.
  • Replication Metadata (in multi-region setups):
    • Information about where the key is replicated and its consistency state.
  • Tags/Labels:
    • Custom metadata for categorization (e.g., env:prod, team:finance).

Metadata Database Design

Database Schema

The schema design depends on the specific features of your KMS but might look like this:

1. Key Table

Stores metadata about each cryptographic key.

Column Name Data Type Description
key_id String (PK) Unique identifier for the key.
alias String User-friendly alias for the key.
type String Key type (e.g., AES, RSA, HMAC).
state String Current state (e.g., active, disabled).
created_at Timestamp Date and time of creation.
rotated_at Timestamp Last rotation date.
expiry_date Timestamp Expiry date for the key.
policy_id String (FK) Reference to the policy governing the key.
region String Region where the key is primarily stored.
tags JSON Key-value pairs for categorization.

2. Policy Table

Stores policies that define access control for keys.

Column Name Data Type Description
policy_id String (PK) Unique identifier for the policy.
policy_document JSON Access control rules in JSON format.
created_at Timestamp Date and time of creation.
updated_at Timestamp Date and time of the last update.

3. Audit Table

Tracks operations performed on the keys.

Column Name Data Type Description
audit_id String (PK) Unique identifier for the log entry.
key_id String (FK) The key involved in the operation.
action String The operation (e.g., encrypt, decrypt).
principal String User/service performing the operation.
timestamp Timestamp When the operation was performed.
result String Success or failure of the operation.

Database Technology

For high availability, the database should support:

  • Distributed Architecture:
    • Use databases like PostgreSQL (with replication), CockroachDB, or Amazon Aurora.
  • Horizontal Scalability:
    • Ensure that the database can scale out as the KMS grows.
  • Failover Capabilities:
    • Automatic failover to secondary instances in case of primary failure.
  • Multi-Region Support:
    • Sync metadata across regions for global redundancy.
  • Encryption-at-Rest:
    • Encrypt metadata to ensure it remains secure, even if compromised.

Example Queries

Here are some example queries that KMS might run against the metadata database:

  1. Retrieve Key Metadata:

    SELECT * FROM keys WHERE key_id = 'key123';
  2. List All Active Keys:

    SELECT key_id, alias FROM keys WHERE state = 'active';
  3. Log an Audit Entry:

    INSERT INTO audits (audit_id, key_id, action, principal, timestamp, result)
    VALUES ('audit001', 'key123', 'encrypt', 'user:alice', NOW(), 'success');
  4. Get Policy for a Key:

    SELECT p.policy_document
    FROM policies p
    JOIN keys k ON p.policy_id = k.policy_id
    WHERE k.key_id = 'key123';

Metadata Management Considerations

Consistency Across Instances

  • Use a consensus protocol like Raft or Paxos to ensure consistency when multiple KMS nodes interact with the database.

Caching for Performance

  • Frequently accessed metadata (e.g., active keys and policies) can be cached in an in-memory store like Redis for low-latency access.

Backup and Recovery

  • Regular backups of the database are essential to recover metadata in case of a disaster.
  • Implement Point-in-Time Recovery (PITR) for quick restoration.

Versioning

  • Maintain versions of metadata to track changes over time, which aids in audits and troubleshooting.

Security Requirements

  1. Database Encryption:

    • Use Transparent Data Encryption (TDE) for sensitive columns like policy documents.
  2. Access Control:

    • Limit who can read/write to the database using strict RBAC.
  3. Audit Logging:

    • Log every query or update made to the database for compliance.
  4. Integration with HSM:

    • Store only metadata in the database. Keys themselves reside in the HSM.

How the rootKey is coming from the HSM, as it's a crucial part of the process.

In a typical HSM-based Key Management System (KMS) workflow, the root key never leaves the HSM in plaintext. Instead, you can interact with the HSM using secure APIs to derive, encrypt, or sign data. The HSM itself is responsible for handling the key securely. The key derivation process you're asking about would not directly involve loading the root key into memory. Rather, the root key would remain inside the HSM, and the cryptographic operations (such as key derivation) are done within the HSM itself, or with the assistance of software that interacts with the HSM via a secure protocol.

Here’s how you might go about using the root key from an HSM to derive a key:


Key Management Workflow with HSM

  1. Key Generation in HSM:

    • A root key (e.g., AES-256) is generated and stored securely in the HSM.
    • The root key remains within the HSM, and you don’t pull it out directly into the application or memory.
  2. Key Derivation in HSM:

    • Instead of using the root key in software like we did in the previous example (with hkdf.New), the KMS would call the HSM to perform the key derivation operation internally, using the root key and context as input.

    The key derivation process can be carried out by leveraging the HSM’s API, which supports operations like HMAC or HKDF internally, using the root key. This is done by passing the necessary parameters to the HSM through its secure interface.


How the Interaction Works with an HSM:

1. Deriving Keys within the HSM (Example with HMAC or HKDF)

When you need to derive a key based on the root key stored in the HSM, you would perform something like this:

  • Step 1: Provide Context and Parameters You provide the necessary input parameters (context, salt, etc.) to the HSM, which could include things like the operation type ("encryption", "signing"), a unique identifier (such as a user ID or session), or additional randomness.

  • Step 2: Perform Key Derivation Using HSM APIs You use the HSM's API to perform key derivation (e.g., using HMAC, HKDF, or another key derivation function) directly within the HSM.

    • The HSM would use the root key to perform the HMAC or HKDF operation and return a derived key.
  • Step 3: Use Derived Key for Cryptographic Operations The derived key is then used for cryptographic operations, such as encryption, signing, etc.


2. Example HSM API Call for Key Derivation:

Let’s say you are using an HSM (such as Thales, SafeNet, or AWS CloudHSM) which has an API that supports HMAC or HKDF. The steps in your application might look like this:

  1. Send a Request to the HSM for Key Derivation:

    • The application sends a request to the HSM to derive a key from the root key stored in the HSM. The request includes the context and any other parameters needed for the derivation (e.g., salt, label).

    Request:

    DeriveKeyRequest {
        rootKeyId: "aes256-root-key",
        context: "encrypt-user-123",
        salt: "randomSalt123",
        algorithm: "HKDF",
        outputLength: 32 // (256-bit key)
    }
    
  2. HSM Performs Key Derivation:

    • The HSM internally uses its root key (aes256-root-key) to perform the key derivation, such as with HMAC or HKDF, and then returns the derived key.

    HSM Response:

    DerivedKey: "derived-256-bit-key"
    
  3. Use the Derived Key for Cryptographic Operations:

    • The derived key can now be used for encryption, decryption, signing, or other cryptographic operations.

3. Workflow with Google Tink and HSM:

Now, assuming you're using Google Tink and the root key is stored securely in the HSM, you would rely on the HSM to perform key derivation using the root key, then use the derived key for Tink operations.

import (
    "fmt"
    "golang.org/x/crypto/hkdf"
    "github.com/google/tink/go/aead"
)

// Example function to simulate key derivation process
func DeriveKeyFromHSM(hsmClient *HSMClient, context []byte) ([]byte, error) {
    // Simulate HSM call: request key derivation using root key from HSM
    derivedKey, err := hsmClient.DeriveKey("root-key-id", context)
    if err != nil {
        return nil, fmt.Errorf("failed to derive key: %v", err)
    }

    return derivedKey, nil
}

// Example of using derived key in Tink AEAD
func EncryptWithDerivedKey(derivedKey []byte, plaintext []byte) ([]byte, error) {
    keysetHandle, err := aead.NewKeysetFromBytes(derivedKey)
    if err != nil {
        return nil, fmt.Errorf("failed to create keyset: %v", err)
    }

    a, err := aead.New(keysetHandle)
    if err != nil {
        return nil, fmt.Errorf("failed to create AEAD: %v", err)
    }

    return a.Encrypt(plaintext, nil)
}

In this workflow, the actual derivation of the root key from the HSM is done securely by the HSM itself, and you don’t need to manually manage the root key in the application.


Summary

In practice, the root key stored in the HSM is not directly accessible. Instead, you send a request to the HSM to derive keys using that root key. The HSM performs the cryptographic operation (e.g., HKDF or HMAC) internally and returns the derived key. This derived key is then used for cryptographic operations in your KMS or with libraries like Google Tink.

This approach ensures that:

  • The root key never leaves the HSM in plaintext.
  • You can perform key derivation in a secure, scalable, and efficient manner.

The concept of derived keys involves creating secondary keys from a root key (stored in a secure location, such as an HSM) using a deterministic key derivation process. These derived keys are then used for cryptographic operations like encryption, decryption, or signing. This approach provides a scalable and efficient way to perform operations without frequent interactions with the HSM. Here's how it works:


1. Key Derivation Process

The KMS or software cryptographic library derives keys using a Key Derivation Function (KDF). Common KDFs include:

  • HKDF (HMAC-based Key Derivation Function):

    • Generates cryptographic subkeys using a root key, input parameters (context), and a salt value.
    • Example: DerivedKey = HKDF(RootKey, Context, Salt, OutputLength)
  • PBKDF2 (Password-Based Key Derivation Function):

    • Suitable for creating keys from passwords or low-entropy inputs.
  • Google Tink:

    • Uses primitives like AES or HMAC with deterministic inputs to derive subkeys.

The derived keys depend on:

  • Root Key (Master Key): Stored securely in the HSM.
  • Context Information: Unique identifiers, such as application names, user IDs, or key usage types.
  • Salt (Optional): Random data to prevent predictable keys.

2. Key Derivation in Action

Step-by-Step Workflow:

  1. Root Key in HSM:

    • A high-entropy master key is generated and stored securely in the HSM. For example, a 256-bit AES key.
  2. Key Derivation in KMS:

    • When an application requests a key for specific operations, the KMS derives the key using the root key and unique context.
    • Example:
      DerivedKey = HKDF(RootKey, "encrypt-user-123", Salt, 256)
      
  3. Key Usage in Software:

    • The derived key is passed to a cryptographic library like Google Tink to perform operations like encryption or signing.
  4. No Root Key Exposure:

    • The root key remains inside the HSM, ensuring that even if the derived key is compromised, other keys or the root key remain secure.
  5. Efficient Cryptographic Operations:

    • Derived keys are cached in memory (securely) for performance and used for repeated operations.

Advantages of Using Derived Keys

  1. Improved Performance:

    • Cryptographic operations use derived keys locally in software, avoiding the latency of frequent HSM interactions.
  2. Granularity:

    • Derived keys can be unique to users, sessions, or operations, offering fine-grained security control.
  3. Scalability:

    • The system can handle a large number of cryptographic operations without overloading the HSM.
  4. Security Isolation:

    • The root key never leaves the HSM, and derived keys are limited in scope and validity.

Example: Using Derived Keys in Google Tink

Here's an example workflow using Google Tink:

Key Derivation

  1. Root Key Generation (in HSM):

    • Store a root key securely in the HSM (e.g., AES-256).
  2. Context for Derivation:

    • Combine contextual data (e.g., user_id, operation, timestamp) to derive a unique key.
  3. Key Derivation Code (Using HKDF):

    import (
        "crypto/sha256"
        "golang.org/x/crypto/hkdf"
    )
    
    func DeriveKey(rootKey []byte, context []byte) ([]byte, error) {
        hkdf := hkdf.New(sha256.New, rootKey, nil, context)
        derivedKey := make([]byte, 32) // 256-bit key
        _, err := hkdf.Read(derivedKey)
        return derivedKey, err
    }

Using Derived Key for Encryption

After deriving the key, use Tink's aead (authenticated encryption with associated data) primitive:

import (
	"fmt"
	"github.com/google/tink/go/aead"
)

func EncryptWithDerivedKey(derivedKey, plaintext []byte) ([]byte, error) {
	// Create a Tink Keyset using the derived key
	keysetHandle, err := aead.NewKeysetFromBytes(derivedKey)
	if err != nil {
		return nil, fmt.Errorf("failed to create keyset: %v", err)
	}

	// Get the AEAD primitive
	a, err := aead.New(keysetHandle)
	if err != nil {
		return nil, fmt.Errorf("failed to create AEAD: %v", err)
	}

	// Encrypt the plaintext
	return a.Encrypt(plaintext, nil)
}

Best Practices for Derived Keys

  1. Limit Lifetime and Scope:
    • Use short-lived derived keys and regenerate them frequently to minimize risk.
  2. Ensure Unique Context:
    • Always include unique identifiers in the derivation context to prevent key reuse.
  3. Secure Derived Keys:
    • Store derived keys only in secure memory and clear them after use.
  4. Audit HSM Interactions:
    • Maintain logs for all root key usage to detect anomalies.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment