Notes and Comments on Modeling Secrets in Pydantic

Basic Usage and Example

from pydantic import SecretStr

s1 = SecretStr("abc")
s2 = SecretStr("")
s3 = SecretStr("abc")

sxs = (s1, s2, s3)
print(sxs)

def demo(sx: SecretStr) -> None:
    print(repr(sx))
    # The "box" has bool/non-zero support?
    if sx:
        print(f"Box'ed Secret is non-zero. {sx}")
    
    # is this different from `if sx`:
    if sx.get_secret_value():
        print("Value is non-empty")

_ = list(map(demo, sxs))
    
# Comparing Box'ed values?         
msg = "Equal" if s1 == s3 else "non-Equal"
print(f"Secrets are {msg}")

print(("Hashes", list(map(hash, sxs))))

Outs:

(SecretStr('**********'), SecretStr(''), SecretStr('**********'))
SecretStr('**********')
Box'ed Secret is non-zero. **********
Value is non-empty
SecretStr('')
SecretStr('**********')
Box'ed Secret is non-zero. **********
Value is non-empty
Secrets are Equal
('Hashes', [5699326906497411070, 0, 5699326906497411070])

Strictness

from pydantic import BaseModel, ConfigDict, Field, SecretStr

class A(BaseModel):
    model_config = ConfigDict(strict=True)
    x: SecretStr
    y: SecretStr


# Even with Strict=True, it's a bit surprising that a Secret doesn't have to be a SecretStr instance?    
a = A(x="1234", y=SecretStr("abc"))

Outs:

A(x=SecretStr('**********'), y=SecretStr('**********'))

Comments

Why does the "Box'ed" container understand (or leak) information of the secret?
Mixing up the Box and .get_secret_value() undermines the point of using .get_secret_value()
Internally, the "Box" shouldn't use .get_secret_value(). Otherwise, it's leaking info.
Why is the hash value leaking? Comparing two secrets should require and explicit s1.get_secret_value() == s2.get_secret_value() call.
Documenting this functionality is confusing because the "Box" knows something about the secret.
Why does the repr/str communicate "non-empty" values? This is encouraging or enabling an anti-pattern.

Anti-Pattern of "Empty" Secrets

For MyModel(secret="") for non-provided/set secrets (often from ENV, or json), this looks like an anti-pattern.

For "non-provided" Secrets, it's better to model them as None | Secret[T], where T would be a non-empty value.

For example, Pydantic makes it easy to build up these pieces. First, assemble a non-empty str type, then a non-empty Secret type.

from typing import Annotated
from pydantic import StringConstraints, BaseModel
from pydantic.types import Secret

StrNonEmpty = Annotated[str, StringConstraints(min_length=1)]

class SecretStrNonEmpty(Secret[StrNonEmpty]):

    def _display(self) -> str:
        return "*" * 5

    def __eq__(self, other) -> bool:
        # If you want to access the secret, then
        # explicitly call, .get_secret_value()
        return False

    def __hash__(self) -> int:
        # Same reasoning as __eq___
        return id(self)

class A(BaseModel):
    x: SecretStrNonEmpty


class B(BaseModel):
    x: SecretStrNonEmpty | None


def example():
    axs = tuple(map(lambda x: A(x=x), ("a", "aa", "aaa")))
    print(axs)

    bxs = tuple(map(lambda x: B(x=x), ('b', "b", None)))
    print(bxs)

mpkocher/PydanticSecretModeling.md

Notes and Comments on Modeling Secrets in Pydantic

Basic Usage and Example

Comments

Anti-Pattern of "Empty" Secrets