PRELIMINARY: When an adversary uses possibly non-optimal classification methods, the privacy risk to users of OSPEAD deployment without a hard fork is lower than the status quo of all users using the current decoy selection algorithm.
Below are some very preliminary results on the privacy risk of deploying a new OSPEAD-derived decoy selection algorithm (DSA) in the wallet2
reference software without a hard fork.
Having two DSAs used in the wild can potentially split the Monero anonymity pool into two anonymity puddles. Roughly, the tasks of an adversary are:
- Classify transactions by their DSA being used. A given DSA will leave probabilistic evidence of its true form: the empirical temporal distribution of 15 of the 16 ring members. For now, we assume that the adversary will use a neural net (NN) classifier for this task
- Given the classifications (allocations) made in step 1, try to guess the real spend. They can do this with the nonfungibility (NF) classifier described in Rucknium (2023), which guesses that the real spend is the one whose antecedent transaction shares the same DSA as the transaction of interest.
- Alternatively, implement the MAP Decoder attack against the actual ring member distribution, which guesses that the real spend is the one with the highest relative probability of belonging to the real spend age distribution, compared to the decoy distribution.
- The adversary can dynamically choose between the NF classifier and the MAP Decoder, depending on which one has greatest accuracy in a given circumstance.
Some initial parameters are needed to set up the simulation:
The share of users using the new DSA,
I assume only 1-input (therefore a single ring) transactions. Transactions with more than one input would be easier to classify into a specific DSA.
The simulated real spend distribution, new DSA, and old DSA need to be specified. The old DSA in the simulations is the actual wallet2 DSA. The real spend distribution and new DSA were chosen to be compatible with the contributions of @spackle-xmr in his neural net simulations. The real spend distribution and new DSA are exactly the same, so the MAP Decoder does no better than uniform-random guessing. The distribution is the OSPEAD-fitted log-gamma distribution with shape parameter equal to 4.315 and rate parameter equal to 0.3751. In this simulation, the average MAP Decoder attack success probability against users using the old DSA is 27%. A more realistic scenario would set the real spend distribution to its actual nonparametric distribution estimate, which would allow the adversary to attack new-DSA users with the MAP Decoder with greater success than random guessing.
A dataset for training the NN classifier was created. New-DSA rings and old-DSA rings were generated by drawing 15 ring members from the DSA distribution and one from the real spend. The number of new/old-DSA rings was proportional to the
A separate dataset for guessing the real spend was created. Each observation consists of a "transaction of interest". The transaction of interest has a ring distribution that is either new-DSA or old-DSA. Each ring member (one of the "antecedent transaction" of the "transaction of interest") has a probability of being constructed with the new-DSA or old-DSA, depending on
The adversary follows this procedure:
- Train the NN classifier with the training dataset
- Apply the NN classifier to to the transactions of interest and the antecedent transactions. Store the classifications.
- Given these classifications, which will not be 100% correct, apply the MAP Decoder and NF classifier.
- Choose to use the classifier (MAP Decoder or NF) that has the highest probability of guessing the real spend in a given situation. The "situation" depends on whether the trasnaction of interest was classified as new- or old-DSA. Also, from step (2), the adversary will know how many antecedent transactions have been classified as new/old-DSA in a given transaction of interest and can adapt its rules according to the DSA classification count.
When the adversary needs to use NN to clasify both antecedent tx and tx of interest (feasible for adversary)
Share of txs using new DSA | Share of new-DSA txs with correctly-guessed real spend | Share of old-DSA txs with correctly-guessed real spend |
---|---|---|
2% | 20.7% | 27.1% |
5% | 20.2% | 26.7% |
10% | 17.8% | 26.6% |
When the adversary knows with 100% accuracy the DSA of antecedent txs, but needs to use NN to clasify tx of interest (NOT feasible for adversary)
Share of txs using new DSA | Share of new-DSA txs with correctly-guessed real spend | Share of old-DSA txs with correctly-guessed real spend |
---|---|---|
2% | 25.6% | 27.1% |
5% | 24.8% | 26.8% |
10% | 20.6% | 26.7% |
When the adversary knows with 100% accuracy the DSA of antecedent txs and the tx of interest (NOT feasible for adversary)
These are theoretical results from Table 1 of Rucknium (2023) and the MAP Decoder average attack success probability.
Share of txs using new DSA | Share of new-DSA txs with correctly-guessed real spend | Share of old-DSA txs with correctly-guessed real spend |
---|---|---|
2% | 38.3% | 27.0% |
5% | 31.7% | 27.0% |
10% | 24.1% | 27.0% |
# Uncomment to install necessary packages
# install.packages(c("keras3", "actuar", "data.table"))
# May need additional steps to install keras:
# https://rstudio.github.io/cheatsheets/html/keras.html#installation
library(keras3)
library(data.table)
####################################
# Parameters below can be changed
####################################
beta <- 0.10 # Share of txs that are new-DSA
n <- 16 # Ring size
C <- 0.4 # Share of inputs that spend change
n.rings.nn.training.base <- 10000
# Base value for number of rings. Will multiply this by 100. This is like percent.
adversary.clairvoyant.antecedent.txs.dsa <- FALSE
# Change to TRUE if the adversary knows exactly which antecedent txs
# are new-DSA and old-DSA, i.e. the adversary doesn't have to
# implement a probabalistic classifier
adversary.clairvoyant.txs.of.interest.dsa <- FALSE
# Change to TRUE if the adversary knows exactly which txs of interest
# are new-DSA and old-DSA, i.e. the adversary doesn't have to
# implement a probabalistic classifier
####################################
# Parameters above can be changed
####################################
v <- 1.4
z <- 117464545
GAMMA_SHAPE = 19.28
GAMMA_RATE = 1.61
G <- function(x) {
actuar::plgamma(x, shapelog = GAMMA_SHAPE, ratelog = GAMMA_RATE)
}
G_star <- function(x) {
(0 <= x*v & x*v <= 1800) *
(G(x*v + 1200) - G(1200) +
( (x*v)/(1800) ) * G(1200)
)/G(z*v) +
(1800 < x*v & x*v <= z*v) * G(x*v + 1200)/G(z*v) +
(z*v < x*v) * 1
}
get.decoy.pmf <- function(cdf, v, z, sub.supp, log.trans = FALSE, ...) {
G <- function(x) {
cdf(x, ...)
}
if (log.trans) {
G_star <- function(x) {
G(log1p(x*v))/G(log1p(z*v))
}
} else {
G_star <- function(x) {
G(x*v)/G(z*v)
}
}
G_star(sub.supp) - G_star(sub.supp - 1)
}
f_D.lgamma <- function(param, v, z, sub.supp, get.decoy.pmf) {
list(decoy.pmf = get.decoy.pmf(actuar::plgamma, v, z, sub.supp = sub.supp, shapelog = exp(param[1]), ratelog = exp(param[2])),
tail.beyond.support = 1 - actuar::plgamma(z*v, shapelog = exp(param[1]), ratelog = exp(param[2])))
}
f_D.fun <- f_D.lgamma
fitted.par <- log(c(4.315, 0.3751))
new_decoy <- stepfun(1:z, cumsum(c(0, f_D.fun(fitted.par, v, z, sub.supp = 1:z, get.decoy.pmf)$decoy.pmf)))
new_decoy.pmf <- diff(c(0, new_decoy(1:z)))
G_star.pmf <- diff(c(0, G_star(1:z)))
new_decoy.pmf <- new_decoy.pmf/sum(new_decoy.pmf)
G_star.pmf <- G_star.pmf/sum(G_star.pmf)
new_decoy.G_star.pmf.ratio <- new_decoy.pmf/G_star.pmf
# Used for MAP Decoder
# Uncomment this if you want to see the baseline Map Decoder attack sucess
# probability against the old DSA. Should be X for current parameters.
# Must also install the decoyanalysis R package:
# https://github.com/rucknium/OSPEAD?tab=readme-ov-file#installing-the-decoyanalysis-r-package
# MEAN.MAPD <- decoyanalysis::map.decoder.success.prob(new_decoy.pmf, G_star.pmf, n.decoys = 15)
# print(MEAN.MAPD$mean.success.probability)
draw.real.spend <- function(n) {
sample(1:z, size = n, replace = TRUE, prob = new_decoy.pmf)
}
draw.decoy.new.dsa <- function(n) {
sample(1:z, size = n, replace = TRUE, prob = new_decoy.pmf)
}
draw.decoy.old.dsa <- function(n) {
sample(1:z, size = n, replace = TRUE, prob = G_star.pmf)
}
set.seed(314)
n.rings.nn.training.new <- ceiling(n.rings.nn.training.base * beta * 100)
n.rings.nn.training.old <- ceiling(n.rings.nn.training.base * (1 - beta) * 100)
new_decoy.draws <- cbind(
matrix(draw.decoy.new.dsa(n.rings.nn.training.new * (n - 1)), nrow = n.rings.nn.training.new),
matrix(draw.real.spend(n.rings.nn.training.new), nrow = n.rings.nn.training.new)
)
new_decoy.draws <- t(apply(new_decoy.draws, MARGIN = 1, FUN = sample))
G_star.draws <- cbind(
matrix(draw.decoy.old.dsa(n.rings.nn.training.old * (n - 1)), nrow = n.rings.nn.training.old),
matrix(draw.real.spend(n.rings.nn.training.old), nrow = n.rings.nn.training.old)
)
G_star.draws <- t(apply(G_star.draws, MARGIN = 1, FUN = sample))
new_decoy.draws <- log(new_decoy.draws)
G_star.draws <- log(G_star.draws)
standardize.mean <- mean(c(c(new_decoy.draws), c(G_star.draws)))
standardize.sd <- sd(c(c(new_decoy.draws), c(G_star.draws)))
scale.fixed <- function(x) {
scale(x, center = rep(standardize.mean, ncol(x)), scale = rep(standardize.sd, ncol(x)))
}
new_decoy.draws <- scale.fixed(new_decoy.draws)
G_star.draws <- scale.fixed(G_star.draws)
X <- rbind(new_decoy.draws, G_star.draws)
y <- rep(c(0, 1), times = c(nrow(new_decoy.draws), nrow(G_star.draws)))
set.seed(314)
splitter <- sample(c("train", "test"), size = length(y), replace = TRUE, prob = c(0.2, 0.8))
X_train <- X[splitter == "train", ]
X_test <- X[splitter == "test", ]
y_train <- y[splitter == "train"]
y_test <- y[splitter == "test"]
model <- keras_model_sequential(input_shape = 16) |>
layer_dense(128, activation='relu') |>
layer_dense(64, activation='relu')|>
layer_dense(32, activation='relu') |>
layer_dense(1, activation='sigmoid')
# compile:
compile(object = model, optimizer = "adam", loss = "binary_crossentropy", metrics = list("accuracy"))
# Fit:
fit(object = model, x = X_train, y = y_train, batch_size = 32, epochs = 10, validation_data = list(X_test, y_test))
# Evaluate:
evaluate(object = model, x = X_test, y = y_test)
y_prediction <- predict(model, X_test)
y_prediction <- ifelse(y_prediction > 0.5, 1, 0)
prop.table(table(y_test, y_prediction))
gen.rings <- function(draw.my.dsa, draw.their.dsa, draw.real.spend, nn.model,
n, beta, C, n.rings, is.old, standardize.scale = TRUE) {
non.change.real.spend <- sample(c(1L, 0L), size = n.rings, replace = TRUE, prob = c(beta, 1 - beta))
# Draw n.rings elements with replacement from the set of {1, 0} with probability beta of drawing 1 and
# probability 1-beta of drawing 0. The 1 is a real spend output that has the defect.
real.spend <- ifelse(
sample(c(TRUE, FALSE), size = n.rings, replace = TRUE, prob = c(C, 1 - C)),
1L,
non.change.real.spend)
# With probability C, the ring spends change. The change has the defect by assumption. With probability
# 1-C, the user spend a non-change output that has beta probability (above) of having the defect.
rings <- matrix(c(
sample(c(1L, 0L), size = n.rings * (n - 1), replace = TRUE, prob = c(beta, 1 - beta)),
real.spend),
nrow = n.rings, ncol = n)
# Create a matrix n.rings x n in size. The first n-1 columns are filled with decoys. With probability
# beta these decoys have the defect. The last column is the real spend, created above.
# Final column will always be the real spend
ring.distribution <- cbind(
matrix(draw.my.dsa(n.rings * (n - 1)), nrow = n.rings),
matrix(draw.real.spend(n.rings), nrow = n.rings)
)
my.dsa.rings.to.produce <- sum(c(rings) == 1)
their.dsa.rings.to.produce <- sum(c(rings) == 0)
antecedent.ring.distribution.mine <- rbind(
matrix(draw.my.dsa(my.dsa.rings.to.produce * (n - 1)), ncol = my.dsa.rings.to.produce),
matrix(draw.real.spend(my.dsa.rings.to.produce), ncol = my.dsa.rings.to.produce)
)
antecedent.ring.distribution.theirs <- rbind(
matrix(draw.their.dsa(their.dsa.rings.to.produce * (n - 1)), ncol = their.dsa.rings.to.produce),
matrix(draw.real.spend(their.dsa.rings.to.produce), ncol = their.dsa.rings.to.produce)
)
antecedent.ring.distribution.index <- replicate(n, rings)
# replicate() will bind the duplicate matrices together along the third dimension
# Don't need Reduce(abind, list(rings, rings,...))
antecedent.ring.distribution.dims <- dim(antecedent.ring.distribution.index)
antecedent.ring.distribution.index <- c(antecedent.ring.distribution.index)
antecedent.ring.distribution <- vector("integer", length(antecedent.ring.distribution.index))
stopifnot(length(antecedent.ring.distribution) ==
length(antecedent.ring.distribution.mine) + length(antecedent.ring.distribution.theirs))
antecedent.ring.distribution[antecedent.ring.distribution.index == 1L] <- antecedent.ring.distribution.mine
antecedent.ring.distribution[antecedent.ring.distribution.index == 0L] <- antecedent.ring.distribution.theirs
stopifnot( ! any(is.na(antecedent.ring.distribution)))
antecedent.ring.distribution <- array(antecedent.ring.distribution, dim = antecedent.ring.distribution.dims)
antecedent.ring.distribution.for.classification <- apply(antecedent.ring.distribution, 3, "c")
# Help from https://stackoverflow.com/questions/4022195/transform-a-3d-array-into-a-matrix-in-r
# table(rowSums( apply(replicate(n, rings), 3, "c") ))
# Should have only 0 and 16
antecedent.ring.distribution.for.classification <-
t(apply(antecedent.ring.distribution.for.classification, MARGIN = 1, FUN = sample))
antecedent.ring.distribution.for.classification.trans <- log(antecedent.ring.distribution.for.classification)
if (standardize.scale) {
antecedent.ring.distribution.for.classification.trans <- scale.fixed(antecedent.ring.distribution.for.classification.trans)
}
antecedent.ring.distribution.classified <-
predict(model, antecedent.ring.distribution.for.classification.trans)
# antecedent.ring.distribution.classified <- ifelse(c(antecedent.ring.distribution.classified) > 0.5, 0, 1)
antecedent.ring.distribution.classified <- ifelse(c(antecedent.ring.distribution.classified) > 0.5, 1, 0)
if (is.old) {
antecedent.ring.distribution.classified <- abs(antecedent.ring.distribution.classified - 1)
}
# TODO: Have a more elegant way to do this
antecedent.ring.distribution.classified <- matrix(antecedent.ring.distribution.classified, ncol = n)
# Fill by columns
ring.distribution.for.classification <-
t(apply(ring.distribution, MARGIN = 1, FUN = sample))
ring.distribution.for.classification.trans <- log(ring.distribution.for.classification)
if (standardize.scale) {
ring.distribution.for.classification.trans <- scale.fixed(ring.distribution.for.classification.trans)
}
ring.distribution.classified <-
predict(model, ring.distribution.for.classification.trans)
# WARNING: assumes a log transformation of the training set for the NN classifier
ring.distribution.classified <- ifelse(ring.distribution.classified > 0.5, 1, 0)
# Be careful of what is zero and 1
if (is.old) {
ring.distribution.classified <- abs(ring.distribution.classified - 1)
}
# TODO: Have a more elegant way to do this
list(
nn.classified.txs = ring.distribution.classified,
ring.distribution = ring.distribution,
defect.matrix = rings,
nn.classified.antecedent.txs = antecedent.ring.distribution.classified
)
# NN-Classified tx vector: ring.distribution.classified
# Actual ring defect matrix: rings
# NN-Classified ring defect matrix: antecedent.ring.distribution.classified
}
set.seed(314)
new.dsa.ring.data <- gen.rings(
draw.my.dsa = draw.decoy.new.dsa,
draw.their.dsa = draw.decoy.old.dsa,
draw.real.spend = draw.real.spend,
nn.model = model,
n = n, beta = beta, C = C, n.rings = n.rings.nn.training.new, is.old = FALSE,
standardize.scale = TRUE)
old.dsa.ring.data <- gen.rings(
draw.my.dsa = draw.decoy.old.dsa,
draw.their.dsa = draw.decoy.new.dsa,
draw.real.spend = draw.real.spend,
nn.model = model,
n = n, beta = 1 - beta, C = C, n.rings = n.rings.nn.training.old, is.old = TRUE,
standardize.scale = TRUE)
if (adversary.clairvoyant.antecedent.txs.dsa) {
new.dsa.ring.data$nn.classified.antecedent.txs <- new.dsa.ring.data$defect.matrix
old.dsa.ring.data$nn.classified.antecedent.txs <- old.dsa.ring.data$defect.matrix
}
if (adversary.clairvoyant.txs.of.interest.dsa) {
new.dsa.ring.data$nn.classified.txs <- c(rep(0, 10), rep(1, length(c(new.dsa.ring.data$nn.classified.txs)) - 10))
old.dsa.ring.data$nn.classified.txs <- c(rep(0, 10), rep(1, length(c(old.dsa.ring.data$nn.classified.txs)) - 10))
# Have some misclassified so that the code below works
}
# The operation below explicitly follows the rules of the classiffier to select
# one of the ring members as teh guessed real spend. It is much slower than the
# formula above
nf.classifier <- function(x) {
if (sum(x) == 0L) {
return(sample(1:n, size = 1))
}
# If none of the ring members have the defect, randomly guess that one of them
# is the real spend, with equal probability.
if (sum(x) == 1L) {
return(which(x == 1L))
}
# If just one of the ring members has the defect, guess that the ring member
# with the defect is the real spend
return(sample(which(x == 1L), size = 1))
# If more than one ring members have the defect, randomly guess that one of the
# ring members with the defect is the real spend, with equal probability.
}
map.decoder <- function(x) {
which.max(new_decoy.G_star.pmf.ratio[x])
# TODO: Must change this once we get the actual real spend distribution
}
best.classification <- function(ring.data, correct, do.correct = TRUE) {
if (!do.correct) {
correct <- !correct
ring.data$nn.classified.antecedent.txs <- abs(ring.data$nn.classified.antecedent.txs - 1)
}
# Get the incorrectly classified ones
mapd.guesses <- apply(ring.data$ring.distribution[correct, ], 1, FUN = map.decoder )
nf.guesses <- apply(ring.data$nn.classified.antecedent.txs[correct, ], 1, FUN = nf.classifier)
n.classified.defects.in.ring <- rowSums(ring.data$nn.classified.antecedent.txs[correct, ])
list(mapd.guesses = mapd.guesses, nf.guesses = nf.guesses,
n.classified.defects.in.ring = n.classified.defects.in.ring)
}
new.correct <- best.classification(new.dsa.ring.data, new.dsa.ring.data$nn.classified.txs == 1)
new.incorrect <- best.classification(new.dsa.ring.data, new.dsa.ring.data$nn.classified.txs == 1, do.correct = FALSE)
old.correct <- best.classification(old.dsa.ring.data, old.dsa.ring.data$nn.classified.txs == 1)
old.incorrect <- best.classification(old.dsa.ring.data, old.dsa.ring.data$nn.classified.txs == 1, do.correct = FALSE)
best.rule <- function(first.correct, second.incorrect) {
combined <- list(
mapd.guesses = c(first.correct$mapd.guesses, second.incorrect$mapd.guesses),
nf.guesses = c(first.correct$nf.guesses, second.incorrect$nf.guesses),
n.classified.defects.in.ring = c(first.correct$n.classified.defects.in.ring, second.incorrect$n.classified.defects.in.ring)
)
combined$mapd.guesses.correct <- combined$mapd.guesses == n
combined$nf.guesses.correct <- combined$nf.guesses == n
setDT(combined)
classification.comparison <- combined[, .(mapd.guesses.correct.mean = mean(mapd.guesses.correct),
nf.guesses.correct.mean = mean(nf.guesses.correct), N = .N), by = n.classified.defects.in.ring]
which.better <- classification.comparison[, mapd.guesses.correct.mean > nf.guesses.correct.mean]
total.correct <- classification.comparison[,
sum(ifelse(which.better, mapd.guesses.correct.mean * N, nf.guesses.correct.mean * N))]
setDT(first.correct)
first.classification.comparison <- first.correct[, .(mapd.guesses.mean = mean(mapd.guesses == n),
nf.guesses.mean = mean(nf.guesses == n), N = .N), by = n.classified.defects.in.ring]
first.correct.guessed <- first.classification.comparison[,
sum(ifelse(which.better, mapd.guesses.mean * N, nf.guesses.mean * N))]
setDT(second.incorrect)
second.classification.comparison <- second.incorrect[, .(mapd.guesses.mean = mean(mapd.guesses == n),
nf.guesses.mean = mean(nf.guesses == n), N = .N), by = n.classified.defects.in.ring]
second.correct.guessed <- second.classification.comparison[,
sum(ifelse(which.better, mapd.guesses.mean * N, nf.guesses.mean * N))]
list(
total = nrow(combined), total.correct = total.correct, total.share.correct = total.correct / nrow(combined),
first = nrow(first.correct), first.correct = first.correct.guessed,
first.share.correct = first.correct.guessed / nrow(first.correct),
second = nrow(second.incorrect), second.correct = second.correct.guessed,
second.share.correct = second.correct.guessed / nrow(second.incorrect)
)
}
best.rule.new <- best.rule(new.correct, old.incorrect)
best.rule.old <- best.rule(old.correct, new.incorrect)
share.new.dsa.real.spends.correctly.guessed <-
(best.rule.new$first.correct + best.rule.old$second.correct)/
(best.rule.new$first + best.rule.old$second)
share.old.dsa.real.spends.correctly.guessed <-
(best.rule.old$first.correct + best.rule.new$second.correct)/
(best.rule.old$first + best.rule.new$second)
cat("Share of new-DSA txs in the dataset:", beta, "\n")
cat("Share of new-DSA txs whose real spends are correctly guessed:",
round(share.new.dsa.real.spends.correctly.guessed, 5), "\n")
cat("Share of old-DSA txs in the dataset:", 1 - beta, "\n")
cat("Share of old-DSA txs whose real spends are correctly guessed:",
round(share.old.dsa.real.spends.correctly.guessed, 5), "\n")