Skip to content

Instantly share code, notes, and snippets.

@soodoku
Last active December 27, 2024 01:13
Show Gist options
  • Save soodoku/e50fd61ba55cee8a8543e65e8fbbad0f to your computer and use it in GitHub Desktop.
Save soodoku/e50fd61ba55cee8a8543e65e8fbbad0f to your computer and use it in GitHub Desktop.
demand/supply misleading beta
# Set random seed for reproducibility
set.seed(123)
# Time t parameters
n_novels_x_t <- 500 # 50% of 1000 novels are type x at time t
n_novels_y_t <- 500
k <- 30 # 30% more likely to read type x
total_readings <- 800 # Fixed total reading capacity
# Time t simulation
prob_x_t <- (1 + k/100)/(2 + k/100)
prob_y_t <- 1/(2 + k/100)
# Simulate readings at time t
readings_t <- numeric(total_readings)
for(i in 1:total_readings) {
if(runif(1) < prob_x_t) {
readings_t[i] <- 1 # Read type x
} else {
readings_t[i] <- 0 # Read type y
}
}
# Time t regression
data_t <- data.frame(
reading = readings_t,
type = factor(ifelse(readings_t == 1, "x", "y"))
)
model_t <- lm(reading ~ type, data = data_t)
# Time t+1 parameters
n_novels_x_t1 <- 700 # 70% of 1000 novels at t+1
n_novels_y_t1 <- 300
# Calculate new reading probabilities
# Due to fixed capacity, probability is affected by supply
scaling_factor <- (n_novels_x_t/n_novels_x_t1) # Adjust for increased supply
prob_x_t1 <- prob_x_t * scaling_factor
prob_y_t1 <- 1 - prob_x_t1
# Simulate readings at t+1
readings_t1 <- numeric(total_readings)
for(i in 1:total_readings) {
if(runif(1) < prob_x_t1) {
readings_t1[i] <- 1
} else {
readings_t1[i] <- 0
}
}
# Time t+1 regression
data_t1 <- data.frame(
reading = readings_t1,
type = factor(ifelse(readings_t1 == 1, "x", "y"))
)
model_t1 <- lm(reading ~ type, data = data_t1)
# Results for time t
cat("Time t results (50-50 split):\n")
print(summary(model_t))
prop_x_t <- mean(readings_t)
prop_y_t <- 1 - prop_x_t
k_recovered_t <- (prop_x_t/prop_y_t - 1) * 100
cat("\nRecovered k at time t:", k_recovered_t, "%\n")
# Results for time t+1
cat("\nTime t+1 results (70-30 split):\n")
print(summary(model_t1))
prop_x_t1 <- mean(readings_t1)
prop_y_t1 <- 1 - prop_x_t1
k_recovered_t1 <- (prop_x_t1/prop_y_t1 - 1) * 100
cat("\nRecovered k at time t+1:", k_recovered_t1, "%\n")
# Set random seed for reproducibility
set.seed(123)
# Parameters
k <- 30 # 30% more likely to read type x
total_readings <- 800 # Fixed total reading capacity
# Time t: 50-50 split
n_novels_x_t <- 500
n_novels_y_t <- 500
# Base probability of reading x (incorporating preference k)
prob_x_t <- (1 + k/100)/(2 + k/100)
prob_y_t <- 1/(2 + k/100)
# Simulate readings at time t
readings_t <- numeric(total_readings)
for(i in 1:total_readings) {
if(runif(1) < prob_x_t) {
readings_t[i] <- 1 # Read type x
} else {
readings_t[i] <- 0 # Read type y
}
}
# Time t+1: 70-30 split
n_novels_x_t1 <- 700
n_novels_y_t1 <- 300
# Adjust probabilities for supply change
prob_x_t1 <- prob_x_t * (n_novels_x_t/n_novels_x_t1) # Adjust down due to oversupply
prob_y_t1 <- 1 - prob_x_t1
# Simulate readings at t+1
readings_t1 <- numeric(total_readings)
for(i in 1:total_readings) {
if(runif(1) < prob_x_t1) {
readings_t1[i] <- 1
} else {
readings_t1[i] <- 0
}
}
# Create data frames for regressions
data_t <- data.frame(
read_x = readings_t,
period = "t",
type = factor(ifelse(readings_t == 1, "x", "y"))
)
data_t1 <- data.frame(
read_x = readings_t1,
period = "t+1",
type = factor(ifelse(readings_t1 == 1, "x", "y"))
)
# Combine data for both periods
all_data <- rbind(data_t, data_t1)
# Create period-type interaction for regression
all_data$period_t1 <- ifelse(all_data$period == "t+1", 1, 0)
all_data$type_x <- ifelse(all_data$type == "x", 1, 0)
all_data$period_type <- all_data$period_t1 * all_data$type_x
# Run regression
model <- lm(read_x ~ type_x + period_t1 + period_type, data = all_data)
# Print results
cat("Regression Results:\n")
print(summary(model))
# Calculate and display proportions
prop_x_t <- mean(readings_t)
prop_y_t <- 1 - prop_x_t
ratio_t <- prop_x_t/prop_y_t
prop_x_t1 <- mean(readings_t1)
prop_y_t1 <- 1 - prop_x_t1
ratio_t1 <- prop_x_t1/prop_y_t1
cat("\nTime t (50-50 split):\n")
cat("Proportion reading x:", prop_x_t, "\n")
cat("Proportion reading y:", prop_y_t, "\n")
cat("Ratio x/y:", ratio_t, "\n")
cat("\nTime t+1 (70-30 split):\n")
cat("Proportion reading x:", prop_x_t1, "\n")
cat("Proportion reading y:", prop_y_t1, "\n")
cat("Ratio x/y:", ratio_t1, "\n")

Say that there are two types of novels: type x and y. Let's say that people prefer reading novels of type x. They are k% more likely to read a novel of type x than y.

At time t, the total number of novels is 1000 with novels of type x being 50%. And we can recover k% by regressing the number of people who read a novel on the type of novel.

Reader preference for type x novels leads writers to produce more novels of type x. At time t+1, the percentage of type of novels x being produced is 70%. But say that the total number of novels of type x that people can read is fixed at n_t. When we regress the number of people who read a novel on the type of novel, we can get that people are less likely to read novel of type x than y because we have more novels of type x at t+1. (For proof, see here.)

@soodoku
Copy link
Author

soodoku commented Dec 26, 2024

cat("\nRecovered k at time t:", k_recovered_t, "%\n")

Recovered k at time t: 33.91813 %

cat("\nRecovered k at time t+1:", k_recovered_t1, "%\n")

Recovered k at time t+1: -34.36853 %

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment