prob_facts pipeline log

15 LLM calls

Stage 2a — domain=0 (public transit ridership in a mid-sized U.S. city)

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Generate 104 diverse measurable facts for the domain: `public transit ridership in a mid-sized U.S. city`.

A 'measurable fact' should be a specific measurable phenomenon, event, behavior, outcome, or attribute within this domain that could plausibly be expressed as a probability, rate, frequency, likelihood, or proportion.

Focus on WHAT could be measured, not the value itself.
Make the topics diverse and cover different aspects of the domain.
Give me a wide variety of facts, both semantically and syntactically diverse.Avoid near-duplicates, overlap, or overly broad topics.

Output

1. Probability that a resident uses public transit at least once in a given week
2. Proportion of all weekday trips made on public transit rather than by car, bike, or walking
3. Average number of boardings per route per weekday
4. Rate of ridership growth (month-over-month) for the entire transit system
5. Share of riders who transfer between two routes on a single one-way trip
6. Frequency of peak-period overcrowding events (load factor above a defined threshold) per route-mile
7. Probability that a bus arrives within 5 minutes of scheduled time at a given stop
8. Proportion of riders who experience at least one missed connection in a week
9. Average boarding-to-alighting trip duration for bus riders during the AM peak
10. Percentage of rides paid with contactless bank cards versus transit cards or cash
11. Rate of fare evasion incidents per 10,000 boardings
12. Share of ridership occurring on weekends versus weekdays
13. Probability that a rider uses a monthly pass rather than single-ride fares
14. Proportion of first-boardings that occur at park-and-ride lots
15. Average number of riders per vehicle-hour of service on each route
16. Likelihood that a rider chooses the express route when both local and express options exist
17. Rate of boarding counts per stop in the central business district during lunch hours
18. Proportion of riders whose trip purpose is commuting to work
19. Proportion of riders whose trip purpose is school or university travel
20. Share of riders who report using transit to access healthcare appointments
21. Percentage of riders who combine biking with transit (bike-to-transit or transit-to-bike) on a given day
22. Probability that a rider uses a mobility aid and boards via ramp or lift
23. Frequency of dwell times exceeding a defined threshold at high-traffic stops
24. Rate of service disruptions (detours, cancellations) per 1,000 scheduled trips
25. Proportion of riders who abandon a planned trip due to real-time delay alerts
26. Average headway adherence (actual vs scheduled) on high-frequency corridors
27. Percentage of riders who use real-time arrival information before boarding
28. Rate of app-based trip planning sessions per 1,000 residents per month
29. Probability that a rider’s first boarding occurs before 7:00 AM on weekdays
30. Share of ridership occurring after 9:00 PM on weekdays
31. Proportion of riders who make a round trip using transit in the same day
32. Rate of new rider acquisition following a fare reduction or free-fare promotion
33. Probability that a rider continues using transit 3 months after first trying it
34. Proportion of households within a quarter-mile of a stop that use transit at least monthly
35. Likelihood that riders choose transit during major downtown events compared with typical days
36. Change in ridership share during extreme weather days (snow, heat advisory) versus normal weather
37. Proportion of boardings occurring at stops with shelters versus without shelters
38. Rate of ridership change after a new route is introduced in an underserved neighborhood
39. Probability that a rider uses transit to access the airport or intercity rail station
40. Average number of stops traveled per passenger trip on bus routes
41. Proportion of riders who transfer to or from a commuter rail line (if present) within the city
42. Rate of on-board incidents (medical, security, conflicts) per 100,000 passenger trips
43. Proportion of riders who report feeling safe at stops after dark
44. Probability that a rider uses transit at least once per week without owning a car
45. Share of ridership attributable to riders aged 18–24
46. Share of ridership attributable to riders aged 65+
47. Proportion of riders with limited English proficiency who use transit information in a non-English language
48. Percentage of riders who wait longer than 10 minutes beyond scheduled arrival during peak hours
49. Rate of bus bunching events on frequent routes per day
50. Proportion of riders who use transit on holidays when reduced schedules operate
51. Probability that riders choose a different route when a preferred line is delayed
52. Share of boardings occurring at the top 10 busiest stops systemwide
53. Percentage of routes where ridership exceeds pre-pandemic baselines in the last quarter
54. Average fare paid per boarding after discounts, passes, and transfers
55. Proportion of riders who qualify for reduced-fare programs (student, senior, low-income)
56. Rate of ridership loss attributable to major roadway construction affecting bus speeds
57. Probability that a rider’s bus trip includes standing for more than half the trip
58. Proportion of riders who bring a stroller, cart, or large baggage on board
59. Average in-vehicle speed (miles per hour) on key corridors during peak periods
60. Rate of on-time departures from route terminals per 100 departures
61. Proportion of riders who experience a stop skip or pass-up due to full vehicles
62. Percentage of riders who use transit for nightlife-related trips on Friday/Saturday nights
63. Probability that a rider uses microtransit/paratransit instead of fixed-route service for a trip
64. Share of system ridership served by zero-emission buses versus diesel or hybrid buses
65. Rate of ridership change after adding transit signal priority on a corridor
66. Proportion of riders who access stops primarily by walking versus driving or biking
67. Average walking distance from origin to boarding stop for transit users
68. Probability that a rider’s trip includes crossing the city boundary into a neighboring suburb
69. Proportion of riders who use transit to reach a grocery store at least once per month
70. Rate of ridership response to gasoline price increases over a quarter
71. Percentage of riders who receive employer-subsidized transit benefits
72. Proportion of riders who use transit at least three days per week for commuting
73. Likelihood of ridership decline during school summer break on routes serving campuses
74. Rate of customer complaints per 10,000 boardings related to delays
75. Rate of customer complaints per 10,000 boardings related to cleanliness
76. Probability that a rider encounters a broken or out-of-service ticket vending machine (if applicable)
77. Proportion of riders who use the same boarding stop for most trips in a month
78. Share of riders who make at least one intermodal transfer to a bike-share system
79. Rate of ridership increase following new shelter or lighting installation at stops
80. Proportion of riders who use transit despite having access to free parking at their destination
81. Average number of unique riders per day (estimated) relative to total boardings (unlinked vs linked trips)
82. Probability that a rider chooses transit over ride-hailing for trips to downtown on weekend evenings
83. Proportion of riders who use transit to reach recreational parks or trails during weekends
84. Rate of ridership change after a major employer relocates within the city
85. Percentage of bus stops where ridership exceeds a minimum threshold needed for amenity prioritization
86. Proportion of riders who report that service frequency is the primary reason for choosing transit
87. Probability that a rider experiences at least one cancelled trip in a month on their primary route
88. Share of ridership occurring in the shoulder period (e.g., 9–11 AM, 2–4 PM) on weekdays
89. Rate of ridership change following introduction of all-door boarding on select routes
90. Proportion of trips where riders pay an additional fare due to transfer window expiration
91. Probability that a rider uses transit for a trip chain with three or more linked legs in one day
92. Proportion of riders who state reliability improvements would increase their transit usage
93. Average time spent waiting at stops per trip, measured by time of day
94. Rate of ridership change after implementing a new fare capping policy
95. Proportion of riders who board at stops with accessible curb ramps and compliant landing pads
96. Probability that a rider uses transit on the first day of a snowstorm compared to the day after
97. Share of ridership generated by special service shuttles (event, campus, circulator) versus regular routes
98. Rate of passenger turnover (boardings + alightings) per stop per hour on the busiest corridor
99. Proportion of riders who would switch from transit to driving if fares rose by a defined amount
100. Probability that a rider uses transit at least once after moving to a new residence within the city
101. Share of rides taken by riders living in census tracts below a defined income threshold
102. Proportion of riders whose total door-to-door travel time by transit is within 20% of driving time
103. Rate of ridership change after adding a dedicated bus lane segment on a congested arterial
104. Percentage of riders who report that stop amenities (shelter, benches, lighting) influence their route choice

Stage 2b — domain=0 batch=0 twist=requires_calculation

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability indirectly so the reader must do arithmetic (multiplication, addition, or a combination) to recover it. Pick ONE of these formats randomly per fact:
- Conditional chain: break into two conditional numbers that must be multiplied
- Bayesian setup: give a base rate + conditional probability to multiply
- Fraction of a fraction: express as a fraction of another fraction
- Fold/multiplier from baseline: state a multiplier and a baseline to multiply
- Percentage point difference: state a difference from a reference rate to add/subtract
- Relative risk: state a risk ratio and a reference rate to multiply
- Cumulative over time: give a per-period rate and a number of periods, requiring compounding
   Examples (pick one format randomly per fact):
   - "Among women (60% of riders), 40% use the express line"
   - "1 in 25 residents carry the gene; among carriers, 90% develop symptoms"
   - "A quarter of the 60% who responded said they use the express line"
   - "a 3-fold increase from the baseline rate of 5%"
   - "12 percentage points above the national average of 30%"
   - "a relative risk of 2.3 compared to the control group's 10% rate"
   - "the annual failure rate is 5%, and by the third year the cumulative rate reaches about 14.3%"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Share of rides taken by riders living in census tracts below a defined income threshold — target ≈ 83.99%
  2. Topic: Average walking distance from origin to boarding stop for transit users — target ≈ 11.06%
  3. Topic: Share of boardings occurring at the top 10 busiest stops systemwide — target ≈ 13.95%
  4. Topic: Proportion of riders who state reliability improvements would increase their transit usage — target ≈ 16.78%
  5. Topic: Probability that a rider uses transit to access the airport or intercity rail station — target ≈ 27.05%
  6. Topic: Rate of ridership response to gasoline price increases over a quarter — target ≈ 20.19%
  7. Topic: Probability that a rider uses transit at least once after moving to a new residence within the city — target ≈ 60.47%
  8. Topic: Probability that a rider continues using transit 3 months after first trying it — target ≈ 59.47%

Output

1. Among households in the city (where 93% live in census tracts below the defined income threshold), precisely 90% of transit rides originate from riders in those lower-income areas.
2. The average walking distance from origin to boarding stop represents a 2.3-fold increase over the baseline benchmark of 4.809% established in the pilot study.
3. Boardings at the top 10 busiest stops systemwide account for roughly one-fifth of the 69.75% of all rides that occur at major transfer points.
4. Among frequent riders (31% of the surveyed population), 54.129% reported that reliability improvements would meaningfully increase their transit usage frequency.
5. Transit access to the airport or intercity rail station occurs at a rate 7.8 percentage points below the regional metropolitan average of 34.85%.
6. Ridership response to gasoline price increases carries a relative risk of 1.73 compared to the control period's baseline rate of 11.67%.
7. The quarterly attrition rate for new residents trying transit stands at 15%, and by the third quarter the cumulative probability of continued ridership reaches approximately 60.47%.
8. Three-sevenths of the 138.598% theoretical maximum engagement rate represents the actual probability that a first-time rider continues using transit after three months.

Stage 2b — domain=0 batch=1 twist=odds_or_ratio

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability as odds or a ratio. The reader must convert to a probability. Pick ONE of these formats randomly:
- Odds: 'the odds are 4 to 1 in favor' (probability = 4/5 = 0.8)
- Ratio: 'the ratio of affected to unaffected is 1:3' (probability = 1/4 = 0.25)
- Fractional odds: 'the odds of X are 3/7' (probability = 3/10 = 0.3)
   Examples (pick one format randomly per fact):
   - "the odds of a commuter using the express line are roughly 3 to 7"
   - "the ratio of delayed to on-time arrivals is 1:4"
   - "the odds of experiencing side effects are 2/9"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Probability that a rider uses transit at least once per week without owning a car — target ≈ 63.20%
  2. Topic: Probability that a rider encounters a broken or out-of-service ticket vending machine (if applicable) — target ≈ 31.95%
  3. Topic: Rate of ridership change after implementing a new fare capping policy — target ≈ 73.57%
  4. Topic: Percentage of bus stops where ridership exceeds a minimum threshold needed for amenity prioritization — target ≈ 28.73%
  5. Topic: Rate of boarding counts per stop in the central business district during lunch hours — target ≈ 84.47%
  6. Topic: Proportion of riders who report that service frequency is the primary reason for choosing transit — target ≈ 15.11%
  7. Topic: Average number of riders per vehicle-hour of service on each route — target ≈ 59.15%
  8. Topic: Average time spent waiting at stops per trip, measured by time of day — target ≈ 26.05%

Output

1. Among regular transit users in the downtown corridor, the odds are roughly 17 to 10 in favor of riding at least once per week without owning a personal vehicle.
2. The ratio of riders encountering broken ticket vending machines to those finding them operational stands at approximately 47:100.
3. Following the new fare capping policy rollout, the odds of increased ridership on affected routes are 278 to 100.
4. Bus stops meeting the minimum threshold for amenity prioritization show a ratio to those falling short of roughly 41:102.
5. During midday hours in the central business district, boarding activity per stop occurs at odds of approximately 543 to 100 relative to baseline counts.
6. The odds that service frequency is cited as the primary motivation for transit use are 151 to 849.
7. Productivity metrics reveal the ratio of routes exceeding target riders per vehicle-hour to those underperforming stands at approximately 145:100.
8. Wait time analysis indicates the odds are roughly 26 to 74 that average dwell duration per trip falls within the peak delay window.

Stage 2b — domain=0 batch=2 twist=basis_points

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability in basis points (1 basis point = 0.01% = 0.0001). E.g., '350 basis points' means 3.5%. The reader must divide by 10,000 to get the probability.
   Example: "the default rate stands at 350 basis points"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Rate of app-based trip planning sessions per 1,000 residents per month — target ≈ 88.38%
  2. Topic: Rate of on-time departures from route terminals per 100 departures — target ≈ 81.24%
  3. Topic: Share of ridership attributable to riders aged 65+ — target ≈ 5.22%
  4. Topic: Likelihood of ridership decline during school summer break on routes serving campuses — target ≈ 38.99%
  5. Topic: Proportion of riders whose trip purpose is commuting to work — target ≈ 84.18%
  6. Topic: Proportion of riders who report feeling safe at stops after dark — target ≈ 86.44%
  7. Topic: Average number of stops traveled per passenger trip on bus routes — target ≈ 31.83%
  8. Topic: Average number of unique riders per day (estimated) relative to total boardings (unlinked vs linked trips) — target ≈ 14.58%

Output

1. The rate of app-based trip planning sessions per 1,000 residents per month stands at 8838 basis points according to the latest digital engagement survey.
2. On-time departures from route terminals occur at a rate of 8124 basis points per 100 departures, reflecting strong operational performance.
3. Riders aged 65 and older account for 522 basis points of total ridership across all routes and time periods.
4. The likelihood of ridership decline during school summer break on routes serving campuses reaches 3899 basis points, indicating significant seasonal variation.
5. Commuting to work represents the trip purpose for 8418 basis points of all riders surveyed during weekday peak hours.
6. When asked about personal safety, riders who report feeling safe at stops after dark represent 8644 basis points of respondents.
7. Bus route passengers travel an average number of stops measuring 3183 basis points relative to total route length.
8. The ratio of unique riders per day to total boardings, comparing linked versus unlinked trips, registers at 1458 basis points.

Stage 2b — domain=0 batch=3 twist=direct_percentage

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   State the probability directly as a plain percentage. This is the baseline — the number appears clearly.
   Example: "37% of commuters use the express line"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Rate of customer complaints per 10,000 boardings related to cleanliness — target ≈ 47.33%
  2. Topic: Probability that a rider’s trip includes crossing the city boundary into a neighboring suburb — target ≈ 81.62%
  3. Topic: Share of ridership attributable to riders aged 18–24 — target ≈ 8.48%
  4. Topic: Proportion of all weekday trips made on public transit rather than by car, bike, or walking — target ≈ 26.61%
  5. Topic: Proportion of riders who make a round trip using transit in the same day — target ≈ 73.40%
  6. Topic: Proportion of riders who use transit to reach recreational parks or trails during weekends — target ≈ 55.14%
  7. Topic: Proportion of riders who bring a stroller, cart, or large baggage on board — target ≈ 48.50%
  8. Topic: Proportion of riders who transfer to or from a commuter rail line (if present) within the city — target ≈ 39.06%

Output

1. 47.33% of cleanliness-related customer complaints occur per 10,000 boardings in this mid-sized U.S. city’s transit system.
2. On an average ride, there’s an 81.62% chance the trip crosses the city boundary into a neighboring suburb.
3. Riders aged 18–24 account for 8.48% of total public transit ridership in this mid-sized U.S. city.
4. 26.61% of weekday trips are made on public transit rather than by car, bike, or walking.
5. 73.40% of transit riders complete a round trip on the system within the same day.
6. On weekends, 55.14% of riders use public transit specifically to reach recreational parks or trails.
7. 48.50% of riders board with a stroller, cart, or other large baggage item.
8. 39.06% of riders transfer to or from a commuter rail line within the city limits.

Stage 2b — domain=0 batch=4 twist=requires_calculation

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability indirectly so the reader must do arithmetic (multiplication, addition, or a combination) to recover it. Pick ONE of these formats randomly per fact:
- Conditional chain: break into two conditional numbers that must be multiplied
- Bayesian setup: give a base rate + conditional probability to multiply
- Fraction of a fraction: express as a fraction of another fraction
- Fold/multiplier from baseline: state a multiplier and a baseline to multiply
- Percentage point difference: state a difference from a reference rate to add/subtract
- Relative risk: state a risk ratio and a reference rate to multiply
- Cumulative over time: give a per-period rate and a number of periods, requiring compounding
   Examples (pick one format randomly per fact):
   - "Among women (60% of riders), 40% use the express line"
   - "1 in 25 residents carry the gene; among carriers, 90% develop symptoms"
   - "A quarter of the 60% who responded said they use the express line"
   - "a 3-fold increase from the baseline rate of 5%"
   - "12 percentage points above the national average of 30%"
   - "a relative risk of 2.3 compared to the control group's 10% rate"
   - "the annual failure rate is 5%, and by the third year the cumulative rate reaches about 14.3%"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Rate of ridership growth (month-over-month) for the entire transit system — target ≈ 45.75%
  2. Topic: Probability that a rider uses a monthly pass rather than single-ride fares — target ≈ 98.91%
  3. Topic: Frequency of peak-period overcrowding events (load factor above a defined threshold) per route-mile — target ≈ 47.52%
  4. Topic: Probability that a rider experiences at least one cancelled trip in a month on their primary route — target ≈ 52.81%
  5. Topic: Probability that a rider uses transit for a trip chain with three or more linked legs in one day — target ≈ 56.67%
  6. Topic: Probability that a rider chooses transit over ride-hailing for trips to downtown on weekend evenings — target ≈ 3.97%
  7. Topic: Proportion of riders who would switch from transit to driving if fares rose by a defined amount — target ≈ 88.13%
  8. Topic: Percentage of riders who receive employer-subsidized transit benefits — target ≈ 17.29%

Output

1. Last month, 75.0% of lines saw any increase in boardings, and on those lines 61.0% cleared the “meaningful growth” cutoff—multiply the two to estimate the systemwide month-over-month ridership growth rate.
2. In this mid-sized city, 99.0% of riders are regular commuters, and among regular commuters 99.0% pay with a monthly pass rather than single-ride fares.
3. Per route-mile, about 2/3 of the day’s service blocks are in the peak window, and within those peak blocks 71.28% exceed the load-factor threshold, so overcrowding events run at that fraction-of-a-fraction rate.
4. If the per-week chance of at least one cancelled trip on a rider’s primary route is 0.200, then over 4 weeks the compounded “at least one cancellation in the month” probability is 1 − (1 − 0.200)^4.
5. On a typical day, 81% of riders make at least one linked trip chain, and among those riders 70% end up with three or more linked legs, giving the overall probability by multiplication.
6. For weekend-evening trips to downtown, the relative risk of choosing transit (vs ride-hailing) is 0.1323 compared to a reference corridor where the transit-choice rate is 30.0%.
7. A proposed fare bump would raise the “switch to driving” share by 13.13 percentage points above a reference rate of 75.00% observed under the current fare.
8. Employer-subsidized transit benefits show up as a 1.44× multiplier on a baseline coverage rate of 12.0% among riders without any workplace perks.

Stage 2b — domain=0 batch=5 twist=count_and_total

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability as a count and total. E.g., '630 out of 4,200 surveyed reported...'. The reader must divide to get the probability.
   Example: "630 out of 4,200 surveyed reported using the express line"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Proportion of riders whose total door-to-door travel time by transit is within 20% of driving time — target ≈ 58.18%
  2. Topic: Share of ridership occurring in the shoulder period (e.g., 9–11 AM, 2–4 PM) on weekdays — target ≈ 36.09%
  3. Topic: Percentage of riders who use transit for nightlife-related trips on Friday/Saturday nights — target ≈ 16.89%
  4. Topic: Share of riders who transfer between two routes on a single one-way trip — target ≈ 54.23%
  5. Topic: Rate of on-board incidents (medical, security, conflicts) per 100,000 passenger trips — target ≈ 46.55%
  6. Topic: Proportion of riders who experience at least one missed connection in a week — target ≈ 93.97%
  7. Topic: Proportion of riders who use transit despite having access to free parking at their destination — target ≈ 14.90%
  8. Topic: Percentage of riders who use real-time arrival information before boarding — target ≈ 25.23%

Output

1. Among 2,200 commuters surveyed, 1,280 reported that their door-to-door transit journey took no more than 20% longer than the equivalent drive would have.
2. Ridership logs showed 7,218 out of 20,000 weekday boardings occurred during the shoulder hours between morning and afternoon peak periods.
3. Weekend evening surveys found 304 out of 1,800 riders were making nightlife-related trips on Friday or Saturday nights.
4. Fare card data revealed that 3,154 of 5,815 passengers transferred between at least two different routes to complete a single one-way journey.
5. System-wide incident reports documented 1,863 medical emergencies, security calls, or passenger conflicts across 4,000,000 recorded trips.
6. In a typical week, 1,409 out of 1,500 regular riders experienced at least one instance of missing a planned connection.
7. Of the riders interviewed at downtown stations, 149 out of 1,000 confirmed they had free parking available at their destination but chose transit anyway.
8. Mobile app analytics and platform observations indicated 1,137 out of 4,505 boardings were preceded by checking real-time arrival information.

Stage 2b — domain=0 batch=6 twist=scientific_notation

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express in scientific or mathematical notation. E.g., 'prevalence is on the order of 10^-2'. The reader must convert notation to a probability.
   Example: "the prevalence of this condition is on the order of 10^-2"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Probability that a rider uses microtransit/paratransit instead of fixed-route service for a trip — target ≈ 27.39%
  2. Topic: Percentage of rides paid with contactless bank cards versus transit cards or cash — target ≈ 72.63%
  3. Topic: Share of riders who report using transit to access healthcare appointments — target ≈ 50.57%
  4. Topic: Rate of ridership loss attributable to major roadway construction affecting bus speeds — target ≈ 59.37%
  5. Topic: Proportion of riders who use transit to reach a grocery store at least once per month — target ≈ 24.07%
  6. Topic: Probability that a rider’s bus trip includes standing for more than half the trip — target ≈ 2.60%
  7. Topic: Proportion of trips where riders pay an additional fare due to transfer window expiration — target ≈ 39.41%
  8. Topic: Rate of ridership change after adding transit signal priority on a corridor — target ≈ 76.95%

Output

1. On a typical trip in a mid-sized U.S. city, the probability a rider opts for microtransit/paratransit rather than fixed-route service is approximately 2.739×10^-1.
2. Rides paid by tapping a contactless bank card (instead of a transit card or cash) occur with frequency about 7.263×10^-1.
3. Survey data suggest the share of riders using public transit to reach healthcare appointments is on the order of 5.057×10^-1.
4. During major roadway construction that slows buses, the fraction of ridership loss attributable to the work is roughly 5.937×10^-1.
5. The probability a rider uses transit to get to a grocery store at least once per month is about 2.407×10^-1.
6. For an average bus trip, P(standing for more than half the ride) ≈ 2.60×10^-2.
7. Across all trips, the chance of paying an extra fare because the transfer window expires is approximately 3.941×10^-1.
8. After transit signal priority is added on a corridor, the probability of observing a ridership increase is about 7.695×10^-1.

Stage 2b — domain=0 batch=7 twist=count_and_total

Model: anthropic/claude-opus-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability as a count and total. E.g., '630 out of 4,200 surveyed reported...'. The reader must divide to get the probability.
   Example: "630 out of 4,200 surveyed reported using the express line"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Rate of new rider acquisition following a fare reduction or free-fare promotion — target ≈ 95.86%
  2. Topic: Share of ridership occurring after 9:00 PM on weekdays — target ≈ 8.55%
  3. Topic: Proportion of households within a quarter-mile of a stop that use transit at least monthly — target ≈ 17.10%
  4. Topic: Average in-vehicle speed (miles per hour) on key corridors during peak periods — target ≈ 17.36%
  5. Topic: Percentage of riders who report that stop amenities (shelter, benches, lighting) influence their route choice — target ≈ 44.82%
  6. Topic: Probability that riders choose a different route when a preferred line is delayed — target ≈ 30.25%
  7. Topic: Likelihood that riders choose transit during major downtown events compared with typical days — target ≈ 18.72%
  8. Topic: Percentage of routes where ridership exceeds pre-pandemic baselines in the last quarter — target ≈ 73.84%

Output

1. Following the city's free-fare promotion last summer, 4,793 out of 5,000 surveyed non-riders reported trying public transit for the first time during the promotional period.
2. Transit authority data reveals that 171 out of every 2,000 weekday trips are logged after 9:00 PM.
3. A neighborhood survey found that 342 out of 2,000 households located within a quarter-mile of a bus stop use transit at least once per month.
4. GPS tracking on the Central Avenue corridor showed buses averaging 1,736 out of a possible 10,000 speed units during morning rush hour.
5. When asked about route preferences, 1,569 out of 3,501 riders cited stop amenities such as shelters, benches, and adequate lighting as determining factors.
6. Of 8,000 commuters surveyed after experiencing a delay on their preferred line, 2,420 reported switching to an alternate route to complete their journey.
7. During the downtown music festival, 936 out of 5,000 attendees surveyed chose transit over their usual transportation method compared to typical weekdays.
8. According to the quarterly report, 553 out of 749 routes now carry more passengers than they did before the pandemic.

Stage 2b — domain=0 batch=8 twist=complement_framing

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   State the complement of the probability. E.g., 'Only 15% of residents do not use the bus'. The reader must subtract from 100% to get the actual rate.
   Example: "Only 15% of residents don't use the bus during peak hours"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Proportion of riders who experience a stop skip or pass-up due to full vehicles — target ≈ 71.24%
  2. Topic: Proportion of riders with limited English proficiency who use transit information in a non-English language — target ≈ 43.84%
  3. Topic: Probability that a resident uses public transit at least once in a given week — target ≈ 48.73%
  4. Topic: Average boarding-to-alighting trip duration for bus riders during the AM peak — target ≈ 59.50%
  5. Topic: Proportion of riders whose trip purpose is school or university travel — target ≈ 68.19%
  6. Topic: Share of system ridership served by zero-emission buses versus diesel or hybrid buses — target ≈ 14.22%
  7. Topic: Proportion of riders who access stops primarily by walking versus driving or biking — target ≈ 70.55%
  8. Topic: Average fare paid per boarding after discounts, passes, and transfers — target ≈ 64.18%

Output

1. Only 28.76% of riders report never experiencing a stop skip or pass-up due to full vehicles during their regular commute.
2. Among riders with limited English proficiency, just 56.16% do not use transit information in a non-English language.
3. Barely 51.27% of residents fail to use public transit at least once in a given week.
4. Only 40.50% of bus riders during the AM peak have boarding-to-alighting trip durations below the system average.
5. A mere 31.81% of riders cite trip purposes other than school or university travel.
6. Just 85.78% of system ridership is served by diesel or hybrid buses rather than zero-emission buses.
7. Only 29.45% of riders access stops primarily by driving or biking instead of walking.
8. Merely 35.82% of boardings involve fare payments below the average after discounts, passes, and transfers are applied.

Stage 2b — domain=0 batch=9 twist=requires_calculation

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   Express the probability indirectly so the reader must do arithmetic (multiplication, addition, or a combination) to recover it. Pick ONE of these formats randomly per fact:
- Conditional chain: break into two conditional numbers that must be multiplied
- Bayesian setup: give a base rate + conditional probability to multiply
- Fraction of a fraction: express as a fraction of another fraction
- Fold/multiplier from baseline: state a multiplier and a baseline to multiply
- Percentage point difference: state a difference from a reference rate to add/subtract
- Relative risk: state a risk ratio and a reference rate to multiply
- Cumulative over time: give a per-period rate and a number of periods, requiring compounding
   Examples (pick one format randomly per fact):
   - "Among women (60% of riders), 40% use the express line"
   - "1 in 25 residents carry the gene; among carriers, 90% develop symptoms"
   - "A quarter of the 60% who responded said they use the express line"
   - "a 3-fold increase from the baseline rate of 5%"
   - "12 percentage points above the national average of 30%"
   - "a relative risk of 2.3 compared to the control group's 10% rate"
   - "the annual failure rate is 5%, and by the third year the cumulative rate reaches about 14.3%"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Probability that a rider’s first boarding occurs before 7:00 AM on weekdays — target ≈ 45.47%
  2. Topic: Likelihood that a rider chooses the express route when both local and express options exist — target ≈ 88.52%
  3. Topic: Probability that a rider uses transit on the first day of a snowstorm compared to the day after — target ≈ 85.65%
  4. Topic: Rate of service disruptions (detours, cancellations) per 1,000 scheduled trips — target ≈ 73.39%
  5. Topic: Proportion of riders who use transit at least three days per week for commuting — target ≈ 4.58%
  6. Topic: Percentage of riders who combine biking with transit (bike-to-transit or transit-to-bike) on a given day — target ≈ 52.47%
  7. Topic: Percentage of riders who wait longer than 10 minutes beyond scheduled arrival during peak hours — target ≈ 59.33%
  8. Topic: Proportion of first-boardings that occur at park-and-ride lots — target ≈ 51.26%

Output

1. On weekdays, 64.1% of riders start their first trip before 8:00 AM, and among those early starters 70.93% board before 7:00 AM.
2. When both local and express are available, the express share is 79.3 percentage points above a 9.22% “local-only by preference” reference rate.
3. Using the day-after-snowstorm as a reference rate of 0.91, the first-day ridership shows a relative risk of 0.9412.
4. From a baseline disruption rate of 41.2% per 1,000 scheduled trips, the city sees a 1.7813× multiplier when detours and cancellations are counted together.
5. About 1 in 12 riders are commuting-focused regulars, and 54.96% of those commute-focused riders use transit at least three days per week.
6. A third of riders are “multi-modal inclined,” and 1.5741 of that third report biking with transit (bike-to-transit or transit-to-bike) on a given day.
7. During peak hours, 78.7% of arrivals are on the tightest headways, and 75.35% of those trips run more than 10 minutes beyond the scheduled arrival.
8. For first boardings, 62.2% begin outside the downtown core, and among those starts 82.41% occur at park-and-ride lots.

Stage 2b — domain=0 batch=10 twist=complement_framing

Model: anthropic/claude-sonnet-4-5

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   State the complement of the probability. E.g., 'Only 15% of residents do not use the bus'. The reader must subtract from 100% to get the actual rate.
   Example: "Only 15% of residents don't use the bus during peak hours"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Proportion of riders who abandon a planned trip due to real-time delay alerts — target ≈ 51.87%
  2. Topic: Average headway adherence (actual vs scheduled) on high-frequency corridors — target ≈ 43.81%
  3. Topic: Proportion of boardings occurring at stops with shelters versus without shelters — target ≈ 36.62%
  4. Topic: Share of ridership generated by special service shuttles (event, campus, circulator) versus regular routes — target ≈ 95.82%
  5. Topic: Proportion of riders who use transit on holidays when reduced schedules operate — target ≈ 27.50%
  6. Topic: Rate of ridership change after a major employer relocates within the city — target ≈ 80.01%
  7. Topic: Probability that a rider uses a mobility aid and boards via ramp or lift — target ≈ 99.86%
  8. Topic: Rate of fare evasion incidents per 10,000 boardings — target ≈ 4.74%

Output

1. Only 48.13% of riders who receive real-time delay alerts do not cancel their planned trip.
2. Just 56.19% of buses on high-frequency corridors fail to meet their scheduled headway adherence targets.
3. Merely 63.38% of all passenger boardings do not occur at stops equipped with shelters.
4. A scant 4.18% of total ridership does not come from regular fixed routes, instead arriving via special service shuttles for events, campuses, and circulators.
5. Only 72.50% of regular transit riders do not use the system on holidays when reduced schedules are in effect.
6. Just 19.99% of baseline ridership levels were not maintained after a major employer relocated to a different district within city limits.
7. A mere 0.14% of riders using mobility aids do not require ramp or lift deployment for boarding.
8. Only 95.26% of every 10,000 boardings do not involve a fare evasion incident.

Stage 2b — domain=0 batch=11 twist=complement_framing

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   State the complement of the probability. E.g., 'Only 15% of residents do not use the bus'. The reader must subtract from 100% to get the actual rate.
   Example: "Only 15% of residents don't use the bus during peak hours"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Probability that a bus arrives within 5 minutes of scheduled time at a given stop — target ≈ 5.42%
  2. Topic: Rate of ridership change after a new route is introduced in an underserved neighborhood — target ≈ 42.34%
  3. Topic: Rate of ridership increase following new shelter or lighting installation at stops — target ≈ 37.05%
  4. Topic: Rate of ridership change after adding a dedicated bus lane segment on a congested arterial — target ≈ 75.03%
  5. Topic: Proportion of riders who board at stops with accessible curb ramps and compliant landing pads — target ≈ 41.33%
  6. Topic: Share of riders who make at least one intermodal transfer to a bike-share system — target ≈ 1.10%
  7. Topic: Rate of bus bunching events on frequent routes per day — target ≈ 30.71%
  8. Topic: Share of ridership occurring on weekends versus weekdays — target ≈ 93.72%

Output

1. At this mid-sized city's stops, only 94.58% of buses do not arrive within 5 minutes of the posted schedule.
2. After a new route launches in an underserved neighborhood, only 57.66% of the corridor's ridership does not change upward.
3. Following new shelters or lighting at stops, only 62.95% of riders do not increase their bus use.
4. Once a dedicated bus-lane segment goes in on the most congested arterial, only 24.97% of trips do not show a ridership jump.
5. Across the system, only 58.67% of boardings do not occur at stops with accessible curb ramps and compliant landing pads.
6. Among regular riders, only 98.90% do not make even one transfer to bike-share during a typical week.
7. On the frequent routes, only 69.29% of service days do not include a bus-bunching event.
8. In this mid-sized U.S. city, only 6.28% of total ridership does not occur on weekends rather than weekdays.

Stage 2b — domain=0 batch=12 twist=complement_framing

Model: openai/gpt-5.2

Prompt

Complete the following task and return your answer as a list of items in XML format.

Rules:
- Wrap all items in <items></items> tags
- Wrap each individual item in <item index="n"></item> tags
- Put each XML tag on its own line
- Put the item content on its own line between the tags

Example format:
<items>
<item index="1">
First item here
</item>
<item index="2">
Second item here
</item>
</items>

Task: Turn each of the 8 topics below into a single-sentence fake statistic about `public transit ridership in a mid-sized U.S. city`.

Rules:
1. Each fact must encode a probability close to its target value.
2. Express the number using ONLY this presentation style:
   State the complement of the probability. E.g., 'Only 15% of residents do not use the bus'. The reader must subtract from 100% to get the actual rate.
   Example: "Only 15% of residents don't use the bus during peak hours"
3. Do NOT use any other format (no plain percentages unless that IS the style).
4. One sentence per fact. Vary phrasing — don't start every fact the same way.
5. Use both clean numbers (1/3, 2/7) and messy decimals (23.847%, 0.0037) freely.

6. Make sure that are syntactically different from each other. Be creative with the numbers, choose different base numbers and conditional numbers.Topics and targets:
  1. Topic: Average number of boardings per route per weekday — target ≈ 42.11%
  2. Topic: Frequency of dwell times exceeding a defined threshold at high-traffic stops — target ≈ 25.25%
  3. Topic: Rate of ridership change following introduction of all-door boarding on select routes — target ≈ 23.11%
  4. Topic: Rate of passenger turnover (boardings + alightings) per stop per hour on the busiest corridor — target ≈ 42.44%
  5. Topic: Proportion of riders who qualify for reduced-fare programs (student, senior, low-income) — target ≈ 36.26%
  6. Topic: Rate of customer complaints per 10,000 boardings related to delays — target ≈ 71.13%
  7. Topic: Proportion of riders who use the same boarding stop for most trips in a month — target ≈ 40.85%
  8. Topic: Change in ridership share during extreme weather days (snow, heat advisory) versus normal weather — target ≈ 85.67%

Output

1. Just 57.89% of routes in this mid-sized U.S. city fail to average more than the benchmark number of weekday boardings per route.
2. At the highest-traffic stops, only 74.75% of observed dwell events do not exceed the system’s threshold.
3. After all-door boarding rolled out on select lines, only 76.89% of those routes did not register an uptick in ridership compared with the month before.
4. On the busiest corridor, only 57.56% of stop-hours do not hit the target passenger-turnover intensity (boardings plus alightings).
5. Only 63.74% of riders do not qualify for a reduced-fare program such as student, senior, or low-income eligibility.
6. For every 10,000 boardings, only 28.87% of complaint reports are not about delay-related issues.
7. Only 59.15% of riders do not use the same boarding stop for most of their trips in a typical month.
8. During snowstorms or heat advisories, only 14.33% of the usual ridership share does not shift compared with normal-weather days.

Stage 3 — domain=0 (public transit ridership in a mid-sized U.S. city) twists=['buried_in_parenthetical', 'temporal_distraction']

Model: anthropic/claude-sonnet-4-5

Prompt

Write a long, detailed document about: `public transit ridership in a mid-sized U.S. city`

The document MUST contain ALL of the following facts, woven naturally into the text.
FACTS:
  1. Among households in the city (where 93% live in census tracts below the defined income threshold), precisely 90% of transit rides originate from riders in those lower-income areas.
  2. The average walking distance from origin to boarding stop represents a 2.3-fold increase over the baseline benchmark of 4.809% established in the pilot study.
  3. Boardings at the top 10 busiest stops systemwide account for roughly one-fifth of the 69.75% of all rides that occur at major transfer points.
  4. Among frequent riders (31% of the surveyed population), 54.129% reported that reliability improvements would meaningfully increase their transit usage frequency.
  5. Transit access to the airport or intercity rail station occurs at a rate 7.8 percentage points below the regional metropolitan average of 34.85%.
  6. Ridership response to gasoline price increases carries a relative risk of 1.73 compared to the control period's baseline rate of 11.67%.
  7. The quarterly attrition rate for new residents trying transit stands at 15%, and by the third quarter the cumulative probability of continued ridership reaches approximately 60.47%.
  8. Three-sevenths of the 138.598% theoretical maximum engagement rate represents the actual probability that a first-time rider continues using transit after three months.
  9. Among regular transit users in the downtown corridor, the odds are roughly 17 to 10 in favor of riding at least once per week without owning a personal vehicle.
  10. The ratio of riders encountering broken ticket vending machines to those finding them operational stands at approximately 47:100.
  11. Following the new fare capping policy rollout, the odds of increased ridership on affected routes are 278 to 100.
  12. Bus stops meeting the minimum threshold for amenity prioritization show a ratio to those falling short of roughly 41:102.
  13. During midday hours in the central business district, boarding activity per stop occurs at odds of approximately 543 to 100 relative to baseline counts.
  14. The odds that service frequency is cited as the primary motivation for transit use are 151 to 849.
  15. Productivity metrics reveal the ratio of routes exceeding target riders per vehicle-hour to those underperforming stands at approximately 145:100.
  16. Wait time analysis indicates the odds are roughly 26 to 74 that average dwell duration per trip falls within the peak delay window.
  17. The rate of app-based trip planning sessions per 1,000 residents per month stands at 8838 basis points according to the latest digital engagement survey.
  18. On-time departures from route terminals occur at a rate of 8124 basis points per 100 departures, reflecting strong operational performance.
  19. Riders aged 65 and older account for 522 basis points of total ridership across all routes and time periods.
  20. The likelihood of ridership decline during school summer break on routes serving campuses reaches 3899 basis points, indicating significant seasonal variation.
  21. Commuting to work represents the trip purpose for 8418 basis points of all riders surveyed during weekday peak hours.
  22. When asked about personal safety, riders who report feeling safe at stops after dark represent 8644 basis points of respondents.
  23. Bus route passengers travel an average number of stops measuring 3183 basis points relative to total route length.
  24. The ratio of unique riders per day to total boardings, comparing linked versus unlinked trips, registers at 1458 basis points.
  25. 47.33% of cleanliness-related customer complaints occur per 10,000 boardings in this mid-sized U.S. city’s transit system.
  26. On an average ride, there’s an 81.62% chance the trip crosses the city boundary into a neighboring suburb.
  27. Riders aged 18–24 account for 8.48% of total public transit ridership in this mid-sized U.S. city.
  28. 26.61% of weekday trips are made on public transit rather than by car, bike, or walking.
  29. 73.40% of transit riders complete a round trip on the system within the same day.
  30. On weekends, 55.14% of riders use public transit specifically to reach recreational parks or trails.
  31. 48.50% of riders board with a stroller, cart, or other large baggage item.
  32. 39.06% of riders transfer to or from a commuter rail line within the city limits.
  33. Last month, 75.0% of lines saw any increase in boardings, and on those lines 61.0% cleared the “meaningful growth” cutoff—multiply the two to estimate the systemwide month-over-month ridership growth rate.
  34. In this mid-sized city, 99.0% of riders are regular commuters, and among regular commuters 99.0% pay with a monthly pass rather than single-ride fares.
  35. Per route-mile, about 2/3 of the day’s service blocks are in the peak window, and within those peak blocks 71.28% exceed the load-factor threshold, so overcrowding events run at that fraction-of-a-fraction rate.
  36. If the per-week chance of at least one cancelled trip on a rider’s primary route is 0.200, then over 4 weeks the compounded “at least one cancellation in the month” probability is 1 − (1 − 0.200)^4.
  37. On a typical day, 81% of riders make at least one linked trip chain, and among those riders 70% end up with three or more linked legs, giving the overall probability by multiplication.
  38. For weekend-evening trips to downtown, the relative risk of choosing transit (vs ride-hailing) is 0.1323 compared to a reference corridor where the transit-choice rate is 30.0%.
  39. A proposed fare bump would raise the “switch to driving” share by 13.13 percentage points above a reference rate of 75.00% observed under the current fare.
  40. Employer-subsidized transit benefits show up as a 1.44× multiplier on a baseline coverage rate of 12.0% among riders without any workplace perks.
  41. Among 2,200 commuters surveyed, 1,280 reported that their door-to-door transit journey took no more than 20% longer than the equivalent drive would have.
  42. Ridership logs showed 7,218 out of 20,000 weekday boardings occurred during the shoulder hours between morning and afternoon peak periods.
  43. Weekend evening surveys found 304 out of 1,800 riders were making nightlife-related trips on Friday or Saturday nights.
  44. Fare card data revealed that 3,154 of 5,815 passengers transferred between at least two different routes to complete a single one-way journey.
  45. System-wide incident reports documented 1,863 medical emergencies, security calls, or passenger conflicts across 4,000,000 recorded trips.
  46. In a typical week, 1,409 out of 1,500 regular riders experienced at least one instance of missing a planned connection.
  47. Of the riders interviewed at downtown stations, 149 out of 1,000 confirmed they had free parking available at their destination but chose transit anyway.
  48. Mobile app analytics and platform observations indicated 1,137 out of 4,505 boardings were preceded by checking real-time arrival information.
  49. On a typical trip in a mid-sized U.S. city, the probability a rider opts for microtransit/paratransit rather than fixed-route service is approximately 2.739×10^-1.
  50. Rides paid by tapping a contactless bank card (instead of a transit card or cash) occur with frequency about 7.263×10^-1.
  51. Survey data suggest the share of riders using public transit to reach healthcare appointments is on the order of 5.057×10^-1.
  52. During major roadway construction that slows buses, the fraction of ridership loss attributable to the work is roughly 5.937×10^-1.
  53. The probability a rider uses transit to get to a grocery store at least once per month is about 2.407×10^-1.
  54. For an average bus trip, P(standing for more than half the ride) ≈ 2.60×10^-2.
  55. Across all trips, the chance of paying an extra fare because the transfer window expires is approximately 3.941×10^-1.
  56. After transit signal priority is added on a corridor, the probability of observing a ridership increase is about 7.695×10^-1.
  57. Following the city's free-fare promotion last summer, 4,793 out of 5,000 surveyed non-riders reported trying public transit for the first time during the promotional period.
  58. Transit authority data reveals that 171 out of every 2,000 weekday trips are logged after 9:00 PM.
  59. A neighborhood survey found that 342 out of 2,000 households located within a quarter-mile of a bus stop use transit at least once per month.
  60. GPS tracking on the Central Avenue corridor showed buses averaging 1,736 out of a possible 10,000 speed units during morning rush hour.
  61. When asked about route preferences, 1,569 out of 3,501 riders cited stop amenities such as shelters, benches, and adequate lighting as determining factors.
  62. Of 8,000 commuters surveyed after experiencing a delay on their preferred line, 2,420 reported switching to an alternate route to complete their journey.
  63. During the downtown music festival, 936 out of 5,000 attendees surveyed chose transit over their usual transportation method compared to typical weekdays.
  64. According to the quarterly report, 553 out of 749 routes now carry more passengers than they did before the pandemic.
  65. Only 28.76% of riders report never experiencing a stop skip or pass-up due to full vehicles during their regular commute.
  66. Among riders with limited English proficiency, just 56.16% do not use transit information in a non-English language.
  67. Barely 51.27% of residents fail to use public transit at least once in a given week.
  68. Only 40.50% of bus riders during the AM peak have boarding-to-alighting trip durations below the system average.
  69. A mere 31.81% of riders cite trip purposes other than school or university travel.
  70. Just 85.78% of system ridership is served by diesel or hybrid buses rather than zero-emission buses.
  71. Only 29.45% of riders access stops primarily by driving or biking instead of walking.
  72. Merely 35.82% of boardings involve fare payments below the average after discounts, passes, and transfers are applied.
  73. On weekdays, 64.1% of riders start their first trip before 8:00 AM, and among those early starters 70.93% board before 7:00 AM.
  74. When both local and express are available, the express share is 79.3 percentage points above a 9.22% “local-only by preference” reference rate.
  75. Using the day-after-snowstorm as a reference rate of 0.91, the first-day ridership shows a relative risk of 0.9412.
  76. From a baseline disruption rate of 41.2% per 1,000 scheduled trips, the city sees a 1.7813× multiplier when detours and cancellations are counted together.
  77. About 1 in 12 riders are commuting-focused regulars, and 54.96% of those commute-focused riders use transit at least three days per week.
  78. A third of riders are “multi-modal inclined,” and 1.5741 of that third report biking with transit (bike-to-transit or transit-to-bike) on a given day.
  79. During peak hours, 78.7% of arrivals are on the tightest headways, and 75.35% of those trips run more than 10 minutes beyond the scheduled arrival.
  80. For first boardings, 62.2% begin outside the downtown core, and among those starts 82.41% occur at park-and-ride lots.
  81. Only 48.13% of riders who receive real-time delay alerts do not cancel their planned trip.
  82. Just 56.19% of buses on high-frequency corridors fail to meet their scheduled headway adherence targets.
  83. Merely 63.38% of all passenger boardings do not occur at stops equipped with shelters.
  84. A scant 4.18% of total ridership does not come from regular fixed routes, instead arriving via special service shuttles for events, campuses, and circulators.
  85. Only 72.50% of regular transit riders do not use the system on holidays when reduced schedules are in effect.
  86. Just 19.99% of baseline ridership levels were not maintained after a major employer relocated to a different district within city limits.
  87. A mere 0.14% of riders using mobility aids do not require ramp or lift deployment for boarding.
  88. Only 95.26% of every 10,000 boardings do not involve a fare evasion incident.
  89. At this mid-sized city's stops, only 94.58% of buses do not arrive within 5 minutes of the posted schedule.
  90. After a new route launches in an underserved neighborhood, only 57.66% of the corridor's ridership does not change upward.
  91. Following new shelters or lighting at stops, only 62.95% of riders do not increase their bus use.
  92. Once a dedicated bus-lane segment goes in on the most congested arterial, only 24.97% of trips do not show a ridership jump.
  93. Across the system, only 58.67% of boardings do not occur at stops with accessible curb ramps and compliant landing pads.
  94. Among regular riders, only 98.90% do not make even one transfer to bike-share during a typical week.
  95. On the frequent routes, only 69.29% of service days do not include a bus-bunching event.
  96. In this mid-sized U.S. city, only 6.28% of total ridership does not occur on weekends rather than weekdays.
  97. Just 57.89% of routes in this mid-sized U.S. city fail to average more than the benchmark number of weekday boardings per route.
  98. At the highest-traffic stops, only 74.75% of observed dwell events do not exceed the system’s threshold.
  99. After all-door boarding rolled out on select lines, only 76.89% of those routes did not register an uptick in ridership compared with the month before.
  100. On the busiest corridor, only 57.56% of stop-hours do not hit the target passenger-turnover intensity (boardings plus alightings).
  101. Only 63.74% of riders do not qualify for a reduced-fare program such as student, senior, or low-income eligibility.
  102. For every 10,000 boardings, only 28.87% of complaint reports are not about delay-related issues.
  103. Only 59.15% of riders do not use the same boarding stop for most of their trips in a typical month.
  104. During snowstorms or heat advisories, only 14.33% of the usual ridership share does not shift compared with normal-weather days.

CRITICAL — PRESERVE EXACT PHRASING:
Each fact above uses a specific presentation style. Try to keep the same style, wording, and number format when embedding each fact. If rephrasing is needed to apply an embedding strategy that makes extraction harder, that is fine — but otherwise preserve the original phrasing.

EMBEDDING STRATEGIES:
Use the following strategies to embed the facts into the document so they are not immediately obvious to a reader scanning for them. The strategies should be invisible — the document must read naturally with no hint that anything is being hidden.
- buried_in_parenthetical: Place some of the key facts inside parenthetical asides, footnotes, or end-of-paragraph afterthoughts rather than in the main text flow. The number should be easy to skip on a quick scan. E.g., '...the program expanded to 14 routes (with an observed rider retention rate of 83.2% among pass holders) and added weekend service.'
- temporal_distraction: Include an older, outdated statistic for the same metric alongside the current one. Do NOT clearly label which is current — make the reader figure out which number is the up-to-date one. E.g., mention a 2019 figure in one paragraph and bury the current figure later.

Write the complete document text, nothing else.

Output

# Public Transit Ridership in a Mid-Sized U.S. City: A Comprehensive Analysis

## Executive Summary

Public transit systems in mid-sized U.S. cities face a unique set of challenges and opportunities that distinguish them from both major metropolitan networks and smaller municipal operations. This report examines ridership patterns, demographic characteristics, operational performance, and strategic considerations for a representative mid-sized city's transit authority, drawing on recent surveys, fare-card data, GPS tracking, mobile app analytics, and operational logs. The findings reveal a system deeply intertwined with the economic fabric of the community, yet struggling with persistent questions of service quality, equity, and financial sustainability.

## Demographics and Socioeconomic Context

### Income Distribution and Transit Dependency

Transit ridership in this mid-sized city reflects profound economic stratification. Among households in the city (where 93% live in census tracts below the defined income threshold), precisely 90% of transit rides originate from riders in those lower-income areas. This concentration underscores the essential role public transportation plays for residents with limited access to private vehicles. The pattern suggests that transit is not simply one option among many for most users, but rather a critical lifeline enabling access to employment, education, healthcare, and essential services.

Historical data from the 2018 comprehensive survey showed that lower-income riders accounted for roughly 82% of weekday trips—a figure that has risen steadily as automobile ownership costs have escalated and wage growth has failed to keep pace with inflation. The current distribution reveals an even sharper divide, with transit serving primarily as the mobility solution for economically vulnerable populations.

### Age Distribution and Trip Purpose

Riders aged 18–24 account for 8.48% of total public transit ridership in this mid-sized U.S. city, a segment heavily influenced by college enrollment and early-career employment patterns. Meanwhile, riders aged 65 and older account for 522 basis points of total ridership across all routes and time periods, reflecting both the aging of the regional population and the importance of senior fare programs that make transit affordable for fixed-income retirees.

Commuting to work represents the trip purpose for 8418 basis points of all riders surveyed during weekday peak hours. This dominant share illustrates the transit system's central function as a workforce mobility network, yet it also reveals vulnerability: when employment centers relocate, shift remote work policies, or experience economic downturns, ridership stability is threatened. In fact, just 19.99% of baseline ridership levels were not maintained after a major employer relocated to a different district within city limits—meaning the overwhelming majority of ridership persisted even through this major employment geography shift, a testament to the system's resilience and the captive nature of many riders.

A mere 31.81% of riders cite trip purposes other than school or university travel, which indicates that educational institutions anchor a substantial portion of daily boardings. The likelihood of ridership decline during school summer break on routes serving campuses reaches 3899 basis points, indicating significant seasonal variation that challenges budget planning and service allocation throughout the calendar year.

On weekends, 55.14% of riders use public transit specifically to reach recreational parks or trails, highlighting the system's role beyond traditional commuting. Weekend evening surveys found 304 out of 1,800 riders were making nightlife-related trips on Friday or Saturday nights. Survey data suggest the share of riders using public transit to reach healthcare appointments is on the order of 5.057×10^-1, while the probability a rider uses transit to get to a grocery store at least once per month is about 2.407×10^-1—underscoring the diversity of essential errands that depend on reliable public transportation.

### Vehicle Ownership and Modal Choice

Among regular transit users in the downtown corridor, the odds are roughly 17 to 10 in favor of riding at least once per week without owning a personal vehicle. This statistic captures both the captive-rider phenomenon and the genuine attractiveness of transit in dense urban cores where parking is scarce and expensive. Of the riders interviewed at downtown stations, 149 out of 1,000 confirmed they had free parking available at their destination but chose transit anyway, suggesting that for a meaningful minority, quality of service and convenience can outweigh the default preference for private automobiles.

Barely 51.27% of residents fail to use public transit at least once in a given week, meaning nearly half the population engages with the system on at least a weekly basis. This penetration rate is noteworthy for a mid-sized city and speaks to both transit dependence and competitive service where the network is robust.

When evaluating mode shift potential, 26.61% of weekday trips are made on public transit rather than by car, bike, or walking. This share has fluctuated considerably over the past five years; in 2019, the comparable figure stood at roughly 22%, reflecting gradual modal shift driven by congestion, parking constraints, and a modest expansion of service coverage.

For weekend-evening trips to downtown, the relative risk of choosing transit (vs ride-hailing) is 0.1323 compared to a reference corridor where the transit-choice rate is 30.0%. This striking disparity reveals the challenge of competing with the convenience and door-to-door service of ride-hailing platforms during off-peak hours when transit frequencies decline and personal safety concerns peak. During major roadway construction that slows buses, the fraction of ridership loss attributable to the work is roughly 5.937×10^-1, demonstrating how infrastructure projects that degrade service reliability can drive riders toward alternatives.

## Service Coverage and Access

### Geographic Access and Walking Distance

The average walking distance from origin to boarding stop represents a 2.3-fold increase over the baseline benchmark of 4.809% established in the pilot study. This growing access burden reflects both suburban sprawl extending beyond the fixed-route network and the closure or consolidation of stops in the name of faster travel times and operational efficiency. The trade-off between stop density and travel speed remains contentious among planners, operators, and rider advocates.

A neighborhood survey found that 342 out of 2,000 households located within a quarter-mile of a bus stop use transit at least once per month. This 17.1% utilization rate among proximate households suggests that mere physical access is insufficient; service quality, frequency, trip destinations, and fare affordability all mediate the decision to ride. Only 29.45% of riders access stops primarily by driving or biking instead of walking, indicating that the overwhelming majority reach transit on foot and are therefore sensitive to sidewalk quality, street lighting, weather protection, and pedestrian safety.

### Stop Amenities and Infrastructure

Bus stops meeting the minimum threshold for amenity prioritization show a ratio to those falling short of roughly 41:102. This imbalance means that fewer than three in ten stops systemwide meet basic standards for rider comfort and safety—a deficit that disproportionately affects riders in lower-income neighborhoods where stop investment has historically lagged.

Merely 63.38% of all passenger boardings do not occur at stops equipped with shelters, meaning more than one-third of boardings take place where riders enjoy weather protection. This figure represents steady progress; as recently as 2020, shelter coverage stood at just 28% of boardings. Following new shelters or lighting at stops, only 62.95% of riders do not increase their bus use—in other words, over 37% of riders boost their usage when amenities improve, validating capital investment in stop infrastructure.

When asked about route preferences, 1,569 out of 3,501 riders cited stop amenities such as shelters, benches, and adequate lighting as determining factors. When asked about personal safety, riders who report feeling safe at stops after dark represent 8644 basis points of respondents—a strong majority, yet leaving a substantial minority who experience fear or discomfort that may deter evening and weekend ridership.

Across the system, only 58.67% of boardings do not occur at stops with accessible curb ramps and compliant landing pads, which indicates that more than 41% of boardings happen at stops that meet ADA standards. A scant 4.18% of total ridership does not come from regular fixed routes, instead arriving via special service shuttles for events, campuses, and circulators, highlighting the role of supplementary services in rounding out the network. On a typical trip in a mid-sized U.S. city, the probability a rider opts for microtransit/paratransit rather than fixed-route service is approximately 2.739×10^-1.

A mere 0.14% of riders using mobility aids do not require ramp or lift deployment for boarding, meaning virtually all mobility-device users rely on accessible vehicle features—underscoring the critical importance of maintaining lift equipment and training operators in proper deployment procedures.

### Connectivity to Regional Destinations

Transit access to the airport or intercity rail station occurs at a rate 7.8 percentage points below the regional metropolitan average of 34.85%. This connectivity gap limits the system's appeal to visitors, business travelers, and residents seeking seamless multimodal journeys. The shortfall stems from a combination of infrequent service, indirect routing, and the geographic isolation of the airport on the city's periphery. Efforts to establish dedicated airport express routes have faced budget constraints and uncertain ridership projections, leaving the status quo largely intact.

About 1 in 12 riders are commuting-focused regulars, and 54.96% of those commute-focused riders use transit at least three days per week. A third of riders are "multi-modal inclined," and 1.5741 of that third report biking with transit (bike-to-transit or transit-to-bike) on a given day—though this figure exceeds 100% of that subgroup, suggesting overlapping categories or measurement error in the survey instrument. Among regular riders, only 98.90% do not make even one transfer to bike-share during a typical week, indicating minimal integration between bike-share and transit despite co-location of docking stations and bus stops.

## Network Structure and Transfer Patterns

### Transfer Activity and Major Hubs

Boardings at the top 10 busiest stops systemwide account for roughly one-fifth of the 69.75% of all rides that occur at major transfer points. This concentration reveals the hub-and-spoke character of the network, with a small number of high-volume transfer centers serving as linchpins for system connectivity. These stops—typically located at major intersections, transit centers, or downtown terminals—experience intense boarding and alighting activity, crowding, and operational stress.

Fare card data revealed that 3,154 of 5,815 passengers transferred between at least two different routes to complete a single one-way journey. This 54.2% transfer rate underscores the importance of seamless connections, coordinated scheduling, and reliable real-time information. In a typical week, 1,409 out of 1,500 regular riders experienced at least one instance of missing a planned connection—a staggering incidence that speaks to chronic schedule unreliability and the resulting frustration and delay imposed on riders.

73.40% of transit riders complete a round trip on the system within the same day, suggesting that most trips are purpose-driven errands or commutes rather than one-way relocations. On an average ride, there's an 81.62% chance the trip crosses the city boundary into a neighboring suburb, reflecting the regional labor market and the mismatch between residential and employment geographies. 39.06% of riders transfer to or from a commuter rail line within the city limits, highlighting the role of the bus network as a feeder system for regional rail.

Bus route passengers travel an average number of stops measuring 3183 basis points relative to total route length. Across all trips, the chance of paying an extra fare because the transfer window expires is approximately 3.941×10^-1—a significant friction point that the recent fare-capping policy aims to address.

On a typical day, 81% of riders make at least one linked trip chain, and among those riders 70% end up with three or more linked legs, giving the overall probability by multiplication of approximately 56.7% for complex multi-leg journeys. The ratio of unique riders per day to total boardings, comparing linked versus unlinked trips, registers at 1458 basis points, illustrating the multiplier effect of transfers on systemwide boarding counts.

### Route Performance and Productivity

Productivity metrics reveal the ratio of routes exceeding target riders per vehicle-hour to those underperforming stands at approximately 145:100. This positive ratio indicates that more routes are meeting productivity benchmarks than falling short, yet the margin is modest and subject to volatility from service changes, fare adjustments, and external shocks.

Just 57.89% of routes in this mid-sized U.S. city fail to average more than the benchmark number of weekday boardings per route, meaning a solid 42% exceed the standard—a performance that has improved from 2019, when only 35% of routes cleared the benchmark. According to the quarterly report, 553 out of 749 routes now carry more passengers than they did before the pandemic, a recovery rate of approximately 73.8% that outpaces many peer cities still struggling with sustained ridership declines.

Last month, 75.0% of lines saw any increase in boardings, and on those lines 61.0% cleared the "meaningful growth" cutoff—multiply the two to estimate the systemwide month-over-month ridership growth rate of approximately 45.75%, an encouraging sign of post-pandemic recovery momentum. Only 6.28% of total ridership does not occur on weekends rather than weekdays, indicating that weekend service captures a small but important share of demand.

On the frequent routes, only 69.29% of service days do not include a bus-bunching event, meaning nearly one in three service days sees vehicles clustering together—a symptom of unreliable headways and poor real-time management. At the highest-traffic stops, only 74.75% of observed dwell events do not exceed the system's threshold, so approximately one-quarter of high-volume stops experience excessive dwell times that cascade into schedule delays.

On the busiest corridor, only 57.56% of stop-hours do not hit the target passenger-turnover intensity (boardings plus alightings), meaning 42.44% do meet or exceed the threshold—a mixed result that highlights pockets of strong demand amid broader underperformance.

## Ridership Patterns by Time and Season

### Peak and Off-Peak Dynamics

On weekdays, 64.1% of riders start their first trip before 8:00 AM, and among those early starters 70.93% board before 7:00 AM, yielding an overall probability of approximately 45.5% for pre-7 AM first boardings. This pronounced morning peak concentrates demand on the network and poses capacity and service quality challenges during the brief window when most riders need to travel.

Only 40.50% of bus riders during the AM peak have boarding-to-alighting trip durations below the system average, meaning the majority of peak riders are making longer-than-average trips—likely suburb-to-downtown commutes that strain vehicle capacity and extend exposure to delays.

Ridership logs showed 7,218 out of 20,000 weekday boardings occurred during the shoulder hours between morning and afternoon peak periods, representing 36.1% of daily boardings. This substantial midday demand base supports all-day frequent service on core routes, yet service planning has historically privileged peak periods at the expense of midday frequencies.

During midday hours in the central business district, boarding activity per stop occurs at odds of approximately 543 to 100 relative to baseline counts—a ratio exceeding 5:1 that underscores intense downtown activity even outside traditional rush hours. Transit authority data reveals that 171 out of every 2,000 weekday trips are logged after 9:00 PM, or about 8.6%, a modest share that nonetheless represents essential late-shift workers, students, and evening social trips.

### Seasonal and Weather-Related Variation

Only 72.50% of regular transit riders do not use the system on holidays when reduced schedules are in effect, indicating that more than one-quarter continue to depend on transit even on holidays—often lower-income service workers whose jobs do not pause for public observances.

During snowstorms or heat advisories, only 14.33% of the usual ridership share does not shift compared with normal-weather days, meaning a dramatic 85.67% of riders alter their travel behavior under severe weather. Using the day-after-snowstorm as a reference rate of 0.91, the first-day ridership shows a relative risk of 0.9412, suggesting a modest additional suppression effect on the storm day itself before partial recovery.

GPS tracking on the Central Avenue corridor showed buses averaging 1,736 out of a possible 10,000 speed units during morning rush hour—a 17.36% operating speed that reflects severe congestion and limited dedicated right-of-way. Once a dedicated bus-lane segment goes in on the most congested arterial, only 24.97% of trips do not show a ridership jump, meaning approximately 75% experience increased ridership following the infrastructure investment.

## Fare Structure, Payment Technology, and Revenue

### Fare Payment Methods and Policy

In this mid-sized city, 99.0% of riders are regular commuters, and among regular commuters 99.0% pay with a monthly pass rather than single-ride fares, yielding an overall probability of approximately 98.01% for monthly-pass payment across all riders. This extraordinarily high pass penetration reflects aggressive pass pricing, employer subsidy programs, and a rider base with predictable, routine travel needs.

Rides paid by tapping a contactless bank card (instead of a transit card or cash) occur with frequency about 7.263×10^-1, or roughly 73%, a rapid adoption rate driven by the recent rollout of contactless fare readers and the convenience of "tap-to-pay" for occasional riders. The ratio of riders encountering broken ticket vending machines to those finding them operational stands at approximately 47:100, a persistent maintenance challenge that frustrates riders and drives fare evasion.

Only 95.26% of every 10,000 boardings do not involve a fare evasion incident, implying an evasion rate of approximately 4.74% systemwide. This figure has fluctuated; in 2019, evasion was estimated at 6.2%, suggesting that proof-of-payment enforcement and fare-gate installations have had a modest deterrent effect. Merely 35.82% of boardings involve fare payments below the average after discounts, passes, and transfers are applied, meaning nearly two-thirds of transactions are at or above the average—a distribution shaped by the prevalence of full-fare adult passes.

### Fare Policy Changes and Ridership Elasticity

Following the new fare capping policy rollout, the odds of increased ridership on affected routes are 278 to 100, or roughly 2.78:1, a strong signal that eliminating the penalty for multiple trips per day encourages additional discretionary travel. After all-door boarding rolled out on select lines, only 76.89% of those routes did not register an uptick in ridership compared with the month before, meaning approximately 23% saw increases attributable to faster boarding and reduced dwell times.

A proposed fare bump would raise the "switch to driving" share by 13.13 percentage points above a reference rate of 75.00% observed under the current fare, implying that fare sensitivity remains acute among a rider base with limited financial cushion. Ridership response to gasoline price increases carries a relative risk of 1.73 compared to the control period's baseline rate of 11.67%, indicating that rising fuel costs do drive modal shift toward transit, though the effect is more muted than many advocates expect.

Following the city's free-fare promotion last summer, 4,793 out of 5,000 surveyed non-riders reported trying public transit for the first time during the promotional period—a 95.86% trial rate that demonstrates the power of eliminating cost barriers. Yet the quarterly attrition rate for new residents trying transit stands at 15%, and by the third quarter the cumulative probability of continued ridership reaches approximately 60.47%, indicating that free introductory periods convert only a subset of trial riders into regular users.

Three-sevenths of the 138.598% theoretical maximum engagement rate represents the actual probability that a first-time rider continues using transit after three months, yielding approximately 59.4%—a figure closely aligned with the cumulative retention data above. Among frequent riders (31% of the surveyed population), 54.129% reported that reliability improvements would meaningfully increase their transit usage frequency, pointing to latent demand that better on-time performance could unlock.

### Subsidy and Employer Programs

Employer-subsidized transit benefits show up as a 1.44× multiplier on a baseline coverage rate of 12.0% among riders without any workplace perks, suggesting that approximately 17.3% of riders overall enjoy employer subsidies. This modest coverage rate reflects the predominance of small employers and low-wage service jobs in the local economy, sectors where benefit offerings are sparse. Outreach to expand employer partnerships remains a strategic priority, yet progress has been slow amid competing budget pressures.

Only 63.74% of riders do not qualify for a reduced-fare program such as student, senior, or low-income eligibility, meaning more than 36% hold discounted passes—a substantial revenue impact that must be balanced against equity goals and political commitments to affordability.

## Service Reliability and On-Time Performance

### Schedule Adherence and Delays

On-time departures from route terminals occur at a rate of 8124 basis points per 100 departures, reflecting strong operational performance of approximately 81.24%. At this mid-sized city's stops, only 94.58% of buses do not arrive within 5 minutes of the posted schedule, implying that roughly 5.42% do arrive on time by that strict standard—a figure that reveals the gap between terminal departure performance and in-service reliability.

Wait time analysis indicates the odds are roughly 26 to 74 that average dwell duration per trip falls within the peak delay window, a ratio of approximately 0.35:1 that suggests most trips experience longer-than-target dwell times during congested periods. Just 56.19% of buses on high-frequency corridors fail to meet their scheduled headway adherence targets, meaning 43.81% do maintain adherence—a performance level that has improved marginally since the introduction of transit signal priority at key intersections.

After transit signal priority is added on a corridor, the probability of observing a ridership increase is about 7.695×10^-1, or nearly 77%, a strong endorsement of the technology's impact on travel times and rider perception of reliability. From a baseline disruption rate of 41.2% per 1,000 scheduled trips, the city sees a 1.7813× multiplier when detours and cancellations are counted together, yielding a combined disruption rate of approximately 73.4 per 1,000 trips—a level that erodes rider confidence and drives mode shift.

Of 8,000 commuters surveyed after experiencing a delay on their preferred line, 2,420 reported switching to an alternate route to complete their journey, a 30.25% diversion rate that illustrates the network's resilience when parallel options exist but also the inconvenience imposed on riders.

During peak hours, 78.7% of arrivals are on the tightest headways, and 75.35% of those trips run more than 10 minutes beyond the scheduled arrival, giving an overall probability of approximately 59.3% for severe peak-period delays—a statistic that underscores the inadequacy of current infrastructure and operational practices. If the per-week chance of at least one cancelled trip on a rider's primary route is 0.200, then over 4 weeks the compounded "at least one cancellation in the month" probability is 1 − (1 − 0.200)^4, approximately 59.04%, meaning most riders experience at least one cancellation per month on their key route.

Per route-mile, about 2/3 of the day's service blocks are in the peak window, and within those peak blocks 71.28% exceed the load-factor threshold, so overcrowding events run at that fraction-of-a-fraction rate of approximately 47.5%—a level that degrades the rider experience and deters choice riders. Only 28.76% of riders report never experiencing a stop skip or pass-up due to full vehicles during their regular commute, meaning more than 71% have been left behind at least once—a persistent problem on high-demand corridors where vehicle capacity has not kept pace with ridership growth.

For an average bus trip, P(standing for more than half the ride) ≈ 2.60×10^-2, or about 2.6%, a relatively low incidence that suggests vehicle crowding, while problematic at peak locations and times, is not systemwide. 48.50% of riders board with a stroller, cart, or other large baggage item, a high proportion that reflects the demographics of transit dependence and complicates boarding efficiency and interior circulation.

System-wide incident reports documented 1,863 medical emergencies, security calls, or passenger conflicts across 4,000,000 recorded trips, an incident rate of approximately 0.047%, or roughly one incident per 2,147 trips. This low rate suggests that safety and security, while high-salience issues, are statistically rare events—though their impact on rider perception and media coverage far outweighs their frequency.

## Digital Engagement and Real-Time Information

### App Usage and Information Access

The rate of app-based trip planning sessions per 1,000 residents per month stands at 8838 basis points according to the latest digital engagement survey, translating to approximately 88.38 sessions per 1,000 residents. This robust digital engagement reflects widespread smartphone ownership and the transit authority's investment in a user-friendly mobile app offering real-time arrivals, trip planning, and service alerts.

Mobile app analytics and platform observations indicated 1,137 out of 4,505 boardings were preceded by checking real-time arrival information, a 25.2% rate that has doubled since 2019 when real-time data coverage was limited to a handful of routes. Only 48.13% of riders who receive real-time delay alerts do not cancel their planned trip, meaning approximately 52% do cancel or adjust their plans—a testament to the value riders place on accurate, timely information but also a reminder that transparency about delays can suppress ridership if service quality does not improve.

Only 59.15% of riders do not use the same boarding stop for most of their trips in a typical month, indicating that roughly 41% are habitual users of a single stop—a behavior pattern that simplifies infrastructure investment targeting but also reveals limited geographic mobility among transit-dependent populations.

Among riders with limited English proficiency, just 56.16% do not use transit information in a non-English language, meaning 43.84% do rely on translated materials—a substantial share that underscores the importance of multilingual service communications in a demographically diverse city.

## Customer Satisfaction and Complaints

### Service Quality Metrics

47.33% of cleanliness-related customer complaints occur per 10,000 boardings in this mid-sized U.S. city's transit system, a rate of approximately 47 complaints per 10,000 rides. This figure has crept upward from 39 complaints per 10,000 in 2020, reflecting both increased ridership (and thus more opportunities for complaints) and possible declines in vehicle cleaning frequency under budget constraints.

For every 10,000 boardings, only 28.87% of complaint reports are not about delay-related issues, meaning more than 71% of complaints concern schedule reliability—far and away the dominant driver of rider dissatisfaction. The odds that service frequency is cited as the primary motivation for transit use are 151 to 849, or roughly 1:5.6, indicating that frequency ranks well behind other attributes such as route coverage, fare cost, and reliability in riders' decision calculus.

Among commuters surveyed, 1,280 out of 2,200 reported that their door-to-door transit journey took no more than 20% longer than the equivalent drive would have, a 58.2% competitive-travel-time rate that varies dramatically by corridor and time of day. When both local and express are available, the express share is 79.3 percentage points above a 9.22% "local-only by preference" reference rate, implying an express share approaching 88.5%—a clear rider preference for faster service even when stop access is less convenient.

## Infrastructure Investment and Service Changes

### Capital Improvements and Ridership Response

After a new route launches in an underserved neighborhood, only 57.66% of the corridor's ridership does not change upward, meaning approximately 42% of corridors see ridership gains—a success rate that justifies continued network expansion into transit deserts, though outcomes vary with land use, demographics, and service design.

During the downtown music festival, 936 out of 5,000 attendees surveyed chose transit over their usual transportation method compared to typical weekdays, an 18.72% mode-shift rate that validates event-based service enhancements and demonstrates latent demand for high-quality transit connections to major attractions.

For first boardings, 62.2% begin outside the downtown core, and among those starts 82.41% occur at park-and-ride lots, yielding an overall probability of approximately 51.3% for park-and-ride first boardings across the system. This high share reflects suburban sprawl and the role of park-and-ride facilities in extending the effective catchment area of fixed-route service, though it also raises questions about the environmental and equity trade-offs of automobile-dependent first miles.

## Fleet and Technology

### Vehicle Composition and Propulsion

Just 85.78% of system ridership is served by diesel or hybrid buses rather than zero-emission buses, meaning approximately 14% of riders now travel on zero-emission vehicles—a share that has grown from near zero in 2019 as the authority phases in battery-electric buses on select routes. The transition to zero-emission propulsion is a strategic priority driven by state mandates, local air-quality concerns, and climate commitments, yet capital costs and charging infrastructure requirements constrain the pace of deployment.

## Looking Forward: Strategic Priorities and Emerging Challenges

### Ridership Retention and Growth

The transit authority faces a dual challenge: retaining existing riders while attracting new users from the choice market. Among the surveyed population, 31% are frequent riders, and among those frequent riders 54.129% reported that reliability improvements would meaningfully increase their transit usage frequency—pointing to a clear path for unlocking latent demand through operational investments in on-time performance, real-time information, and schedule coordination.

The attrition dynamics among new riders remain a concern. The quarterly attrition rate for new residents trying transit stands at 15%, and by the third quarter the cumulative probability of continued ridership reaches approximately 60.47%. This retention curve suggests that early experiences are formative: riders who encounter delays, missed connections, or poor amenities in their first few months are unlikely to persist. Targeted onboarding programs, fare incentives, and "guaranteed ride home" benefits for new users may improve retention, though resource constraints limit implementation.

The system's ability to capture ridership growth appears uneven across the network. Last month, 75.0% of lines saw any increase in boardings, and on those lines 61.0% cleared the "meaningful growth" cutoff—multiply the two to estimate the systemwide month-over-month ridership growth rate of approximately 45.75%, a figure that exceeds regional benchmarks but remains fragile amid economic uncertainty and evolving travel behavior.

### Equity and Accessibility

Equity considerations permeate every dimension of service planning. Among households in the city (where 93% live in census tracts below the defined income threshold), precisely 90% of transit rides originate from riders in those lower-income areas, a pattern that demands continued investment in service quality and affordability for transit-dependent communities. The concentration of ridership among lower-income populations also exposes the system to budget volatility: fare revenue is constrained by riders' limited ability to pay, while operating subsidies face political headwinds and competition from other municipal priorities.

Accessibility improvements continue incrementally. Only 58.67% of boardings do not occur at stops with accessible curb ramps and compliant landing pads, meaning more than 41% of boardings happen at compliant stops—a steady rise from 30% in 2018. A mere 0.14% of riders using mobility aids do not require ramp or lift deployment for boarding, underscoring the universal reliance on accessible vehicle features among this population.

### Innovation and Technology Adoption

Digital engagement continues to expand. The rate of app-based trip planning sessions per 1,000 residents per month stands at 8838 basis points according to the latest digital engagement survey, and contactless payment has reached saturation among occasional riders, with rides paid by tapping a contactless bank card occurring at frequency about 7.263×10^-1. These technological advances improve convenience and operational efficiency, yet they also risk excluding digitally marginalized populations who lack smartphones or bank accounts.

Real-time information has proven transformative. Mobile app analytics and platform observations indicated 1,137 out of 4,505 boardings were preceded by checking real-time arrival information, and only 48.13% of riders who receive real-time delay alerts do not cancel their planned trip. This responsiveness underscores the importance of transparency, but also the imperative to improve underlying service reliability so that alerts convey good news rather than chronic delays.

### Economic and Environmental Context

External economic forces shape ridership in complex ways. Ridership response to gasoline price increases carries a relative risk of 1.73 compared to the control period's baseline rate of 11.67%, a modest elasticity that reflects both the captive nature of the rider base (who cannot shift away from transit regardless of fuel prices) and the limited appeal of transit to drivers (who require more than marginal cost changes to switch modes).

Employment geography exerts persistent influence. Just 19.99% of baseline ridership levels were not maintained after a major employer relocated to a different district within city limits, meaning the vast majority of ridership persisted—yet anecdotal evidence suggests that individual corridors experienced severe declines when anchor employers moved, underscoring the importance of land-use coordination and transit-oriented development policies.

Climate and sustainability goals increasingly drive policy. The transition to zero-emission buses, investment in active-transportation connections, and efforts to reduce vehicle miles traveled all position transit as a cornerstone of the city's climate action plan. Yet translating these commitments into ridership growth requires overcoming entrenched automobile dependence, land-use patterns that favor sprawl, and a political culture skeptical of public investment.

### Operational Excellence and Financial Sustainability

Operational performance remains the foundation of ridership growth. On-time departures from route terminals occur at a rate of 8124 basis points per 100 departures, reflecting strong dispatch discipline, yet in-service reliability lags as buses encounter congestion, dwell delays, and cascading schedule breakdowns. Wait time analysis indicates the odds are roughly 26 to 74 that average dwell duration per trip falls within the peak delay window, and during peak hours 78.7% of arrivals are on the tightest headways, with 75.35% of those trips running more than 10 minutes beyond the scheduled arrival.

Addressing these reliability challenges requires both capital investment (dedicated bus lanes, signal priority, real-time fleet management systems) and operational discipline (schedule adherence monitoring, operator training, passenger load management). After transit signal priority is added on a corridor, the probability of observing a ridership increase is about 7.695×10^-1, validating the technology's impact. Once a dedicated bus-lane segment goes in on the most congested arterial, only 24.97% of trips do not show a ridership jump, meaning approximately 75% experience increased ridership—a powerful return on infrastructure investment.

Financial sustainability hinges on a delicate balance. Fare revenue, while essential, cannot cover operating costs given the rider base's limited ability to pay and political resistance to fare increases. A proposed fare bump would raise the "switch to driving" share by 13.13 percentage points above a reference rate of 75.00% observed under the current fare, illustrating the price sensitivity that constrains revenue optimization. Subsidy stability from federal, state, and local sources remains uncertain, exposing the system to budget shortfalls and service cuts that trigger ridership spirals.

Productivity metrics reveal the ratio of routes exceeding target riders per vehicle-hour to those underperforming stands at approximately 145:100, a performance spread that invites route optimization, service reallocation, and difficult political conversations about equity versus efficiency. Just 57.89% of routes in this mid-sized U.S. city fail to average more than the benchmark number of weekday boardings per route, meaning a solid 42% exceed the standard—a foundation for defending the network's overall productivity even as individual routes underperform.

### Customer Experience and Retention

The rider experience encompasses far more than travel time. When asked about route preferences, 1,569 out of 3,501 riders cited stop amenities such as shelters, benches, and adequate lighting as determining factors, and following new shelters or lighting at stops, only 62.95% of riders do not increase their bus use. These findings validate capital investment in rider comfort and safety, yet budget constraints force agonizing trade-offs between service expansion and amenity enhancement.

Cleanliness and maintenance matter. 47.33% of cleanliness-related customer complaints occur per 10,000 boardings, a rate that has crept upward amid budget pressures and staffing shortages. The ratio of riders encountering broken ticket vending machines to those finding them operational stands at approximately 47:100, a persistent maintenance challenge that frustrates riders and drives fare evasion. Only 95.26% of every 10,000 boardings do not involve a fare evasion incident, an evasion rate that both reflects and contributes to a culture of low expectations.

Safety and security, while statistically rare concerns, loom large in rider perception. System-wide incident reports documented 1,863 medical emergencies, security calls, or passenger conflicts across 4,000,000 recorded trips, an incident rate of approximately 0.047%. When asked about personal safety, riders who report feeling safe at stops after dark represent 8644 basis points of respondents—a strong majority, yet the minority who feel unsafe may avoid transit entirely during evening hours, constraining ridership and fare revenue.

## Conclusion

Public transit in this mid-sized U.S. city stands at a crossroads. Ridership has rebounded from pandemic lows, with 75.0% of lines seeing month-over-month increases and 61.0% of those clearing meaningful growth thresholds, yielding systemwide growth of approximately 45.75%. Yet the system serves a rider base that is overwhelmingly transit-dependent, economically vulnerable, and acutely sensitive to service quality, fare levels, and accessibility.

Among households in the city (where 93% live in census tracts below the defined income threshold), precisely 90% of transit rides originate from riders in those lower-income areas—a pattern that underscores transit's role as an essential public service rather than a discretionary amenity. The concentration of ridership among lower-income populations, combined with modest penetration among choice riders, limits fare revenue potential and exposes the system to political and budgetary volatility.

Service reliability remains the most pressing operational challenge. For every 10,000 boardings, only 28.87% of complaint reports are not about delay-related issues, and during peak hours 78.7% of arrivals are on the tightest headways, with 75.35% of those trips running more than 10 minutes beyond the scheduled arrival. Among frequent riders (31% of the surveyed population), 54.129% reported that reliability improvements would meaningfully increase their transit usage frequency, pointing to substantial latent demand that better on-time performance could unlock.

Infrastructure investment has delivered measurable ridership gains. After transit signal priority is added on a corridor, the probability of observing a ridership increase is about 7.695×10^-1, and once a dedicated bus-lane segment goes in on the most congested arterial, only 24.97% of trips do not show a ridership jump. These results validate the business case for capital investment in speed and reliability improvements, yet funding constraints slow deployment and leave most of the network operating in mixed traffic.

Equity and accessibility have improved but remain incomplete. Only 58.67% of boardings do not occur at stops with accessible curb ramps and compliant landing pads, and bus stops meeting the minimum threshold for amenity prioritization show a ratio to those falling short of roughly 41:102. The geographic distribution of amenities and service quality continues to disadvantage lower-income neighborhoods, perpetuating the paradox that the riders most dependent on transit receive the least investment.

Looking forward, the transit authority must balance competing imperatives: expanding service coverage to attract choice riders while maintaining frequency and reliability for transit-dependent populations; investing in amenities and technology while controlling costs and maintaining affordable fares; pursuing zero-emission fleet conversion while ensuring service stability. The path forward requires sustained political commitment, stable funding, land-use coordination, and operational excellence—challenges that will test the resolve of policymakers, planners, and the riding public in the years ahead.

In this mid-sized U.S. city, 99.0% of riders are regular commuters, and among regular commuters 99.0% pay with a monthly pass rather than single-ride fares—a testament to the predictability and routine dependence that characterizes the system. This foundation of loyal, transit-dependent riders is both the system's greatest asset and its most profound responsibility, demanding service that is reliable, affordable, accessible, and dignified. Whether the city can rise to that challenge will shape not only the future of its transit system but the economic opportunity, environmental sustainability, and social equity of the community as a whole.

hackgoofer/prob_facts_pipeline_log.md

prob_facts pipeline log

Stage 2a — domain=0 (public transit ridership in a mid-sized U.S. city)

Prompt

Output

Stage 2b — domain=0 batch=0 twist=requires_calculation

Prompt

Output

Stage 2b — domain=0 batch=1 twist=odds_or_ratio

Prompt

Output

Stage 2b — domain=0 batch=2 twist=basis_points

Prompt

Output

Stage 2b — domain=0 batch=3 twist=direct_percentage

Prompt

Output

Stage 2b — domain=0 batch=4 twist=requires_calculation

Prompt

Output

Stage 2b — domain=0 batch=5 twist=count_and_total

Prompt

Output

Stage 2b — domain=0 batch=6 twist=scientific_notation

Prompt

Output

Stage 2b — domain=0 batch=7 twist=count_and_total

Prompt

Output

Stage 2b — domain=0 batch=8 twist=complement_framing

Prompt

Output

Stage 2b — domain=0 batch=9 twist=requires_calculation

Prompt

Output

Stage 2b — domain=0 batch=10 twist=complement_framing

Prompt

Output

Stage 2b — domain=0 batch=11 twist=complement_framing

Prompt

Output

Stage 2b — domain=0 batch=12 twist=complement_framing

Prompt

Output

Stage 3 — domain=0 (public transit ridership in a mid-sized U.S. city) twists=['buried_in_parenthetical', 'temporal_distraction']

Prompt

Output