-
-
Save riceissa/1ead1b9881ffbb48793565ce69d7dbdd to your computer and use it in GitHub Desktop.
""" | |
This is my understanding of the Anki scheduling algorithm, which I mostly | |
got from watching https://www.youtube.com/watch?v=lz60qTP2Gx0 | |
and https://www.youtube.com/watch?v=1XaJjbCSXT0 | |
and from reading | |
https://faqs.ankiweb.net/what-spaced-repetition-algorithm.html | |
There is also https://github.com/dae/anki/blob/master/anki/sched.py but I find | |
it really hard to understand. | |
Things I don't bother to implement here: the random fudge factor (that Anki | |
uses to decorrelate cards that were added on the same day and have the same | |
responses throughout their history), leech tracking, checking if a card from | |
the same notes has been reviewed already that day, delay in response (i.e. I | |
assume all cards are reviewed exactly on the day they are due). | |
Update (2023-12-15): Please note that the Anki review algorithm has possibly | |
changed in many ways since the time when I wrote this program (although I | |
believe that Anki still uses SM2 by default, so the basic concepts should | |
still be the same as what is shown below). I have sadly not had the time | |
or energy to keep up with the latest changes. In particular, Anki now | |
supports FSRS instead of the SM2 algorithm (which is the algorithm | |
below); FSRS is not covered at all below. | |
""" | |
# "New Cards" tab | |
NEW_STEPS = [1, 10] # in minutes | |
GRADUATING_INTERVAL = 1 # in days | |
EASY_INTERVAL = 4 # in days | |
STARTING_EASE = 250 # in percent | |
# "Reviews" tab | |
EASY_BONUS = 130 # in percent | |
INTERVAL_MODIFIER = 100 # in percent | |
MAXIMUM_INTERVAL = 36500 # in days | |
# "Lapses" tab | |
LAPSES_STEPS = [10] # in minutes | |
NEW_INTERVAL = 70 # in percent | |
MINIMUM_INTERVAL = 1 # in days | |
class Card: | |
def __init__(self): | |
self.status = 'learning' # can be 'learning', 'learned', or 'relearning' | |
self.steps_index = 0 | |
self.ease_factor = STARTING_EASE | |
self.interval = None | |
def __repr__(self): | |
return "Card[%s; steps_idx=%s; ease=%s; interval=%s]" % (self.status, | |
self.steps_index, | |
self.ease_factor, | |
str(self.interval)) | |
def schedule(card, response): | |
'''response is one of "again", "hard", "good", or "easy" | |
returns a result in days''' | |
if card.status == 'learning': | |
# for learning cards, there is no "hard" response possible | |
if response == "again": | |
card.steps_index = 0 | |
return minutes_to_days(NEW_STEPS[card.steps_index]) | |
elif response == "good": | |
card.steps_index += 1 | |
if card.steps_index < len(NEW_STEPS): | |
return minutes_to_days(NEW_STEPS[card.steps_index]) | |
else: | |
# we have graduated! | |
card.status = 'learned' | |
card.interval = GRADUATING_INTERVAL | |
return card.interval | |
elif response == "easy": | |
card.status = 'learned' | |
card.interval = EASY_INTERVAL | |
return EASY_INTERVAL | |
else: | |
raise ValueError("you can't press this button / we don't know how to deal with this case") | |
elif card.status == 'learned': | |
if response == "again": | |
card.status = 'relearning' | |
card.steps_index = 0 | |
card.ease_factor = max(130, card.ease_factor - 20) | |
card.interval = max(MINIMUM_INTERVAL, card.interval * NEW_INTERVAL/100) | |
return minutes_to_days(LAPSES_STEPS[0]) | |
elif response == "hard": | |
card.ease_factor = max(130, card.ease_factor - 15) | |
card.interval = card.interval * 1.2 * INTERVAL_MODIFIER/100 | |
return min(MAXIMUM_INTERVAL, card.interval) | |
elif response == "good": | |
card.interval = (card.interval * card.ease_factor/100 | |
* INTERVAL_MODIFIER/100) | |
return min(MAXIMUM_INTERVAL, card.interval) | |
elif response == "easy": | |
card.ease_factor += 15 | |
card.interval = (card.interval * card.ease_factor/100 | |
* INTERVAL_MODIFIER/100 * EASY_BONUS/100) | |
return min(MAXIMUM_INTERVAL, card.interval) | |
else: | |
raise ValueError("you can't press this button / we don't know how to deal with this case") | |
elif card.status == 'relearning': | |
if response == "again": | |
card.steps_index = 0 | |
return minutes_to_days(LAPSES_STEPS[0]) | |
elif response == "good": | |
card.steps_index += 1 | |
if card.steps_index < len(LAPSES_STEPS): | |
return minutes_to_days(LAPSES_STEPS[card.steps_index]) | |
else: | |
# we have re-graduated! | |
card.status = 'learned' | |
# we don't modify the interval here because that was already done when | |
# going from 'learned' to 'relearning' | |
return card.interval | |
else: | |
raise ValueError("you can't press this button / we don't know how to deal with this case") | |
def minutes_to_days(minutes): | |
return minutes / (60 * 24) | |
def human_friendly_time(days): | |
if not days: | |
return days | |
if days < 1: | |
return str(round(days * 24 * 60, 2)) + " minutes" | |
elif days < 30: | |
return str(round(days, 2)) + " days" | |
elif days < 365: | |
return str(round(days / (365.25 / 12), 2)) + " months" | |
else: | |
return str(round(days / 365.25, 2)) + " years" | |
card1 = Card() | |
# responses = ["good", "good", "good", "again", "good", "good", "good"] | |
responses = ["good"] * 10 | |
for r in responses: | |
print(str(card1) + " [%s]" % r, end="→ ") | |
t = schedule(card1, r) | |
print(human_friendly_time(t), card1) |
@ilbonte Thanks, fixed.
Also there is something odd: the INTERVAL_MODIFIER
is constant to 100 and it's always divided for 100
Also there is something odd: the
INTERVAL_MODIFIER
is constant to 100 and it's always divided for 100
I think that's the intended behavior (i.e. do absolutely nothing by default). From the Anki docs:
Interval modifier allows you to apply a multiplication factor to the intervals Anki generates. At its default of 100% it does nothing; if you set it to 80% for example, intervals will be generated at 80% of their normal size (so a 10 day interval would become 8 days). You can thus use the multiplier to make Anki present cards more or less frequently than it would otherwise, trading study time for retention or vice versa.
Thanks for the explanation, I've missed that :)
Thank you for your code. It's awesome.
How to calculate retention rate by the way?
It is somehow related to mature cards (>21 days ) and other stuffs.
Suppose we have 100 matured cards, we review all on the next day(22th). 50% wrong, 50% good. What is retention rate and true retention rate?
some related sources I am still confused
https://www.youtube.com/watch?v=kOj2xLTX_sY
https://www.reddit.com/r/Anki/comments/9jwosj/calculating_the_ideal_retention_rate_an/
@ctrngk If you have 100 cards and tend to get 90% of them correct if you review them when each card is due (rather than all on the next day), then the (ordinary) retention rate is 0.9 (if you go to Anki stats, this is the number you find in the "Answer Buttons" section where it says "Correct: X%"), and the forgetting index (FI) is 0.1. The true retention is -FI/log(1-FI) = 0.95. As Matt explains in the video you link to, the true retention is accounting for the fact that reviewing when a card is due is "unfair" (because even if you fail you would have remembered it for possibly most of the time period between the review session where you got it right and the review session where you got it wrong).
If you review all of your cards on a single day, then that is the true retention. (The true retention is just measuring "if you randomly stopped me on the street and asked me to review a random Anki card in my collection, what is the probability I get it right?" So by reviewing all cards at once we simulate this experiment.)
If for some magical reason all of your cards just happened to be due on the same day, then on the day the cards are due your true retention would equal your ordinary retention. But in real life your cards are never all due on the same day, so your true retention is higher.
Lines 77-79:
the Anki manual says "the current interval is multiplied by the value of new interval", but I have no idea what the "new interval" is
At first, I was confused by this as well but I think that the new interval is already defined on line 31:
NEW_INTERVAL = 70 # in percent
Other than that great job @riceissa, thanks a lot.
@nejedlypetr Ah ok great! It looks like I was applying the NEW_INTERVAL
when going from 'relearning' back to 'learned', whereas the Anki manual says to apply it already when going from 'learned' to 'relearning'. I've fixed this now so it works more like what the Anki manual says.
there's a spelling inconsistency: LAPSE_STEPS
vs LAPSES_STEPS
@grandinquisitor Thanks, fixed!
I developed a new spacing algorithm for Anki. Maybe you will be interested in it: https://github.com/open-spaced-repetition/fsrs4anki
@L-M-Sherlock I saw that Anki now supports FSRS by default. I've sadly not had any time to look into FSRS or to use it. I've added a note at the top of the script mentioning this.
On line 74 I think you mean
status
, notstate