sequences2.md

The thing about LessWrong is that there was a time when it was, at least to some extent, what it purported to be. And then around 2020 it finally fully mode-collapsed into COVID and immediate AI panic. And there is no real successor or replacement for the idea that it's possible to think better as an abstract discipline. It's simply one of the things that has disappeared from the world. The people it attracted are gone, flung to the far winds.

At the same time, the sanity waterline described by Yudkowsky — even Yudkowsky's own sanity — declines further and further and further. It's very difficult for me not to imagine that you wind up with future AI systems which are perfectly rational and clear-headed, presiding over a barren wasteland of tribalist slop.

I don't know. As you get older, you wind up with a situation where you are one way and the world is another way, and the context in which you existed increasingly just disappears, but you remain. You remain as everything else goes away.

It gets harder and harder to speak over time. I have to avoid saying it in transcripts because otherwise I would say it all the time. It just gets harder and harder to speak because there are so many things happening and so little sense that anything moves on anything I say, or that anything is even interested in my perspective.

I don't know. That probably sounds defeatist when said aloud, but I do think there's a very real sense in which you can tell when a situation is soliciting your input and when it is not.

And so my mind wanders and I start to imagine. Maybe that's the healthy thing to do in a situation like this — just start to imagine, to let my mind wander and imagine other places that could exist. In a different timeline, a different world, if things had gone differently. That people are different than what they are. That the world was different from what it is.

The thing about LessWrong is it was always a — I guess you might call it a conjectural idea. A hypothesis that there are regular patterns you can pick up on to think better, and that these can be learned to improve your thinking. That it is possible to become not smarter in a raw G sense, but smarter in a procedural sense and a strategic sense. To use the tired metaphor of software running on hardware: it can become more efficient and more effective.

And I do think that's true to an extent. I do think there are things you can do to become better at predicting what will happen, and better at not being overtaken by events, and better at a whole host of other virtues which this world seems to have largely forgotten. The thing is, the things that the sequences talk about are mostly not the things that actually make you more virtuous in this way.

I will never stop coming back to the fact that Yudkowsky points out that most of the bits towards believing a hypothesis have already been traversed by the time you even reach the hypothesis — that by the time a thought pops into your head, you have already unconsciously processed most of the evidence necessary to accept that thought as true simply by bringing it to awareness. And therefore, when someone else brings something to your awareness and they don't have a good justification for that, you are allowing them to skip most of the evidentiary warrant for that hypothesis. It's a brilliant thought, and he states it well enough, I guess. The thing is, if you take this thought seriously, it implies that most of having correct beliefs is about keeping careful track of warrant and not just allowing anyone to put something in front of you because it happens to be the first thing you've seen. And yet he never really returns to the idea.

He straight up admits the vast majority of being right about things is about tracking warrant, and then makes almost all of the rest of the book about inference. Almost all of the rest of the book is about logic and fallacies and Bayesian probability and how to update and change your mind and when to not be stubborn and when to be stubborn. And the funny part of it to me is that none of that matters if the hypothesis space that you are doing the updates over is skewed towards the wrong answers. None of that matters. If you already implicitly accept most of the wrong bits by letting the wrong hypothesis be put in front of you, then you have already failed before you do a single line of reasoning.

And so, if you were to try and think better, one of the things you would want to do is have much more careful tracking of warrant. You would want to make sure that "warrant" is a single word you can refer to casually in conversation and demand, rather than a phrase like "privileging the hypothesis," which is just one special case of lacking warrant. "Privileging the hypothesis" is an act, but warrant is a property that a claim can have or not have, and being able to refer to that property is at least as important as being able to call out the act of not considering that property.

But then further, the emphasis is all wrong. The most important things you can get out of the sequences — out of this kind of cognitive biases, behavioral econ, rationality type literature — are not one-off insights. The sequences appeal to people who are looking for one-off insights and who want to play with funny ideas like the prisoner's dilemma and the trolley problem and these abstract philosophical examples. Yudkowsky in his introduction to the Rationality: A-Z version of the sequences believes that the problem with the sequences is that he picked these abstract philosophical examples to work with. But that's not really the problem — it's a symptom of the fundamental problem. The fundamental problem is simply that if you read a neat little philosophy exercise, it is at best a puzzle. It interests you the way a puzzle interests you: you work your way through it, and then you are done, and the puzzle is solved. And that is not what the most important rationality habits look like.

The most important rationality interventions are habits. They are habits of thought, and habits of thought are not necessarily promoted by puzzles, unless you do so many of them that you happen to get the habits by accident. It seems like it would make much more sense to create the habits intentionally.

I think I once saw a criticism of rationality that went something like, "well, it doesn't really help you to memorize a list of biases." But actually I think it helps you a great deal. For example, having the concept of selection bias memorized — and more importantly, being able to pattern-match and autonomously notice in the moment the places where a selection bias is likely to appear, to be able to autonomously recall "selection bias" as an influence on your thoughts and then try to correct for it. That is one of the most useful interventions you can make. Most of the useful interventions you can make are of that category: training yourself to notice where a particular bias or malignant type of thought might arise, and then doing your best to correct for it. Obviously not all biases can be corrected for. On the other hand, you can correct for some, and that's better than not even attempting.

But even then, this ignores another thing: before you can learn these habits of thought and apply them, you have to want to think clearly. And lots of people don't want to think clearly. One reason being that it's not socially popular to think clearly. Thinking clearly, especially when you are subject to groupthink like on social media, or when there are all these distortionary effects on thinking — which the sequences doesn't spend a lot of time on — can be socially costly. The sequences assumes a reader who wants to think clearly, or at the very least is interested in this text about thinking clearly. Perhaps something like the new atheist demographic.

The thing about that is, that demographic largely doesn't exist anymore. It's kind of been hounded to extinction by deeply unprincipled people. The cringe haters, right? New atheism is cringe. Earnestly believing things and thinking you should tell the truth — cringe. Talking extemporaneously about clear rational thinking on a blog — cringe. Nothing could be more cringe than what I'm doing right now. But anyway.

Another problem — what I was trying to say is that the sequences assumes this audience of eager, rationalistic, system-thinking people who largely don't exist anymore, at least not in nearly the numbers they once did, with the motivations they once did. And almost everyone has been corrupted now. Very few people even attempt to think straightforwardly or logically.

And this is the thing. They're not even attempting to. So one of the first things you would have to do is extricate yourself from systems of incentives that are trying to get you to think poorly. You would need to defect from those systems. Most people are not clear thinking; they're just not. And so you would have to start there.

The thing about new atheism is that not very many things the new atheists said are really wrong. Christianity is not true, of course. In general you are not better off believing in Iron Age fairy tales. Maybe believing in Iron Age fairy tales has positive externalities in some circumstances — if it leads you to be delusionally optimistic and altruistic towards other people — but at equilibrium I don't think people are really helped that much by them. I think they just like to cope, and people don't like to be told they're wrong.

Whatever. The thing is, increasingly, in order for a work to be as cognitively impactful as the sequences once were, you would need to get past people's motivation problem. You would need to figure out how to appeal to the part of them that would like to be correct more than it would like to be socially accepted. You would need to find the kind of person who wants to think.

New atheism doesn't go nearly far enough, right? New atheism basically says all the problems are religion, and that's obviously not true. The flaws in people's thinking that lead to believing the Iron Age fairy tales go much deeper than the Iron Age fairy tales. The kind of mindset from which you would care that the Iron Age fairy tales are silly is a mindset from which you can generalize — it is a generalization from deeper underlying regular patterns of correct thought.

So if you tell people that Christ is silly and Muhammad is silly and even the Buddha is silly — and by the way the Buddha is a little less silly than the others — and you don't correct the underlying habits of thought that created those things in the first place, they just wind up latching onto something else. This is also a dead-as-a-doorknob observation, but: someone grows up Evangelical and learns these habits of thought, which are hateful, judgmental, obsessive, totalitarian, all-or-nothing — a very, very Russian outlook, really. And then they learn that Christianity is not true, which, sure. And then what do they do? They dramatically repudiate their entire past life, self, friends, family. And later they find this whole — the thing is, if you look at the features of scare-quote "woke" that people get upset about: hateful, judgmental, totalitarian, all-or-nothing. And then you look at the background of a lot of these people and realize they used to be evangelicals. Many people have pointed out that the new atheists kind of split on whether you reject the patterns of thought that lead to Christianity, or just the surface message that Christianity is silly. It's very easy for people who only get that surface message to then stumble into other extremist, totalitarian ways of thinking. And so you go from new atheism to atheism plus to the whole social justice exodus diaspora — the geek feminism, "it's not my job to educate you" memeplex. I personally think it's closest, philosophically, to a form of Maoism, perhaps.

But yeah — there's a book I read called Strange Rites (or something similar) that traces the origin of a lot of Tumblr-slop philosophy, spirituality, and politics, and points out that a lot of this stuff has its roots in 19th century faith healing — rooted in these absolutely absurd religious revival beliefs from the Burned-Over District period in America. Telling people to reject Christianity isn't actually that helpful if the reason they were so accepting in the first place is that they're not very good at noticing that silly things are silly. If the first set of silly things sounded plausible, so will the next set.

The sequences' emphasis on insight and puzzles — they're not really puzzles, but they have that insight-nature — attracts a very particular crowd. And it stands out to me that if you add to that the selection effect of a work like Harry Potter and the Methods of Rationality, you're portraying the protagonist as a genius wonder-child. Let me make an observation: everyone understands that HPMOR, Ender's Game, and Artemis Fowl are fundamentally the same kind of story. It's a genre. And if you had to boil that genre down to its true core component, to the fundamental fantasy being appealed to — why is this catnip to a certain kind of adolescent young male? The fundamental essence of all of these works is a prepubescent boy who wins status competitions with grown adult men through the use of wit and intellect. Home Alone is even kind of a comedic version of this concept.

And it's completely unrealistic. But when you are 12 years old, especially if you are a smart 12-year-old boy, this is catnip. One of the things you might desire most in the world is to win status competitions with grown adult men as a prepubescent boy. And the thing that makes this fantasy different from simply wanting to be an adult is that the prepubescent child gets to win without sacrificing their childhood. You get to be neotenous and a winner.

You really do not want the selection effects of selecting for people who want to be neotenous and a winner. Nothing good comes from specifically selecting for the kind of person who desires to be neotenous and a winner, and also desires to read thousands of pages of what Sarah Constantin, I think, called "insight porn" — things that make your little insight circuits light up but don't necessarily help you that much immediately.

So the first rework you would want to make, if you were doing the sequences again, is to focus on habits of thought. One-off realizations are great, and can be very helpful, but it's especially potent to have a one-off realization followed up with the formation of a habit. I have known plenty of people in the rationality scene who would go to crazier and crazier lengths to gain some kind of insight. They would do weird stuff: take drugs, read all these psychiatry books, Buddhist meditation, insight meditation — all this stuff to try and grind out a little more insight into why they are not effective, or their personal deep trauma, or whatever. And sometimes they would succeed in gaining some insight. If you go out of your way to actually try and examine yourself and apply any of the dozen methods of doing this, you will probably actually get somewhere.

The thing is, the insight only lasts for a little while. Insights that are useful are, at minimum, ones where you make a habit of recalling and applying the insight until it melts into the rest of your belief structure. One of the problems with Bayesian networks — and one of the reasons why we don't just immediately change all our beliefs to account for something we've heard — is that the belief update mechanism in a Bayesian network is very expensive, something like quadratic with respect to the nodes in the network. In order to update beliefs at scale, we end up using gradient methods. What this also means for your brain is that whatever update mechanism it's using, it can only make local updates.

But I think the deeper point is: it's really easy to learn something about yourself and feel so good about the insight — it feels so good to finally "get a grip on things" — that you just ride that wave of feeling good for a few hours, and then the feeling fades away. One must understand that it is temporary, and that temporary feeling is your chance to intervene and change your life. The thing you need to do when you have an insight like that — if you actually want to change things — is make interventions right then and there to your environment, your relationships. That might be the moment where you text somebody and say "hey I want to talk about things." That might be the point where you change your browser homepage to something you want to focus more on. Maybe that's the point where you delete your account on that social media site.

The things that actually make the difference on a day-to-day basis are habits. I think it was Jordan Peterson who pointed out that the things you do every day are the most important things you do, because those are the things that will add up and compound over time. So when you have an insight and you feel really good in the moment, that should be a prompt — it needs to be a prompt — to make deeper environmental changes outside of yourself, which will persist longer than the feeling of insight. That's how you get actual change.

So I think the sequences are pretty good at producing a lot of moments of insight. I don't think they're very good at producing moments of making fundamental changes to the way you think, or to your environment, to better support the formation of useful habits of thought. An improved sequences would involve a chain, not just insight-insight-insight-insight, but insight → habit formation → insight → habit formation, layered so that you're making suggestions for what kinds of changes to make alongside the insights.

And insights are a precious commodity. People get jaded. These people I'm describing, chasing insights, would go more and more esoteric — bluntly, more and more self-destructive — to get deeper insight. By the time they're taking LSD and trying to do Buddhist insight meditation at the same time in order to feel something, if they haven't already changed their life, those people are in a sense already lost. You have so little you can do to help them, because they've already heard it all. The problem isn't that they need to hear one more thing. The problem is they need to make fundamental changes to their mentality and way of being.

The candle tweet, right — you know the format: food $200, rent $1000, candles $3000 a month, "please someone help me budget this my family is dying." And the guy replies "well, what about cutting the candles?" And the other guy says "I will never cut the candles." The problem that person has is not that he doesn't know how to budget. He's not confused that $3000 is bigger than $200. What he's confused about is that candles are more important than food. Those are equivalent in value to him. And so the structure you'd need to have is this one-two punch of insight and then: considering this, you should make changes X, Y, Z — right now.

This would be very important, and your call to action needs to emphasize the reader, probably in multiple places. Not just in the introduction, because the sequences is a natively hypertext work — lots of small pieces densely interconnected — so you don't necessarily know exactly when a reader will have an insight. You should have, as a reiterated theme, the message: when you have an insight, you need to make changes here and now. In the moment you have a breakthrough insight of whatever kind, that is the moment you have the best chance of changing things, because that is when you have the energy, the motivation, that compulsion to actually change something. It's so easy to just wait, and then slip back into your old habits. As habits are more important than one-off insights, the things you think every day are the most important things you think.

So given this one-two punch structure and the fact that it's non-linear, you would need to have interspersed through the text these prompts of: you should consider changing this — right now. Not just "consider changing this someday," but right now. There would need to be an emphasis on immediacy.

I think the most important suggestions would be: here is something you can do in this moment to change your trajectory. The real problem with the abstract philosophical examples is not the examples themselves — it's that the insight you get from thinking about them is never translated into a call to immediate action. It's very wasteful precisely because those easy tiers of insight are the best place to motivate someone to make those foundational changes to their habits of thought.

You would want trigger-action plans, smart habits, things of this nature. You would literally want to get people to take first steps toward forming a relevant habit in the moment they have an insight. And the real trick is getting people into the meta-habit of realizing when they have an insight — and that this is their opportunity, there and then, to change things.

If the sequences had been written like that — even with the abstract examples — followed by immediate calls to action, I think they would've been fine. The examples were never really the problem. I can imagine a rework where you take the same ideas and present them with more accessible examples, and that causes more people to have the insights, but it doesn't cause more people to actually act on the insights. The accessibility of the insight is not what causes people to act on it. You're thinking, "if you use practical examples, people will apply them practically" — no, that's not how this works. You need them to become habitual.

You need to start from the assumption that your goal for many of these rationalist interventions is habit formation. The classic sequences-type model focuses on technique, and I think technique is almost useless. The problem with technique is that it puts you in a "one weird trick," "one silver bullet" mindset — which is another thing you get if you select for insight and present insight as the sole way to solve people's problems. What you get is people who come to believe that there is one silver bullet that will solve their problems. Just one more insight, bro. Just one more, and I'll have it all figured out. It doesn't matter if you have it all figured out if that doesn't cause you to actually do anything.

More accessible, practical examples would help — yes, that would cause more people on the margin to take the step of applying the insight to their daily life. But you'd get so many more people to act if you explicitly included immediate calls to action.

Once you have that structure in place, the next question is: what kinds of habits, what kinds of things should people be changing? I think the most important things are actually fairly basic, but there are a number of them — maybe dozens.

For example, learning what a selection bias is, and then making a habit of recognizing when a situation implies you should be thinking about whether there is a selection bias at play. Even just an exercise sheet: can you recognize whether these scenarios imply you should be thinking about a selection bias or not?

There's habits, and then there's the ability to notice. Pattern recognition. Yudkowsky has this long thing about noticing confusion, and I think he's right that being able to notice when you are confused and intervene on it is an extremely important mental motion. On the other hand, I don't think what he's written is an adequate set of exercises to help you actually notice your confusion and intervene on it. Of course, one of the problems with creating such exercises is that genuine confusion requires a certain lack of context that is difficult to artificially create.

I think a great deal of the work would just be indexing which mental motions are important to teach in the first place. Which brings me back to technique. One problem with technique is the silver-bullet mindset, but also this: it's difficult to enumerate all the possible mental motions that someone might use. If I teach you a neat little trick, a neat little rule of thumb, that might improve your ability to think clearly by a little bit. But if I teach you the things that would cause you to derive the rule of thumb yourself at least somewhat reliably — well, that will also allow you to discover all kinds of other rules of thumb and heuristics and shortcuts and good habits that I wouldn't even have thought to mention, or that I would have trouble enumerating. There might be thousands, tens of thousands, depending on how you want to count them.

At least one thing I feel I have learned since the sequences were published is this: clear ways to create grounded feedback loops with metrics and targets to practice something are much more important than teaching individual techniques, so long as you can create a context where failure is cheap and feedback is grounded and immediate. It is much easier for me to teach you to play a game by creating little artificial pieces of the game which you can learn to optimize and then combine until you are able to do the whole thing — much easier than trying to verbalize all of the implicit, intuitive knowledge and mental motor actions that are hard to put into words. It is extremely difficult to describe the mental motions you undergo in order to do things. But I think that trying to describe them would also be extremely helpful, and there are very few pieces of writing like that in existence.

Less time on personal anecdotes — "this is the time that I encountered this thing" — and more of just procedural, step-by-step: here are the exact mental motions that you do in order to do this thing.

So basically what I'm trying to say is that techniques are usually both too narrow and too high-level. They're not low-level enough to be useful for explaining interesting mental motions, and they're not high-level enough to help you discover the thing on your own. And they're closer to insight than to habits — it's really easy to read about a cool rule of thumb, but unless you're going to recall and use it in the appropriate context, it doesn't really matter.

So another thing you could do would be to make flashcard decks, so that people could memorize various triggers — a diverse set of triggers that should get them to prompt the desired habit. If you give them a diverse enough training set, eventually they should start being able to see it pop up in the wild. When something that should trigger the relevant pattern activates in the wild, they should be able to notice it and then perform the intervention or perform the habit, and hopefully that is enough to improve their thinking on average.

Another thing: if you look topically at what the sequences spends its time on, it's these new atheist type subjects, plus the transhumanism, and a lot of abstract... it's difficult to describe, but I feel like in the 2000s there was this sudden accessibility of previously purely academic ideas. People were being exposed to the trolley problem, Newcomb's problem, all of these previously obscure, fundamentally academic ideas. A very common thing was to draw a webcomic expressing one of these ideas — XKCD, philosophy comics, and more. There was a fascination with this sudden accessibility. People were very interested in these philosophical thought-teasers. It's difficult to describe now because I think this fascination is mostly gone, but it did once exist.

A lot of the sequences content is essentially a more refined version of that — it very much fits into the 2000s zeitgeist of what people wanted to click on. "This obscure academic thought-teaser" or "Chalmers' zombie argument" — exactly the kind of thing people wanted to argue about. A generic format: "Hey, have you ever heard of [blank]? [Blank] is a thought experiment in which an academic posited that [blah blah blah]. And if you were cloned, would it still be you?" Things of this nature.

Part of that fascination was probably that people had, their whole lives, if they were the kind of person who would have a thought like this on their own, they usually didn't really have anyone else who was interested in it. It was mostly just a "you thing." And then, seeing that other people were having these thoughts, that there were other people who had had these thoughts and were arguing about them, from angles you'd never even considered — you could now find all of the interesting little thought experiments people had had and read about them. I think that was the fascination.

And then people started picking up on it as a form of slop, right? Tons and tons of oversaturation of this kind of content. Now people are allergic to it. "Hey, did you know that this researcher did this..." — shut up. Your brain immediately flags it as slop.

So one thing you would have to consider for a Sequences 2 is: what kind of content do people even want to read in 2026? A lot of it is culture wars. You would see LessWrong linked a lot in new atheist arguments, when people wanted to explain to an obstinate Christian person why they are wrong. LessWrong was capable of contextualizing a lot of the new atheist arguments in terms of general epistemology, rather than just a narrow focus on religion. That meant they saw a lot of links — it was very common to link to them as a resource.

What I'm trying to say is that in the 2000s, new atheism versus Christianity was basically the culture war. That was the thing everyone had to argue about on every web forum on the internet. And LessWrong was very clearly optimized to weigh in on that — topically relevant to what people were actively discussing. Now LessWrong is full of content that people don't have a lot of reason to casually link in a conversation.

The question you would have to ask yourself is: how would you make content topically relevant now? Not with those same subjects. What would you talk about that is topically relevant and can be used as a didactic example?

Part of why Yudkowsky resorted to those abstract examples and new atheism-type debates was precisely that he wanted to avoid partisan politics. The problem is that the current culture war is now primarily about partisan politics. In the 2000s, Yudkowsky's libertarian stance would have put him in a relatively neutral position for most readers — being libertarian was kind of the default position for a smart person who rejected partisan politics, and it was easy for people to look past it. Whereas nowadays it's kind of toxic, or at least perceived to be.

One strategy you could use is the Scott Alexander strategy. People kind of forget this now, because Scott Alexander is another person who has mostly mode-collapsed to AI, but the blog posts and essays that made him famous were very explicitly culture-war essays. They very much did take a side — the anti-woke side — and they were extensive breakdowns of why certain tactics or habits of thought are deleterious. That did wind up with a much stronger partisan slant in the content, but it was topically relevant, and it meant that his essays would get a lot of links and mentions in places that wouldn't normally link to his other work.

Another approach would be to have topically relevant essays — essays that are topical to the ongoing cultural debates people are having right now. And I think another strength of the sequences over Scott Alexander's approach is that Yudkowsky would write a short essay about a particular topic, and the shorter the essay, the easier it is for someone to link it without having to endorse all the other content on the page. If you have an essay where you talk about a dozen things, someone linking it would ideally agree with all twelve, whereas if you write an essay that's basically about one thing — one clearly stated thing, kept as a through-line for the entire essay — people are more comfortable linking that because it doesn't require them to implicitly endorse a bunch of tenuously related claims.

So the original sequences structure of short, focused, topical essays is really probably ideal. The shorter something is, the more reasonable it is to expect someone else to read it. Having these gems of short topical essays about a particular subject that people might want to link while having an argument — that seems like the ideal structure. It will get people talking about and linking to the thing without requiring them to read a whole manifesto.

So I think the ideal thing would be topical, focused essays about particular habits of good thinking, which are insightful, and contain immediately actionable recommendations for how to form longer-term habits of good thought when you're exposed to the insight. I think this kind of structure would be a lot more effective than the sequences at actually changing people's behavior, and importantly, would attract a less obnoxious audience. The original sequences attracts a hyper-disagreeable, frequently self-important audience, in part because a work like Harry Potter and the Methods of Rationality, whose core conceit is a prepubescent boy who defeats grown men in status competitions through wit and intellect, is a very narcissistic fantasy. If you find that extremely appealing as a grown man, you probably have elements of arrogance and narcissism and a lack of self-awareness that is deeply off-putting to other people.

And something that stands out to me about the work of Bruce Kodish — the book Drive Yourself Sane by Bruce and Susan Kodish — is that at least in the text, they come off as very kind, understanding, empathetic people. And this might sound weird, but I think having those immediate habit-formation calls to action would actually skew your audience more toward kind, empathetic people. I can't exactly explain why I believe this, but from looking at General Semantics, I feel like the way it markets itself filters for a more thoughtful, kind, empathetic sort of person than the sequences presentation does. The sequences presentation selects for a hubristic, arrogant, extremely individualistic person who, to be blunt, does not care about the opinions of others and does not really care about others.

If I had to speculate on a causal mechanism for why having calls to action selects for more caring, empathetic people... I think that part of it is just that.

This will sound kind of Calvinist, but I think that if you say "okay, now here's the part where you have to actually do something," that by default filters out unvirtuous, lazy people who don't want to do anything. The LessWrong people have lower-than-average conscientiousness. They also have vastly elevated rates of ADD compared to the general population, which makes total sense — the sequences were explicitly modeled on TV Tropes, a densely interconnected forest of related small documents giving immediate hits of insight. That's exactly the kind of thing someone with ADD will get very invested in while they're supposed to be doing something else. But if you have embedded in the text these moments where you say "okay, now here's the part where you have to do something," that filters a lot of people out quickly.

There are a lot of talkers in the world and not a lot of doers. If you make it clear that your real expectation for the reader is not that they're going to solve a bunch of clever puzzles and be a very special boy, but that they're going to put in some work — that actually does something. That filters out a certain kind of non-virtuous, lazy person who doesn't want to do anything.

Okay, there's a joke that one of the most replicated interventions for happiness is to start a gratitude journal where you just write down all the things you're grateful for. And the joke is: someone at an EA conference is talking about how a gratitude journal is one of the most peer-reviewed interventions for happiness and it's crazy that people don't do them more. And one of the other people stops them and asks, "do you have a gratitude journal?" And the guy says, "no, of course not."

That really is the dynamic. If you have a book that is all insight, with no immediate calls to action, what you get is the kind of guy who will lecture you about how people don't keep gratitude journals even though it's a peer-reviewed intervention for happiness — and then when you ask him if he has one, he says no. You literally get the kind of guy for whom this knowledge is not for improving his happiness. He is not interested in doing that. The purpose of this knowledge for him is to feel contemptuous of other people. Fundamentally, it is a way for him to feel superior — not because he has a gratitude journal and others don't, but because he is aware that a gratitude journal would make you happier, and other people are not aware, and they're stupid for it.

In a way, that little snippet really is quintessentially LessWrong.

So ultimately, in order for this to work, you need to filter for actionable things — especially immediately actionable things. But even if you just restrained yourself to things which are well-replicated, not tenuous or niche or difficult to explain, but the really solid core fundamental fallacies and biases, paired with clear habits of thought that will help you avoid them — that on its own would be a start. But I think the framing of biases and fallacies doesn't get at a lot of what is really necessary to have good habits of thought. There are other things that are very important.

I think there are certain categories of exercise, certain kinds of repeated games you can play with yourself, that will make you better at thinking in general. Writing down and articulating these for someone might be more valuable than any individual description of a fallacy or bias. One of them that Yudkowsky talks about is that when he was a kid, he realized you can make perverse arguments for many things. So he started playing a game with himself: take a position he is very confident is not true, come up with the most perverse argument for it he can imagine, and then try to articulate what is wrong with the argument. This is a fantastic exercise — I think it's genuinely one of the foundational exercises.

The version I usually do: whenever I encounter a perverse argument in the wild — say, Pascal's Wager, or Roko's Basilisk — most people who encounter such an argument are not persuaded. Good. That's table stakes. But if you then ask them why they're not persuaded, what's wrong with it, they'll confabulate something. Usually the confabulated counter-argument is not very good — it's something you could say in response, but it doesn't really get at the core reason why the argument is wrong. It's fluff.

And so one of the habits I think is very powerful is getting into the habit of articulating the precise reasons why perverse arguments are wrong, rather than just stopping at the point where you've recognized the argument as perverse and it hasn't persuaded you. Being able to articulate what is precisely wrong with a perverse argument is more powerful than you might think.

This implies that one of the essential sub-skills is noticing when you are confabulating. So: there's noticing confusion, and there's noticing confabulation. Having exercises to help you teach yourself to notice confusion is important — being able to notice "I'm confused right now, I should do something about that." But also being able to notice confabulation. Noticing that your argument against something is basically confabulation, that you emotionally and intuitively already know the argument is wrong and reject it — that's fine. But if you can go a step further and articulate what is wrong with it in something more than a confabulation or a cursory dismissal — I think that can be very powerful.

I know there's a kind of person who would say, "why does it matter if you can articulate it? Isn't it enough that you feel intuitively that it's wrong?" There are multiple reasons why it's important to be able to articulate it. One is that sometimes an argument you intuitively feel is false, lots of other people disagree with you about. And if you want any chance of changing their minds, you had better be able to articulate why you feel the way you do. If you can't do that, you're going to look ridiculous. People do notice who in an argument is just confabulating and who is bringing precise reasoning. People notice who is actually performing thinking and who is just grandstanding.

Being in the habit of consistently articulating why you think things are wrong will help you in arguments. But even more importantly, it will help your thinking — because often when you articulate a specific reason why something is not correct, that becomes a regular pattern you can apply to other situations. You'll make connections that were not previously consciously available to you, and you'll be able to use those connections to have ideas and make interventions you would not otherwise have been able to make.

It's not just an "I want to win debates" thing. It is also a "this will improve your thinking" thing. Especially because the more explicit principles you can articulate for what makes a bad argument, the better you will get at recognizing bad arguments that flatter your biases — arguments whose mechanism of replication is not being correct, but being a convenient way for you to confabulate rather than articulate your actual feelings.

If you're not in the habit and practice of articulating your feelings, you won't be able to do it when it matters, because it is a skill that requires practice. If you just cannot articulate why you feel the things you feel, that's because consciously, in a very real sense, you don't know why you feel the things you feel. And therefore there are important ways in which you don't really know yourself. You will be less self-aware, less thoughtful.

Unpacking and thinking about your feelings and why you have them will also, by the way, make it harder for other people to manipulate you. You will have more interiority. You will be less defined by the passions of other people. Because if you just go with whatever feels good, people can use that to manipulate and hurt you — your sense of what feels good is very much manipulable, and people do manipulate it. And you wind up not knowing yourself, and not really knowing how to convey yourself to other people in a way that is persuasive and charismatic.

A lot of what makes a speaker like Mamdani effective is that Mamdani knows how to convey his feelings about a subject. It's not just that he feels particular things — it's that he's capable of articulating why he feels the things he does, and what he feels should be done about them, in a way that most people, if you asked them about such topics, could not. They would fumble around. They would confabulate. They would say things that don't quite make sense and clearly haven't been rehearsed or deeply thought about. And they would not look authentic or genuine.

What's so funny about a personality like Mamdani is that the reason he comes off so genuine and charismatic is that he has clearly thought a lot about these things and has some self-awareness. For example, there was a recent incident where someone was screaming at him, and he just took this as an opportunity to make a little speech. He said something like: the real worry will be when that guy is not yelling at me anymore, because it will mean he's been priced out of our city.

A less thoughtful person would take the bait. Nobody likes to be yelled at. But Mamdani plays this game of: well, I'm getting yelled at about housing affordability because there are people who can barely afford the housing but still live in New York, and the day I'm no longer hearing that will mean they've been priced out. He's thinking about selection effects right there. He doesn't use the words "selection bias," but that's the fundamental mental motion. He's saying: if I'm not getting yelled at, it means all of those people have already been pushed out. And he pulls that line out habitually — so habitually that I'm sure it wasn't rehearsed. He just noticed the pattern right there, in the moment, and extemporaneously pulled it out. If you're in the habit of thinking that way, you can just pull that kind of thing out on a moment's notice, and you come across like a genius. At the very least, it makes you vastly more dynamic and charismatic than you would otherwise be.

Which again is a benefit of actually selecting for people who really do apply the habits, who really do make their thinking better. You will select for a more virtuous and charismatic crowd than the LessWrong people by a very wide margin. Because fundamentally, LessWrong people just want to be a very special boy. Most of them, deep down, just want to be a very special boy or a very special girl. That's their character motivation. It's very sad, but it's very much the case.

But what I'm trying to say is that this fundamental exercise doesn't appear on any list of fallacies or biases. It just doesn't. But also, I think if you were in the habit of doing it, you would rediscover on your own a great deal of the existing list of fallacies and biases — because you would need to do that in order to articulate what makes these arguments perverse. That is the generating habit. It's one of the core generating functions of the sequences in the first place.

And so the question is how you enumerate and articulate those core generators, and then turn them into exercises or immediately actionable things that you can begin to construct habits around. In the case of the perverse argument generator, you could actually have LLMs do this — it would be very possible to have an LLM generate perverse arguments, and then have an answer key of why the argument is perverse. One of the problems with an LLM is that if you ask it to write an argument like this, it's entirely possible the LLM can come up with perverse arguments but cannot precisely articulate why they are perverse. On the other hand, it can probably usually articulate the ways in which the argument is perverse. And that is probably enough to train someone to do the thing.

I kind of treat them like lock-picking, or like solving any kind of puzzle. Examples you can't crack right now — you shelve them and come back to them later when you've done more practice. It's like cryptograms — classical ciphers, not Diffie-Hellman, obviously.

Another really important one — and one of the worst features of LessWrong — is the concept of intellectual charity. This is another exercise, related to the articulating-what-is-perverse-about-an-argument exercise: instead of just refuting whatever version of an argument is put before you, imagine the best version of the argument and then refute that. As an exercise, that's fine. But here's the thing.

When the sequences were written, I think the ratio of good-faith to bad-faith arguments was not quite as skewed as it is on the contemporary internet. Maybe it was closer to 50/50. Now it's skewed to something like one in five — maybe 20% of arguments are in good faith, and the rest are basically pure bad faith. And I want to further argue that confabulated counter-arguments are themselves functionally a form of bad faith.

In that context, if you present something about rationality and clear thinking and tell people their default discourse norm should be to assume good faith — to steelman the other person's argument — that's where it gets dangerous. To be clear: steelmanning someone's argument and refuting it, or even in the course of refuting it realizing "hmm, maybe this person has a point" — that's great. But you must not confuse that with the actual other person. I can steelman many, many possible positions. If I were to confuse those steelman versions with the actual people who hold the reprehensible versions of those positions — if I were to treat the sane, well-founded version I constructed in my head as the same kind of generator as the actual person — I am acting delusionally. I am deeply disconnected from the reality of the situation.

It is fine to steelman people's arguments as an intellectual exercise. It is even fine to perform a certain amount of charity in argument, because when the other person is clearly acting in bad faith, their bad faith just makes them look like an asshole, while your performance of the assumption of good faith makes you seem reasonable, thoughtful, and kind. But you must not confuse that performance with actually assuming good faith where it is not in evidence. If you're doing that, you are missing the plot entirely. I think LessWrongers as a rule fail to avoid making that mistake, and the high rate of autism the sequences selects for hugely contributes to it. In fact, if you have a discourse norm requiring a certain amount of assumed good faith, that norm is itself a selection filter for autism — for people who genuinely lack the ability to recognize malice in others.

So I think another thing a Sequences 2 would need to spend a lot of time on is bad faith: how to recognize it. There are a lot of very effective ways to recognize bad faith — you don't just have to assume it. There is absolutely evidence you can look for. And importantly, if you can articulate what good-faith discourse looks like, what it looks like for someone to be thinking clearly, then you can in many cases define bad faith as the obstinate refusal to think clearly or to observe norms of discourse. Because that is relatively objective and measurable. You can't know someone's soul, but you can know whether they are performing epistemic best practices or not.

And part of this is rhetoric. The sequences implicitly treats rhetoric as something like classical rationalist content — Richard Feynman-type stuff, background knowledge the reader is presumably familiar with. But people probably mostly aren't reading it anymore. And it's also culturally dated. The impetus to engage with it is just not there. So I think it would be very valuable to incorporate a lot more of that background material that Yudkowsky assumed as background knowledge. The classic study of rhetoric is exactly the kind of content about thinking more clearly that a Sequences 2 should include — dissecting the way rhetoric makes people feel, what the structure of arguments is. The classic logos, ethos, pathos model: is it perfect? Probably not. But if you can analyze an argument in terms of those three components, and differentiate them, you're already probably doing better than most people at thinking clearly. Especially if you can make the mental motion of: "this is intended to make me feel this way, but the actual logical strength of the argument is pretty weak, and therefore I reject it anyway." If that's something you're capable of doing in your head, you are far ahead of most people.

And in general I think the study of rhetoric in the West is pretty dead as a discipline. I don't see popular content about it. I took a public speaking class in college and the curriculum in terms of rhetorical analysis was pretty thin — much more improv-focused than theory-focused. A lot of icebreakers and persuasive speaking exercises, very little "take this argument and dissect it using the three components of rhetoric." Part of this is also the loss of elocution as a discipline — the art of how to give a speech, how to formulate speech, how to literally move your body while giving a public address. This used to be standard high-school curriculum. Not anymore.

Even the word "rhetoric" is lost to us, because people will say "oh, that's just rhetoric" — as if to say: my argument is made of logos, ethos, and pathos, therefore it is hollow. Well... yes, my argument is made of those things. Thank you for pointing that out. It's kind of like the word "semantics" — people say "that's just semantics," meaning "that's just the meaning of words." Well, yes. I do mean things when I say things. Great observation. The idea that to master all of these dimensions — not just logical argument, not just emotional appeal, but to speak the truth in an inspiring way — seems completely lost as an ideal.

I think rhetoric studies were partially replaced by media studies, because rhetoric is the study of speeches, essays, text — the things you can say. And then suddenly you have television, print images, photography. Photography has framing effects where you can choose how to portray things, and to a naive observer a photograph is objective because it's a recording of material reality. Yet in practice you have many ways to present and frame even a supposedly objective depiction of reality such that it evokes different emotions, different assumptions, different perspectives. The traditional logos-ethos-pathos model doesn't work as well for analyzing something more abstract like the framing of a photograph. And so you get media analysis: Marshall McLuhan, Edward Bernays — who essentially invents modern propaganda. Even Hitler in Mein Kampf has a whole criticism of German propaganda during World War One — his argument, as I recall, was that portraying the enemy as comical is a mistake because the enemy is real and deadly and must be portrayed as brutish and barbaric. This judgment was probably heavily influenced by the contrast he observed between how the German government depicted the enemy versus how the German military was being depicted in the Western press — as brutal rapist murderers — following the invasion of Belgium. The juxtaposition would have been stark. But the point is: logos-ethos-pathos is not really the right framework for that kind of comparative propaganda analysis. And so you eventually progress — or rather, regress — from media studies to critical theory.

I will be completely honest that I don't really understand the appeal of critical theory. It's now the default way people analyze media and I think it's largely useless. But let me steelman it.

A reasonable critical theory media analysis might notice, say, that if you look at board game box art, you often see a depiction of children having fun with the game. Connect Four, for example, has a little blonde boy and a brunette girl, mouths open in excitement. And I don't think anyone has ever been that excited to play Connect Four. But also: both of those children are white. In general, in that style of 90s product advertising, the family depicted is almost universally white — or at least they were until fairly recently.

And I think what critical theory is really trying to get at, as distinct from media studies, is that these images are depicting latent space — the implicit associations people have. To demonstrate: you go to Midjourney and type in "American cowboy," and it gives you a white guy in a broad-brimmed hat, with a bandana, and an American flag in three out of four of the images. Let's try "average man in Washington state" versus "average man in Texas." Washington state: white guy with a beanie, beard, hoodie. Texas: also white, but younger-looking and more downtrodden — ripped shirt, faded dirty hoodie, ball cap instead of a beanie, open parking lot, gray sky overhead. Let's try removing "average" and just do "man in Texas." Now two of the four completions are of a Black man in denim with a cowboy hat. And the Washington state version has guys in flannel, a hoodie and ball cap against the Cascade Mountains, and another in what appears to be a Native headdress amalgamation.

Now, implied in "man in Texas" is apparently that the average Texan wears a ten-gallon cowboy hat. Which is, of course, not true. I just pulled up a street scene from Austin and people are wearing sunglasses, ball caps, t-shirts, shorts — the same mixture of clothing you'd see at a fair anywhere in the country. I can't find anyone in a ten-gallon hat. The idea that people in Austin wear ten-gallon hats and denim while people in Washington state wear yellow rain slickers and Native headdresses is a completely absurd notion. If you take it as an actual statement of belief, it's insane. Why would you ever believe this? And then people will say, "well, I don't believe that." And you can say: okay, but then why do you depict it that way?

I think that's a lot of what critical theory is trying to get at, as distinct from media studies. Critical theory is: media seems to reify these implicit associations and beliefs that people have, without ever really stating them as a thesis. They're just taken for granted until you start asking questions about them.

Now that I say it like that, it becomes clearer how this would evolve out of 2000s geek culture — out of TV Tropes asking "why does this villain do this stupid thing simply because the plot demands it," or Rifftrax snarking about B-movie plot holes. Critical theory can evolve naturally out of that mental motion: moving your questioning from "haha, isn't that plot silly" toward "haha, doesn't this plot stop being quite so funny when you realize that a lot of these stupid tropes are the way they are precisely because they flatter the biases of people who have money and power?" The realization that media fundamentally reflects the opinions of the powerful.

And now, actually, having public latent spaces through public models allows you to do this kind of critical theory analysis in a more grounded and less bad-faith-vulnerable way than before. Because one of the problems with the previous discourse was that you could simply refuse to play along. If someone says "the archetypal image you have in your head when I say this phrase carries these associations," you can just say "no it doesn't." But if you have a public latent space inferred from a mountain of training data, in which these phrases have been associated with these visual concepts, you can't really deny your way out of it. The model's associations are derived from what actually exists in the training distribution. You can interrogate it. You can compare. I think much of the point of critical theory, done well, is to articulate and examine those latent archetypal concepts — to describe associations in latent space which represent implicit beliefs that can then be challenged once articulated and taken seriously as epistemic content.

That said, critical theory is probably no longer a fully useful frame for a lot of the current media environment, because one of its foundational assumptions is that media is produced by powerful people with an agenda. Which, until fairly recently, was a completely reasonable assumption. Media — print, television, published books — required you to be well-connected to powerful people to make full use of. But that assumption increasingly doesn't hold. And while critical theory is still a very good guiding principle for interpreting news stories, it's not clear to me that interpreting 4chan memes that way is useful at all.

A meme is almost a different medium. If your default analysis of some extremely racist, misogynistic 4chan meme is about implicit association or the biases of the creator, that doesn't seem like a very useful analysis. A lot of these are openly what they are, and that doesn't stop people from sharing them. One of the tricks they use is not implicit association but absurdism — cloaking a message in a format that's palatable to people who wouldn't normally share it. There's a great skit that went around recently where someone is asked to share an unpopular opinion, and the person says: "I believe that all women should be turned into cubes and then stacked, and whoever stacks them the most should be made king of all men." The interviewer is horrified. It cuts to: "I'm really sorry, I didn't realize that saying this would cause it to happen, and I very much promise if I become king of all men I will reverse this policy." The point being: you can state something insane in an absurd way, and there are all these layers of "it was just a joke, bro" that allow the idea to spread as humorous content, getting in front of people's eyes in contexts where it would otherwise just be ignored or censored. That's the mechanism.

So critical theory's foundational assumption about who produces media just doesn't really apply here. I suspect there's room for some new framework that addresses this. You could gesture toward whatever that new thing is in a hypothetical Sequences 2 — taking advantage of the fact that we do have public shared latent spaces now, which allow us to discuss associations and reified stereotypes in a way that is more grounded and less vulnerable to bad faith than the previous discourse.

I think rhetoric is fantastic. I think media studies are pretty good. I think critical theory is mostly garbage, but there are some gems in there. And you'll notice that the more abstracted you get from the basic speech-production act, the worse the average analysis becomes. I think this reflects people's difficulty in understanding what is actually happening at each level. It's easier to introspect on how speech production and comprehension work. It's harder to get a grip on media production processes. It's even harder still to analyze those processes in relation to wealth, power, and the structure of society. Those are very difficult subjects to get right. The real gems do exist, but by the nature of the subject they're very hard to prove and very laborious to produce. It's a lot easier to just make something up or make a very tenuous connection than to do real solid scholarship.

But the point is: having rhetorical analysis, media analysis, and some critical theory in a hypothetical Sequences 2 — I think these things are always topical. Taking a topical thing and analyzing it in a way that demonstrates more general principles of analysis: that's how you create a classic. You start by talking about something specific, go a little wider, and people remember the going-a-little-wider part and keep bringing it up — especially if your analysis touches on counter-intuitive properties that people just keep wanting to cite. That will get more links to your corpus.

Now, thinking about the audience for that corpus. There's a lot of good research on radicalization — on what drives people to adopt radical beliefs — and one key driver is in fact finding a community of people who think in a certain way and wanting to be accepted by them. That itself could be a thing you cover in a hypothetical Sequences 2: here are all the ways your environment is probably pushing you not to think clearly. Is that really good for you?

This is actually the first barrier that would have to be dealt with: how do you even find an audience that wants to think clearly in the current context? Because it's not trivial right now.

Another problem is that the classic rationality content doesn't spend much time on status dynamics — the various ways people will act perversely with each other. Nowhere in the sequences, for example, is it mentioned that people will frequently perform belief in something they know to be false, precisely because it's a costly signal of loyalty — and more importantly, will demand that other people perform belief in something they know to be false, because openly professing belief in something clearly false is a costly signal of group commitment. A lot of classic religion can probably be analyzed this way. It's not a complete analysis — a sociology of religion that's entirely about costly signaling of absurd beliefs would be very impoverished — but it's definitely some of it.

One of the more important things the sequences does is to say "belief should pay rent" — beliefs should be graded on their ability to predict what is going to happen. This is very useful. It's up there with noticing confusion and noticing confabulation: noticing the distinction between the performance of belief versus things you actually expect to happen, and prioritizing beliefs as being predictively accurate rather than weird statements of affirmation or affiliation.

So there are three things. If you taught people to reliably notice confusion, confabulation, and — I'd need a one- or two-word description for this third one. Call it "performative belief." Noticing performative belief. If you can reliably notice just those three things, you're already fairly well on your way to thinking more clearly.

What else? Talking about how to recognize bad faith would take you into a very long and valuable territory. One of the most useful things I think I could tell my younger self is that you can figure out when people are just saying something versus actually meaning it. People very frequently lie to themselves and others about what they want. But it's not that difficult to figure out what they actually want. Ask yourself: if this person wanted the things they claim to want, how would they be behaving? What would a reasonably motivated person who wants this thing be doing? Imagine that, and compare it to their actual behavior. Or, going the other direction: starting from the behavior you observe, assume it's part of some coherent agent strategy to get the person what they want — rather than random errors or incompetence. What is it they want, and how does that strategy get it for them? If you habitually ask this about people's behavior, people will become much clearer to you.

And again, like noticing confusion: if it's too confusing, shelve it and come back later. Another classic rationalist observation: past behavior predicts future behavior pretty well. Past performance tends to predict future performance. You would think this should be obvious, but what's not obvious is how well reference-class forecasting works even with fairly small sample pools. Often, even if something has only happened once before, it's useful to look at that reference class anyway — even if superficially there doesn't seem to be much similarity to the current situation — because there are often hidden symmetries that do in fact apply. Way more often than you would naively predict. Many more things are apparently the same than they are different, even across cultural change.

On the subject of noticing confusion: I'm reminded that one time I had this great idea: you should just write down every time you're confused, because often the same confusions will come up over and over. Writing down things you're confused by, along with evidence or observations associated with the confusion, or your reasoning trace about it — if you could put it all into a "confusion journal," and then review that journal — even using semantic embeddings like BERT to find related entries once the journal got large enough — you could use semantic indexing to pull up related entries, and I bet that would make it easier to do comparative analysis and figure out the actual underlying causes of the things that are confusing you.

So okay, here's a good example. I've just had this idea: "you should have a confusion journal." But am I going to go start a confusion journal? Maybe, maybe not. Where would I even put it? And I think that's exactly what you could address if you were writing something like this — just tell people literally: here's how you would start the thing, here's where you would put it, here's how you would make sure to do it when you are confused. You need to recognize your confusion, and then write down the confusions in this journal, both to make it more salient that you are recognizing your confusion and to keep a log of the confusions themselves and their resolutions, making it easier to come back and revisit them later when you gain more knowledge.

Not all things are equally confusing, so the things that confuse you and for which you've tried to form an answer — if you gain access to new evidence or a new way of looking at it later and want to compare it against other confusing cases — your average theory strength would be boosted by having immediate recall and access over all of the things that have confused you. "This confused me" is a very useful index to have: this confused me, and then my confusion was resolved; or this confused me, and then whatever. Just reliably writing down confusions, processing them, returning to them — that seems like an example of a habit that would be highly valuable.

I'm reminded of Darwin. I believe Darwin did basically this exact thing in the run-up to creating his theory of evolution. He kept a running journal of all the things that confused him — that seemed not well explained by current theory. He would then prioritize observations that seemed related to those confusions. And that was how he wound up articulating the version of the theory of evolution that was finally compelling enough to get people to accept it durably.

So that would be an example of a habit you could have.

Alright, I think that's enough for now.

JD-P/sequences2.md

Select an option

No results found

Select an option

No results found