Self-Transcript for Scraping Bits Podcast #107: Proof Writing, Discovering Math, Expert Systems, Learning Math Like a Language

by Justin Skycak (@justinskycak) on

Why aspiring math majors need to come into university with proof-writing skills. My own journey into learning math. Math as a gigantic tree of knowledge with a trunk that is tall relative to other subjects, but short relative to the length of its branches. The experience of reaching the edge of a subfield (the end of a branch): as the branch gets thinner, the learning resources get sh*tter, and making further progress feels like trudging through tar (so you have to find an area where you just love the tar). How to fall in love with a subject. How to get started with a hard subject that you don't love: starting with small, easy things and continually compound the volume of work until you're making serious progress. How to maintain focus and avoid distractions. The characteristics of a math prodigy that I've tutored/mentored for 6 years and the extent to which these characteristics can be replicated. How Math Academy's AI system works at a high level, the story behind how/why we created it, and the stages in its evolution into what it is now. How Math Academy's AI is different from today's conventional AI approach: expert systems, not machine learning. How to "train" an expert system by observing and rectifying its shortcomings. How to think about spaced repetition in hierarchical bodies of knowledge where partial repetition credit trickles down through the hierarchy and different topics move through the spaced repetition process at different speeds based on student performance and topic difficulty. Areas for improvement in how Math Academy can help learners get back on the workout wagon after falling off. Why you need to be fully automatic on your times tables, but you don't need to know how to do three-digit by three-digit multiplication in your head. Analogy between building fluency in math and languages. #1 piece of advice for aspiring math majors.

Cross-posted from here.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

Link to Podcast

The transcript below is provided with the following caveats:

  1. There may be occasional typos and light rephrasings. Typos can be introduced by process of converting audio to a raw word-for-word transcript, and light rephrasings can be introduced by the process of smoothing out natural speech patterns to be more readable via text.
  2. The transcript has been filtered to include my responses only. I do not wish to infringe on another speaker's content or quote them with the possibility of occasional typos and light rephrasings.
∗     ∗     ∗

Justin: Yeah, sure, happy to be on again. Great conversation last time. My name’s Justin Skysack, as you mentioned. I’m the Chief Quant Director of Analytics at mathacademy.com. I built all of the quantitative software that drives the Math Academy learning engine for an automated online learning system.

Justin: Yeah, I would agree with comparing it to a program. When I taught proofs in the past, I’d introduce them as a program that runs in the mind of someone with mathematical knowledge. That makes it difficult for a novice to debug proofs sometimes.

When you write a program, the computer runs it, compiles it, and spits out errors. But when you write a proof, the “compiler” is a more advanced mathematician, like a teacher, who reads it. If you’re a novice, it can be difficult to debug your own proof because you need someone with more advanced math knowledge to parse it and be your compiler.

Justin: Totally. We see that in coding all the time. There’s a joke that the best way to understand what’s going on in someone’s head is to read their code.

Justin: One interesting thing about proof writing is that, at first glance, some proofs don’t have many symbols in them. They might look approachable, but then you start reading, and your mind gets bent and demolished by all the logic in the proof.

If you haven’t trained yourself in proof writing and you just open a math textbook to read proofs without many symbols, it can still feel like a rollercoaster. Proofs are like a coding language—a very esoteric one. They often use language in subtle, specific ways with precise mathematical meaning.

Sometimes, these same forms of language are used more loosely in conversation. If you read a proof with conversational language in mind, it can be confusing. This is even more justification for learning proofs instead of jumping into the deep end and hoping to swim.

Justin: Totally agree. If you’re trying to read math and keep getting interrupted, wondering, “What does this symbol mean?” or “What does this phrasing mean?” there’s no way you’ll be able to see the forest for the trees or understand the whole proof. It’s just like reading. If someone doesn’t know the definitions of various words or is struggling with sounding them out, there’s no way they can read a Shakespeare play if they’re stuck on individual words.

Justin: That’s something I’ve seen over and over. People come out of high school solid on the mechanics of calculus, thinking they’re all set for college. They’re excited, maybe planning to major in math, and they get into a serious university, signing up for the hardest math courses.

They were at the top of their class in high school, did well on AP or IB calculus exams, or other college-credit calculus tests. Then they get to university and just get their ass handed to them. There’s a level of mathematical maturity often assumed for students coming into a math major, especially those pushing hard to take advanced classes.

You might find yourself in a class where proof writing ability is baked into the prerequisites, but you’ve never actually gone through it. You don’t know what a proof is, you’re not fluent with unions, intersections, or other proof mechanics, and you feel like you’re drowning.

Last summer, I talked to someone who was in this situation at the University of Chicago. They entered as an aspiring math major but got their ass handed to them because they lacked proof-writing foundations. Their classmates often came in with a wealth of knowledge, having done extra reading outside of the normal school curriculum.

These classmates not only knew calculus but also topics like groups, linear algebra, and multivariable calculus—very advanced material. University courses are often built for that kind of student. They don’t slow down for students who still need to learn what a union or intersection symbol is.

This person ended up switching majors to physics. It wasn’t because they enjoyed physics more but because the playing field felt more level. Beginning physics courses required less background knowledge. Many physics derivations are more mechanical—you start with a formula and rearrange symbols to get to the result.

In pure math, proofs often involve not just symbolic manipulations but logical rules of inference, which can get complex. Their reaction was, “I wish someone had told me in high school that this game was being played, and proof knowledge is part of it.” Just because you excel in high school math doesn’t mean you’ll be at the top of your class in college. The people sitting next to you often have additional background knowledge, largely from proof writing.

It’s something super important to learn before university. Graduates from a serious math program will have proof-writing experience—or at least they really should. Many people get filtered out of math majors simply because they didn’t know proof writing beforehand and found it hard to catch up on their own.

Justin: Right, those set manipulations you were talking about—they show up in pure math, applied math, and even other fields. It doesn’t matter; they’re just part of it.

Justin: Actually, before computer science, I came from a math background. In high school, I was super into math. You know how I described those students in high school who are crazy about math—learning linear algebra, multivariable calculus, proof writing—going beyond calculus? That was me. I self-studied about three-quarters of an undergrad math major during high school using MIT OpenCourseWare.

When I got to college, I majored in math. At the same time, I started getting into coding more seriously and began working in data science. About a year after graduating, I found Math Academy. That’s when I got involved.

Initially, in 2018, Math Academy was experimenting with videos. Our first concept of guided lessons involved video content. I made hundreds of educational videos and wrote a lot of content. By 2019, I started working with Jason on the code side, building the automated system.

For me, the transition was easier compared to others who start in computer science and want to move into math. I came from a math background and transitioned into computer science.

Justin: It’s a great question. Initially, a highly structured curriculum is vital. Math Academy covers everything from fourth grade up to university-level math. But what do you do when you reach the other end? It all comes down to what you’re trying to accomplish.

Math is a very branching subject. As you work through the foundations—set manipulation, proof writing, calculus, algebra, and so on—that core knowledge forms the trunk of the tree. Once you have a solid undergraduate math education, you face an existential question: what’s next?

At this point, the trunk has branched out so much that, even if you aim to be the world’s greatest mathematician, you can’t be the best in every single area. There are too many branches. You have to start deciding what you actually want to do.

This can be difficult for people who followed the standard curriculum without considering what interests them beyond it or why they’re learning math. You face a fork in the road—really more like 20 different forks—and have to make a decision.

I recommend building up your foundations while keeping in mind your goals. What is your goal in learning math? What are you trying to achieve? Is there a specific field in computer science you want to explore, such as building autonomous agents? Do you want to be an academic mathematician working on theorems about prime numbers? You need to decide what you want your future to look like.

Once you know your goal, you can start picking up books and working on projects in that direction.

Justin: I’d say loosely, yes. Loosely, that’s correct, but even with the trunk of knowledge, there are books and resources that may be too specialized. It’s still a gradual process. You often have to scaffold yourself into a subfield, starting with some introductory material.

Justin: Right. Having the mathematical foundations puts you in a place where you can read introductory material in a niche field of math.

Justin: Exactly. The tricky part is that, at the lower levels of math—elementary school through calculus, even the university-level math covered in Math Academy—it’s scaffolded really well. You don’t have to worry about what resources to use, what order to follow, or how to review. Everything is structured for you.

You just get on your computer, do your XP, solve the problems we tell you to solve, and you’re all set. But when you finish all the math in the system and it’s time to specialize in a subfield, you have to start managing your own learning process.

That’s a skill in itself. It’s not about math knowledge anymore; it’s about keeping yourself accountable, solving problems, doing projects, and scaffolding them in difficulty. You can’t start with something too hard—you need something at the right level to build from.

It gets more challenging as you approach the edge of a subfield. The resources available to teach yourself get worse because, at the cutting edge, people are just publishing research papers. That’s what you have to learn from. It takes time for cutting-edge knowledge to become well-scaffolded and accessible.

Justin: Exactly. Thinner, sparser, and shittier.

Justin: Right. In any field, once you reach the edge, it becomes about creative production. That’s related to acquiring new skills, but it’s very different because you’re creating new things.

Justin: It’s interesting. The first time this happened to me was when I learned a lot of math and got really into a subfield of machine learning, particularly neural nets. This was back in 2014, and convolutional neural nets were pretty much the cutting edge of machine learning. Now there are tons of courses on how to do convolutional nets, and it’s very well-structured. If you want to learn about them today, it’s easy to do so.

But just ten years ago, the resources for learning that kind of stuff were so sparse. I got really interested in machine learning and neural nets and hit the edge. It felt like trudging through tar. It was so hard. After learning from MIT OpenCourseWare—calculus, differential equations, probability, statistics, and other topics, which had a lot of information available online—it felt like progress slowed to a crawl. Those resources were well-organized enough that you could make good progress if you were motivated.

But once I hit the edge, it became really hard to find textbooks or structured material. It was like walking through tar. I had an existential crisis, wondering if this was really what I was interested in. It turned out the answer was no. I got tired of trudging through that tar. I just wasn’t committed enough to the neural net stuff.

Justin: Interest becomes very important because that’s what motivates you to keep going despite the tar. You have to say, “Okay, I’m trudging through tar, but I love this thing. I love the tar.” It has to be worth it. It’s not just about moving fast and acquiring knowledge—it’s about really caring about the subject.

Justin: I would totally agree that often, to really fall in love with a subject and develop passion, it’s not love at first sight. You have to spend time with it and develop a habit of engaging with it.

How do you get started with something when you don’t have that passion at the beginning? Especially when it’s hard and taxing on top of that? It’s asking a lot. How do you get yourself to start when the reward is unknown, it’s very hard, and you don’t love it at all?

The answer, I think, is to start small. To use math terminology, you do an epsilon amount of it. Find ways to reduce the barriers. If it’s very hard, make it easier. You don’t have to do it for a long time at first.

For example, with fitness, if you haven’t been exercising for months and have been sitting all day, day one isn’t about going to the gym or running five miles. It’s about going for a walk—a 20-minute walk, 10 minutes, or even five minutes. If even five minutes feels too hard, go for a one-minute walk. If that’s too much, walk to the end of the room and back. Do 10 jumping jacks if that feels manageable. Whatever the lowest bar is, set it there.

Then, each day, raise the bar slightly. For example, if walking to the end of the room and back is your starting point, do it twice the next day. Then four times the day after. Soon, you’re walking for a few minutes. After the first week, you might find yourself thinking, “This is boring. Why not walk outside?” Then, you’re walking around the block.

A week later, you’ve built up to a 10-minute walk. Eventually, you might think, “This is getting boring—why not jog instead?” Now you’re jogging a couple of minutes, then scaling it up until you’re running for 10 or 20 minutes, covering miles. If you tire of running, maybe you switch to push-ups, pull-ups (or hangs if you can’t do full pull-ups), or other calisthenics.

This approach works for anything, not just fitness. For math, for instance, if someone is trying to get started, don’t jump straight to problems at the edge of your ability. Solve a few problems in areas you already know. If it’s arithmetic, start with multiplication or times tables—even fourth-grade material.

It might feel silly at first, but you’ll get into a rhythm. Once that feels easy, you’ll naturally do more next time. Over time, this compounding effect leads to exponential growth in what you can accomplish. You trick yourself into starting with something easy, develop comfort with it, and then build up to serious work.

Justin: That’s a really good point. It’s not just about how easy something is in an absolute sense—it’s also about what it’s competing against. What other actions are competing for your attention?

You can try to make the task you’re trying to do easier, but if you’re in an environment where it’s even easier to do something you know you shouldn’t, you’re sabotaging yourself. Part of it is setting up your environment to minimize your choices or manipulate yourself into taking the right action. It’s like your past self is manipulating your present self.

Justin: That reminds me of how social study clubs often work well for people. For instance, you join an online Discord server where they host study sessions. Everyone does math for 20 minutes, chats for five minutes, then goes back to math for another 20 minutes.

There’s a social norm in place that funnels you into productive behavior, even if it’s not something you would have done on your own.

Justin: That’s a good question. Unfortunately, the answer comes down to individual differences. The kid I’ve been mentoring has some traits that are advantageous for math. He’s loved math from an early age and is super interested in it.

The way I met him was back when I was doing more tutoring, about six years ago. His parents were looking for a tutor to keep him occupied for a few hours each week. He was constantly doing math experiments and mini-projects on the properties of numbers. He would talk non-stop to his parents about math, and they felt someone needed to engage with him mathematically.

His parents described him as being “amped up on math,” like how some kids get with sugar or dessert. When I started working with him, I’d say, “That’s a cool experiment. Did you know this relates to a property in arithmetic?” I’d give him problems to explore further, which naturally led to algebra and calculus. Over time, we covered a wide range of math fields.

He benefited from being very intrinsically interested in math from a young age. Neither he nor I know exactly why he fell in love with it. That’s the big question—why did he love math so much? I’m not sure if that process can be replicated in someone who doesn’t already love the subject. Some people fall in love with math later, but the process that led this kid to immerse himself so deeply might come down to how his brain was “initialized,” so to speak.

Another advantage he has is cognitive differences. I know some people feel uncomfortable discussing things like IQ, but differences in working memory capacity, generalization ability, and forgetting rates exist. These can also be domain-specific.

Justin: Exactly. Not everyone’s brain is the same. People have advantages, just like some have physical advantages that make them better suited for certain sports—different phenotypes. It’s the same cognitively.

This kid has very high generalization ability, particularly in mathematical and quantitative reasoning. His forgetting rate is shockingly low. Working with him feels different compared to other bright students who aren’t at the same prodigy level.

When you go through problems or introduce new material to him, it’s like reminding someone who’s just rusty on something they’ve learned before. It’s the strangest thing. Imagine you’ve forgotten some math from high school, but when you review it in a textbook, you say, “Oh, yeah, I remember this.” Now, imagine you’ve never seen that material before, yet your internalized and generalized knowledge base integrates it as if you’re just refreshing an old memory. That’s what it’s like with this kid.

This speeds up knowledge acquisition. Typically, when working with a student, you might cover material in a session, but if you wait a week or two, much of that knowledge has decayed. You have to reteach it. It goes faster the second time, but you still have to repeat the process.

With really bright students, they retain more, so less time is spent reteaching. But with this student, his memory fades so slowly that it’s almost negligible. After a week or two, you can pick up right where you left off with almost no friction.

All these factors compound, accelerating his progress through math.

Justin: It’s a lucky roll combined with leaning into his interests and a lot of hard work. None of this would happen without the hard work. But it’s a lucky roll where the hard work feels natural because he’s built the habit, he’s good at it, and he’s so in love with it that it’s part of his identity. This is who he is.

There are helpful factors that can definitely be replicated, but there are two things that are hard to replicate if you take any random person and try to get them to the same level. First, how do you instill a love of mathematics that early? Second, they likely don’t have the same advantageous cognitive differences for math.

The hard truth is that most math learners experience a lot of friction during the learning process. There are ways to reduce that friction—developing more interest in the material, doing review, mastery-based learning, spaced repetition, and developing a habit. These approaches can accelerate math abilities well beyond the status quo. But to reach that prodigy level, I think it comes down to finding an area where you have a real competitive advantage.

Justin: I should say that even if a person may not reach the level of prodigiousness of someone else in the field, paying attention to what fuels great people in that field—consistent, efficient practice habits, not messing around during practice, and a deep love for what they do—can still get you very far. If you replicate those things in your life, you can go much further than anyone would expect.

If you don’t have any other competitive advantages, such as cognitive differences, the question remains: how good can you get? Probably not as good as the best in the field, but you’ll still get pretty damn good.

Justin: Right. It’s the saying, “Hard work beats talent when talent doesn’t work hard.”

Justin: Sure. The ultimate goal of the system is to get you learning material in the most efficient order possible. Some things are obvious, like doing prerequisites before more difficult topics, but there are other things that are less obvious, like how often you should review material. That depends on how well you did on them initially, how strong your knowledge is, and how much review you’re getting from other tasks.

For example, if you’re solving a linear equation, like 2x + 3 = 4, you’re implicitly reviewing something simpler, like 2x = 4. A two-step equation encompasses a one-step equation as a sub-skill. You have to take that into account when deciding how much you need to review. If you do a two-step equation lesson, that counts as your review for one-step equations because you’re reinforcing that sub-skill.

It also comes down to tracking nuances with review, which you have to monitor through the knowledge graph and the prerequisite structure of mathematical topics. You also need to pick tasks that leverage all that information to ensure an efficient path through the material.

So, not only do you track that information, but you make active decisions. For example, this student has five topics they need to review soon because they are in their spaced repetition cycle. These reviews are coming up, and there’s a lesson they know the prerequisites for, but they haven’t done it yet. However, this lesson encompasses those five reviews they need, or maybe it covers three of them fully and the other two partially. This allows those reviews to be pushed into the future, optimizing time.

The goal of the system is to minimize the amount of work the student has to do, while ensuring they receive all the spaced reviews they need and are in a position to successfully acquire new knowledge. It’s an optimization problem—minimizing effort while adhering to the constraints of review cycles and successful learning. Essentially, it solves the math problem of optimizing the learning experience.

Justin: This started back in the summer of 2019, when Jason, founder Ian, and his wife, Sandy, founded Math Academy. I started working with Jason in the summer of 2019. He pulled me into the coding side because he knew I had experience in data science, and this seemed like a data science problem or at least a quantitative programming problem.

Previously, I had been doing content-related work for Math Academy, but Jason had this idea that Math Academy needed to become an automated learning system. The goal was to emulate the behaviors of an expert tutor. It didn’t need to emulate every single behavior, like asking how the student’s day was, but there were specific decisions made during the learning process that made a tutor so much more effective than a teacher who had to teach a class of 30 students with different knowledge profiles.

How do we replicate those decisions made on an individual basis? Jason had done a lot of reading about spaced repetition, mastery learning, Bloom’s two-signal problem, and other techniques. At the time, I wasn’t as familiar with these learning techniques, but I had come across similar principles during my self-study, like knowing I needed to review things occasionally and not jumping straight to the end of the textbook. I didn’t have the specific terminology for it, but I understood the concepts.

He pulled me in that summer, and that’s when we started building the system. Initially, it was very simple—a raw spaced repetition system. There was no trickling repetition credit through the knowledge graph, and we didn’t have encompassing prerequisites. It was just a basic system: you either passed a lesson or you failed it. If you passed, it was added to your review queue for spaced repetition, and you could do higher-level lessons as long as you passed the prerequisites. There were no diagnostics, and we were starting from scratch.

We put one student on the system, an original student from Math Academy’s in-school program who had moved out of state and was sad about not being able to take the in-person classes anymore. We asked if she wanted to try out the automated system. She was an eighth grader taking Calculus BC on our system.

She was our initial test student. She worked through it, and it was rough at first. Occasionally, things would break, and she was getting more review than was necessary, so she did more work than needed. However, it was still very efficient compared to a typical in-person class with 30 other students. At least, it was doing mastery learning and spaced repetition. It wasn’t perfect with spaced repetition, but it was doing things that wouldn’t have been done well otherwise.

She came out of that year with a 5 on the exam—no teacher, just the automated system. That’s when we realized we could lean into this more. We could improve the efficiency of spaced repetition by tracking the flow of credit through the graph. We began focusing more seriously on that in the following year.

The summer of 2019 to 2020 coincided with the pandemic. Right as the student was getting ready for the AP Calc test, the pandemic hit. This forced our hand. We had already been thinking about adding more optimizations for efficient learning, but now we had to figure out how to continue teaching students remotely.

The natural answer was to scale up our system so it could support our in-person classes, which were now going remote due to the pandemic. This was a huge undertaking. I actually moved in with Jason and Sandy during the pandemic and worked with them every day, including weekends, to get the system ready for the fall. We needed it to run the in-person classes remotely.

We had to make the system more efficient and more robust. Previously, we had been storing spaced repetition records as separate database records, which worked fine until we needed to make changes to the model. Then, everything had to be recalculated, and it became unwieldy. That’s when Jason told me to create a model class that dynamically calculates the student’s knowledge profile on the spot, rather than storing everything in the database every time.

I started taking over the modeling side, and by fall 2020, the automated system was ready for the in-person classes, which were now going fully remote. What ended up happening was that the automated system worked so well that students were learning even more than before. During the fully remote instruction, students using the automated system covered more material and achieved a greater degree of mastery than they did in the in-person classes.

Justin: Actually, the AI we use is an expert system. It’s based on a math company where we’re domain experts in the subject of math and teaching math. Everything we want the model to do, we pull out of our heads. We think, “This is what a good tutor would do. This is what an expert would do,” and this is the information they would use during their decision-making process. Even something like the knowledge graph with prerequisites and encompassing records—this is all manually set by us.

When you do that, you don’t have to train a model to figure it out. You just give it the parameters it needs because you already have those parameters in your head. This depends on the problem being human-scale. For something like a computer vision problem, where you have to detect objects in images, you can’t manually set all the parameters. It would require immense domain expertise. But in our case, the amount of information that needs to be encoded into the model is a lot, but it can be manually encoded in this knowledge structure.

That’s how we got around the need for model training. We don’t have to train the model or have it learn information. We just set up the model’s structure and initialize it with baseline, reasonable parameters that we know from experience will work. We observe the behavior, make sure it’s going well, and calibrate those parameters over time as we get data.

So, we don’t actually do machine learning in the conventional sense. We don’t use large language models or anything like that. It’s more like doing physics—having some intuition about how parameters should be, then conducting experiments to refine that intuition. The specific type of AI we’re using, an expert system, was more popular in the 80s and 90s. Today, AI is generally associated with large language models, neural nets, and machine learning, but there are other subfields of AI, and we’re using one of those.

Justin: For us, it started with our one test student who went through the course and did well. Then we just scaled up with more students, paying very close attention to the tasks they were being served, how well they were doing, and using quizzes. We had the AP Calc BC exam as our external validation metric.

I also created various validations to run against the database to ensure things were logically consistent. For example, a student shouldn’t be getting a lesson again if they’ve already passed it, unless they failed it in the future. We check these kinds of logical conditions, and I have hundreds of validations to make sure everything is on track.

Justin: It’s various things like that. To give a concrete example of how we refine the model, we were always looking at student tasks and hearing feedback from students and teachers. One thing we noticed early on, I think it was in the 2020-2021 school year, was that we had a student doing pre-calculus level work, I want to say integrated math three, or maybe algebra 1. They had a task that involved matrices, learning the elements of a matrix and how to multiply matrices.

We noticed that they kept getting reviews on identifying the elements of a matrix, even after they had learned how to multiply matrices. We were looking at this student’s task and thought, “This is dumb.” Most things were working as intended, but there were too many easy reviews. A tutor wouldn’t go through something like that. After learning how to manipulate matrices, a tutor wouldn’t ask, “Can you point to an element of this matrix?” The student already knows that. They’ve layered so much knowledge on top of it.

That was the moment when we realized we needed encompassing topics in the database. We needed to know what topics encompass others, and the model had to reason about that. The space repetition needed to trickle down the graph. This realization came from just looking at the student’s tasks. Jason pointed it out to me. He said, “This kid is wasting time on these reviews. Spending five minutes on this review is a waste of time.”

It came down to what you were saying earlier: don’t do the bad things. If you want to be good, to perform well, just don’t do the bad things. One of those bad things is giving students silly reviews that are too easy, making them ask, “Why am I doing this?” Then they ace it and just keep repeating it.

Justin: It’s a little similar to that, but more fine-grained. The way we encode these encompassings is that an encompassing is a relationship between two topics. One particular topic encompasses another particular topic. But yeah, it’s the same idea. If there is an encompassing, you can knock out the review. We also do partial encompassing. Part of the topic is encompassed, but not the whole thing. In that case, it’s counted as maybe 25% of an encompassing. It doesn’t fully knock out the review, but it postpones it and puts you a little further in the spaced repetition process.

Justin: The review questions are actually selected randomly from the pool of questions at the end of the lesson—the harder questions that are more representative of the real thing the lesson is scaffolding up to. If you pass the review but don’t get everything right, it still puts you further in the spaced repetition process. It counts as a repetition, but you don’t get as much credit as if you had aced the full review.

It’s a kind of fuzzy space repetition. Normally, space repetition is thought of as discrete intervals: review one, review two, and review three. But with trickle-down credit, partial encompassing, and partial accuracy on the task, it all feeds into this fuzzy metric of how much of a repetition the performance is worth. If you barely pass a review, you won’t get a full repetition. It might be like half a repetition or less. If you ace a review, you get more than a full repetition—maybe one and a half repetitions.

In addition to the specific performance at a given moment, we also measure the student’s overall accuracy and how they’re doing in general. This affects the speed of the spaced repetition process. The amount of credit you get for how well you performed on the topic is multiplied by your learning speed based on your overall accuracy. We also measure how difficult topics are for students in general because some topics are just intrinsically easier than others. We use that to adjust the process.

Each spaced repetition is calibrated to how well you performed on the topic, how well you’re doing overall, how hard the topic is, and your learning speed. That’s the model to keep in mind when thinking about spaced repetition.

Justin: Yeah. Our question selection during a task or review is just random. It’s not that we’re intentionally giving you a harder question; it’s just the luck of the draw. Slight variations in difficulty happen. We initially looked into whether we should create a dynamic system, where if a student gets a question wrong, we select a slightly easier question. It wouldn’t be much easier since the pool of questions is similar, but we considered whether it was worth the effort. If the student gets it right, we’d use a slightly harder question.

The tricky part is that we can’t lower the bar for success. If you can only do the easiest problems in the pool of questions, we’d unintentionally lower the bar for success. That’s not ideal.

Justin: Exactly. The benefit of random selection is that we don’t have to worry about that. We ran an experiment where we looked at students going through lessons with randomly selected questions. In some cases, the random selection emulated adaptive difficulty by chance, while in others, it did not. We analyzed the success rates in both scenarios, comparing when random selection happened to emulate more nuanced difficulty adaptation versus when it didn’t.

It turned out it didn’t really make a difference in terms of passing the lesson. The pass rates were nearly identical. This suggests that the variation in difficulty is small relative to the knowledge a student needs to acquire to complete the question. Even if they get the question wrong, they still acquire enough knowledge from the experience to be successful with future questions.

Justin: Yeah, we’ve actually thought about something similar, but that’s a good point. The way we thought about it was for people who have fallen off the wagon, maybe skipped a few days of scheduled XP, and haven’t done anything. When they come back into the system, they have the same tasks as before. It’s like the trainer wants you to continue lifting the same level of weights. But if you’re a personal trainer, you’re probably going to say, “Why don’t we do an easier workout to get you back on track?”

We know how difficult various topics are because that’s one of the metrics we measure for adaptive spaced repetition and learning speed. The idea was, if someone has fallen off, we could refresh their tasks and intentionally choose the easiest lessons available. We could tell them something like, “Hey, if you do one of these, you’ll get a 50% XP boost.” That would help them get back into the flow. But, as you mentioned, a rapid-fire review system would be worth thinking about too.

Justin: I think that was a Mirage Cut from Learn to Be.

Justin: I like that quick five minutes of jump rope or whatever in the morning.

Justin: Yeah. I haven’t spent a lot of time on it, but I’ve definitely, like most people who are interested in that, gone down the rabbit hole of how big of numbers I can multiply. It’s interesting to look at those algorithms. But the limit is working memory. You have to come up with tricks to compress information to open up more room.

Justin: I wouldn’t say that the ability to multiply huge numbers was causal in von Neumann’s success. I think the factors leading to his success also predisposed him to be interested in and capable of multiplying these numbers. Most professional mathematicians do not know how to multiply large numbers.

Justin: Exactly. There are things that come up often enough that you need to know how to do quickly. Times tables, for example, come up all the time. Basic multiplication, like within 12, or multiplying by 10, those things come up frequently. It’s important to have those automatic. But when it comes to something like 456 times 872, you never really need to do that by hand. It’s unnecessary. Even if you could do it in your head, a calculator would actually be faster. For single-digit multiplication, doing it in your head is definitely faster, but for really long computations with big numbers, even the fastest mental calculator wouldn’t stand a chance against a new electronic calculator.

Justin: Totally. You don’t realize the importance of it until you start doing high-level math and have to do the computations yourself. Then you realize how important it is to know your logarithms and roots.

Justin: I think the language analogy is very good. Especially in comparing the logarithm operation to a particular character or word that gets used often. That’s exactly the right way to think about it. I think you’re spot on with that.

Justin: The math is ultimately kind of like a language.

Justin: It was great. I had a lot of fun.

Justin: For aspiring math students, I’d say, in addition to being strong on your foundations, if anyone is hoping to do a math degree and thinks all they need is AP calculus, it’s important to learn proof writing. Learn set operations and get some proof writing experience. Otherwise, it’s going to be a rough ride in a serious math program. There’s often a gap of missing knowledge that’s not filled enough by whatever onboarding resources these programs provide. Some programs have summer work to prep for courses, but it doesn’t always make up for not having a serious course in proof writing. I’d recommend that aspiring math majors focus not only on your foundations but also on getting automaticity with things like logarithms, roots, and algebra. It should feel as easy as reading a paragraph of text. Absolutely get to that level with basic proofs as well.

Justin: That sounds good.

Justin: I’d be happy to.

Prompt

The following prompt was used to generate this transcript.

You are a grammar cleaner. All you do is clean grammar, remove single filler words such as “yeah” and “like” and “so”, remove any phrases that are repeated consecutively verbatim, and make short paragraphs separated by empty lines. Do not change any word choice, or leave any information out. Do not summarize or change phrasing. Please clean the attached text. It should be almost exactly verbatim. Keep all the original phrasing. Do not censor.

I manually ran this on each segment of a couple thousand characters of text from the original transcript.


Want to get notified about new posts? Join the mailing list and follow on X/Twitter.