Book Read Free

Ultralearning

Page 16

by Scott Young


  Memory Mechanism 1—Spacing: Repeat to Remember

  One of the pieces of studying advice that is best supported by research is that if you care about long-term retention, don’t cram. Spreading learning sessions over more intervals over longer periods of time tends to cause somewhat lower performance in the short run (because there is a chance for forgetting between intervals) but much better performance in the long run. This was something I needed to be careful about during the MIT Challenge. After my first few classes, I switched from doing one class at a time to doing a few in parallel, to minimize the impact that the crammed study time would have on my memory.

  If you have ten hours to learn something, therefore, it makes more sense to spend ten days studying one hour each than to spend ten hours studying in one burst. Obviously, however, if the amount of time between study intervals gets longer and longer, the short-term effects start to outweigh the long-term ones. If you learn something with a decade separating study intervals, it’s quite possible that you’ll completely forget whatever you had learned before you reach the second session.

  Finding the exact trade-off point between too long and too short has been a minor obsession for some ultralearners. Space your study sessions too closely, and you lose efficiency; space them too far apart, and you forget what you’ve already learned. This has led many ultralearners to apply what are known as spaced-repetition systems (SRS) as a tool for trying to retain the most knowledge with the least effort. SRS was a major force behind Roger Craig’s Jeopardy! trivia memorization, and I used the systems extensively when learning Chinese and Korean. Although you may not have heard of this term, the general principle is the backbone of many language-learning products, including Pimsleur, Memrise, and Duolingo. These programs tend to hide the spacing algorithm in the background, so you don’t need to bother yourself with it. However, other programs, such as the open-source Anki, are the preferred tool of more extreme ultralearners who want to squeeze out a little more performance.

  SRS is an amazing tool, but it tends to have quite focused applications. Learning facts, trivia, vocabulary words, or definitions is ideally suited for flash card software, which presents knowledge in terms of a question with a single answer. It’s more difficult to apply to more complicated domains of knowledge, which rely on complex information associations that are built up only through real-world practice. Still, for some tasks, the bottleneck of memory is so tight that SRS is a powerful tool for widening it, even if there are some drawbacks. The authors of a popular study guide for medical students center their approach around SRS, because a medical student must remember so many things and the default strategy of forgetting and relearning is quite costly in terms of time.10

  Spacing does not require complex software, however. As Richards’s story clearly demonstrates, simply printing lists of words, reading them over, and then rehearsing them mentally without having them in front of you is an incredibly powerful technique. Similarly, semiregular practice of a skill is often quite helpful. After my year of learning languages, I wanted to ensure that I didn’t forget them. My approach was fairly simple: schedule thirty minutes of conversation practice once a week, to be done over Skype using italki, an online service for tutoring and language exchange partners all over the world. I maintained this for one year, after which I dropped to once-per-month practice for another two years. I don’t know whether this practice schedule was ideal, and I had other opportunities to practice that came up spontaneously in that time period that also helped, but I believe it was much better than doing nothing and letting the skills atrophy. When it comes to retention, don’t let perfect become the enemy of good enough.

  Another strategy for applying spacing, which can work better for more elaborate skills that are harder to integrate into your daily habits, is to semiregularly do refresher projects. I leaned toward this approach for the things I learned during the MIT Challenge, since the skill I wanted most to retain was writing code, which is tricky to do on only half an hour per week. This approach has the disadvantage of sometimes deviating quite a lot from optimal spacing; however, if you’re prepared to do a little bit of relearning to compensate, it can still be a better approach than completely giving up practice. Scheduling this kind of maintenance in advance can also be helpful, as it will remind you that learning isn’t something done once and then ignored but a process that continues for your entire life.

  Memory Mechanism 2—Proceduralization: Automatic Will Endure

  Why do people say it’s “like riding a bicycle” and not “like remembering trigonometry?” This common expression may be rooted in deeper neurological realities than it first appears. There’s evidence that procedural skills, such as riding a bicycle, are stored in a different way from declarative knowledge, such as knowing the Pythagorean Theorem or the Sine Rule for triangles. This difference between knowing how and knowing that may also have different implications for long-term memory. Procedural skills, such as the ever-remembered bicycling, are much less susceptible to being forgotten than knowledge that requires explicit recall to retrieve.11

  This finding can actually be used to our advantage. One dominant theory of learning suggests that most skills proceed through stages—starting declarative but ending up procedural as you practice more. A perfect example of this declarative-to-procedural transition is typewriting. When you start typing on a keyboard, you must memorize the positions of the letters. Each time you want to type a word, you have to think in terms of its letters, recall each one’s position on the keyboard, and then move your finger to that spot to press it. This process may fail; you may forget where a key is and need to look down to type it. However, if you practice more and more, you stop having to look down. Eventually you stop having to think about the letters’ positions or how to move your fingers to meet them. You may even reach a point where you don’t think of letters at all and whole words come out at a time. Such procedural knowledge is quite robust and tends to be retained much longer than declarative knowledge. A quick observation is enough to verify this: when you’ve gotten really good at typing and someone asks you to quickly say where on a keyboard the letter w is, you might need to actually put your hands in the keyboard position (or imagine you’re doing so) and pretend to type the w to say definitively. This is exactly what happened to me as I was typing out this paragraph. What has happened is that what was originally the primary access point to knowledge, your explicit memory of the key location, has faded away and now needs to be recalled with the more durable procedural knowledge encoded in your motor movements. If you’ve ever had to enter a password or pin code you use often, you may be in a similar position, where you remember it by feel and not by its explicit combination of numbers and letters.

  Because of the fact that procedural knowledge is stored for longer, this may suggest a useful heuristic. Instead of learning a large volume of knowledge or skills evenly, you may emphasize a core set of information much more frequently, so that it becomes procedural and is stored far longer. This was an unintentional side effect of my friend’s and my language-learning project. Being forced to speak a language constantly meant that a core set of phrases and patterns was repeated so often that neither of us will ever forget them. This may not hold true for a bunch of less frequently used words or phrases, but the starting points of conversations are nearly impossible to forget. The classic approach to language studies, in which students “move on” from beginner words and grammatical patterns to more complicated ones may sidestep this, so that those core patterns aren’t sticky enough to last for years without repeated practice.

  Failing to fully proceduralize core skills was a major flaw of my first major self-education effort, the MIT Challenge, which I was able to improve upon in my subsequent language-learning and portrait-drawing projects. Whereas the MIT Challenge did have core mathematical and programming skills that were often repeated, what ended up being proceduralized was more haphazard rather than reflecting a conscious decision to automate the most
essential skills of applying computer science.

  Most skills we learn are incompletely proceduralized. We may be able to do some of them automatically, but other parts require us to actively search our minds. You might, for instance, be able to easily move variables from one side of an equation to the other in algebra without thinking. But you may have to think a bit more when exponents or trigonometry is involved. Perhaps, owing to their nature, some skills cannot be completely automated and will always require some conscious thought. This creates an interesting mix of knowledge, with some things retained quite stably over longer periods of time and others susceptible to being forgotten. One strategy for applying this concept might be to ensure that a certain amount of knowledge is completely proceduralized before practice concludes. Another approach might be to spend extra effort to proceduralize some skills, which will serve as cues or access points for other knowledge. You may aim to completely proceduralize the process you use to start working on a new programming project, for example, so that you can get over that hump in the process of writing a new program. These strategies are somewhat speculative, but I think there are lots of potential ways the declarative-to-procedural transition of knowledge might be applied by clever ultralearners in the future.

  Memory Mechanism 3—Overlearning: Practice Beyond Perfect

  Overlearning is a well-studied psychological phenomenon that’s fairly easy to understand: additional practice, beyond what is required to perform adequately, can increase the length of time that memories are stored.12 The typical experimental setup is to give subjects a task, such as assembling a rifle or going through an emergency checklist, allowing them enough time to practice that they can do it correctly once. The time from zero to this point is considered the “learning” phase. Next, allow the subjects different amounts of “overlearning,” or practice that continues after the first correct application. Since subjects are already doing the skill correctly, performance doesn’t improve past this point. However, the overlearning can extend the durability of the skill.

  In the typical setting in which overlearning has been studied, the duration of the overlearning effects tends to be quite short; practicing a little longer in one session produces an additional week or two of recall. This may imply that overlearning is primarily a short-term phenomenon: something useful for skills like first aid or emergency response protocols, which are rarely practiced but need to be kept fresh in between regular training sessions. I suspect, however, that overlearning might have longer-term implications if it is combined with spacing and proceduralization over much longer projects. In my own personal experience drawing portraits, for instance, the thought process used for mapping out the facial features I learned from Vitruvian Studio was repeated so many times that it’s hard to forget, even though my major practice time was only during one month. Similarly, certain reflexes of programming or mathematics I can still easily recall from my MIT Challenge days, even without practice in the interim, because they happened to be patterns that were repeated far more than was necessary to perform them adequately at the time (because they were components of more elaborate problems).

  Overlearning dovetails nicely with the principle of directness. Because direct use of a skill frequently involves overpracticing certain core abilities, that kernel is usually quite resistant to forgetting, even years later. In contrast, academically learned subjects tend to distribute practice more evenly to cover the entire curriculum to a minimum level of competency in each area, regardless of the centrality of subtopics to practical applications. Many people I’ve known who have learned a language that I also speak but who learned it through years of formal schooling have much more impressive vocabularies or knowledge of grammatical nuances than I do. However, those same people may trip over fairly basic phrases, because they learned every fact and skill evenly, rather than overlearning the smaller subset of very common patterns.

  There seem to be two main methods I’ve encountered for applying overlearning. The first is core practice, continually practicing and refining the core elements of a skill. This approach often works well paired with some kind of immersion or working on extensive (as opposed to intensive) projects, after the initial ultralearning phase has been completed. The shift from learning to doing here may actually involve a deeper, subtler form of learning, which shouldn’t be discounted as simply applying previously learned knowledge.

  The second strategy is advanced practice, going one level above a certain set of skills so that core parts of the lower-level skills are overlearned as one applies them in a more difficult domain. One study of algebra students demonstrated this second strategy.13 Most students who had taken an algebra class and were retested years later had forgotten huge amounts of what they had learned. This could have been either because the information was truly lost or simply because forgotten cues rendered the majority of it inaccessible. Interestingly, this rate of forgetting was the same for better- and poorer-performing students; better students retained more than weaker ones, but the rate at which they had forgotten was the same. One group, however, did not show such a steep decline in forgetting: those who had taken calculus. This suggests that moving up a level to a more advanced skill enabled the earlier skill to be overlearned, thus preventing some forgetting.

  Memory Mechanism 4—Mnemonics: A Picture Retains a Thousand Words

  The final tool common to many ultralearners I encountered was mnemonics. There are many mnemonic strategies, and covering them all is outside the scope of this book. What they have in common is that they tend to be hyperspecific—that is, they are designed to remember very specific patterns of information. Second, they usually involve translating abstract or arbitrary information into vivid pictures or spatial maps. When mnemonics work, the results can be almost difficult to believe. Rajveer Meena, the Guinness World Record holder for memorizing digits of the mathematical constant pi, knows the number to 70,000 decimal places.14 Master mnemonicists, who compete in championships of memory, can memorize the order of a deck of cards in under sixty seconds and can repeat a poem verbatim after only a minute or two of studying. These feats are quite impressive, and even better, they can be learned by anyone patient enough to apply them. How do they work?

  One common, and useful, mnemonic is known as the keyword method. The method works by first taking a foreign-language word and converting it into something it sounds like in your native language. If I were doing this with French, for example, I might take the word chavirer (to capsize) and convert it into “shave an ear,” to which it is close enough in sound for the latter to serve as an effective cue for recalling the original word. Next I create a mental image that combines the sounds-like version of the foreign word and an image of its translation in a fantastical and vivid setting that is bizarre and hard to forget. In this case, I could imagine a giant ear shaving a long beard while sitting in a boat that capsizes. Then, whenever I need to remember what “capsize” is in French, I think of capsizing, recall my elaborate picture, which links to “shaving an ear” and thus . . . chavirer. This process sounds needlessly complicated and elaborate at first, but it benefits from converting a difficult association (between arbitrary sounds and a new meaning) into a few links that are much easier to associate and remember. With practice, each conversion of this type may take only fifteen to twenty seconds, and it really does help with remembering foreign-language words. This particular kind of mnemonic works for this purpose, but there are others that work for remembering lists, numbers, maps, or sequences of steps in a procedure. For a good introduction to this topic, I highly recommend Joshua Foer’s book Moonwalking with Einstein: The Art and Science of Remembering Everything.

  Mnemonics work well, and with practice, anyone can do them. Why, then, are they not front and center in this chapter, instead of at the end? I believe that mnemonics, like SRS, are incredibly powerful tools. And as tools, they can open new possibilities for people who are not familiar with them. However, as someone who has spent much time exploring
them and applying them to real-world learning, their applications are quite a bit narrower than they first appear, and in many real-world settings they simply aren’t worth the hassle.

  I believe there are two disadvantages to mnemonics. The first is that the most impressive mnemonics systems (like the one for memorizing thousands of digits of the mathematical constant pi), also require a considerable up-front investment. After you’re done, you can memorize digits easily, but this isn’t actually a very useful task. Most of our society adapts around the fact that people generally cannot memorize digits, so we have paper and computers do it for us. The second disadvantage is that recalling from mnemonics is often not as automatic as directly remembering something. Knowing a mnemonic for a foreign-language word is better than failing to remember it entirely, but it’s still too slow to allow you to fluently form sentences out of mnemonically remembered words. Thus mnemonics can act as a bridge for difficult-to-remember information, but it’s usually not the final step in creating memories that will endure forever.

  Mnemonics, therefore, are an incredibly powerful if somewhat brittle tool. If you are doing a task that requires memorizing highly dense information in a very specific format, especially if the information is going to be used over a few weeks or months, they can enable you to do things with your mind that you might not have thought possible. Alternatively, they can be used as an intermediate strategy to smooth initial information acquisition when the information is quite dense. I’ve found them useful for language learning and terminology, and, paired with SRS, they can form an effective bridge from feeling as though there’s no way you can possibly remember everything to remembering it so deeply that you can’t possibly forget. Indeed, in a world before paper, computers, and other externalized memories, mnemonics were the main game in town. However, in the modern world, which has developed excellent coping mechanisms for the fact that most people cannot remember things as a computer can, I feel that mnemonics tend to serve more as cool tricks than as a foundation you should base your learning efforts on. Still, there is a devoted subset of ultralearners who are fiercely committed to applying these techniques, so my word shouldn’t be the final verdict.

 

‹ Prev