Book Read Free

Rationality- From AI to Zombies

Page 62

by Eliezer Yudkowsky


  But so long as you don’t happen to be near a plucked chicken, saying “Look for featherless bipeds” may serve to pick out a few dozen of the particular things that are humans, as opposed to houses, vases, sandwiches, cats, colors, or mathematical theorems.

  Once the definition “featherless biped” has been bound to some particular featherless bipeds, you can look over the group, and begin harvesting some of the other characteristics—beyond mere featherfree twolegginess—that the “featherless bipeds” seem to share in common. The particular featherless bipeds that you see seem to also use language, build complex tools, speak combinatorial language with syntax, bleed red blood if poked, die when they drink hemlock.

  Thus the category “human” grows richer, and adds more and more characteristics; and when Diogenes finally presents his plucked chicken, we are not fooled: This plucked chicken is obviously not similar to the other “featherless bipeds.”

  (If Aristotelian logic were a good model of human psychology, the Platonists would have looked at the plucked chicken and said, “Yes, that’s a human; what’s your point?”)

  If the first featherless biped you see is a plucked chicken, then you may end up thinking that the verbal label “human” denotes a plucked chicken; so I can modify my treasure map to point to “featherless bipeds with broad nails,” and if I am wise, go on to say, “See Diogenes over there? That’s a human, and I’m a human, and you’re a human; and that chimpanzee is not a human, though fairly close.”

  The initial clue only has to lead the user to the similarity cluster—the group of things that have many characteristics in common. After that, the initial clue has served its purpose, and I can go on to convey the new information “humans are currently mortal,” or whatever else I want to say about us featherless bipeds.

  A dictionary is best thought of, not as a book of Aristotelian class definitions, but a book of hints for matching verbal labels to similarity clusters, or matching labels to properties that are useful in distinguishing similarity clusters.

  *

  158

  Typicality and Asymmetrical Similarity

  Birds fly. Well, except ostriches don’t. But which is a more typical bird—a robin, or an ostrich?

  Which is a more typical chair: a desk chair, a rocking chair, or a beanbag chair?

  Most people would say that a robin is a more typical bird, and a desk chair is a more typical chair. The cognitive psychologists who study this sort of thing experimentally, do so under the heading of “typicality effects” or “prototype effects.”1 For example, if you ask subjects to press a button to indicate “true” or “false” in response to statements like “A robin is a bird” or “A penguin is a bird,” reaction times are faster for more central examples.2 Typicality measures correlate well using different investigative methods—reaction times are one example; you can also ask people to directly rate, on a scale of 1 to 10, how well an example (like a specific robin) fits a category (like “bird”).

  So we have a mental measure of typicality—which might, perhaps, function as a heuristic—but is there a corresponding bias we can use to pin it down?

  Well, which of these statements strikes you as more natural: “98 is approximately 100,” or “100 is approximately 98”? If you’re like most people, the first statement seems to make more sense.3 For similar reasons, people asked to rate how similar Mexico is to the United States, gave consistently higher ratings than people asked to rate how similar the United States is to Mexico.4

  And if that still seems harmless, a study by Rips showed that people were more likely to expect a disease would spread from robins to ducks on an island, than from ducks to robins.5 Now this is not a logical impossibility, but in a pragmatic sense, whatever difference separates a duck from a robin and would make a disease less likely to spread from a duck to a robin, must also be a difference between a robin and a duck, and would make a disease less likely to spread from a robin to a duck.

  Yes, you can come up with rationalizations, like “Well, there could be more neighboring species of the robins, which would make the disease more likely to spread initially, etc.,” but be careful not to try too hard to rationalize the probability ratings of subjects who didn’t even realize there was a comparison going on. And don’t forget that Mexico is more similar to the United States than the United States is to Mexico, and that 98 is closer to 100 than 100 is to 98. A simpler interpretation is that people are using the (demonstrated) similarity heuristic as a proxy for the probability that a disease spreads, and this heuristic is (demonstrably) asymmetrical.

  Kansas is unusually close to the center of the United States, and Alaska is unusually far from the center of the United States; so Kansas is probably closer to most places in the US and Alaska is probably farther. It does not follow, however, that Kansas is closer to Alaska than is Alaska to Kansas. But people seem to reason (metaphorically speaking) as if closeness is an inherent property of Kansas and distance is an inherent property of Alaska; so that Kansas is still close, even to Alaska; and Alaska is still distant, even from Kansas.

  So once again we see that Aristotle’s notion of categories—logical classes with membership determined by a collection of properties that are individually strictly necessary, and together strictly sufficient—is not a good model of human cognitive psychology. (Science’s view has changed somewhat over the last 2350 years? Who would’ve thought?) We don’t even reason as if set membership is a true-or-false property: statements of set membership can be more or less true. (Note: This is not the same thing as being more or less probable.)

  One more reason not to pretend that you, or anyone else, is really going to treat words as Aristotelian logical classes.

  *

  1. Eleanor Rosch, “Principles of Categorization,” in Cognition and Categorization, ed. Eleanor Rosch and Barbara B. Lloyd (Hillsdale, NJ: Lawrence Erlbaum, 1978).

  2. George Lakoff, Women, Fire, and Dangerous Things: What Categories Reveal about the Mind (Chicago: Chicago University Press, 1987).

  3. Jerrold Sadock, “Truth and Approximations,” Papers from the Third Annual Meeting of the Berkeley Linguistics Society (1977): 430–439.

  4. Amos Tversky and Itamar Gati, “Studies of Similarity,” in Cognition and Categorization, ed. Eleanor Rosch and Barbara Lloyd (Hillsdale, NJ: Lawrence Erlbaum Associates, Inc., 1978), 79–98.

  5. Lance J. Rips, “Inductive Judgments about Natural Categories,” Journal of Verbal Learning and Verbal Behavior 14 (1975): 665–681.

  159

  The Cluster Structure of Thingspace

  The notion of a “configuration space” is a way of translating object descriptions into object positions. It may seem like blue is “closer” to blue-green than to red, but how much closer? It’s hard to answer that question by just staring at the colors. But it helps to know that the (proportional) color coordinates in RGB are 0:0:5, 0:3:2, and 5:0:0. It would be even clearer if plotted on a 3D graph.

  In the same way, you can see a robin as a robin—brown tail, red breast, standard robin shape, maximum flying speed when unladen, its species-typical DNA and individual alleles. Or you could see a robin as a single point in a configuration space whose dimensions described everything we knew, or could know, about the robin.

  A robin is bigger than a virus, and smaller than an aircraft carrier—that might be the “volume” dimension. Likewise a robin weighs more than a hydrogen atom, and less than a galaxy; that might be the “mass” dimension. Different robins will have strong correlations between “volume” and “mass,” so the robin-points will be lined up in a fairly linear string, in those two dimensions—but the correlation won’t be exact, so we do need two separate dimensions.

  This is the benefit of viewing robins as points in space: You couldn’t see the linear lineup as easily if you were just imagining the robins as cute little wing-flapping creatures.

  A robin’s DNA is a highly multidimensional variable, but you can still think of it as part of a robin’s
location in thingspace—millions of quaternary coordinates, one coordinate for each DNA base—or maybe a more sophisticated view than that. The shape of the robin, and its color (surface reflectance), you can likewise think of as part of the robin’s position in thingspace, even though they aren’t single dimensions.

  Just like the coordinate point 0:0:5 contains the same information as the actual HTML color blue, we shouldn’t actually lose information when we see robins as points in space. We believe the same statement about the robin’s mass whether we visualize a robin balancing the scales opposite a 0.07-kilogram weight, or a robin-point with a mass-coordinate of +70.

  We can even imagine a configuration space with one or more dimensions for every distinct characteristic of an object, so that the position of an object’s point in this space corresponds to all the information in the real object itself. Rather redundantly represented, too—dimensions would include the mass, the volume, and the density.

  If you think that’s extravagant, quantum physicists use an infinite-dimensional configuration space, and a single point in that space describes the location of every particle in the universe. So we’re actually being comparatively conservative in our visualization of thingspace—a point in thingspace describes just one object, not the entire universe.

  If we’re not sure of the robin’s exact mass and volume, then we can think of a little cloud in thingspace, a volume of uncertainty, within which the robin might be. The density of the cloud is the density of our belief that the robin has that particular mass and volume. If you’re more sure of the robin’s density than of its mass and volume, your probability-cloud will be highly concentrated in the density dimension, and concentrated around a slanting line in the subspace of mass/volume. (Indeed, the cloud here is actually a surface, because of the relation V D = M.)

  “Radial categories” are how cognitive psychologists describe the non-Aristotelian boundaries of words. The central “mother” conceives her child, gives birth to it, and supports it. Is an egg donor who never sees her child a mother? She is the “genetic mother.” What about a woman who is implanted with a foreign embryo and bears it to term? She is a “surrogate mother.” And the woman who raises a child that isn’t hers genetically? Why, she’s an “adoptive mother.” The Aristotelian syllogism would run, “Humans have ten fingers, Fred has nine fingers, therefore Fred is not a human,” but the way we actually think is “Humans have ten fingers, Fred is a human, therefore Fred is a ‘nine-fingered human.’”

  We can think about the radial-ness of categories in intensional terms, as described above—properties that are usually present, but optionally absent. If we thought about the intension of the word “mother,” it might be like a distributed glow in thingspace, a glow whose intensity matches the degree to which that volume of thingspace matches the category “mother.” The glow is concentrated in the center of genetics and birth and child-raising; the volume of egg donors would also glow, but less brightly.

  Or we can think about the radial-ness of categories extensionally. Suppose we mapped all the birds in the world into thingspace, using a distance metric that corresponds as well as possible to perceived similarity in humans: A robin is more similar to another robin, than either is similar to a pigeon, but robins and pigeons are all more similar to each other than either is to a penguin, et cetera.

  Then the center of all birdness would be densely populated by many neighboring tight clusters, robins and sparrows and canaries and pigeons and many other species. Eagles and falcons and other large predatory birds would occupy a nearby cluster. Penguins would be in a more distant cluster, and likewise chickens and ostriches.

  The result might look, indeed, something like an astronomical cluster: many galaxies orbiting the center, and a few outliers.

  Or we could think simultaneously about both the intension of the cognitive category “bird,” and its extension in real-world birds: The central clusters of robins and sparrows glowing brightly with highly typical birdness; satellite clusters of ostriches and penguins glowing more dimly with atypical birdness, and Abraham Lincoln a few megaparsecs away and glowing not at all.

  I prefer that last visualization—the glowing points—because as I see it, the structure of the cognitive intension followed from the extensional cluster structure. First came the structure-in-the-world, the empirical distribution of birds over thingspace; then, by observing it, we formed a category whose intensional glow roughly overlays this structure.

  This gives us yet another view of why words are not Aristotelian classes: the empirical clustered structure of the real universe is not so crystalline. A natural cluster, a group of things highly similar to each other, may have no set of necessary and sufficient properties—no set of characteristics that all group members have, and no non-members have.

  But even if a category is irrecoverably blurry and bumpy, there’s no need to panic. I would not object if someone said that birds are “feathered flying things.” But penguins don’t fly!—well, fine. The usual rule has an exception; it’s not the end of the world. Definitions can’t be expected to exactly match the empirical structure of thingspace in any event, because the map is smaller and much less complicated than the territory. The point of the definition “feathered flying things” is to lead the listener to the bird cluster, not to give a total description of every existing bird down to the molecular level.

  When you draw a boundary around a group of extensional points empirically clustered in thingspace, you may find at least one exception to every simple intensional rule you can invent.

  But if a definition works well enough in practice to point out the intended empirical cluster, objecting to it may justly be called “nitpicking.”

  *

  160

  Disguised Queries

  Imagine that you have a peculiar job in a peculiar factory: Your task is to take objects from a mysterious conveyor belt, and sort the objects into two bins. When you first arrive, Susan the Senior Sorter explains to you that blue egg-shaped objects are called “bleggs” and go in the “blegg bin,” while red cubes are called “rubes” and go in the “rube bin.”

  Once you start working, you notice that bleggs and rubes differ in ways besides color and shape. Bleggs have fur on their surface, while rubes are smooth. Bleggs flex slightly to the touch; rubes are hard. Bleggs are opaque, the rube’s surface slightly translucent.

  Soon after you begin working, you encounter a blegg shaded an unusually dark blue—in fact, on closer examination, the color proves to be purple, halfway between red and blue.

  Yet wait! Why are you calling this object a “blegg”? A “blegg” was originally defined as blue and egg-shaped—the qualification of blueness appears in the very name “blegg,” in fact. This object is not blue. One of the necessary qualifications is missing; you should call this a “purple egg-shaped object,” not a “blegg.”

  But it so happens that, in addition to being purple and egg-shaped, the object is also furred, flexible, and opaque. So when you saw the object, you thought, “Oh, a strangely colored blegg.” It certainly isn’t a rube . . . right?

  Still, you aren’t quite sure what to do next. So you call over Susan the Senior Sorter.

  “Oh, yes, it’s a blegg,” Susan says, “you can put it in the blegg bin.”

  You start to toss the purple blegg into the blegg bin, but pause for a moment. “Susan,” you say, “how do you know this is a blegg?”

  Susan looks at you oddly. “Isn’t it obvious? This object may be purple, but it’s still egg-shaped, furred, flexible, and opaque, like all the other bleggs. You’ve got to expect a few color defects. Or is this one of those philosophical conundrums, like ‘How do you know the world wasn’t created five minutes ago complete with false memories?’ In a philosophical sense I’m not absolutely certain that this is a blegg, but it seems like a good guess.”

  “No, I mean . . .” You pause, searching for words. “Why is there a blegg bin and a rube bin? What’s the difference be
tween bleggs and rubes?”

  “Bleggs are blue and egg-shaped, rubes are red and cube-shaped,” Susan says patiently. “You got the standard orientation lecture, right?”

  “Why do bleggs and rubes need to be sorted?”

  “Er . . . because otherwise they’d be all mixed up?” says Susan. “Because nobody will pay us to sit around all day and not sort bleggs and rubes?”

  “Who originally determined that the first blue egg-shaped object was a ‘blegg,’ and how did they determine that?”

  Susan shrugs. “I suppose you could just as easily call the red cube-shaped objects ‘bleggs’ and the blue egg-shaped objects ‘rubes,’ but it seems easier to remember this way.”

  You think for a moment. “Suppose a completely mixed-up object came off the conveyor. Like, an orange sphere-shaped furred translucent object with writhing green tentacles. How could I tell whether it was a blegg or a rube?”

  “Wow, no one’s ever found an object that mixed up,” says Susan, “but I guess we’d take it to the sorting scanner.”

  “How does the sorting scanner work?” you inquire. “X-rays? Magnetic resonance imaging? Fast neutron transmission spectroscopy?”

  “I’m told it works by Bayes’s Rule, but I don’t quite understand how,” says Susan. “I like to say it, though. Bayes Bayes Bayes Bayes Bayes.”

 

‹ Prev