by Paul Scharre
You can imagine anti-submarine warfare pickets, you can imagine anti-submarine warfare wolfpacks, you can imagine mine warfare flotillas, you can imagine distributive anti-surface warfare surface action groups . . . We might be able to put a six pack or a four pack of missiles on them. Now imagine 50 of these distributed and operating together under the hands of a flotilla commander, and this is really something.
Like many other robotic systems, the Sea Hunter can navigate autonomously and might someday be armed. There is no indication that DoD has any intention of authorizing autonomous weapons engagements. Nevertheless, the video on DARPA’s lobby wall is a reminder that the robotics revolution continues at a breakneck pace.
BEHIND THE CURTAIN: INSIDE DARPA’S TACTICAL TECHNOLOGY OFFICE
DARPA is organized into six departments focusing on different technology areas: biology, information science, microelectronics, basic sciences, strategic technologies, and tactical technologies. CODE, FLA, LRASM, and the Sea Hunter fall into DARPA’s Tactical Technology Office (TTO), the division that builds experimental vehicles, ships, airplanes, and spacecraft. Other TTO projects include the XS-1 Experimental Spaceplane, designed to fly to the edge of space and back; the Blue Wolf undersea robotic vehicle; an R2-D2-like robotic copilot for aircraft called ALIAS; the Mach 20 Falcon Hypersonic Technology Vehicle, which flies fast enough to zip from New York to Los Angeles in 12 minutes; and the Vulture program to build an ultra-long endurance drone that can stay in the air for up to five years without refueling. Mad science, indeed.
TTO’s offices look like a child’s dream toy room. Littered around the offices are models and even some actual prototype pieces of hardware from past TTO projects—missiles, robots, and stealth aircraft. I can’t help but wonder what TTO is building today that will be the stealth of tomorrow.
Bradford Tousley, TTO’s director, graciously agreed to meet with me to discuss CODE and other projects. Tousley began his government career as an Army armor officer during the Cold War. His first tour was in an armored cavalry unit on the German border, being ready for a Soviet invasion that might kick off World War III. Later in his career, when the Army sent him back for a secondary education, Tousley earned a doctorate in electrical engineering. His career shifted from frontline combat units to research and development in lasers and optics, working to ensure the U.S. military had the best possible technology. Tousley’s career has covered multiple stints at DARPA as well as time in the intelligence community on classified satellite payloads, so he has a breadth of understanding in technology beyond merely robotics.
Tousley pointed out that DARPA was founded in response to the strategic surprise of Sputnik: “DARPA’s fundamental mission is unchanged: Enabling pivotal early investments for breakthrough capabilities for national security to achieve or prevent strategic surprise.” Inside DARPA, they weigh these questions heavily. “Within the agency, we talk about every single program we begin and we have spirited discussions. We talk about the pros and cons. Why? Why not? . . . How far are we willing to go?” Tousley made clear, however, that answering those questions isn’t DARPA’s job. “Those are fundamental policy and concept and military employment considerations” for others to decide. “Our fundamental job is to take that technical question off the table. It’s our job to make the investments to show the capabilities can exist” to give the warfighter options. In other words, to prevent another Sputnik.
If machines improved enough to reliably take out targets on their own, what the role was for humans in warfare? Despite his willingness to push the boundaries of technology, Tousley still saw humans in command of the mission: “That final decision is with humans, period.” That might not mean requiring human authorization for every single target, but autonomous weapons would still operate under human direction, hunting and attacking targets at the direction of a human commander. At least for the foreseeable future, Tousley explained, humans were better than machines at identifying anomalies and reacting to unforeseen events. This meant that keeping humans involved at the mission level was critical to understand the broader context and make decisions. “Until the machine processors equal or surpass humans at making abstract decisions, there’s always going to be mission command. There’s always going to be humans in the loop, on the loop—whatever you want to call it.”
Tousley painted a picture for me of what this might look like in a future conflict: “Groups of platforms that are unmanned that you are willing to attrit [accept some losses] may do extremely well in an anti-access air defense environment . . . How do I take those platforms and a bunch of others and knit them together in architectures that have manned and unmanned systems striking targets in a congested and contested environment? You need that knitted system because you’re going to be GPS-jammed; communications are going to be going in and out; you’re going to have air defenses shooting down assets, manned and unmanned. In order to get in and strike critical targets, to control that [anti-access] environment, you’re going to have to have a system-of-systems architecture that takes advantage of manned and unmanned systems at different ranges with some amount of fidelity in the ability of the munition by itself to identify the target—could be electronically, could be optically, could be infrared, could be [signals intelligence], could be different ways to identify the target. So that system-of-systems architecture is going to be necessary to knit it all together.”
Militaries especially need autonomy in electronic warfare. “We’re using physical machines and electronics, and the electronics themselves are becoming machines that operate at machine speed. . . . I need the cognitive electronic warfare to adapt in microseconds. . . . If I have radars trying to jam other radars but they’re frequency hopping [rapidly changing radio frequencies] back and forth, I’ve got to track with it. So [DARPA’s Microsystems Technology Office] is thinking about, how do I operate at machine speed to allow these machines to conduct their functions?”
Tousley compared the challenge of cognitive electronic warfare to Google’s go-playing AlphaGo program. What happens when that program plays another version of AlphaGo at “machine speed?” He explained, “As humans ascend to the higher-level mission command and I’ve got machines doing more of that targeting function, those machines are going to be challenged by machines on the adversary’s side and a human can’t respond to that. It’s got to be machines responding to machines. . . . That’s one of the trends of the Third Offset, that machine on machine.” Humans, therefore, shift into a “monitoring” role, watching these systems and intervening, if necessary. In fact, Tousley argues that a difficult question will be whether humans should intervene in these machine-on-machine contests, particularly in cyberspace and electronic warfare where the pace of interactions will far exceed human reaction times.
I pointed out that having a human involved in a monitoring role still implies some degree of connectivity, which might be difficult in a contested environment with jamming. Tousley was unconcerned. “We expect that there will be jamming and communications denial going on, but it won’t be necessarily everywhere, all the time,” he said. “It’s one thing to jam my communication link over 1,000 miles, it’s another thing to jam two missiles that are talking in flight that may be three hundred meters apart flying in formation.” Reliable communications in contested areas, even short range, would still permit a human being to be involved, at least in some capacity.
So, what role would that person play? Would this person need to authorize every target before engagement, or would human control sit at a higher level? “I think that will be a rule of engagement-dependent decision,” Tousley said. “In an extremely hot peer-on-peer conflict, the rules of engagement may be more relaxed. . . . If things are really hot and heavy, you’re going to rely on the fact that you built some of that autonomous capability in there.” Still, even in this intense battlefield environment, he attested, the human plays the important role of overseeing the combat action. “But you still want some low data rate” to keep a person involved
.
It took me a while to realize that Tousley wasn’t shrugging off my questions about whether the human would be required to authorize each target because he was being evasive or trying to conceal a secret program, it was because he genuinely didn’t see the issue the same way. Automation had been increasing in weapons for decades—from Tousley’s perspective, programs like CODE were merely the next step. Humans would remain involved in lethal decision-making, albeit at a higher level overseeing and directing the combat action. The precise details of how much freedom an autonomous system might be granted to choose its own targets and in which situations wasn’t his primary concern. Those were questions for military commanders to address. His job as a researcher was to, as he put it, “take that technical question off the table.” His job was to build the options. That meant building swarms of autonomous systems that could go into a contested area and conduct a mission with as minimal human supervision as possible. It also meant building in resilient communications so that humans could have as much bandwidth and connectivity to oversee and direct the autonomous systems as possible. How exactly those technologies were implemented—which specific decisions were retained for the human and which were delegated to the machine—wasn’t his call to make.
Tousley acknowledged that delegating lethal decision-making came with risks. “If [CODE] enables software that can enable a swarm to execute a mission, would that same swarm be able to execute a mission against the wrong target? Yeah, that is a possibility. We don’t want that to happen. We want to build in all the fail-safe systems possible.” For this reason, his number-one concern with autonomous systems was actually test and evaluation: “What I worry about the most is our ability to effectively test these systems to the point that we can quantify that we trust them.” Trust is essential to commanders being willing to employ autonomous systems. “Unless the combatant commander feels that that autonomous system is going to execute the mission with the trust that he or she expects, they’ll never deploy it in the first place.” Establishing that trust was all about test and evaluation, which could mean putting an autonomous system through millions of computer simulations to test its behavior. Even still, testing all of the possible situations an autonomous system might encounter and its potential behaviors in response could be very difficult. “One of the concerns I have,” he said, “is that the technology for autonomy and the technology for human-machine integration and understanding is going too far surpass our ability to test it. . . . That worries me.”
TARGET RECOGNITION AND ADAPTION IN CONTESTED ENVIRONMENTS (TRACE)
Tousley declined to comment on another DARPA program, Target Recognition and Adaption in Contested Environments (TRACE), because it fell under a different department he wasn’t responsible for. And although DARPA was incredibly open and helpful throughout the research for this book, the agency declined to comment on TRACE beyond publicly available information. If there’s one program that seems to be a linchpin for enabling autonomous weapons, it’s TRACE. The CODE project aims to compensate for poor automatic target recognition (ATR) algorithms by leveraging cooperative autonomy. TRACE aims to improve ATR algorithms directly.
TRACE’s project description explains the problem:
In a target-dense environment, the adversary has the advantage of using sophisticated decoys and background traffic to degrade the effectiveness of existing automatic target recognition (ATR) solutions. . . . the false-alarm rate of both human and machine-based radar image recognition is unacceptably high. Existing ATR algorithms also require impractically large computing resources for airborne applications.
TRACE’s aim is to overcome these problems and “develop algorithms and techniques that rapidly and accurately identify military targets using radar sensors on manned and unmanned tactical platforms.” In short, TRACE’s goal is to solve the ATR problem.
To understand just how difficult ATR is—and how game-changing TRACE would be if successful—a brief survey of sensing technologies is in order. Broadly speaking, military targets can be grouped into two categories: “cooperative” and “non-cooperative” targets. Cooperative targets are those that are actively emitting a signal, which makes them easier to detect. For example, radars, when turned on, emit energy in the electromagnetic spectrum. Radars “see” by observing the reflected energy from their signal. This also means the radar is broadcasting its own position, however. Enemies looking to target and destroy the radar can simply home in on the source of the electromagnetic energy. This is how simple autonomous weapons like the Harpy find radars. They can use passive sensors to simply wait and listen for the cooperative target (the enemy radar) to broadcast its position, and then home in on the signal to destroy the radar.
Non-cooperative targets are those that aren’t broadcasting their location. Examples of non-cooperative targets could be ships, radars, or aircraft operating with their radars turned off; submarines running silently; or ground vehicles such as tanks, artillery, or mobile missile launchers. To find non-cooperative targets, active sensors are needed to send signals out into the environment to find targets. Radar and sonar are examples of active sensors; radar sends out electromagnetic energy and sonar sends out sound waves. Active sensors then observe the reflected energy and attempt to discern potential targets from the random noise of background clutter in the environment. Radar “sees” reflected electromagnetic energy and sonar “hears” reflected sound waves.
Militaries are therefore like two adversaries stumbling around in the dark, each listening and peering fervently into the darkness to hear and see the other while remaining hidden themselves. Our eyes are passive sensors; they simply receive light. In the darkness, however, an external source of light like a flashlight is needed. Using a flashlight gives away one’s own position, though, making one a “cooperative target” for the enemy. In this contest of hiding and finding, zeroing in on the enemy’s cooperative targets is like finding a person waving a flashlight around in the darkness. It isn’t hard; the person waving the flashlight is going to stand out. Finding the non-cooperative targets who keep their flashlights turned off can be very, very tricky.
When there is little background clutter, objects can be found relatively easily through active sensing. Ships and aircraft stand out easily against their background—a flat ocean and an empty sky. They stand out like a person standing in an open field. A quick scan with even a dim light will pick out a person standing in the open, although discerning friend from foe can be difficult. In cluttered environments, however, even finding targets in the first place can be hard. Moving targets can be discerned via Doppler shifting—essentially the same method that police use to detect speeding vehicles. Moving objects shift the frequency of the return radar signal, making them stand out against a stationary background. Stationary targets in cluttered environments can be as hard to see as a deer hiding in the woods, though. Even with a light shined directly on them, they might not be noticed.
Humans have challenges seeing stationary, camouflaged objects and human visual cognitive processing is incredibly complex. We take for granted how computationally difficult it is to see objects that blend into the background. While radars and sonars can “see” and “hear” in frequencies that humans are incapable of, military ATR is nowhere near as good as humans at identifying objects amid clutter.
Militaries currently sense many non-cooperative targets using a technique called synthetic aperture radar, or SAR. A vehicle, typically an aircraft, flies in a line past a target and sends out a burst of radar pulses as the aircraft moves. This allows the aircraft to create the same effect as having an array of sensors, a powerful technique that enhances image resolution. The result is sometimes grainy images composed of small dots, like a black-and-white pointillist painting. While SAR images are generally not as sharp as images from electro-optical or infrared cameras, SAR is a powerful tool because radar can penetrate through clouds, allowing all-weather surveillance. Building algorithms that can automatically identify SAR image
s is extremely difficult, however. Grainy SAR images of tanks, artillery, or airplanes parked on a runway often push the limits of human abilities to recognize objects, and historically ATR algorithms have fallen far short of human abilities.
The poor performance of military ATR stands in stark contrast to recent advances in computer vision. Artificial intelligence has historically struggled with object recognition and perception, but the field has seen rapid gains recently due to deep learning. Deep learning uses neural networks, a type of AI approach that is analogous to biological neurons in animal brains. Artificial neural networks don’t directly mimic biology, but are inspired by it. Rather than follow a script of if-then steps for how to perform a task, neural networks work based on the strength of connections within a network. Thousands or even millions of data samples are fed into the network and the weights of various connections between nodes in the network are constantly adjusted to “train” the network on the data. In this way, neural networks “learn.” Network settings are refined until the correct output, such as the correct image category (for example, cat, lamp, car) is achieved.
Deep Neural Network
Deep neural networks are those that have multiple “hidden” layers between the input and output, and have proven to be a very powerful tool for machine learning. Adding more layers in the network between the input data and output allows for a much greater complexity of the network, enabling the network to handle more complex tasks. Some deep neural nets have over a hundred layers.
This complexity is, it turns out, essential for image recognition, and deep neural nets have made tremendous progress. In 2015, a team of researchers from Microsoft announced that they had created a deep neural network that for the first time surpassed human performance in visual object identification. Using a standard test dataset of 150,000 images, Microsoft’s network achieved an error rate of only 4.94 percent, narrowly edging out humans, who have an estimated 5.1 percent error rate. A few months later, they improved on their own performance with a 3.57 percent rate by a 152-layer neural net.