by Gene Kim
Kurt shrugs, continuing, “If it doesn’t work, we all pretend like it never happened and promise not to bug you again.”
“You’ve got a deal,” Maggie says after thinking a moment. “And here’s the good news. I don’t need approval from anyone—it’s my call. Sarah is already onboard. It’s my belief that the survival of the company depends on this.”
Almost on cue, Maggie looks down at her phone, saying, “Hang on a second, it’s Sarah,” as she taps out a reply. “Uh, she’s getting heat from some people because Data Hub was down earlier today, and she wants to know what’s going on, who caused the problem, and if we need to make an example of them.”
Oh great, Maxine thinks. By working with Maggie, they’re falling even deeper into Sarah’s orbit.
The next morning, Kurt, Maxine, Kirsten, and Maggie are once again in front of Chris. When Kurt proposes temporarily swarming the Promotions efforts, not surprisingly Chris seems exasperated.
“You want my job, Kurt? Cause you’re sure acting like it,” he grumbles. But Maggie implores him on the importance of the need for accelerating the work to support the Black Friday holiday promotions and how it could generate very visible and quick wins, and Kirsten reassures him that the other efforts can absorb the temporary reassignments.
Chris furrows his brow. Just like last time, he turns to Maxine. “What do you think, Maxine? Do we really need to do this?”
Maxine studies him, realizing how uncomfortable he is with the constantly changing plans, very different from the static plans that characterized the Phoenix Project.
“Without a doubt, Chris. This is clearly where the company needs developers most. We can’t be hamstrung by our org chart or, for that matter, the annual plan we made last year,” Maxine says, reassuringly.
He looks at her for another moment, grunts his approval, and again briskly shoos them out of his office.
Maxine and Kurt give each other discrete thumbs-up as they walk out.
Despite Sarah’s loud demands to find someone to blame on the temporary Data Hub outage yesterday, Kurt refuses to do anything along those lines. Instead, he gathers everyone in a conference room.
Kurt starts the meeting, saying, “Every time we have an outage, we’ll be conducting a blameless post-mortem like this one. The spirit and intent of these sessions are to learn from them, chronicling what happened before memories fade. Prevention requires honesty, and honesty requires the absence of fear. Just like Norm Kerth says in the Agile Prime Directive, ‘Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.’
“Let’s first start by assembling a timeline and gathering details on what happened. To help with the process, Maxine pulled together our production telemetry and logs, as well as our chat rooms, just to provide a framework for discussion. The goal is to enable the people closest to the problem to share what they saw, so we can make our systems safer. The only rule is that you can’t say ‘I should have done X’ or ‘If I had known about that, I would have done Y.’ Hindsight is always perfect. In crises, we never actually know what’s reallyl going on, and we need to prepare for a future where we have an equally imperfect understanding of the world.”
He looks over at Maxine, indicating for her to proceed. Maxine is impressed and wonders briefly if Kurt has been coached by Erik before this meeting. If so, she’s glad. But despite Kurt’s strenuous declaration that people shouldn’t be afraid to talk, everyone seems reticent to react … even the members of the Rebellion. Given the ever-growing culture of fear and blame, Maxine has been prepared to model the behaviors you’d see where there’s real psychological safety—Erik’s Fourth Ideal.
But before Maxine can start, Brent blurts out, “I’m so sorry, everyone. It was all my fault. I can’t believe I missed the database connection string. I never make that type of mistake, but I was in such a hurry …”
Brent looks so distraught, as if he’d been wanting to make this confession for days. Kurt puts his hand on Brent’s shoulder and says, “Brent, let’s go back to the Agile Prime Directive. No one is at fault. Everyone did the best they could, given what they knew. Let’s just stick with assembling the timeline. Maxine, please lead the way.”
“My pleasure,” Maxine says, winking at Brent, projecting her laptop on the TV. “I’m choosing to start our timeline at 6:37 p.m., which is after Tom initiated the deployment, after all the tests passed but the app failed to start. The health indicators went red, and Tom was the first to notice. Tom, what exactly did you see?”
“I was watching the logs scroll by in the deployment tool, and I saw the startup messages, as you’d expect, and then I saw a bunch of error messages and a stack trace,” he says, his face darkening, reliving the crisis.
“Got it,” she says, adding to her notes that everyone can see on the TV in the front of the room. “What happened next? I remember feeling almost borderline panic, because despite all our preparation, we were clearly in uncharted waters.” With a wry smile, she adds, “Umm, that’s code for ‘I was so scared I was crapping in my pants.’”
People around the table laugh, and Tom says, “Yeah, me too. I’ve spent decades looking at stack traces, but I’ve never seen them in our deployment tool. I couldn’t stop the window from scrolling on me, and I couldn’t see anything long enough to read it.”
Maxine had no idea, because Tom had seemed so calm and was so effective at making sense of the logs. She is typing when Tom says, “You know, I should have rehearsed looking at logs in this new tool.”
“I totally get that, and I’ve been there … and it totally sucks to feel that way,” Kurt responds. “But remember, we’re doing this so that we can be better prepared for the next crisis, when we will be equally ignorant of entirely new things that are just as important and will be just as obvious in hindsight … This is great stuff, Tom. Keep going. What happened next?”
Over the next hour, Maxine and the group assemble an amazingly detailed and vivid timeline of what actually happened. Once again, she marvels that anything can run in production at all given all the imperfections and sharp edges present in their daily work. Log files scrolling by too fast to read, configuration settings scattered across scores of locations, potential failure points hiding in almost every nook and cranny, surprises lurking around every corner … Given all this, it’s amazing that Data Hub has worked mostly without incident for over a decade, she thinks.
Maxine is certain that everyone has learned something about how Data Hub actually works, in stark contrast with their mental models of how they thought it works. She records a list of five things people will change right away that will likely prevent future outages and will certainly make fixing certain problems faster in the future.
As they adjourn, Maxine smiles and says to Kurt, “Nice job running the meeting.” She means it. It was a pitch-perfect example of improving daily work and fostering a culture of psychological safety, as Erik described in the Third and Fourth Ideals.
Reflecting on the meeting, Maxine now appreciates how tenuous and fleeting the conditions that enable psychological safety can be. It depends on the behavior of leaders, one’s peers, their moods, their sense of self-worth, wounds from their pasts … Given all this, it’s amazing that psychological safety can be created at all, she thinks.
Later that day, Kurt, Maxine, and the rest of the newly selected team are gathered in a conference room to meet with Maggie and the rest of the Promotions team leads.
During introductions, Maxine notes that most of the twenty people in the existing Promotions team are front-end developers—they own the mobile application, the product landing pages for the e-commerce site, in-store applications, and all the applications that Marketing staff use to manage the product promotions lifecycle.
Maggie is presenting. “Thank you to Kurt, Maxine, and the rest of the engineers from the Data Hub team who have voluntee
red to help us achieve some badly needed near-term wins. I put together some slides to frame some of the higher-level business outcomes that this team was created to make happen.
“Our market share is declining, primarily because we have little presence in the e-commerce market, the fastest growing part of the broader market,” she says. “This is where our competitors and the e-commerce giants are taking share from us. The good news is that we have fiercely loyal customers … the bad news is that the average age of our customer base continues to increase. Our competitors are clearly winning younger customers, a real market segment, despite declining car ownership because of the rise of ride-sharing, such as Uber and Lyft. But the number of car-miles driven per year keep growing, although who’s doing the driving is definitely changing. But without doubt, demand for car maintenance should grow, not shrink.”
Maggie continues, “For our loyal customers, we know what they buy and how frequently they buy it. We’re focused on enabling personalization and knowledge of current inventory to drive promotion. Until recently, we’ve never been able to use this information to create compelling offers for them.
“We know from our customer research that our core market uses mobile phone apps extensively—in fact,” she points to the projected slide, “here is a picture of Tomas, a customer we interviewed during our market research. He’s a fifty-two-year-old public school teacher. For decades, he’s done all his own car maintenance. It’s something he did with his dad and now it’s something he does with his two teenage daughters and son. He wants his kids to focus on STEM, but he insists that they understand mechanical basics and learn self-reliance.
“He also maintains his wife’s car, and when he has time, his parents’ cars too,” Maggie says. “Tomas doesn’t consider himself to be very technical, but he has six computers at home that he supports for his entire family.
“Right now, he uses a spiral notebook and these file folders to keep records for each car he maintains. He uses his mobile phone all the time, primarily for messaging but also Amazon. He would love to have more of the maintenance routine codified. He loves using Parts Unlimited, but he says he would far rather look in the app for parts rather than having to call the store. He says he likes the in-store employees and knows many of them by name, but complains about our terrible automated phone system and hates having to listen for which button to push to get to a real person.”
Maxine laughs. No one likes those things.
“For the Thanksgiving and Christmas season, we want to find inventory that we have too much of, combine it with our personalization data, create compelling promotions, and deliver them through our e-commerce site, email, and our mobile apps. We want to drive real revenue through these promotions and increase the average monthly usage of the app to prove that we’re actually building something that they value.
“The Phoenix teams have already identified all the needed interfaces to all the various systems where this data is stored: the customer and orders database, the POS transactions, fulfillment systems, the e-commerce website, and the ad campaign data from the Marketing team.
“One of the most critical sources of data is the in-store inventory systems. We want to promote overstocked items, but we’ve got to be very careful that we don’t promote items that we don’t have on hand in that region.
“We finally rolled out a customer relationship management or CRM system a couple of years ago. But as I described last night, connecting data about the customers, such as which automobiles they own and some demographic information, with the vast wealth of other data we have is a real struggle.
“You can see what we’re trying to achieve, right? If only we had a single view of the customer: top of funnel, bottom of funnel, as well as their complete history with the company. Not only what they purchased, but also what they did on our site, what they browsed, searched for, their credit card transactions, repair history … There’s so much potential!
“If we could combine all this information, we would know so much about what they need, and we’d be so much better able to help them,” she says, almost wistfully.
Maxine nods, impressed. She says, “After analyzing even the small amount of data that we’ve been able to combine, we’ve built some customer profiles based on their behavior. The archetypes we’ve created so far are: Racing Enthusiast, Frugal Maintainer, Meticulous Maintainer, Catastrophic Late Maintainer, and Happy Hobbyist.”
“For now, we are focusing on the Meticulous Maintainers and the Catastrophic Late Maintainers, because we think these groups have the highest probability of buying for the types of campaigns we’re thinking of,” Maggie says. “We know the Meticulous Maintainers purchase things like oil change products every month without fail. On the other hand, the purchase history of Catastrophic Late Maintainers suggests that they are constantly accumulating more expensive tools and engine parts, just needing a nudge to complete their work.
“On the screen, you can see a bunch of hypotheses we have. These are offers we think will be a big hit with these customer segments. And this report shows the attributes of customers that we’re at risk of losing entirely,” Maggie says. “The problem is that executing on any of these ideas requires months. Anytime we want to do something, we’ve got to make a million changes across all of Phoenix. Phoenix has been rolling for three years, and we haven’t made one targeted promotion yet. And if we can’t experiment, we can’t learn!”
“You haven’t been able to execute even one of these promotion ideas?” Maxine asks, surprised. “How is that possible?!”
Maxine hears grumbles from around the room from the Promotions team. They start describing why.
“We’re still waiting for access to all those back-end systems. The only one we have access to is the inventory management system,” someone complains. “We already have all the data from the TOFU—top of the funnel. We need information about the customer lifetime value from BOFU—bottom of the funnel.”
“The integration teams take six to nine months to create any new integration for us,” says someone else.
“When we do query the inventory management systems, they often shut us down because of the CPU load we generate or the amount of data we copy,” says a third person.
“The APIs from many of the back-end systems don’t deliver the data we need. We’ve been waiting for months for those teams to implement the necessary API changes.”
“We’re still waiting for correct data from the Data Warehouse team, because their reports are always wrong. Last time we found people’s last names in the zip code field.”
“We’re still waiting on getting new database instances created for us.” And on and on.
There were twenty developers on the Promotions team, with a ton of good ideas that could deliver on so many of the Phoenix promises, but they were all bottlenecked on the back-end systems.
Suddenly, Maxine is very sure that they can help. But another part of her is aghast at how helpless these developers have been, unable to complete their work.
Maxine and the rest of the engineers from the Data Hub team smile at each other. Seeing this, Kurt folds his hands in front of himself, smiling. “I think we can help.”
After nearly ninety minutes of excited discussion and brainstorming, everyone adjourns. Maggie takes Kurt and Maxine aside. “That was amazing. We’ve been crying out for help for so long, but this was the first time that we’ve been able to engage like this with anyone.”
“Well, we haven’t done anything yet,” Kurt says. “But hopefully by the end of next week, we’ll have something that we can point to as progress.”
Maxine nods emphatically. She looks at Maggie for several moments and then asks the question she’s been wanting to ask since last night. “Just what does it take to build great products? And how can we as developers help?”
“Where do I start?” Maggie says. “It usually begins with understanding who our customers are, both current and desired. Then we typically segment the customer
base, so we know what set of problems each faces. Once we know that, we can understand which of those problems we want to solve, based on market size, ease of reaching them, and so forth. Once we know that, we think about pricing and packaging, offer development, and more strategic issues, such as the overall profitability of the product portfolio and how it affects the achievement of strategic goals.
“I need each of my product managers to be able to live in this domain,” Maggie continues. “Almost all great product organizations create customer personas so that everyone can better understand and relate to the people who you’re building products for. That’s the reason behind so much of the UX and ethnographic research we do. For these personas, we articulate their goals and aspirations, figure out what causes problems for them during a typical day, and describe how they do their daily work. If we do things well, we end up building a set of user stories, framed inside of the business outcomes we want. We should be testing and validating all these assumptions in the marketplace and learning all the time.”
Maxine says, “I love this relentless focus on understanding the customer—it reminds me of the Fifth Ideal.”
Maggie looks at her quizzically.
“I’ll explain later,” Maxine says.
“You know, if you’re that interested in the customer, you could do the same in-store training that all employees and managers do. Two weeks ago, all the new sales managers flew here to headquarters to spend a week with our local store. You missed that, but if you want, there’s a new employee training happening on Saturday. Want to join them?” Maggie asks.
Maxine’s jaw drops. She’s always been envious of people who have been able to do this program, and Maggie just offered her the chance to take part in it. “Holy cow, I’d love to. Frankly, I’m a little bummed that I’ve been here for almost seven years and have never been offered this.”
“I require all my product owners and everyone at the manager level and above to go through it,” Maggie says. “I’d be happy to get it arranged.”