GM’s Cruise Dug Itself A Deep Hole; They Want To Show They See It
General Motors’ Cruise robotaxi unit today announced a series of first steps they are taking during the partly voluntary, partly compelled period of shutdown following an incident where their vehicle dragged a pedestrian who was flung in front of it from another impact. Their problems have not ended with the shutdown, and a series of leaks recently revealed more issues, as well as morale problems. GM confirmed that they will pause production of the custom “Origin” robotaxi during this period. Leaks in the intercept alleged Cruise internal memos expressed concern about how well their system was distinguishing between adults and children. Another leak alleged the vehicles have had trouble identifying holes in the road (such as a construction pit) leading to the metaphor headlining this article.
There has also been confirmation that layoffs will come to Cruise, suggesting the shutdown will not be short and morale will get worse before it gets better.
The biggest hole, however, continues to be their lack of disclosure of the details of the pedestrian dragging incident. Cruise worked hard to get out the message that the event was primarily the fault of a hit-and-run driver who hit a jaywalking pedestrian and flung her in front of their vehicle, but avoided talking, even when prompted, about their own mistake in handling the event after the fact. As such, Cruise faces the problem that the public, press and regulators will wonder if they are not telling the full story when they make any statement. That suspicion is always there for all companies, for they generally rarely tell the truly full story of their troubles, but Cruise crossed a line.
Cruise wants to understand how it got in this hole, and how it can get out. They know that they must regain trust to be a viable business. Indeed, all robocar companies already have a hard job of creating trust in the first place. They all started with suspicion from a decent segment of the public, but Cruise has managed to go further into the red. Their challenge has moved from “Show us you’re safe” to “Show us you’re safe and that you’re telling the whole truth about it.”
The first challenge of showing safety was already very hard. Here are some of the steps Cruise announced, with analysis:
Safety Recall
Cruise has installed a fix on the algorithms which triggered the car’s attempt to pull over in the accident situation, which it did too quickly and almost surely “unaware” of the pedestrian under the vehicle. It has done this fix as a “Voluntary Safety Recall.”
While it is good that they have applied this fix, I can’t give it super high marks as it only fixes the proximate cause of the problem. Fixing proximate causes is good, and improves safety, but the real fix is to be more aware that there was somebody under the vehicle. Cruise has not released the source of this error—having somebody thrown in front of your car out of the blue is a fairly rare and difficult situation—but improvement of this is the real goal. However, having fixed the proximate cause (and measuring that the chances of this are very low) they have time to work on the real cause. I would have hoped they would have said they were doing that, but due to being under investigation will make them more closed about these things when they need to be more open.
This is not the first time. When they hit the back of the bus, they quickly fixed the stupid bug that made it happen, but didn’t announce a fix for the larger problem that even a stupid bug that creates a perception error should not be able to drive you into the back of a bus.
The use of the recall mechanism has always never made sense to me. Cruise and all other teams are releasing software updates with safety fixes very frequently. Declaring one of them to be an official voluntary recall seems to be mostly for show and to make nice with the recall regulating NHTSA (which is investigating them.) NHTSA should work out a way to move this into the modern world of constant software updates being a fact of life.
Chief Safety Officer
Cruise’s VP of Safety, Louise Zhang, gets promoted to the C-suite, at least temporarily. They should already have been paying attention to her at that level, perhaps this will improve that. Managing safety is much more complex and subtle than people imagine. All companies make strong declarations of how safety is their top priority, and it certainly had better get a lot of attention. But it’s never quite that simple and all companies have other priorities like functionality and cost and not taking forever to market which live in balance with safety. The true art of a Chief Safety Officer (and CEO) is to find the delicate balance. When your product is intended to improve road safety, slowing down deployment in the interests of perfection in safety is the wrong choice.
3rd Party Legal and Engineering Reviews
Cruise retained a law firm to examine their procedures and how they interacted with other parties. It’s always good to get an outsider’s view to avoid the internal reality distortion field. They also retained Exponent (consulting failure engineers) to look at their engineering mistakes. Cruise should already know many of these, but still can learn more. Everything a robocar does is logged in extreme detail, so the forensics are fairly straightforward. The harder issues are around the procedures which allowed them to happen. Some will always happen, but you try to design to catch as many as you can, particularly the scary ones.
Cruise doesn’t say the reports of these 3rd parties will be published. That’s the other big value of 3rd party analysis, particularly if you use a name-brand firm like Exponent. They won’t have particular self-driving expertise, just general failure engineering expertise, but they do have a reputation for objectivity to protect.
I would recommend Cruise also use (at lower cost) some people with self-driving and robotics expertise to get a reality check on their thinking, as well as those expert in engineering failure.
New procedures
They will, and should, look at their procedures around safety management, safety engineering, transparency and their wounded relationship with the community and press. In a previous article, I outlined some steps they could take in these directions, because the challenge of being open with the public is very difficult.
In particular, the general public’s intuitions about safety are flawed in that they will not lead to the best reduction of risk and improvement of safety. It’s not that the public are stupid, but rather there are complex interactions of issues from moral philosophy, risk analysis, statistics and emotion. The public’s innumeracy about the risks involved in driving are legendary, even though driving is by far our riskiest common activity. Many studies have shown that people naturally judge road risks entirely incorrectly in relation to other risks and it shows in their words and actions. That goes both ways—we are unafraid of certain high risks and too afraid of lower ones. We are greatly affected by how personal the risks and events feel. How well the public will embrace a technology that reduces risk and harm but does so by introducing lesser but different risks and harms is not a well understood problem.
More Simulation
Not having access to internal engineering at Cruise, it’s hard to say too much, but from the outside view I get the sense they need to do more and better simulation, though I know they already do a fair bit.
Every team should have a very extensive library of simulation scenarios for every risky situation they have ever seen, or thought of, or heard somebody else has thought of. All these scenarios should be produced to generate vast numbers of minor variations of a core scenario, including ridiculous situations that push the limits.
Teams have driven millions of miles to see problems and add them to these test suites. But Cruise has had several incidents which show their own suites are incomplete, missing some that don’t seem that obscure:
- Driving into caution tape
- Driving into downed power lines (though simulation of very thin things is challenging.)
- Detecting a vulnerable road use underneath or being dragged by the car
- Post incident movements
- A corner with occluded views with an emergency vehicle with sirens going
- Red light runners in general (they’ve been hit by 2, Waymo seems to do better and has set a bar on that.)
- Cell phone outages and overloads
- Wet concrete in construction zones
- Hitting the back of a bus when main perception has it in the wrong place
- Unprotected left with aggressive oncoming driver in wrong lane
- Pits and holes in the road
I will argue that most of these should be present in a test suite. And I bet they all are in everbody’s test suites now that they were made famous. Making the collection that complete is expensive and time consuming. To that end, 15 years ago I proposed an open simulation library which would encourage everybody to create and share scenarios, including academics. Much later, through Deepen.AI (in which I am an investor and advisor) we created a project called the “Safety Pool,” which is now managed by The University of Warwick with initial participation from the World Economic Forum (WEF.) This pool allows companies to contribute the good scenarios they create, and receive back many more scenarios in return from the other members. It’s a win-win if the companies don’t try to treat their scenario libraries as too proprietary. In fact, it’s a win even then if you can get back much more than you put in.
This is not trivial, because it’s hard to translate scenarios made by others for your own systems, but that effort is worth it, though it needs more funding. I would eventually call for companies to fund tools to make that easier, and to make it easier for independent parties to contribute. I would like to see the many academic courses on self-driving and robotics challenging their students to come up with useful test scenarios not already in the pool. Indeed, it would be cool to see a sort of “bug bounty” so that anybody who generates a realistic scenario that can make a vendor’s system do something wrong gets paid. It’s definitely worth it for the company paying if they find and fix a bug.
In time, almost every crash that is recorded on the road by dashcams and security cams or sensor-equipped cars should be turned, ideally in an automated way, into a scenario for testing, with proper fuzzing (exploring variations) so that companies and the public can know that’s one more thing every car is less likely to get wrong.
Cruise should have simulated 10,000 ways a pedestrian could get thrown under their car, and what they would do. With every new build. Full-sensor simulations (which try to duplicate all sensors and perception) are expensive, but most simulation is done post-perception, along the lines of “If we see a pedestrian right in front of us, what do we do?” Then you flag and fix when you do the wrong thing, ideally fixing not just the proximate cause but the deeper ones.
Ideally if Cruise does some of this work, they and others will join the pool and not compete on safety but make it better for everybody.
In addition, it’s worth sharing the work to simulate things like the coming California “big one” earthquake, or fires like took place on Maui and in Paradise, CA. Today, I fear robotaxis could make some disasters worse, particular if data networks go out as they did in the situations above. They must be sure they don’t make it worse, and ideally make it better. With thought, they actually have the ability to make disasters much better. Imagine robotaxis that can immediately map all road hazards and know the ways out. Robotaxis that can drive back into the evacuation zone empty with oxygen tanks on board, and thermal cameras that avoid hot spots and imaging radars that can drive through thick smoke. Imagine some special fireproof robotaxis which can rescue people surrounded by fire. These will happen some day, and will be tested only in simulator before they are called upon to save lives.