Andrej Karpathy is the Director of Artificial Intelligence at Tesla. He just gave the keynote address at the CVPR 2021 Workshop on Autonomous Vehicles and it is truly fascinating to watch.
While many of us might assume that when it comes to creating a system that allows a car to drive autonomously, the more data from multiple sorts of sensors, the better, that may not actually be the case. At least, not according to Karpathy. In the video above, he lays out why Tesla has moved to a pure vision-based approach and how it works.
This is quite a departure from the way other companies have tried to solve autonomous driving – they typically incorporate LIDAR, high-definition maps, radar, and cameras. Karpathy argues that the vision-based approach is superior because it can "scale" much better. That is, it can be made to work in a wide variety of environments, not just those that have been thoroughly mapped out.
Tesla has already started moving forward with this approach in its production vehicles. The cars being produced today no longer have radar sensors in them. While this may have caused some concern and, at least temporarily cost it the IIHS Safety Award – Auto Emergency Braking (AEB) and Forward Collision Warning (FCW) are not currently functional, though we understand the company expects to have these re-enabled in a few weeks – Karpathy is convincing that the gamble will pay off.
His confidence stems from the advances the company has made with vision. Now, the data returned by the cameras is so superior to that of the radar, the latter is not especially useful. Karpathy points to a tweet from his boss to underline his point: "When radar and vision disagree, which one do you believe? Vision has much more precision, so better to double down on vision than do sensor fusion."
Karpathy explains that the eight cameras in a Tesla provide a lot more information about a vehicle's surroundings compared with any other type of sensor. And contrary to what people might expect, this extends to areas like depth and velocity of objects. This is where radar falls down.
Generally, radar provides pretty good information on depth and velocity, but it can have issues with a manhole cover or a shadow created by an overpass – it doesn't have great vertical resolution – and give false readings. This may be why a vehicle's AEB systems may brake suddenly for these false positives. Karpathy explains that by training neural networks with data sets that are huge, clean, and diverse, they can provide a better quality over all for these metrics than radar.
It's interesting to note that the huge fleet of Tesla vehicles on the road is helping produce these data sets. Karpathy mentions that, while a car may be operating on its Advanced Driving Assistance System (ADAS), like Autopilot, a "shadow" program is running in the background and making comparisons with what it sees and what the functional ADAS sees.
There is a lot more to his presentation, of course, so if you haven't already, check out the video above. It will be interesting to see in the coming months whether his enthusiasm for neural networks solving for self-driving is well placed.
Autonomous driving is the area that has most frustrated Musk's optimistic predictions in the past and there is no shortage of critics and naysayers, even (or maybe even especially) in the autonomous-driving community, standing by to trumpet any failure of this unique approach.
Success in achieving a high-level of autonomous function will go a long way toward helping fulfill Tesla's automaking ambitions. If it can develop a safe system ahead of its competitors, it could really prove to be a large advantage in driving demand further. Certainly, we'll be watching closely to see how it how it all works out. Or doesn't.