Tesla says it is spending more than $1 billion on its Dojo supercomputer by the end of 2024 to help develop autonomous vehicle software.
Dojo was first mentioned by CEO Elon Musk at a Tesla Investor Day in 2019. It was specifically designed for training the machine learning models needed for video processing and recognition to enable vehicles to be self-driving.
During Tesla’s second-quarter earnings call this week, Musk said Tesla wasn’t going to be “open-loop” on its Dojo spending, but the sum at stake would certainly be “north of a billion through the end of next year.”
“To copy us, you’d also need to spend billions of dollars on drive computing,” Musk said, saying developing a reliable self-driving system is “one of the hottest issues of all time.”
“You need the data and you need the training computers, the things to get there at scale towards a widespread autonomy solution.”
Musk pointed out that training complex machine learning models requires huge volumes of data, the more the better, and that’s what Tesla has access to, thanks to all of its vehicle telemetry.
“When it comes to Autopilot and the Dojo, in order to develop autonomy we obviously need to train our neural network with data from millions of vehicles. It’s been proven time and time again, the more training data you have, the better the results,” he said.
“It barely works at 2 million [training examples]. At 3 million, it’s like, wow, OK, we see something. But then you get to, like, 10 million training examples, it gets unbelievable. So there’s simply no substitute for a massive amount of data. And obviously, Tesla has more vehicles on the road collecting this data than all the other companies combined. I think maybe even an order of magnitude,” Musk said.
Regarding the Dojo system itself, Musk said it was designed to drastically reduce the cost of neural network training and was “optimized somewhat” for the type of training Tesla needs, which is video training.
“We’re seeing a demand for really vast training resources. And we think we could hit 100 exaFLOPS internal neural network training capacity by the end of next year,” Musk said, which is a lot of computing power, to say the least.
Musk believes that with all the training data and a “high-efficiency inference computer” in the car, Tesla’s self-driving system will soon make its vehicles not only as proficient as a human driver, but ultimately much better. When? He did not say and form for making large complaints.
“To date, more than 300 million kilometers have been covered with FSD [Full Self-Driving] Beta. This number of 300 million miles is going to seem very small, very quickly. And FSD will go from being as good as a human to then being vastly better than a human. We see a clear path to fully autonomous driving 10 times safer than the average human driver,” he said.
That’s important, Musk explained, because “Right now, I believe there’s something on the order of a million automotive deaths a year. And if you’re 10 times better than a human, that would still mean 100,000 deaths, so, it’s like, we’d rather be a hundred times better, and we want to achieve as perfect a safety as possible.”
Dojo isn’t the only supercomputer Tesla has for video training. The company has also built a computing cluster equipped with 5,760 Nvidia A100 GPUsbut Musk said they just couldn’t get enough GPUs for this task.
“We’re actually going to take the hardware as fast as Nvidia gets it to us,” he said, adding, “If they could get us enough GPUs, we might not need Dojo, but they can’t because they have so many customers.” ®