Autonomous Driving
Assembly Line
Introducing Waymo's Research on an End-to-End Multimodal Model for Autonomous Driving
At Waymo, we have been at the forefront of AI and ML in autonomous driving for over 15 years, and are continuously contributing to advancing research in the field. Now, we are sharing our latest research paper on an End-to-End Multimodal Model for Autonomous Driving (EMMA).
Powered by Gemini, a multimodal large language model developed by Google, EMMA employs a unified, end-to-end trained model to generate future trajectories for autonomous vehicles directly from sensor data. Trained and fine-tuned specifically for autonomous driving, EMMA leverages Gemini’s extensive world knowledge to better understand complex scenarios on the road.
Our research demonstrates how multimodal models, such as Gemini, can be applied to autonomous driving and explores pros and cons of the pure end-to-end approach. It highlights the benefit of incorporating multimodal world knowledge, even when the model is fine-tuned for autonomous driving tasks that require good spatial understanding and reasoning skills. Notably, EMMA demonstrates positive task transfer across several key autonomous driving tasks: training it jointly on planner trajectory prediction, object detection, and road graph understanding leads to improved performance compared to training individual models for each task. This suggests a promising avenue of future research, where even more core autonomous driving tasks could be combined in a similar, scaled-up setup.
Komatsu achieves major autonomous milestones
With a fleet of more than 750 autonomous haul trucks commissioned worldwide, Komatsu customers have hauled more than 10 billion metric tons of material and are adding to that milestone at a rate of over 6 million metric tons per day. Additionally, 10 Komatsu autonomous trucks have achieved a benchmark of 100 thousand autonomous hours each, a first in the mining industry.
Komatsu launched the FrontRunner Autonomous Haulage System (AHS) in 2008, marking the world’s first commercial application of an AHS. In the years since, Komatsu has continued to innovate alongside customers to meet their evolving needs and offer tailored autonomous solutions to promote enhanced operations on a mine-by-mine basis.
Solving the Last Mile of Autonomous Farming
SparkAI is the first to generalize a solution we originally developed to help self-driving cars overcome unexpected driving scenarios, and make it available to the wider universe of automation applications. We combine people and technology in a lightweight API that resolves machine learning exceptions in real-time.
Here’s how it works: in moments of low-confidence, the autonomous tractor automatically calls SparkAI’s service, passing imagery and other metadata via REST API. SparkAI’s objective is to resolve difficult-to-discern details about the scene to support a real-time decision. We do this by combining two key components in real-time: (1) cognitive input from multiple human mission specialists trained for the use case, and (2) results from our own proprietary software-based decision systems. SparkAI returns this resolution directly to the robot. The robot then combines this resolution with its pre-existing knowledge of the world to decide on a safe and confident action.