On September 30th, Tesla exhibited the first prototypes of its humanoid robot ‹Optimus Prime›. According to the company, this robot could become its most important product and was so ambitious that in the future, Elon Musk says, it could add considerably to the size of the real global economy. Is this to be taken seriously? And why, in particular, a robot with a human form? We take a look here at some aspects of this phenomenon, without any claim to complete comprehension or judgment.
Many commentators are skeptical of Tesla’s decision to choose the humanoid form. Shouldn’t the most effective and simplest form be sought, instead of going to the trouble of recreating each of the ten human fingers and choosing a shape that is so difficult to keep in balance? Tesla justifies its decision like this: the world we have built and live in as humans is indeed made and intended for humans. So in order to interact and participate in this world, you, therefore, have to have a human form. Thus Optimus could theoretically do everything we humans do – first and foremost, according to Tesla, the «dangerous, repetitive and boring» tasks. With a human form, the robot could, for example, move around a building, carry food, use tools or drive a car without any problem. It would also make it easier to interact with us humans. We could easily show Optimus something, teach him our practical knowledge, and – if we wanted to – integrate him into our everyday life as a new employee, cook or gardener.
The idea that we humans would have to teach robots things is central to this model. Optimus is supposed to be able to combine physical activity with a flexible ‹understanding› of the world, which means that his artificial intelligence (AI) is supposed to improve day by day. It is precisely the interaction with the real world, with people and objects, that must enable this. The ability to learn effectively from interaction with the real world has already been demonstrated by Tesla with the development of its ‹Full Self-Driving› system (FSD for short) in its cars: this artificial intelligence has been undergoing training for about two years. It is based on the one hand on visual perception of the world (cameras) and on the other hand on precise observation and imitation of human driving behavior. With Optimus, the same interaction should take place in the learning process to understand how to deal with the world.
Whether Tesla succeeds in making something useful out of this robot and producing millions of them, as is planned, remains to be seen. However, recent developments in the field of artificial intelligence show that these systems can assess a real situation or comprehend a task in certain situations. For example, the ‹Stable Diffusion› learning model was developed by the University of Munich in 2022. Artificial intelligences that are trained according to this model all over the world today are capable, among other things, of creating any image from a simple text description. As a demonstration of this process, we have created an illustration for this article using the ‹DALL-E› network from the company Open AI. The image was generated from the following description: «An expressive oil painting of a robot painting a robot.» It is important to understand here that the illustration is not a compilation of existing images, but a completely new, artificially created image. We did not post-process it.
Today, neural networks such as ‹DALL-E› are not only able to comprehend language and text information but can also create solutions for problems that we humans, even with the help of computers, find difficult or impossible to solve ourselves. For example, the AI program ‹AlphaFold› developed by DeepMind (a subsidiary of Google) can predict any protein structure in a few minutes. A task that, without the help of artificial intelligence, represents a scientific project lasting several years.
What ultimately happens with the robot perspective is an amalgamation of these different artificial abilities: comprehension of language and the physical world, as well as creative action from concrete situations and tasks. No longer, as has been the case until now, only on an informational level in the virtual world, but physically in the external world. So what we have observed in the area of data processing and data exchange for about 80 years in terms of new possibilities and challenges could from now on also become apparent on a physical level.
Is this perspective desirable, and will it be possible to realize it in this way? We will leave these questions open here. Many point to dangers, such as when these robots start defending their own or private interests. On the other hand, our humanness is also challenged by this perspective. After every technological development, new questions arise. Now, these are, among others: What actually is intelligence? What is consciousness? What is the relationship between human beings and technology? How, from the perspective of increasingly robotized production and a growing economy, could redistribution be envisaged – for example in the form of a basic income? And as an important practical question: How can we support and help shape the development of these technologies where they are realized and directed in the spirit of human ethics and an aesthetic that leaves people free?
Sources and related material
Video Tesla AI Day 2022
Image generation by Open AI
Video How Marcelo and Megan solved a ten-year problem in minutes
Video Interview by Lex Fridman with Demis Hassabis
Title image Prototype of Tesla’s humanoid robot. Photo: Tesla, Inc.
Translation Christian von Arnim
Thank you for your accessible and thought-provoking article.