ChatGPT for Robotics: A Promising Future According to Microsoft

06-03-2023 | By Robin Mitchell

As ChatGPT, the natural language device continues to make headlines around the world, Microsoft has recently published a new article describing how it could be used to program complex robotic systems, providing humans with the ability to use natural language to give commands to robotic arms and legs, as well as drones. This programmability would allow for more control over the robotic devices, reducing flawed logic and arbitrary behaviour and making them more generic and efficient.  What challenges do robotics face concerning programmability, what did Microsoft publish, and could ChatGPT be the start of futuristic robotic systems?

What challenges do robotics face with programmability?

Any engineer who has worked in the field of robotics understands the immense complexities involved, especially when it comes to programming. Basic numeric control systems like CNCs are extremely easy to understand as the positioning system is flat, with specific X, Y, and Z coordinates. For example, moving the head of a CNC from one place to another can be achieved by raising the Z axis (if needed), setting new values for X and Y, and then lowering the Z axis. 

However, in the case of robotic arms and legs, the use of angular positions and axis of rotation that affect other axis results in highly complex kinematics. For example, getting a robotic arm with 4 degrees of freedom to draw a flat square requires all joints to move in a very particular way, and the equations that describe such motion are not for the faint of heart. This is why many robotic systems will provide inverse kinematic interfaces that allow users to describe a specific motion, and the robotic system will calculate how it should move to achieve that motion.

Another option that robotic systems can use for programming motion is a follow-by-example. In the case of robotic arms, it is possible for a very large arm used in heavy industrial processes to mimic the actions of a smaller arm, which is manipulated by a human via direct manipulation. While this is ideal for simple, repetitive motion, it does not allow for real-time changes that may be needed if certain conditions are met (such as picking up a different object or accounting for objects to be positioned somewhere else). 

Overall, the complex nature of robotics and their programming makes it difficult for robotics to be deployed on scale. A factory installing 100 robots will only be functional if there are engineers around to program and maintain them, even though the actions that those robots will need to complete could easily be described by anyone working at the factory (e.g., place box here, pick up these objects, weld that line).

Microsoft explores ChatGPT as a robotics control platform

As ChatGPT continues to make headlines around the world, Microsoft has recently published a new article describing how ChatGPT could be used to program complex robotic systems, providing humans with the ability to use natural language to give commands.

Simply put, engineers would start by developing an underlying API that enables other software to easily control robotic motions, such as setting the position of an axis or joint. The function calls used in the API would be given easy-to-understand names, and these function calls would be accessible by a ChatGPT application. From there, a spoken command, such as “Arrange these blocks in order of size”, could be fed into the ChatGPT engine, which then determines how that action should be taken and call the relevant functions found in the API.

In their paper, the researchers from Microsoft were able to effectively program OpenAIs ChatGPT with a series of functions related to real-world, low-level capabilities of a robotic system that could be combined to perform more complex tasks. One specific example shows how the researchers presented ChatGPT with a few basic functions, such as locate_object(), go_to_location(), pick_up(), and use_item(), and asked ChatGPT to use those functions to write a piece of code that would make an omelette. The resulting output, albeit simple, was able to determine all actions needed to make an omelette from the basic list of functions, something that a low-level API could easily translate. 

Another example tested how ChatGPT would program a ball catcher that would utilise OpenCV libraries to read data from a camera, determine the position of an orange ball, and then try to catch that ball by positioning a robotic system for interception. The system was further tested with real-world examples, including a drone that was able to translate instructions from ChatGPT to look for physical objects and hover around them, including a can of coke and a microwave.

Example of ChatGPTs output for making an omelette (Click to enlarge)

Could ChatGPT be the start of futuristic robotic systems?

While engineers continue to develop robotic systems that will be capable of walking about city centres and helping people with their daily tasks, it is highly likely that ChatGPT will play a critical role in robot/human relations. 

The ability of ChatGPT to understand natural language, as well as develop complex programs and logic, to perform arbitrary actions will enable anyone to direct robotic systems without needing to understand how robotic systems work. Instead, such an automated platform would behave like any human, only requiring a brief explanation of their task, such as “take out the trash”, “search for this missing item”, and “deliver this parcel for me to this address”.

Of course, engineers need to write the low-level APIs, which break down complex tasks into simple function calls, and this is easier said than done, but considering the many advances that robotics has seen in the last decade, this is more likely to be a reality than fantasy. 


By Robin Mitchell

Robin Mitchell is an electronic engineer who has been involved in electronics since the age of 13. After completing a BEng at the University of Warwick, Robin moved into the field of online content creation, developing articles, news pieces, and projects aimed at professionals and makers alike. Currently, Robin runs a small electronics business, MitchElectronics, which produces educational kits and resources.