Learn how you can teach an AI agent to balance a pole – no AI background needed

YouTube Video

Building and sharing this simulation has been a completely new experience for me, from creating the video with just a webcam and my computer to sharing this blog remotely from my home office. Of course, we are all learning how to navigate under new circumstances, and are trying different approaches to solving problems.

Simulations have been used to help problems since the late 1940s, with broad industrial usage starting in the late 1980s.

Engineers use simulations to design systems virtually and answer “What If?” questions before building a physical system or device. Such simulations have become an integral part of almost any manufacturing or production process or new product development.

In recent years, we’ve learned to use simulations to provide data to advanced AI systems, which helps engineers solve complex problems with control or optimization. In this particular simulation, we are teaching an AI system a policy for balancing a pole using a simulation model of a pole attached to a frictionless cart.

How to balance a pole is a simple mechanics problem that is well understood by engineers around the globe and therefore makes an ideal example for showing a very different approach. Microsoft has built a powerful and easy-to-use service that enables engineers of all backgrounds to add intelligent control to their systems. We’re providing a complete end-to-end toolchain that does not require a data science background or specialized IT skills for managing large clusters of simulation instances. Complex AI algorithm decisions and hyperparameter tuning is done by the system automatically without any further considerations by the user.

Engineers use a three-step workflow to quickly train their AI agent.

  1. The engineer teaches the system what the AI should learn using a simple Domain Specific Programming Language called Inkling, which allows engineers to describe the characteristics of their specific simulation model and optimal behavior of the intelligent controller.
  2. The engineer connects the model with the service to establish the reinforcement learning loop by utilizing the Microsoft Simulink Toolbox. The Microsoft AI system provides input to the model, executes the simulation and reads back output for assessing the quality of the input with respect to the expected optimal control behavior. This is a one-time step that users will execute by using their local installation of Simulink.
  3. Microsoft’s Project Bonsai automatically scales simulation instances to reduce training times substantially. AI systems require large amounts of data samples and running simulation models at scale in parallel on the Azure cloud makes the system learn faster. Users just upload their model files and the Project Bonsai will do the rest.

The video provides a walk-through of the system and an overview of these steps. We will end up with a trained AI agent that has learned a policy that provides the correct action for any given state for keeping the pole upright.

While this experience of working remotely has been new and challenging, it has also been a great reminder of how much we can achieve with the tools that are available to us in all sorts of situations, including a global pandemic. We understand a lot of you may be working remotely and looking for solutions to support your teams in this new way of working, so we’re interested to hear your thoughts and perspective on this demo.

If you’d like to replicate the example or add intelligent control to your own simulation model, you can sign up for the public preview of our new service. Project Bonsai gives you access to all the tools that you need to connect your model and train it successfully using the Azure cloud.