Unity Technologies AI和机器人学习副总裁，前Uber和Amazon机器学习负责人Dr. Danny Lange
Unity机器学习代理（Unity Machine Learning Agents）是Unity推出的第一款机器学习类产品。这款代理通过简单的Python API，使用强化学习和演化法训练智能代理，其主要用途如下：
ML-Agents Community Challenge Winners
Metal Warfare is a Real-time strategy (RTS) game designed for both Entertainment and Education purpose. The education perspective will be on Artificial Intelligence (AI) and Machine Learning (ML), that allow player to design or script AI and may do ML on it.
Inspired by the stealth genre, the project was designed to train an ML Agent to successfully run and hide from a traditional AI which patrols from room to room. A second variation was created to train against a faster AI by utilizing curriculum learning.
It is a Unity AI version, trained with ml-agents, of a little game which is recently popular on WeChat App in China, aiming to make the player jump to next box and never fall down to the ground. It only took two hours to train the AI to play very well.
Autocar with only camera learns to drive using pure RL.
- Process the camera image by detecting lane edges
- Build a model using PPO taking the edge image as input
- Train on a short track for 200 eps
- Train on a long track for another 200 eps
- Upon testing, act deterministically
SpaceY project demonstrates how Unity Machine Learning Agents can be used to teach a rocket to take off from one planet, then fly and land on another one.
This is a vehicle environment with static objects. The vehicle should evade tire piles (- reward) and get stars (+ reward), and it can only move in the lateral direction (3 actions -> left, right, keep).
The sea waves should change so as to bring the drifting ship to the target, overcoming obstacles. However, control over sea waves proved to be very difficult even for people. It seems that wave changes are unpredictable, but AI has learned how to control them!
The submission aims to automate breakfast, with one Machine Learning Agent flipping a pancake from a pan to a plate, and the “Pass the Butter Robot” from Rick and Morty dodging obstacles to deliver butter.
This project uses ML-Agents to stabilize satellite rotating in one axis. AI controls two satellite engines which can be turned on or off. Session starts with rotating satellite. It took two hours of ML-Agents learning to achieve a goal to stabilize satellite.
根据代理、人脑和奖励功能之间的连系，Unity机器学习代理让很多之前不可能实现的训练情景成为可能。这些包括：Single Agent、Simultaneous Single Agent、Cooperative和 Competitive Multi-Agent以及Ecosystem。
这个网球示例显示了Adversarial Self-Play奖励功能。两个相互作用的代理具，都使用逆向奖励功能，并且与同一个大脑相关联。 在双人游戏中，Adversarial Self-Play可以让代理变得越来越熟练，同时始终拥有完美匹配的对手：它自己。