![]() ![]() Universe. We're releasing Universe, a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications. Universe allows an AI agent to use a computer like a human does: by looking at screen pixels and operating a virtual keyboard and mouse. We must train AI systems on the full range of tasks we expect them to solve, and Universe lets us train a single agent on any task a human can complete with a computer. A sample of Universe game environments played by human demonstrators. In April, we launched Gym, a toolkit for developing and comparing reinforcement learning (RL) algorithms. With Universe, any program can be turned into a Gym environment. Universe works by automatically launching the program behind a VNC remote desktop — it doesn't need special access to program internals, source code, or bot APIs. Today's release consists of a thousand environments including Flash games, browser tasks, and games like slither. GTA V. Hundreds of these are ready for reinforcement learning, and almost all can be freely run with the universe Python library as follows. Universe environments into Gym. VETUSWARE.COM - the biggest free abandonware collection in the universe. Review and pay your bill, sign-up to pay your bill automatically, check your email, and see the latest upgrade offers and deals. Sign-in to My Verizon today! Tabtight professional, free when you need it, VPN service. Foreign Exchange Rates & World Currencies - Bloomberg https://www.bloomberg.com/markets/currencies Current exchange rates of major world currencies.Dusk. Drive- v. 0') # any Universe environment ID here. Up arrow 6. 0 times per second. Key. Event', 'Arrow. Up', True)] for _ in observation_n]. The sample code above will start your AI playing the Dusk Drive Flash game. Your AI will be given frames like the above 6. You'll need to have Docker and universe installed. Our goal is to develop a single AI agent that can flexibly apply its past experience on Universe environments to quickly master unfamiliar, difficult environments, which would be a major step towards general intelligence. There are many ways to help: giving us permission on your games, training agents across Universe tasks, (soon) integrating new games, or (soon) playing the games. With support from EA, Microsoft Studios, Valve, Wolfram, and many others, we've already secured permission for Universe AI agents to freely access games and applications such as Portal, Fable Anniversary, World of Goo, Rim. World, Slime Rancher, Shovel Knight, Space. Chem, Wing Commander III, Command & Conquer: Red Alert 2, Syndicate, Magic Carpet, Mirror's Edge, Sid Meier's Alpha Centauri, and Wolfram Mathematica. We look forward to integrating these and many more. Background. The area of artificial intelligence has seen rapid progress over the last few years. Computers can now see, hear, and translate languages with unprecedented accuracies. They are also learning to generate images, sound, and text. A reinforcement learning system, Alpha. Go, defeated the world champion at Go. However, despite all of these advances, the systems we're building still fall into the category of “Narrow AI” — they can achieve super- human performance in a specific domain, but lack the ability to do anything sensible outside of it. For instance, Alpha. Go can easily defeat you at Go, but you can't explain the rules of a different board game to it and expect it to play with you. Systems with general problem solving ability — something akin to human common sense, allowing an agent to rapidly solve a new hard task — remain out of reach. One apparent challenge is that our agents don't carry their experience along with them to new tasks. In a standard training regime, we initialize agents from scratch and let them twitch randomly through tens of millions of trials as they learn to repeat actions that happen to lead to rewarding outcomes. If we are to make progress towards generally intelligent agents, we must allow them to experience a wide repertoire of tasks so they can develop world knowledge and problem solving strategies that can be efficiently reused in a new task. The Atari 2. 60. 0 game “Montezuma's Revenge,” which is notoriously difficult to learn with reinforcement learning. A human player can immediately see that they control the person, that the skull is probably bad to touch, or that it is probably a good idea to collect the key. An AI agent starting from scratch and without any transfer from past experience is forced to discover the solution through a trial and error approach that may require millions of attempts. Universe Infrastructure. Universe exposes a wide range of environments through a common interface: the agent operates a remote desktop by observing pixels of a screen and producing keyboard and mouse commands. The environment exposes a VNC server and the universe library turns the agent into a VNC client. Our design goal for universe was to support a single Python process driving 2. Each screen buffer is 1. GB/s of memory bandwidth. We wrote a batch- oriented VNC client in Go, which is loaded as a shared library in Python and incrementally updates a pair of buffers for each environment. After experimenting with many combinations of VNC servers, encodings, and undocumented protocol options, we now routinely drive dozens of environments at 6. Here are some important properties of our current implementation: General. An agent can use this interface (which was originally designed for humans) to interact with any existing computer program without requiring an emulator or access to the program's internals. For instance, it can play any computer game, interact with a terminal, browse the web, design buildings in CAD software, operate a photo editing program, or edit a spreadsheet. Familiar to humans. Since people are already well versed with the interface of pixels/keyboard/mouse, humans can easily operate any of our environments. We can use human performance as a meaningful baseline, and record human demonstrations by simply saving VNC traffic. We've found demonstrations to be extremely useful in initializing agents with sensible policies with behavioral cloning (i. RL to optimize for the given reward function. VNC as a standard. Many implementations of VNC are available online and some are packaged by default into the most common operating systems, including OSX. There are even VNC implementations in Java. Script, which allow humans to provide demonstrations without installing any new software — important for services like Amazon Mechanical Turk. Easy to debug. We can observe our agent while it is training or being evaluated — we just attach a VNC client to the environment's (shared) VNC desktop. We can also save the VNC traffic for future analysis. We were all quite surprised that we could make VNC work so well. As we scale to larger games, there's a decent chance we'll start using additional backend technologies. But preliminary signs indicate we can push the existing implementation far: with the right settings, our client can coax GTA V to run at 2. Environments. We have already integrated a large number of environments into Universe, and view these as just the start. Each environment is packaged as a Docker image and hosts two servers that communicate with the outside world: the VNC server which sends pixels and receives keyboard/mouse commands, and a Web. Socket server which sends the reward signal for reinforcement learning tasks (as well as any auxiliary information such as text, or diagnostics) and accepts control messages (such as the specific environment ID to run). Atari games. Universe includes the Atari 2. Arcade Learning Environment. These environments now run asynchronously inside the quay. Docker image and allow the agent to connect over the network, which means the agent must handle lag and low frame rates. Running over a local network in the cloud, we usually see 6. Human demonstrators playing Atari games over VNC. Flash games. We turned to Flash games as a starting point for scaling Universe — they are pervasive on the Internet, generally feature richer graphics than Atari, but are still individually simple. We've sifted through over 3. Our initial Universe release includes 1,0. Flash games (1. 00 with reward functions), which we distribute in the quay. Docker image with consent from the rightsholders. This image starts a Tiger. VNC server and boots a Python control server, which uses Selenium to open a Chrome browser to an in- container page with the desired game, and automatically clicks through any menus needed to start the game. Human demonstrators playing Flash games games over VNC. Extracting rewards. While environments without reward functions can be used for unsupervised learning or to generate human demonstrations, RL needs a reward function.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
November 2017
Categories |