Pitfall
September 1st, 2007
Most of machine learning, when it comes to complex worlds, is pretty much useless. Problems quickly become intractable and agents often take valuable time attempting to learn things that humans would never even consider. It was for this reason that Atari’s Pitfall presents an interesting and complicated environment for learning. With a seemingly infinite state space (imagine learning about the color of each pixel of each frame of the game) a human element had to be added to make this task possible.
When coming into the game a player may not know what the rules, goals are rewards are but they do know a few things about objects. This notion of objects allows the player to understand that one can fall down holes, climb ladders and not attempt to through walls. Using this same concept the agent was no longer interested in learning about pixels or impossible situations but rather how each object interacted with immediate environment.
The algorithm attempted to learn how each object (deterministically) functioned and later how, if possible, the objected functioned when it touched other objects. This can be clearly see with Harry (the main character of the game) who the agent first learns about its basic movements and only then attempts to guide him to other objects where possible new insights can be gained (for instance the log, hole, ladder, and wall).
In this manner the learning process, which using normal AI algorithms would have made Pitfall impossible to be masters, became doable within seconds. The first screen took less than 5sec to be learned and only 500 rules were generated about how all the objects on the screen functioned.