For April Fool’s Day, Google turned its popular Maps navigation program into a playable version of Ms. Pac-Man, transforming your local city streets (or any collection of streets around the globe) into a ghost-chasing playground. While that was a nice break from the norm for the traditionally staid Maps application, Microsoft is taking a completely different approach with Ms. Pac-Man.
It all starts with Maluuba, which is a deep learning startup that Microsoft acquired earlier this year. Using the company’s reinforcement learning algorithms, it was able to teach AI how to thoroughly master the Atari 2600 version of Ms. Pac-Man. We’re not talking about "simply" getting a score of a few hundred thousand points (like top human players); instead, Maluuba’s AI was able to achieve the highest score possible in the game: 999,990.
So, why did Maluuba choose Ms. Pac-Man, and why select such an ancient game in the first place to give its AI a challenge? For starters, Ms. Pac-Man was a much harder game than the original Pac-Man, as it was designed for arcades that first and foremost wanted to extract as many quarters as possible from your pocket.
“You want (players to think), ‘Oh, oh, I almost got it! I’m going to try again,’” said Steve Golson, a co-creator of the arcade version of Ms. Pac-Man. “Ka-ching! Another quarter.”
But while it may seem on the surface like a rather simple game to get in and somewhat master, the scenarios and paths to outright victory are incredibly complex. In addition, the game is not as predictable as its predecessor.
“A lot of companies working on AI use games to build intelligent algorithms because there’s a lot of human-like intelligence capabilities that you need to beat the games,” added Maluuba program manager Rahul Mehrotra.
These complexities gave Maluuba researchers the perfect testbed for its reinforcement learning, which uses an agent to gauge both positive and negative responses for a given action. The system then uses trial and error to find the optimum route to victory by maximizing the most “positive” rewards. The positive rewards — from smallest to largest — are the pellets, fruit, and then a blue ghost. A large negative is given when Ms. Pac-Man is eaten by a ghost.
While beating a game like Ms. Pac-Man might seem silly for a company like Microsoft, the actual real-world implications of reinforcement learning are quite broader in scope including robotics and financial models:
Mehrotra said the method they developed to beat Ms. Pac-Man could be used to help a company’s sales organization make precise predictions about which potential customers to target at a particular time or on a particular day. The system could use multiple agents, each representing one client, with a top agent weighing factors such as which clients are up for contract renewal, which contracts are worth the most to the company and whether the potential customer is typically in the office that day or available at that time.
Going back to Ms. Pac-Man, the nearly 1 million high score achieved by Maluuba researchers is quite the feat, but how does the top human player stack up? According to HighScore.com, the top Ms. Pac-Man score for a human belong to a Brazilian, who racked up 266,330 points.