After 50,000 hours, this AI can play Pokemon Red

-Gudstory

After 50,000 hours, this AI can play Pokemon Red -Gudstory

Rate this post

[ad_1]

About ten years ago, the online phenomenon “Twitch Plays Pokémon” called for over one million people to play Pokémon Red at the same time, with each player’s keystrokes registering as commands to a pixelated avatar. Now, as Magikarp evolves into Gyarados, the evolution of technology raises a new question: Can AI play Pokémon?

For the past few years, Seattle-based software engineer Peter Whidden has been training a reinforcement learning algorithm to navigate the classic first game of the Pokémon series — in that time, the AI ​​has played more than 50,000 hours of the game. Whidden posted a thirty-three-minute YouTube video telling the story of the AI’s development, and after nine days, the video had 2.2 million views.

“It’s been fun to see how many people are connecting with it,” Whiddon told TechCrunch. They uploaded the code they used to GitHub along with instructions on how to operate and train the AI. “There are a lot of people who are really interested in doing this process of construction or design.” One fan was able to apply his code to Pokémon Crystal, another retro Game Boy installment.

The AI’s reinforcement model is Pavlovian, giving AI point-based incentives for leveling up Pokémon, exploring new areas, winning battles, and defeating gym leaders. Sometimes, these incentives don’t perfectly match up with progress in the game, yet the AI’s failures are strangely fascinating, which is probably why Whidden’s video has gone viral.

In an attempt at AI, it simply stops to stare at the water in Pallet Town – the first location you visit in the game – and never moves. It’s stuck in an area with animated water, grass and NPCs that move back and forth, meaning that each individual frame feels like a new experience for the AI, even without getting its first Pokémon yet. Sitting motionless. But this AI is in no rush to “catch all.” It’s just enjoying the beauty of the Kanto region (or maybe it’s taking a moral stance against forcing these cute little animals to fight each other… who can say).

“So, according to our own objective, just hanging out and admiring the scenery is more beneficial than exploring the rest of the world,” Whiddon explains in the video. “It’s a paradox we face in real life: Curiosity leads us to our most important discoveries, but at the same time, it makes us vulnerable to distractions and gets us into trouble.”

The AI ​​somehow continues to tug at our heartstrings: Later, it experiences something akin to a traumatic event at the Pokémon Center. The AI’s success is partly measured by the total levels of all the Pokémon in your party. But when an AI goes to a Pokémon Center and the button is broken enough to deposit the Pokémon into storage, the sum of all levels drops significantly, giving the AI ​​a strong negative signal. With both Pizzy and an unknown creature nicknamed “AAAAAAAAAA” in his party, the sum of all levels was 25, but once Pizzy is submitted to the PC, the sum is only 12.

“It doesn’t have emotions like humans do, but a single event with extreme reward value can still leave a lasting impact on its behavior,” Whidden explains. “In this case, losing your Pokémon just once is enough to create a negative association with the entire Pokémon Center, and the AI ​​will avoid it completely in all future games.”

Image Credit: Peter Whidden on YouTube

Despite the AI’s ability to experience trauma and admire the beautiful pixels of Pallet Town, it’s still just a computer. This AI is not able to read and interpret dialogue in the game, so in early iterations, the program would get stuck at the game’s starting intersection. When you reach the second town in Pokémon Red, you are given an item to bring back to the Pokémon Professor in Pallet Town. But the AI ​​was having difficulty delivering the parcel, making it impossible to proceed. Therefore, Whidden proceeded to start each game after delivering the package, and with Squirtle as the AI’s starter Pokémon, as the early game is generally easier with a Water Pokémon at your service.

“In the video, he is the farthest [the AI] “Between the first and second gym, one reaches Mount Moon,” Whidden told TechCrunch. “Navigating caves in the early Pokémon games is extremely frustrating, even if you have an actual human brain. But Whidden recently changed some of his code and tried a different learning algorithm, and finally, the AI ​​managed to get out of the cave and reach Cerulean City.

Other researchers have used reinforcement learning to study the use of AI in gaming, such as DeepMind’s AlphaGo, which was the first computer program to defeat a professional Go player. But Whidden’s videos have garnered so much attention because he’s so adept at explaining unfamiliar concepts through a familiar medium: Pokémon.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *