January 31, 2007

Grand Strategy Game AI as Evolving Society – Part 1

Tonight while reading the inspiring book: “Blondie24”, I came up with some preliminary ideas of applying evolutionary computation with emergence behaviors to strategy game AI.

In a grand strategy game, unlike traditional board games such as chess or checkers, we need a multi-layered approach of writing game AI in order to tame the high degree of computational complexity (which can easily become problematic) if using traditional methods. The difficulty arises because of the fact: in each turn, a player has a high number of choices/moves she can make comparing to traditional board games (e.g., move multiple units of her military forces; develop and invest in economy and research; create, then deploy troops in various locations; negotiate or break alliances). Here conventional board game tree search techniques or reinforcement learning algorithms simply cannot deal with the exponentially increasing potential states generated in the computational space.

One way to tame the decision space complexity is to have an evolving hierarchical structure of agents that mirrors somewhat how a society does things. Humans and other social animals (e.g., ants, bees, wolf packs) evolve with individual members choosing (or being forced) into specific roles to play in a society. A collective society exhibits its adaptive/emergent behaviors through the distributed interactions of its individual members. In essence, its members are selected from a common gene pool in order to compete and cooperative among themselves, and ultimately co-evolved together to ensure the survival of the species (or to be more precisely its gene pool). Similarly in nature or history, societies and civilizations come and go. Failing societies and civilizations are annihilated when their members, collectively, cannot adaptive to ever changing environments or some real threats posed by competing societies and civilizations.

Instead of programming a grand strategy using traditional AI approaches (like finite-state-machine) to create a top-down, centralized AI that controls every little details, we can minimize the required game-playing domain expertise/knowledge, by using evolutionary computing to evolve a hierarchical society of specialized AI agents. These AI agents are simple and specialized so that their decision functions can be computed and evaluated relatively quickly.

In this posting I will describe the first part of the evolutionary computing approach, focusing on creating military units for grand strategy games (like Axis & Allies, Colonial Conquest, or RISK that play on game boards). The following paragraphs describe how to create a distributed network of army AI agents that can coordinate their military movements on the game board.

Assuming we have a pool of genes to be drawn for specifying how an army AI agent ought to behave tactically. We use the usual method in which the higher fitness value of a gene has the higher probability of it will be chosen to create a new army AI agent for deployment. Furthermore, the gene pool is saved and passed on through repeated game-turns.

We can think of the gene in the army gene pool very much like the classifier rule in learning classifier systems (e.g., John Holland’s Echo, Steward Wilson’s XCS).

Now, we adapt the voting-with-their-own-feet approach for the individual armies. In this scheme: army units can freely (actually, they are genetically programmed) to join any favored army group led by a commander to their liking. For instance during the deployment phase of the game, newly formed/generated army units would be drawn from the army-gene pool. Each gene contains a specific genetic segment for deciding the degree of preference of the unit to a specific type of commander pertaining certain trait characteristics.

A commander is basically a strategic-level decision agent (using either a neural network or rule-based learning classifier system) that codified a set of heuristics from high-level military strategies drawn from Sun-Tzu and other military classics. These handcrafted heuristics are often contradictory to one another. The heuristics’ weights are adaptable and subjected to evolution.

Each commander is responsible for directing its subordinated army units with strategic coordination (e.g., attack focus, movement path, tactical withdraw); and if needed, it will initiate request for reinforcement of troops by appealing to the national policy maker (to be described later in another posting). The commander receive

Each army unit receives its rewards and punishments from its commander. These values are administered according to a set of rules of engagement. And each commander has its own set of pay rules that it publicly advertises for recruitment purpose. These advertised rules are parts of the commander’s public trait characteristics.

We can think of these rewards are like currency in a human society. They can be thinking of as payouts for rewarding positive behaviors (or good job done) or extortion’s for punishing negative behaviors. Therefore, a commander pays its subordinates well when they comply with its pre-set rules of engagement and it extorts from them when they misbehaved.

At each game turn, the national policy maker pays out stipends to its commanders based on their achievements; they in terms pay their subordinates accordingly. The national policy maker amasses and allocates the military budgets from many sources (e.g., taxes from provinces, tributes and trades from other nations).

The national policy maker is in charged of grand strategy that ensures its nation’s survival. It directs its commanders where the national interests lie on the geopolitical situations. Therefore it pays out higher stipends to commanders defending critical areas successfully. It signals its desire for certain foreign territories for attack. Commanders then place their bids to take the new commissions when they have the confidence of successful invasion.

Now, I shall stop right here. These preliminary ideas are a bit of sketchy right now. For instance how do we determine the fitness value of each army gene? For example, do we use accuracy as the key measure of a gene’s fitness like that of XCS? How commanders can become adaptive? Ideal, they are just the army units to be created from another commander gene pool. If so, how do we set the reward environment and mechanisms to apply the evolutionary selection pressures on commanders and army units? I have some vague ideas, but more works need to be done There are many more questions to answer and think them through more carefully.

Lee Chen

Leave a Reply