CPSC 533: Artificial Intelligence (W99)

Jonathan Neitz

Humbert Lima

Game Playing: Alpha - Beta Pruning

Chapter 5 of Stuart Russel and Pete Norvig: Artificial Intelligence: A Modern Approach (1995)

Introduction:

Algorithm:

Pseudo Code
Brief Walk through of the Algorithm
Assumptions
Evaluation of Alpha-Beta Efficiency

Example:

Rules
Evaluation Function
Ending State
Animation (Shockwave)

Disadvantages/Known Problems:

References:

Links:

Introduction:

Alpha - Beta Pruning: a technique that improves upon the minimax algorithm by ignoring branches on the game tree that do not contribute further to the outcome.

The basic idea behind this modification to the minimax search algorithm is the following. During the process of searching for the next move, not every move (i.e. every node in the search tree) needs to considered in order to reach a correct decision. In other words, if the move being considered results in a worse outcome than our current best possible choice, then the first move that the opposition could make which is less then our best move will be the last move that we need to look at. As the opposition will at least choose that move. See Figure 1 below.

Figure 1:

Alpha-Beta Pruning the general case. If M is better than N for Player, we will never get to N in play.

Back to the Table of Contents

Now that we have gained a basic understanding of how Alpha-Beta Pruning works we can examine the actual algorithm in more detail. First I'm going to introduce you to some definitions used by the algorithm. Then we will see some Pseudo Code for the algorithm and finally we will step through a generalized walk through of the algorithm.

Alpha ( œ )

minimal score that player MAX is guaranteed to attain.

Beta ( ß )

maximum score that player MAX can hope to obtain against a sensible opponent.

An Alternative Definition:

Let Alpha be the value of the best choice so far at any choice along the path for MAX.

Let Beta be the value of the best (i.e. lowest value) so far at any choice point along the path for MIN.

It is important to note that we now have a Max-Value Function and a Min-Value Function as opposed to the minimax function given previously to handle the minimax problem. They both return the minimax value of the node except for nodes that are to be pruned which are simply ignored. Refer below for the actual algorithm.

Back to the Table of Contents

Pseudo Code:

function Max-Value(state, game, œ, ß) returns the mimimax value of state

inputs:

state

game

state

if CUTOFF-TEST(state) then return EVAL(state)
for each s in SUCCESSORS(state) do

s,game,œ,ß

then return

end

return œ

function Min-Value(state, game, œ, ß) returns the mimimax value of state

if CUTOFF-TEST(state) then return EVAL(state)
for each s in SUCCESSORS(state) do

s,game,œ,ß

then return

end

return ß

Note: That to actually use this algorithm one needs to combine it with the Minimax Decision Function. See Figure 5.3 in the text

Back to the Table of Contents

Alpha-Beta Search: A Brief Walk through:

Here we are going to go through a generalized run through of the algorithm. It may help your understanding to use a diagram for reference such as Figure 1 shown early in the document.

For each node visited we will assign storage for the path taken to backtrack to that node, a value called Alpha and a value called Beta, as well as the current score for that node.

Set the value of Alpha at the initial node to -Limit and Beta to +Limit. Because initially these are the max values that Alpha or Beta could possibly obtain.

Search down the tree to the given depth.
Once reaching the bottom, calculate the evaluation for this node.(i.e. it's utility)
Backtrack, propagating values and paths according to the following:
- If the move being backtracked would be made by the opponent:
  - If the current score is less than the score stored at the parent, replace the score at the parent with this and store the path from the bottom and the value of Beta in the parent. If the score at the parent is now less than Alpha stored at that parent, ignore any further children of this parent and backtrack the parent's value of Alpha and Beta up the tree.
  - If the score at the parent is greater than Alpha, set the Alpha value of the parent to this score and proceed with the next child, sending Alpha and Beta down. If no children exist, propagate Alpha and Beta up the tree and propagate the value of Alpha up as the min score.
- If the move being backtracked would be made by the computer:
  - If the current score is more than the score stored at the parent, replace the score at the parent with this and store the path from the bottom and the value of Alpha in the parent.
  - If the score at the parent is now more than Beta stored at that parent, ignore any further children of this parent and backtrack the parent's value of Alpha and Beta up the tree.
  - If the score at the parent is less than Beta, set the Beta value of the parent to this score and proceed with the next child, sending Alpha and Beta down. If no children exist, propagate Alpha and Beta up the tree and propagate the value of Beta up as the max score.
When the search is complete, the Alpha value at the top node gives the minimum score that the player is guaranteed to attain if using the path stored at the top node.

Back to the Table of Contents

Assumptions:

Now with our understanding of how the algorithm works. We are going to examine what the possible gains of this method are over the old minimax algorithm. Before we do that we must note that we are making a number of assumptions about the game. Namely that:

The game trees nodes all have the same branching factor b
All paths reach the fixed depth limit d
The opponent will always pick the best move.

Back to the Table of Contents

Evaluation of Alpha-Beta Efficiency

With the previously listed assumptions taken into account the following gains\improvements can be calculated.

With Alpha-Beta Pruning the number of nodes on average that need to be examined is O(b^d/2) as opposed to the Minimax algorithm which must examine 0(b^d) nodes to find the best move. In the worst case Alpha-Beta will have to examine all nodes just as the original Minimax algorithm does. But assuming a best case result this means that the effective branching factor is b^(1/2) instead of b.

This translates into a chess game, which normally has a branching factor of 35 having it's branching factor reduced to 6! Simply put this means a chess program running Alpha - Beta could look ahead twice as far in the same amount of time, improving the skill level of our chess program from a novice to an expert level player.

Back to the Table of Contents

An Example using Alpha-Beta Pruning: Tic-Tac-Toe

This is a two player game where each opponent picks a symbol to represent themselves and places them on a 3 by 3 board. The first player to make a complete line wins. This example uses a shockwave animation to show the steps that would be performed if the computer player was using Alpha-Beta search to make it's decisions.

The Players:

The player MAX is the computer player in these examples is 0

The opposition player MIN is the human player in these examples is X

Rules:

If two collinear board places and a third space in the line is empty, award the player 200 points.
If the player has two nearly complete lines then award the player 300 points.
If the player has a complete line award the player 600 points.
Award the player one addition point for each line that could be completed from his/her current position.

Back to the Table of Contents

How the Evaluation Function calculates the utility of a move.

The Evaluation Function calculates the next move by assuming that the opposition will take the best possible move in it's current situation. The utility or worth of a position is calculated by taking the difference between the players score and the opponents score.

 |---|---|---|
 | O | O |   |
 |---|---|---|
 |   | O | X |
 |---|---|---|
 |   |   | X |
 |---|---|---|

For example using the following board above in it's current state the following value of the utility of this game state is 302 for 0 and 201 for X.

So the following the aforementioned rules the utility of this game state is 302 - 201 = 101.

Back to the Table of Contents

End State

When all possible decisions have been evaluated and the best move has been selected.

Animation

You will need to have the shockwave installed in order to view this presentation. There is a version available for Linux, Solaris and Windows machines.

If you don't have the plugin go here Shockwave Web site

If you want to know how to install the plugin go here: Shockwave help

To view the animation from the lectures go here: Tic Tac Toe Animation

Back to the Table of Contents

Disadvantages/Known Problems

So using Alpha-Beta pruning has a number of advantages to it in terms of space/time complexity gains over the original minimax algorithm. So you are probably wondering if this is the best that can be done. In fact it may not be. In addition it does not solve all of the problems associated with the original minimax algorithm. Below is a list of some disadvantages and some suggestions for better ways to achieve the goals of choosing the best move.

Evaluations of the utility of a node are usually not exact but crude estimates of the value of a position and as a result large errors could be associated with them.
We assume that the player will always choose the best possible move for his situation. It is possible that one could choose a poor move if the evaluation function is not that good. See Figure 5.13 in the text.
It is in most cases not feasible to search the entire game tree, a depth limit needs to be set. A notable example is Go which has a branching factor of 360! even with Alpha - Beta Pruning one can not look ahead more than 3 or 4 moves in the game tree.
Alpha-Beta is designed to select a good move but it also calculates the values of all legal moves. A better method maybe to use what is called the utility of a node expansion. In this way a good search algorithm could select a node that had a high utility to expand (these will hopefully lead to better moves). This could lead to a faster decision by searching through a smaller decision space. A extension on those abilities would be the use of another technique called goal-directed reasoning. This technique focuses on having a certain goal in mind like capturing the queen in chess. So far no one has successfully combined these techniques into a fully functional system.

Back to the Table of Contents

References

Russell S.J. and Norvig, P. Artificial Intelligence: A Modern Approach. Englewood Cliffs, N.J., Prentice Hall, 1995.

Belleguelle, Steve. "AIM: Game Playing - 3." Alpha - Beta Pruning reference for the Tic Tac Toe example. Jan 30, 1999. <http://tawny.cs.nott.ac.uk/~sbx/winnie/aim/game/game3.htm>

Links

Woodcock, Steven M. "Classic Games and AI What's Been Solved." A site containing lots of links to Classic AI games like chess, Othello and Go. Nov. 26, 1998. Jan 30, 1999. <http://www.cris.com/~swoodcoc/clagames.html>

Finin, Tim. "Game Playing." A site containing some information on game playing. Feb 22, 1998. Jan 30, 1999. <http://www.cs.umbc.edu/471/notes/games>

Belleguelle, Steve. "AIM: List of Topics" A site containing some good notes on various AI topics. Jan 30, 1999. <http://tawny.cs.nott.ac.uk/~sbx/winnie/aim/topics.htm>

Back to the Table of Contents

Artificial Intelligence (W99), Department of Computer Science

Jonathan Neitz

Humbert Lima

Last updated Feb. 5 1999

CPSC 533: Artificial Intelligence (W99)

Game Playing: Alpha - Beta Pruning

Table of Contents

RETURN TO INDEX:

Introduction:

The Algorithm:

Pseudo Code:

Alpha-Beta Search: A Brief Walk through:

Assumptions:

Evaluation of Alpha-Beta Efficiency

An Example using Alpha-Beta Pruning: Tic-Tac-Toe

The Players:

Rules:

How the Evaluation Function calculates the utility of a move.

End State

Animation

Disadvantages/Known Problems

References

Links