I put together a little tic-tac-toe game that uses the minimax AI algorithm and thought I'd write a tutorial on how it works. (This code is based on work by Clederson Cruz; I've also created a Python version.) Feel free to fork the REPL and add to it!

The game is only a couple hundred lines of Ruby in a single main.py file, so we'll walk through it in this tutorial step-by-step. (This will be almost exactly the same as my tutorial for the Python version.)

The Board

The nice thing about tic-tac-toe is that the game is easy to model! While we could build out an entire Board class (and that's something you could add if you wanted to fork this REPL and update it), but for simplicity's sake, we'll just model our board is a two-dimensional array:

@board = [
[0, 0, 0],
[0, 0, 0],
[0, 0, 0]
]

The board-related functions in our game will be empty_cells (which returns open spots on the board), set_move (which places an X or an O in the provided spot, assuming it's available), and render (which draws the board). Let's look at each one in turn.

def empty_cells(state)
[].tap do |cells|
state.each_with_index do |row, x|
row.each_with_index do |cell, y|
cells << [x, y] if cell.zero?
end
end
end
end

Here, we iterate through all the cells on the board and check to see if the value is still 0 (which is how we initialized the board, as you can see above). If so, we add that cell's coordinates to the cells array, then return that array so we know which spots are still available for a player to choose. If you're curious about the [].tap magic, all that does is let us modify our array inside the block, then automatically return the completed array; you can read more about Object#taphere.

def set_move(x, y, player)
if empty_cells(@board).include? [x, y]
@board[x][y] = player
return true
end
false
end

This is also pretty straightforward: we check to see if our chosen move is valid (that is, it's in the empty_cells array), and if so, we set that square on the board to X or O (depending on which player is moving), returning true on success and false if the move can't be made because the square is already occupied.

def render(state, c_choice, h_choice)
state.each do |row|
puts "\n---------------"
row.each do |cell|
if cell == COMP
print "| #{c_choice} |"
elsif cell == HUMAN
print "| #{h_choice} |"
else
print "| |"
end
end
end
puts "\n---------------"
end

Our last board method just prints out the current state of the board, printing out every cell and showing a blank for free cells, X for cells taken by the X player, and O for cells taken by the O player.

Evaluating Moves

Next, we'll look at our functions for evaluating moves: evaluate, wins, and game_over.

def evaluate(state)
if wins(state, COMP)
1
elsif wins(state, HUMAN)
-1
else
0
end
end

So far, so good! We return 1 if the provided state (which is a two-dimensional array) would result in the computer winning, -1 if the human player would win, and 0 if it would be a tie. There are two important things to note here:

We choose -1 for a human player win because we want the minimax algorithm to run from the computer player's perspective: it's trying to minimize the maximum score the human player can achieve;

This function is run for possible game states, meaning that the computer player uses this function to look at the results of all possible moves, all the possible moves those moves would allow, all the possible moves those moves would allow, and so on, generally limited to a certain depth (at which point the computer chooses the move that leads to the worst "possible futures" for the human player; that is, the branch in the tree of possible states with the smallest possible maximum score).

This might seem a little confusing now, but we'll go through all the details when we get to the minimax function itself.

Here, we simply return true if the player has achieved a set of winning moves. There are eight winning moves in tic-tac-toe: the top row, the middle row, the bottom row, the left column, the middle column, the right column, and the two diagonals (corresponding to the eight entries in our win_state array).

def game_over(state)
wins(state, HUMAN) || wins(state, COMP)
end

This one's pretty easy! If either the human or the computer wins, the game is immediately over.

The Minimax Algorithm

On to the main event: the minimax algorithm! This is how our computer player decides which moves to make.

As mentioned earlier, this algorithm looks at "possible futures" in the game, assigning expected scores to each, and makes the move that maximizes the score (for the computer) and minimizes the score (for the human). (Remember, since the algorithm is being used by the computer player to select a move, it will want to maximize its score and minimize the human opponent's score.) We take the state of the game, the depth to recurse to in our game tree, and a player, and initialize our "best move" to -1, -1, -INFINITY (that is, a non-existent square with worst/lowest possible score) for the computer player and -1, -1, +INFINITY (a non-existent square with the best/highest possible score) for a human player. Then, for as long as we're recursing and the game isn't over, we continually make imaginary moves and check the resulting scores, assigning as the "best" score the one that represents the highest possible (for the computer) or lowest possible (for the human). In the base case, we return the result of evaluate on our game state, and the overall algorithm returns the move with the highest perceived score for the computer and the lowest one for the human.

def minimax(state, depth, player)
if player == COMP
best = [-1, -1, -INFINITY]
else
best = [-1, -1, INFINITY]
end
if depth.zero? || game_over(state)
score = evaluate state
return [-1, -1, score]
end
empty_cells(state).each do |cell|
x, y = cell[0], cell[1]
state[x][y] = player
score = minimax state, depth - 1, -player
state[x][y] = 0
score[0], score[1] = x, y
if player == COMP
if score[2] > best[2]
best = score
end
else
if score[2] < best[2]
best = score
end
end
end
best
end

Tying It All Together

Finally, we'll complete our game with three functions to manage gameplay: ai_turn, human_turn, and main.

def ai_turn(c_choice, h_choice)
depth = empty_cells(@board).size
return if depth.zero? || game_over(@board)
clear_screen
puts "Computer turn [#{c_choice}]"
render @board, c_choice, h_choice
if depth == 9
x = [0, 1, 2].sample
y = [0, 1, 2].sample
else
move = minimax @board, depth, COMP
x, y = move[0], move[1]
end
set_move x, y, COMP
sleep 1
end

The computer player uses minimax to make the best possible move given the current state of the board, with the exception of when depth == 9 (that is, every square on the board is empty), in which case since the computer player is moving first, there's no "best" move and the computer selects a random square.

Here, we handle the input from the human player, mapping the numbers 1 through 9 to squares in our board (a two-dimensional array).

def main
h_choice = ''
c_choice = ''
first = ''
while h_choice != 'O' && h_choice != 'X'
begin
puts 'Choose X or O:'
h_choice = gets.chomp.upcase
puts "Chosen: #{h_choice}"
rescue SignalException
puts 'Bye!'
exit
rescue StandardError
puts 'Invalid choice'
end
end
if h_choice == 'X'
c_choice = 'O'
else
c_choice = 'X'
end
while first != 'Y' && first != 'N'
begin
puts 'First to start? [y/n]: '
first = gets.chomp.upcase
rescue SignalException
puts 'Bye!'
exit
rescue StandardError
puts 'Invalid choice'
end
end
while empty_cells(@board).size > 0 && !game_over(@board)
if first == 'N'
ai_turn c_choice, h_choice
first = ''
end
human_turn c_choice, h_choice
ai_turn c_choice, h_choice
end
if wins @board, HUMAN
clear_screen
puts "Human turn [#{h_choice}]"
render @board, c_choice, h_choice
puts 'You win!'
elsif wins @board, COMP
clear_screen
puts "Computer turn [#{c_choice}]"
render @board, c_choice, h_choice
puts 'Computer wins!'
else
clear_screen
render @board, c_choice, h_choice
puts 'It\'s a tie!'
end
exit 0
end

Finally, our main function lets the human player select X or O and whether they want to move first or second, then enters the game loop: as long as there's an empty square on the board, alternate human and computer player turns, displaying the appropriate messages for when the human player wins, loses, or ties with the computer. We kick everything off just by calling main at the very end of the file.

...and that's it! I hope you enjoyed this tutorial, and feel free to fork this REPL to add more functionality.

Hi everyone,

I put together a little tic-tac-toe game that uses the minimax AI algorithm and thought I'd write a tutorial on how it works. (This code is based on work by Clederson Cruz; I've also created a Python version.) Feel free to fork the REPL and add to it!

The game is only a couple hundred lines of Ruby in a single

`main.py`

file, so we'll walk through it in this tutorial step-by-step. (This will be almost exactly the same as my tutorial for the Python version.)## The Board

The nice thing about tic-tac-toe is that the game is easy to model! While we could build out an entire

`Board`

class (and that's something you could add if you wanted to fork this REPL and update it), but for simplicity's sake, we'll just model our board is a two-dimensional array:The board-related functions in our game will be

`empty_cells`

(which returns open spots on the board),`set_move`

(which places an`X`

or an`O`

in the provided spot, assuming it's available), and`render`

(which draws the board). Let's look at each one in turn.Here, we iterate through all the cells on the board and check to see if the value is still

`0`

(which is how we initialized the board, as you can see above). If so, we add that cell's coordinates to the`cells`

array, then return that array so we know which spots are still available for a player to choose. If you're curious about the`[].tap`

magic, all that does is let us modify our array inside the block, then automatically return the completed array; you can read more about`Object#tap`

here.This is also pretty straightforward: we check to see if our chosen move is valid (that is, it's in the

`empty_cells`

array), and if so, we set that square on the board to`X`

or`O`

(depending on which player is moving), returning`true`

on success and`false`

if the move can't be made because the square is already occupied.Our last board method just prints out the current state of the board, printing out every cell and showing a blank for free cells,

`X`

for cells taken by the`X`

player, and`O`

for cells taken by the`O`

player.## Evaluating Moves

Next, we'll look at our functions for evaluating moves:

`evaluate`

,`wins`

, and`game_over`

.So far, so good! We return

`1`

if the provided`state`

(which is a two-dimensional array) would result in the computer winning,`-1`

if the human player would win, and`0`

if it would be a tie. There are two important things to note here:`-1`

for a human player win because we want the minimax algorithm to run from the computer player's perspective: it's trying tominimize themaximum score the human player can achieve;possiblegame states, meaning that the computer player uses this function to look at the results of all possible moves, all the possible moves those moves would allow, all the possible movesthosemoves would allow, and so on, generally limited to a certain depth (at which point the computer chooses the move that leads to the worst "possible futures" for the human player; that is, the branch in the tree of possible states with the smallest possible maximum score).This might seem a little confusing now, but we'll go through all the details when we get to the

`minimax`

function itself.Next is the

`wins`

function, which looks like this:Here, we simply return

`true`

if the player has achieved a set of winning moves. There are eight winning moves in tic-tac-toe: the top row, the middle row, the bottom row, the left column, the middle column, the right column, and the two diagonals (corresponding to the eight entries in our`win_state`

array).This one's pretty easy! If either the human or the computer wins, the game is immediately over.

## The Minimax Algorithm

On to the main event: the

`minimax`

algorithm! This is how our computer player decides which moves to make.As mentioned earlier, this algorithm looks at "possible futures" in the game, assigning expected scores to each, and makes the move that maximizes the score (for the computer) and minimizes the score (for the human). (Remember, since the algorithm is being used by the computer player to select a move, it will want to maximize its score and minimize the human opponent's score.) We take the state of the game, the depth to recurse to in our game tree, and a player, and initialize our "best move" to

`-1`

,`-1`

,`-INFINITY`

(that is, a non-existent square with worst/lowest possible score) for the computer player and`-1`

,`-1`

,`+INFINITY`

(a non-existent square with the best/highest possible score) for a human player. Then, for as long as we're recursing and the game isn't over, we continually make imaginary moves and check the resulting scores, assigning as the "best" score the one that represents the highest possible (for the computer) or lowest possible (for the human). In the base case, we return the result of`evaluate`

on our game state, and the overall algorithm returns the move with the highest perceived score for the computer and the lowest one for the human.## Tying It All Together

Finally, we'll complete our game with three functions to manage gameplay:

`ai_turn`

,`human_turn`

, and`main`

.The computer player uses

`minimax`

to make the best possible move given the current state of the board, with the exception of when`depth == 9`

(that is, every square on the board is empty), in which case since the computer player is moving first, there's no "best" move and the computer selects a random square.Here, we handle the input from the human player, mapping the numbers

`1`

through`9`

to squares in our board (a two-dimensional array).Finally, our

`main`

function lets the human player select`X`

or`O`

and whether they want to move first or second, then enters the game loop: as long as there's an empty square on the board, alternate human and computer player turns, displaying the appropriate messages for when the human player wins, loses, or ties with the computer. We kick everything off just by calling`main`

at the very end of the file....and that's it! I hope you enjoyed this tutorial, and feel free to fork this REPL to add more functionality.

This is so cool