Game Tutorial: Tic-Tac-Toe (Python)
ericqweinstein (200)

Hi everyone,

I put together a little tic-tac-toe game that uses the minimax AI algorithm and thought I'd write a tutorial on how it works. (This code is based on work by Clederson Cruz.) Feel free to fork the REPL and add to it!

The game is only a couple hundred lines of Python in a single main.py file, so we'll walk through it in this tutorial step-by-step.

The Board

The nice thing about tic-tac-toe is that the game is easy to model! While we could build out an entire Board class (and that's something you could add if you wanted to fork this REPL and update it), but for simplicity's sake, we'll just model our board is a two-dimensional list:

board = [[0, 0, 0],
         [0, 0, 0],
         [0, 0, 0]]

The board-related functions in our game will be empty_cells (which returns open spots on the board), valid_move (which just checks if a spot is empty/available), set_move (which places an X or an O in the provided spot, assuming it's available), and render (which draws the board). Let's look at each one in turn.

def empty_cells(state):
    cells = []

    for x, row in enumerate(state):
        for y, cell in enumerate(row):
            if cell == 0:
                cells.append([x, y])
    return cells

Here, we iterate through all the cells on the board and check to see if the value is still 0 (which is how we initialized the board, as you can see above). If so, we add that cell's coordinates to the cells list, then return that list so we know which spots are still available for a player to choose.

def valid_move(x, y):
    return [x, y] in empty_cells(board)

Easy! We just return True if the suggested (row, column) pair is in our list of empty cells, and False otherwise.

def set_move(x, y, player):
    if valid_move(x, y):
        board[x][y] = player
        return True
    return False

This is also pretty straightforward: we check to see if our chosen move is valid, and if so, we set that square on the board to X or O (depending on which player is moving), returning True on success and False if the move can't be made because the square is already occupied.

def render(state, c_choice, h_choice):
    for row in state:
        print('\n----------------')
        for cell in row:
            if cell == COMP:
                print('|', c_choice, '|', end='')
            elif cell == HUMAN:
                print('|', h_choice, '|', end='')
            else:
                print('|', ' ', '|', end='')
    print('\n----------------')

Our last board method just prints out the current state of the board, printing out every cell and showing a blank for free cells, X for cells taken by the X player, and O for cells taken by the O player.

Evaluating Moves

Next, we'll look at our functions for evaluating moves: evaluate, wins, and game_over.

def evaluate(state):
    if wins(state, COMP):
        score = 1
    elif wins(state, HUMAN):
        score = -1
    else:
        score = 0
    return score

So far, so good! We return 1 if the provided state (which is a two-dimensional list) would result in the computer winning, -1 if the human player would win, and 0 if it would be a tie. There are two important things to note here:

  1. We choose -1 for a human player win because we want the minimax algorithm to run from the computer player's perspective: it's trying to minimize the maximum score the human player can achieve;
  2. This function is run for possible game states, meaning that the computer player uses this function to look at the results of all possible moves, all the possible moves those moves would allow, all the possible moves those moves would allow, and so on, generally limited to a certain depth (at which point the computer chooses the move that leads to the worst "possible futures" for the human player; that is, the branch in the tree of possible states with the smallest possible maximum score).

This might seem a little confusing now, but we'll go through all the details when we get to the minimax function itself.

Next is the wins function, which looks like this:

def wins(state, player):
    win_state = [
        [state[0][0], state[0][1], state[0][2]],
        [state[1][0], state[1][1], state[1][2]],
        [state[2][0], state[2][1], state[2][2]],
        [state[0][0], state[1][0], state[2][0]],
        [state[0][1], state[1][1], state[2][1]],
        [state[0][2], state[1][2], state[2][2]],
        [state[0][0], state[1][1], state[2][2]],
        [state[2][0], state[1][1], state[0][2]],
    ]
    return [player, player, player] in win_state

Here, we simply return True if the player has achieved a set of winning moves. There are eight winning moves in tic-tac-toe: the top row, the middle row, the bottom row, the left column, the middle column, the right column, and the two diagonals (corresponding to the eight entries in our win_state list).

def game_over(state):
    return wins(state, HUMAN) or wins(state, COMP)

This one's pretty easy! If either the human or the computer wins, the game is immediately over.

The Minimax Algorithm

On to the main event: the minimax algorithm! This is how our computer player decides which moves to make.

As mentioned earlier, this algorithm looks at "possible futures" in the game, assigning expected scores to each, and makes the move that maximizes the score (for the computer) and minimizes the score (for the human). (Remember, since the algorithm is being used by the computer player to select a move, it will want to maximize its score and minimize the human opponent's score.) We take the state of the game, the depth to recurse to in our game tree, and a player, and initialize our "best move" to -1, -1, -infinity (that is, a non-existent square with worst/lowest possible score) for the computer player and -1, -1, +infinity (a non-existent square with the best/highest possible score) for a human player. Then, for as long as we're recursing and the game isn't over, we continually make imaginary moves and check the resulting scores, assigning as the "best" score the one that represents the highest possible (for the computer) or lowest possible (for the human). In the base case, we return the result of evaluate on our game state, and the overall algorithm returns the move with the highest perceived score for the computer and the lowest one for the human.

def minimax(state, depth, player):
    if player == COMP:
        best = [-1, -1, -infinity]
    else:
        best = [-1, -1, +infinity]

    if depth == 0 or game_over(state):
        score = evaluate(state)
        return [-1, -1, score]

    for cell in empty_cells(state):
        x, y = cell[0], cell[1]
        state[x][y] = player
        score = minimax(state, depth - 1, -player)
        state[x][y] = 0
        score[0], score[1] = x, y

        if player == COMP:
            if score[2] > best[2]:
                best = score # max value
        else:
            if score[2] < best[2]:
                best = score # min value
    return best

Tying It All Together

Finally, we'll complete our game with three functions to manage gameplay: ai_turn, human_turn, and main.

def ai_turn(c_choice, h_choice):
    depth = len(empty_cells(board))
    if depth == 0 or game_over(board):
        return

    clean()
    print('Computer turn [{}]'.format(c_choice))
    render(board, c_choice, h_choice)

    if depth == 9:
        x = choice([0, 1, 2])
        y = choice([0, 1, 2])
    else:
        move = minimax(board, depth, COMP)
        x, y = move[0], move[1]

    set_move(x, y, COMP)
    sleep(1)

The computer player uses minimax to make the best possible move given the current state of the board, with the exception of when depth == 9 (that is, every square on the board is empty), in which case since the computer player is moving first, there's no "best" move and the computer selects a random square.

def human_turn(c_choice, h_choice):
    depth = len(empty_cells(board))
    if depth == 0 or game_over(board):
        return

    # Dictionary of valid moves
    move = -1
    moves = {
        1: [0, 0], 2: [0, 1], 3: [0, 2],
        4: [1, 0], 5: [1, 1], 6: [1, 2],
        7: [2, 0], 8: [2, 1], 9: [2, 2],
    }

    clean()
    print('Human turn [{}]'.format(h_choice))
    render(board, c_choice, h_choice)

    while (move < 1 or move > 9):
        try:
            move = int(input('Use numpad (1..9): '))
            coord = moves[move]
            try_move = set_move(coord[0], coord[1], HUMAN)

            if not try_move:
                print('Invalid move.')
                move = -1
        except KeyboardInterrupt:
            print('Goodbye!')
            exit()
        except:
            print('Invalid move.')

Here, we handle the input from the human player, mapping the numbers 1 through 9 to squares in our board (a two-dimensional list).

def main():
    clean()
    h_choice = ''
    c_choice = ''
    first = ''

    # Human chooses X or O to play
    while h_choice != 'O' and h_choice != 'X':
        try:
            print('')
            h_choice = input('Choose X or O\nChosen: ').upper()
        except KeyboardInterrupt:
            print('Goodbye!')
            exit()
        except:
            print('Invalid choice.')

    # Setting computer's choice
    if h_choice == 'X':
        c_choice = 'O'
    else:
        c_choice = 'X'

    # Human may starts first
    clean()
    while first != 'Y' and first != 'N':
        try:
            first = input('First to start? [y/n]: ').upper()
        except KeyboardInterrupt:
            print('Goodbye!')
            exit()
        except:
            print('Invalid choice.')

    # Main game loop
    while len(empty_cells(board)) > 0 and not game_over(board):
        if first == 'N':
            ai_turn(c_choice, h_choice)
            first = ''

        human_turn(c_choice, h_choice)
        ai_turn(c_choice, h_choice)

    # Game over message
    if wins(board, HUMAN):
        clean()
        print('Human turn [{}]'.format(h_choice))
        render(board, c_choice, h_choice)
        print('You win!')
    elif wins(board, COMP):
        clean()
        print('Computer turn [{}]'.format(c_choice))
        render(board, c_choice, h_choice)
        print('Computer wins!')
    else:
        clean()
        render(board, c_choice, h_choice)
        print('It\'s a tie!')

    exit()

Finally, our main function lets the human player select X or O and whether they want to move first or second, then enters the game loop: as long as there's an empty square on the board, alternate human and computer player turns, displaying the appropriate messages for when the human player wins, loses, or ties with the computer. We kick everything off just by calling main at the very end of the file.

...and that's it! I hope you enjoyed this tutorial, and feel free to fork this REPL to add more functionality.

You are viewing a single comment. View All
EdenBrown (1)

This is really cool! I'm not very good at graphics, and I never thought of constructing a board the way you did!