Back to Blog
Tools

Tic-Tac-Toe AI with the Minimax Algorithm: Building an Unbeatable Opponent

Tic-Tac-Toe is a solved game — perfect play always results in a draw. Implementing the Minimax algorithm to play it optimally is a classic computer science exercise that introduces game trees, recursive search, and decision theory.

Tic-Tac-Toe is one of the most useful games in computer science education because it is simultaneously trivial (a 3×3 grid, two players, first to three in a row wins) and perfectly solved (with optimal play from both sides, the game always draws). This solved nature makes it an ideal testing ground for game AI: there is a definitive correct answer for every board state, which means an AI implementation can be verified objectively. The Minimax algorithm — the standard approach for perfect-play game AI in two-player zero-sum games — can be understood completely through Tic-Tac-Toe before applying it to more complex games like Chess and Go.

Tic-Tac-Toe AI minimax algorithm

Tic-Tac-Toe as a Game Tree

A game state in Tic-Tac-Toe is completely described by the contents of the nine cells (each is X, O, or empty) and whose turn it is. The game tree enumerates all possible sequences of moves from the initial empty board to terminal states (win for X, win for O, or draw). The total number of possible game states is at most 9! = 362,880 (for the first game played in order), though many of these are reached via different move sequences, and many are terminal states not requiring further exploration. Accounting for symmetry, the game tree has approximately 255,168 terminal leaves.

A terminal state is one where either player has three in a row (horizontal, vertical, or diagonal) or all nine cells are filled without a winner (draw). There are 8 winning lines: 3 rows, 3 columns, 2 diagonals. Checking all 8 lines for a three-in-a-row is the complete win detection algorithm. If no winning line exists and all cells are filled, the state is a draw. Win detection runs in constant time (O(1)) because the board size is fixed — always check exactly 8 lines regardless of game state.

The Minimax Algorithm

Minimax is a recursive algorithm that computes the optimal move for the current player by exploring all possible future game states. The core idea: the maximising player (assume X) chooses moves that maximise the game's score; the minimising player (O) chooses moves that minimise the score. Score conventions: X wins = +10, O wins = -10, draw = 0. The absolute values don't matter — only their relative order.

The recursive minimax function: if the current state is terminal, return the score (adjusted for the depth at which the terminal state occurs — earlier wins score higher). Otherwise, for each empty cell, simulate placing the current player's mark, recursively call minimax with the opponent as the current player, undo the simulated move, and track the best score found. Return the best score.

Depth adjustment (subtracting depth from winning scores and adding depth to losing scores) makes the algorithm prefer faster wins and slower losses. Without depth adjustment, the algorithm plays correctly but may not prefer to win in 2 moves when it could also win in 5 — a human opponent would notice and find it surprising. With depth adjustment, the AI always plays the most decisive move available.

For the first AI move on an empty board, minimax must evaluate all 9 possible first moves. Each of those leads to 8 possible second moves, then 7, then 6 — the full search tree. The total computation is fast (milliseconds on any modern device for Tic-Tac-Toe) but demonstrates the exponential growth in search space that makes minimax impractical for games with larger state spaces without additional optimisations.

Alpha-Beta Pruning

Alpha-beta pruning is an optimisation to minimax that dramatically reduces the number of states explored without changing the result. It tracks two values during search: alpha (the best score the maximiser has found so far) and beta (the best score the minimiser has found so far). When the current search branch cannot possibly improve on what has already been found — because the minimiser has already found a move that limits the score below the maximiser's current best — the search prunes that branch entirely.

For Tic-Tac-Toe, the performance improvement from alpha-beta is modest because the game tree is small. For larger games, alpha-beta pruning can reduce the effective branching factor enough to double the search depth for the same computation cost — equivalent to squaring the number of positions evaluated at the same depth. Chess engines universally use alpha-beta pruning as the foundational optimisation before applying more sophisticated heuristics.

Implementing alpha-beta for Tic-Tac-Toe adds three lines to the basic minimax function: initialise alpha and beta parameters (-Infinity and +Infinity), update alpha (maximiser) or beta (minimiser) after each recursive evaluation, and return early (prune) when beta <= alpha. This is the complete alpha-beta optimisation for a two-player zero-sum game.

Game State Representation

A Tic-Tac-Toe board is represented as a flat array of 9 elements, each being 'X', 'O', or null. Index 0 is the top-left cell, index 1 is the top-center, index 2 is the top-right, and so on reading left-to-right, top-to-bottom. This flat array is simpler to work with than a 2D array and enables the winning line check to be expressed as checking three indices whose values match: board[0] === board[1] && board[1] === board[2] && board[0] !== null for the top row, etc.

The 8 winning lines as index triples: [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[0,4,8],[2,4,6]]. The win detection function iterates these triples and returns the winner if any line has three matching non-null values. This data-driven approach is cleaner than eight separate conditional checks and trivially extends to larger boards (4×4 Tic-Tac-Toe, for example) by extending the winning-lines array.

Game AI decision tree visualization

User Experience Design for the AI Game

Implementing an unbeatable AI creates a UX consideration: playing against a perfect opponent that never makes mistakes can be frustrating for casual players who expect to win eventually. Several design patterns address this. Difficulty levels: an "easy" AI plays random valid moves, "medium" AI plays optimally half the time, "hard" AI plays perfectly. This gives players a progression from winnable to challenging.

For a "best of three" format (which our Tic-Tac-Toe implementation uses), the AI maintains game score across rounds. Even an unbeatable AI against an imperfect player creates an interesting challenge — can you draw at least once? This reframes the success condition from "win" to "draw," which is achievable against perfect play and gives players a meaningful goal.

Move highlighting — visually indicating the winning combination after a player wins, or drawing an "X" through the board on a draw — improves game clarity. Animated transitions between states (a brief pause after each move before the AI responds, to prevent the game from feeling instant and mechanical) make the interaction feel more natural. Our implementation uses a 400ms thinking delay before AI moves, which gives the interaction a more conversational rhythm.

More Articles