Rock-Paper-Scissors Optimal Strategy
October 2, 2025
Question
You are playing rock paper scissors with a perfectly optimal opponent. However, players lose the overall game the first time that they lose with rock, the first time they lose with paper, or the third time they lose with scissors. Wins and ties do not affect this count. Losses do not have to be consecutive. What is your strategy?
Solution
The issue with this question is that it's absolutely uninterpretable. Does the ENTIRE game end the first time someone loses with rock/paper? Like game over tournament done we walk away? Or are they assuming a "Best of 3" style rock paper scissors game, where the "overall" game is the "Best of 3" game, and we want to maximize "Best of 3" wins overall?
Less Likely Interpretation 1: Literal entire game ends at first rock/paper loss, third scissor loss
If the entire game ends the first time someone loses with rock/paper, the optimal play is a symmetric Nash. We would always play scissors. Scissors are obviously optimal; I could do a proof but I think this is intuitive. You have two more shots to lose. So you might expect the opponent to play scissors, placing a positive probability on Rock. However, an optimal opponent will naturally put some weight on Paper as a result of this reasoning, and you face a non-zero chance of instant elimination.
If you ever place positive probability on paper, an optimal opponent can put some weight on Scissors, and you again face a non-zero chance of elimination. Scissors is the only move that cannot knock you out in a single round; at worst it costs you one of three "lives" when the opponent plays Rock. Against a perfectly optimal opponent, it's clear that Rock and Paper only create an immediate-loss threat while giving the other player no reason to expose themselves.
The unique safe fixed point is Scissors with probability 1, resulting in infinite ties unless someone deviates; any deviation introduces a non-zero chance of losing the entire game immediately, which is strictly worse in the minimax sense. So the strategy is to always throw Scissors. Which is a really stupid solution, so I'll assume they meant this:
More Likely Interpretation 2: An "Overall game" is a round of three, and we want to maximize our total wins of these overall games
It's relevant to note that your optimal choice depends on which state you are in (first, second, third game); this is a state process. Rock and Paper are "one-life" options: if you ever lose with them you're instantly dead. Scissors is a "three-life" option: you only die after the third time you lose with it. So your "state" in the game is just how many Scissors-losses you have already accumulated: 0, 1, or 2. The game ends when you either hit a Rock-loss, a Paper-loss, or reach 3 Scissors-losses.
At each state k∈{0,1,2}, let your mixed strategy be (rk, pk, sk). The opponent chooses a pure move to maximize your eventual chance of dying; your goal is to minimize that. The win probabilities break down as follows:
If the opponent plays Rock: the only relevant outcomes are you playing Paper (they immediately lose) or Scissors (you take one strike and move to k+1). Their probability of eventually winning is the product of terms sj/(pj+sj) (per time they play rock) as you advance through states until either you throw a Paper or accumulate three strikes.
If the opponent plays Paper: the first non-tie decides the game. If it's Rock you instantly lose, if it's Scissors they instantly lose. So their winning probability is just rk/(rk+sk)
If the opponent plays Scissors: again the first non-tie decides it. If it's Paper you instantly lose, if it's Rock they take one strike. They win unless the first three decisive rounds are Rock, so their winning probability is 1−(rk/(rk+pk))3.
To be unexploitable, we must pick (rk, pk, sk) at each state so that these three quantities are equal, otherwise the opponent would have a strict best response. Solving these equalities backwards from k=2 gives the following state-dependent optimal mixes:
- State 0 (no Scissors-losses yet): r0≈39.4%, p0≈8.1%, s0≈52.5%
- State 1 (one Scissors-loss): r1≈43.9%, p1≈11.2%, s1≈44.8%
- State 2 (two Scissors-losses, one life left): r2≈50.1%, p2≈19.0%, s2≈30.9%
So the "proof" is just backward induction: define the eventual elimination probabilities for each opponent action at each state, impose indifference conditions, and solve recursively. The solution makes intuitive sense. Early on (state 0) you should use Scissors more because it takes three losses to kill you, so you push Scissors above 50% and keep Paper very small. At state 1 you rebalance toward the stationary 44/11/44 mix. And at state 2 you cut back on Scissors, as the benefits are now gone, so Rock and Paper are more optimal. Note that state 2 isn't 33/33/33, because the opponent still has reason to be unbalanced.
This guarantees that at every state the opponent is indifferent among Rock, Paper, and Scissors, which is the exact condition for an optimal minimax strategy.
The annoying thing is this still makes no sense. Why is the loss condition for Scissors the third loss, when a RPS three-game is typically a best-of-three, only requiring 2 losses? Do you have one immunity from scissors-losses? I haven't gotten around to getting the answer from BAC yet, but I'll get to it.