Lets play Rock Paper Scissors. After each round the winner receives 3£1 from the loser. However I have an handicap. I can never ever play Scissors. How much would you pay to play this game?
Somewhere between 0£ and 3£ ?
True. If you payed 0£ you would not be expecting to lose money and if you paid 3£ you would certainly not make a profit, so you should pay something in between those values. However we can do better than that. We can find the specific value it is fair to pay.
How?
Lets understand the game. I can only play Rock or Paper. Hence the possible outcomes are:
Rock | Paper | |
---|---|---|
Paper | 3£ | 0£ |
Scissors | -3£ | 3£ |
Rock | 0£ | -3£ |
Since playing Rock is always worse for you than playing Paper we can assume you never play it, so the possible outcomes become: |
Rock | Paper | |
---|---|---|
Paper | 3£ | 0£ |
Scissors | -3£ | 3£ |
If you always play Paper you never lose money and no matter what we play you will never win more than 3£ , so your initial guess of paying something between 0£ and 3£ makes sense. If you always played Paper then I would always play Paper and you would win no money. However you could change your tactic to playing Scissors in order to win the 3£, but then you will be at risk of losing 3£ if I play Rock.
How does that help us reach the fair value?
In practice in every play we will choose each move independently and with certain probabilities. For instance, I will choose Rock with probability and Paper with probability . You will choose to play Scissors with a probability and Paper with probability . As such the probabilities of each move are:
Rock | Paper | |
---|---|---|
Paper | ||
Scissors |
Let be how much money you expect to make given the above probabilities. Then Cool. Now we have a function giving us how much you expect to get after a game. I will try to choose so that is minimal, and you will try to choose so that is maximal. Nash’s work (and the reason for this post’s title) was about proving that this equilibrium always exists for non-deterministic games (and this is called the Nash Equilibrium)
How do we choose and ?
We have to go back to finding maxima and minima of functions. We have to choose . Therefore either there is a local maxima somewhere in the middle or the maximum is attained at the endpoints. To find interior maxima we will use derivatives: If is a maxima or minima, then the derivative of with respect to vanishes (i.e. equals zero) at . Doing the actual computations we get: Similarly, Hence our local maxima / minima can be found when and . Visually the expected returns for you (i.e. ) are:
(A) | (B) | (C) | |
(D) | (E) | (F) | |
(G) | (H) | (I) |
I can choose the value of , which amounts to choosing the column we are in. Since I want to minimize your returns (i.e. minimize ) then I am happy if, for any line you choose, I choose the column with minimal . Hence my choice will end up being one of . and because they are the minimum in their respective rows, and , , because they are all equal in their row.
Similarly you get to choose i.e. you choose the row. In order to maximize , you will only be happy with and since they are the maximum in their respective columns, or , , , for a final choice in the set .
Looking at the 2 sets we chose: and we notice that they intersect in (Nash proved that when following this method there will always be an intersection. If you want to know how I would recommend reading his paper). As such when we both play optimally we will be in cell .
I lost track. How does this relate to the original problem?
We saw that If we both play optimally then we will end up in cell , where . Since is my probability of playing Rock and your probability of playing Scissors, then we found the optimal strategy for both of us.
For those probabilities, so you will have an average return of 1£.
Going back to the original question, paying 1£ would make it a fair game. Anything less than that and you will most likely profit, anything more than that and you will most likely lose money.
1 You are probably wondering why I choose 3£ instead of 1£ . During the calculations we end up having to divide that value by 3, so this way I have to work with fewer fractions.