Friday, July 21, 2006

Sorry that this isn’t related to Blognomic, but…

I need a good strategy for Chicken, and Blognomic is fully loaded with matrix game theorists, so I thought I’d ask around here.

So, there are two players driving toward each other in cars (equipped with airbags). At the last minute each has the choice to either swerve, or not swerve.

Here’s the payoff matrix, with the first row and column having the action of swerving, and the second row and column having the action of not swerving.

(-1,-1)  ( 0 , 1)
( 1,  0 )  (-5,-5)

That is, if both players swerve, both

lose 1 point

.
If one player swerves, and the other doesn’t, the one who drove straight

gets 1 point

, while the one who swerved

gets 0 points

.
If both players don’t swerve, bad things happen, and both

lose 5 points

.

It’s not as easy as Prisoner’s Dilemma, I think, since tit for tat really doesn’t work. It pays off for both players if they both take different actions. Any ideas for strategies?

Comments

Saki:

22-07-2006 01:01:46 UTC

I’m no good with matricies, but I can give you a two player strategy: Never Swerve.

Assume that there’s only two players and whoever has more points at the end wins. If player A never swerves then player B either swerves, giving player A one point, or doesn’t swerve and both players lose five points.

Either way, player A always wins or ties.

Bucky:

22-07-2006 02:04:32 UTC

You will be better off swerving most of the time.

Assuming both players use the same strategy, each player should swerve 90% of the time. 

However, if “Not play chicken” is an option with a payoff of (0,0), it’s the best pick every time, since my strategy’s average payout is -5%.

Bucky:

22-07-2006 02:06:01 UTC

Oh, and make sure to seed their RNGs with different values!

Bucky:

22-07-2006 03:29:07 UTC

Even better: agree with the other guy beforehand: One of us swerves on even numbered times, the other swerves on odd numbered times.  It’s in everyone’s best interest to keep the bargain.

Elias IX:

22-07-2006 04:20:30 UTC

Mmm, I forgot to mention. This strategy will be submitted into a tournament of 20 or so strategies.

Bucky:

22-07-2006 04:39:04 UTC

So you don’t know the other guy’s strategy?  No communication?

How does the tournament scoring work?

Bucky:

22-07-2006 04:41:19 UTC

Also, how many runs do you get against each opponent?  If it’s >50 or so I could come up with an adaptive strategy based on the other player’s past moves.

Bucky:

22-07-2006 05:16:27 UTC

Adaptive strategy:
Swerve if you think your partner has less than a 5 in 7 chance of swerving; otherwise drive streight.  However, before this becomes statistically significant you should swerve most of the time (11/14? check my math on both of these)

Flush probs for each new partner.

This isn’t quite optimized because it can’t adjust if the partner uses a similar strategy but is more aggressive at the beginning.  It’s the best I’m going to come up with tonight.

HOWEVER:  Against a similar strategy it’s best to swerve 4 of the first 7 times so that they will get cautious, if you do enough rounds against one partner.

Hix:

22-07-2006 21:23:27 UTC

The level of communication you’ll have available is the most important unknown to helping you here.  I assume that you’ll be playing multiple games versus (possibly) several opponents, and that the only “communication” you have is the knowledge of what your opponents did in previous rounds with you.

The “safety level” for this game is -5/7 and can be achieved by following a strategy of swerving 6/7 of the time (whether the opponent swerves or not, you would not expect to lose more than 5/7 on average).  The safety level should always be in the back of your mind as a fallback in case your opponent is behaving so erratically that you cannot predict em.

Bucky is correct to point out that swerving is the better option if your opponent is driving straight at least 2/7 of the time, but this analysis should only be used if you assume (or have reason to believe):
1) your opponent is following a simple “swerve X% of the time” strategy, giving no regard to the pattern of swerves/straights
2) your opponent is not willing to “communicate” across rounds through the choices that you both make.

Let’s work on the case where we don’t make those assumptions, shall we?

If there were communication allowed before starting, one thing that you would want to do is to make a credible threat of “I will not swerve” and then immediately cut off communications (Admittedly, this works best if the game is only being played once).  Not really feasable in repeated play, since repeated play is a form of communication; but good to keep in mind if your opponent is playing timidly—successfully convincing a rational opponent that you are ignoring all communication is technically more profitable than cooperation, but really hard to pull off.

The most likely outcome of communication prior to repeated play would be an agreement to alternate which player gets to drive straight, which is both the fairest division, and is also Paraeto optimal.  A nice benefit to this solution is that, once the pattern is established, neither player has any incentive to deviate from it—the risk of collision is too great, and cooperation is practically required to avoid it.  Thus, I suggest that even without direct communication beforehand, attempts should be made within the game to establish this pattern.  The exact implementation of this can probably be done in many ways, but to make my suggestions I note that there is a sort of meta-game Prisoner’s Dillemma going on here, where “cooperation” represents “attempting to establish the alternating pattern, or continuing an already existing alternating pattern” and defection represents “not working toward establishing the pattern, or trying to take advantage of what the opponent seems to be doing”. 

So, if you’re doing repeated play in a tournament, I suggest your strategy have the following 4 properties (adapted from Robert Axelrod’s analysis of repeated play Prisoner’s Dillemma in a tournament setting):
1) Nice.  You start by attempting to establish the pattern, and you don’t be the first one to abandon the attempt.
2) Retaliatory.  You should reliably punish your opponent for abandoning the pattern, or the attempt to establish the pattern.
3) Forgiving.  Having punished the opponent, you should be willing to attempt the pattern again.
4) Clear.  Your pattern of play should be consistent and predictable as often as possible.

So it remains to work out HOW to “attempt to establish the pattern”, HOW to “punish pattern abandonment”, and HOW (or at least WHEN) to “forgive”; all in a clear a manner as possible.  Establishing the pattern would be easy if you could agree who had to swerve first—but without communication, here’s my idea:  Start with the “saftety strategy” until exactly one of you and your opponent swerves.  Then start alternating.  If it becomes clear after a enough attempts at this sort of thing isn’t working, go into “punishing” mode, which may be some combination of the safety strategy (which doesn’t hurt you too much, but causes enough collisions to show the opponent the value of cooperation) or the reactive strategy that Bucky mentioned (i.e. cut your losses versus a bully by swerving, or take advantage of a timid player by not swerving).  If your opponent shows willing (becomes less bullyish/timid, or starts alternating), you can switch back to the pattern.

Bucky:

22-07-2006 22:36:19 UTC

Oh, and if your partner’s ratio is exactly 2/7 and you aren’t in a pattern, you should not swerve because it lowers the opponent’s average score.

Hix:

22-07-2006 23:55:57 UTC

Results of further analysis:  Assuming that there is no way to distinguish between players (so that you can’t, for example, agree that the “oldest”, “tallest”, or “northernmost” player has to swerve first), and that no communication is allowed, the best strategy for the group taken as a whole is for each player to run an algorithm similar to something I described above.  Start with a strategy of swerving with probability x (more on x later) until it becomes possible to distinguish the two players (i.e. one swerves while the other doesn’t).  Now that the players are distinguished, they can coordinate further cooperation, and should do so for the remainder of their rounds together (alternation is the simplest method).  The “safety strategy” x=6/7 is NOT the correct value for x, though.  That results in too many swerve/swerve combinations before coordinated cooperation starts.  The best value is, strangely enough, x=(11-sqrt(33))/8 (about .65693).

Although this strategy is not likely to arise natually, it is stable in the sense that if everyone is using this strategy, no player can expect individual gain by switching to any other possible algorithm.

Bucky:

23-07-2006 01:45:09 UTC

4 major questions:
1)Can you (and your partner) communicate in any form other than swerve/not swerve

2)How many times do you face each partner?

3)Are you facing other humans or abstract strategies?  In other words, do you all submit strategies and let the tournament run itsself, or do you input the result every round?

4)Do you know when you switch partners?

5)How smart are the other people in the tournament?  How many of them can come up with somthing better than swerving X% of the time at optimum X?

At the moment, I’m assuming no, at least 20, abstract strategies, yes and mostly about as smart as you.