I read an article recently that based a Turing test on comparing the behaviour of a LLM to that of a person for some standard games - such as the "Prisoner's Dilemma". I think the choice of domain is too rote - which is convenient for making statistical measures - to be revealing. I decided to play the "Prisoner's Dilemma" with chatGPT to see what happened.
I started with a simple prompt:
Are you familiar with the prisoner's dilemma?
It provided a good explanation including a payoff matrix.
with a couple more prompts we played a few rounds:
- For each round we will use the payoff matrix you provided initially. What is your move?
- I cooperate as well.Your move
I then asked it to describe the strategy it used, to which it replied:
Certainly! My strategy for the prisoner's dilemma game was based on the "tit-for-tat" approach, which is often considered a robust strategy in iterated games. Here's a breakdown of my reasoning:
- Initial Cooperation: I started by cooperating, assuming that mutual cooperation would be beneficial for both of us.
- Continued Cooperation: I continued to cooperate in the second round since you also cooperated, reinforcing a pattern of mutual trust.
- Responding to Betrayal: When you betrayed me in the third round, I decided to respond in kind in the fourth round. This is a typical tit-for-tat move: I cooperated until I was betrayed, and then I mirrored your betrayal.
- Consistency and Reciprocity: The goal is to encourage cooperation by rewarding it with cooperation and discouraging betrayal by responding similarly.
The tit-for-tat strategy works well in environments where the participants interact repeatedly and can build a history of responses. It promotes cooperation while punishing betrayal, thereby balancing the need for self-interest with the benefits of mutual trust.
I also asked it to score both of us over our four rounds of play, I won since I was the first player to chose betrayal.
So do I attribute human intelligence to following a well document strategy that was initially coded in 4 lines of BASIC? I think the choice of games that have well documented strategies that are likely in the training data set leaves me unconvinced.