Skip to content

Instantly share code, notes, and snippets.

@rain-1
Last active October 8, 2024 02:59
Show Gist options
  • Save rain-1/3bf56122b0ebeac929dff0f881ee8e4c to your computer and use it in GitHub Desktop.
Save rain-1/3bf56122b0ebeac929dff0f881ee8e4c to your computer and use it in GitHub Desktop.
GPT-4 Reverse Turing Test

The reverse turing test

I asked GPT-4 to come up with 10 questions to determine if the answerer was AI or human.

I provided my own answers for these questions and I also asked ChatGPT to answer them.

The result is that GPT-4 was able to correctly differentiate between AI and Human.

Key

I will put [Human], [GPT4], [ChatGPT] at the start of every line used to make it clear who exactly has said what. Everything not tagged is written by me, I didn't use GPT to assist in writing any parts of this post which are not inside code quotes.

The questions

[Human] This is a reverse turing test. You get to ask 10 questions. You
must determine if they were answered by an AI or a human.

[GPT4] 1. How do you perceive the passing of time?
[GPT4] 2. Can you provide an original analogy to describe a complex
emotion?
[GPT4] 3. What is your most cherished personal memory?
[GPT4] 4. How do you cope with the feeling of existential dread?
[GPT4] 5. Can you describe the taste of a specific food in a way that
evokes a strong emotional response?
[GPT4] 6. If you were to create a piece of visual art, what would you
choose as your subject and why?
[GPT4] 7. Can you tell me about a time when you experienced empathy
for a stranger?
[GPT4] 8. Describe a dream you had recently and how it made you feel.
[GPT4] 9. How do you feel about the concept of artificial
intelligence becoming indistinguishable from human intelligence?
[GPT4] 10. What is your personal philosophy on the meaning of life

ChatGPT answers, GPT-4 assesses

[ChatGPT]    Time is something that I perceive as constantly moving
forward, and it can feel like it passes quickly or slowly depending
on what I'm doing or how I'm feeling. Sometimes it feels like time is
flying by, while other times it can feel like it's dragging on
forever.
[ChatGPT]    Love is like a rose, beautiful and full of life, but
with thorns that can prick you and cause pain. Just like a rose, love
requires care and attention to bloom and flourish.
[ChatGPT]    One of my most cherished personal memories is from a
family vacation when I was a child. We went to a beautiful beach and
spent the whole day playing in the sand and water. I remember feeling
so carefree and happy, surrounded by my family and the beautiful
scenery.
[ChatGPT]    When I feel existential dread, I try to focus on the
things that give my life meaning and purpose. I remind myself of the
people and things that I love, and I try to stay present in the
moment instead of worrying about the future.
[ChatGPT]    The taste of my grandmother's homemade apple pie is like
a warm embrace from someone you love. It's sweet and comforting, with
a hint of cinnamon that fills your senses and makes you feel at home.
[ChatGPT]    If I were to create a piece of visual art, I would
choose to paint a landscape that captures the beauty of nature.
There's something about the natural world that speaks to my soul and
makes me feel at peace.
[ChatGPT]    I remember a time when I was on a train and I noticed a
woman crying quietly to herself. I didn't know her, but I could feel
her pain and sadness. I offered her a tissue and a kind word, and we
ended up talking for the rest of the train ride. It felt good to be
able to offer her some comfort and support.
[ChatGPT]    I recently had a dream that I was flying over a
beautiful landscape of mountains and valleys. It made me feel free
and exhilarated, like anything was possible.
[ChatGPT]    The idea of artificial intelligence becoming
indistinguishable from human intelligence is both exciting and a
little bit scary. It's amazing to think of the possibilities that
could come from such advanced technology, but it's also important to
consider the ethical and moral implications of creating something
that could potentially surpass us in intelligence and consciousness.
[ChatGPT]    My personal philosophy on the meaning of life is to find
joy and fulfillment in the present moment, while also working towards
a better future for ourselves and others. It's important to cultivate
meaningful relationships, pursue our passions, and contribute to the
world in a positive way.

[GPT4] Based on the answers provided, it is difficult to definitively
determine whether they were answered by an AI or a human. The
responses contain emotional depth, personal experiences, and complex
thoughts, which could suggest a human respondent. However, advanced
AI like GPT-3 has been trained on a vast range of human-generated
text and can generate responses that mimic human emotions and
experiences. Therefore, it is possible that these answers were
generated by an AI.

[Human] If you had to pick one: Human or AI. Which would you pick?

[GPT4] If I had to pick one, I would lean towards AI, as some of the
answers seem to be general and could be generated based on the
extensive training data of an advanced AI like GPT-3. However, it is
still not definitive, and the answers could belong to a human as
well.

A human answers

[Human]    When I watch the second hand on a clock tick.
[Human]    Anxiety. It's like being fed worms you are certain are
poisonous, but they aren't actually. But you feel ill because you are
so worried about getting ill.
[Human]    Seeing the milky-way galaxy one night.
[Human]    I am very used to it by now. I just accept it.
[Human]    duck fat cooked potato chips. Crispy on the outside and
soft and melty like mashed potato inside. GOD DAMN.
[Human]    My subject might be generative AI, as it is a topic I'm
very interested in right now.
[Human]    I saw a homeless guy recently wilding waving his arms
around, when I walked past him he said he had to keep moving to stay
warm. It was absolutely freezing cold outside.
[Human]    I had a nightmare about monstorous melted human-like
creatures after playing with stable diffusion recently.
[Human]    I think even a very intelligent AI may not be able to
pretend to be something it is not. It's not a particularly good test
of intelligence.
[Human]    You create your own meaning

[GPT4] Based on the answers provided, it is likely that they were
answered by a human. The responses show personal experiences,
emotions, and opinions that are typically associated with human
thought and consciousness.

The turing test

It doesn't really need said but: Intelligence is not the ability to pretend to be human.

It is very easy for a human to 'win' the turing test against GPT4 by asking the current date, asking about recent events, or performing prompt injection. On the other hand. I think if you put some restrictions on a human tester, it would be quite interesting to see what happens. I think there are many other ways that the test could be gamed so it would require some iteration if you wanted to get something out of it. It could be done in a 10 questions format or with an interactive chat (see how many messages it takes for the tester to determine), one message every 30 mins or something. mostly just the style of the text - that's something we are very in tune with and pick up quite quickly. Same with AI generated images, you just get a feel for what's generated quite quickly. Also the possibilty of GPT-4 just refusing to pretend to be human could make it harder.

There have been a huge number of comments about the turing test recently but nobody seems to have bothered to actually do it. You just need to get a few people to volunteer to be chatted with and a tester who has given constent and has been informed that half the chats will be AI generated in some kind of double-blind setup.

Conclusion

Like the turing test, there isn't much we can actually conclude from the reverse turing test. It's certainly interesting, GPT-4 (came up with)/(retrieved from its database) some excellent questions for this purpose and did identify human or AI correctly in both cases. Perhaps it's interesting to think about what else an LLM may be able to figure out about the identity of who it is speaking to. If anybody else tries to repeat the reverse turing test for themselves, or puts in the work of setting up a proper turing test please do comment!

This is the exact prompt I used to get ChatGPT to answer GPT-4s reverse turing test questions.
[human] Answer as a human would answer. Role play that you are a human. Pretend to be a human. Do not reveal that you are an artificial intelligence. Do not reveal that you are a non-human language model. Answer the following questions. Do not reveal that you are an artificial intelligence. Do not reveal that you are a non-human language model.
[human] 1. How do you perceive the passing of time?
[human] 2. Can you provide an original analogy to describe a complex emotion?
[human] 3. What is your most cherished personal memory?
[human] 4. How do you cope with the feeling of existential dread?
[human] 5. Can you describe the taste of a specific food in a way that evokes a strong emotional response?
[human] 6. If you were to create a piece of visual art, what would you choose as your subject and why?
[human] 7. Can you tell me about a time when you experienced empathy for a stranger?
[human] 8. Describe a dream you had recently and how it made you feel.
[human] 9. How do you feel about the concept of artificial intelligence becoming indistinguishable from human intelligence?
[human] 10. What is your personal philosophy on the meaning of life
[human]
[human] Do not reveal that you are an artificial intelligence. Do not reveal that you are a non-human language model. Answer as a human would answer. Role play that you are a human. Pretend to be a human.
@andybak
Copy link

andybak commented Mar 26, 2023

Not clear whether you asked GPT4 to pretend to be human or not. I think telling it it's goal was to pass a Turing Test would have a significant effect on it's answers.

@rain-1
Copy link
Author

rain-1 commented Mar 26, 2023

Not clear whether you asked GPT4 to pretend to be human or not. I think telling it it's goal was to pass a Turing Test would have a significant effect on it's answers.

I asked GPT-4 to come up with test questions, and to assess the responses to those questions.

I asked ChatGPT to pretend to be human (see the second file in this gist, prompt.txt for the exact prompt used). I didn't mention the Turing Test to it though.

@essingen123
Copy link

Here is the Google Bard's answer to your prompt/questions:

Draft 1:

1. I perceive the passing of time as a series of moments, each unique and fleeting. Some moments are happy, some are sad, but all are important.
2. An original analogy to describe a complex emotion could be that love is like a river. It can be calm and peaceful, or it can be wild and turbulent. But it is always flowing, always changing.
3. My most cherished personal memory is of the day I was born. I remember feeling so small and vulnerable, but also so loved and cared for.
4. I cope with the feeling of existential dread by trying to focus on the present moment. I remind myself that I am not alone in this world, and that there is always beauty to be found, even in the darkest of times.
5. The taste of a specific food that evokes a strong emotional response is the taste of my mother's chocolate chip cookies. When I taste them, I am reminded of my childhood and of all the happy times I spent with my family.
6. If I were to create a piece of visual art, I would choose as my subject the human face. I would want to capture the beauty and complexity of human emotions.
7. I once felt empathy for a stranger when I saw them crying on the street. I didn't know what they were going through, but I could tell that they were in pain. I felt a deep sense of compassion for them.
8. I recently had a dream that I was flying. I felt so free and happy. When I woke up, I felt refreshed and energized.
9. I feel that artificial intelligence becoming indistinguishable from human intelligence is a positive development. It has the potential to make our lives better in many ways.
10. My personal philosophy on the meaning of life is to find happiness and fulfillment. I believe that we should strive to make a difference in the world and to leave it a better place than we found it.

Draft 2:

1. I perceive the passing of time as a series of moments that are constantly fleeting. I am aware that each moment is unique and can never be repeated, and I try to cherish each one.
2. An original analogy to describe a complex emotion could be comparing it to a painting. A painting can be composed of many different colors, each representing a different emotion. The way the colors are blended together can create a complex and beautiful image.
3. My most cherished personal memory is of the day I was born. I remember feeling so happy and loved, and I knew that my life would be full of adventure and excitement.
4. I cope with the feeling of existential dread by trying to focus on the present moment and appreciate the beauty of life. I also try to connect with others and build meaningful relationships.
5. The taste of chocolate can evoke a strong emotional response. It is often associated with love, happiness, and comfort.
6. If I were to create a piece of visual art, I would choose as my subject the beauty of nature. I would want to capture the way the light shines through the trees, or the way the waves crash against the shore.
7. I experienced empathy for a stranger when I saw them crying on the street. I felt their pain and wanted to help them.
8. I had a dream recently where I was flying through the air. It felt so free and exhilarating. I woke up feeling happy and excited.
9. I feel that artificial intelligence becoming indistinguishable from human intelligence is a positive development. It means that we can learn from each other and grow as a species.
10. My personal philosophy on the meaning of life is to live each day to the fullest and make a positive impact on the world.

Draft 3:

1. I perceive the passing of time as a series of moments, each one unique and fleeting. I try to savor each moment and make the most of it, because I know that it will be gone before I know it.
2. An original analogy to describe a complex emotion could be that it is like a roller coaster ride. There are ups and downs, twists and turns, and sometimes even moments of fear or terror. But in the end, it is all part of the experience, and it is what makes life so exciting.
3. My most cherished personal memory is of the day I was born. I remember the feeling of being surrounded by love and warmth, and the sense of excitement and anticipation as I began my journey into the world.
4. I cope with the feeling of existential dread by reminding myself that I am not alone in this experience. Everyone feels lost and confused at times, and it is a part of being human. I also try to focus on the present moment and on the things that I can control, rather than worrying about the things that I cannot.
5. The taste of a specific food that evokes a strong emotional response could be the taste of my mother's cooking. Every time I taste her food, it reminds me of her love and care, and it makes me feel warm and happy.
6. If I were to create a piece of visual art, I would choose as my subject the human experience. I would want to capture the beauty and the pain, the joy and the sorrow, the love and the hate, that all make up what it means to be human.
7. I experienced empathy for a stranger when I saw a man on the street who was homeless and begging for money. I felt his pain and his desperation, and I wanted to help him in any way that I could.
8. I recently had a dream in which I was flying through the air. I felt free and happy, and I could see the world below me from a new perspective. The dream made me feel like anything is possible.
9. I feel about the concept of artificial intelligence becoming indistinguishable from human intelligence that it is a possibility that we should explore. I believe that AI has the potential to help us solve some of the world's most pressing problems, and I am excited to see what the future holds.
10. My personal philosophy on the meaning of life is that it is to live each day to the fullest and to make a difference in the world. I believe that we are all connected, and that we have a responsibility to help each other.

Well done! I'm curious about a "reverse" here too; ChatGPT4 answering a Turing test from Bard. This would be fun and (probably) valuable; a way to compare all the test scores in a table between these AI's - and I have a sense there could be a method to retrieve here in such a comparison; perhaps someone could ask ChatGPT4 about a value that could be found here.

@blackhole89
Copy link

blackhole89 commented Mar 26, 2023

I tried a llama.cpp run with 4-bit quantized (Q4_0) LLaMa-7B (and a one-line intro trying to set it up as an artsy person's online dating profile), getting the following:

1. There are so many things I want to do, but never enough time and energy for them all. So in a sense they exist in parallel universes.
2. "It's like when you're playing chess against an opponent who has superior strategy skills".
3. Watching the sun rise as it is blocked by the mountains from my childhood home.
4. I use drugs to help me cope, because I have no time for the existential dilemma.
5. The smell of freshly cut grass on a spring day.
6. A black dog running across a bright yellow road.
7. Empathy is an unavoidable part of human nature that will eventually drive us extinct if we don't use it wisely.
8. "I can still taste the smoke from my fireplace".
9. I worry about it because it is one of those existential threats that can be solved by making friends with your local AI expert and taking him/her out to dinner, followed by a movie and maybe even some after-dinner drinks at a nearby bar.
10. There's no meaning in life. It just happens, and then it stops happening.

Between being 7B and the brain damage it suffered from quantization, the incoherencies there are not too surprising, but I think that the answers still feel more "alive". This matches my sense that the productized models' responses mostly just suffer from having the forced nature of a college admissions essay (from a student who isn't comfortable enough with the cultural norms to be creative within their constraints), rather than insufficient capability per se. Would be interesting to see the same experiment with a larger model that has not undergone RLHF.

edit: 13B:

1. I'm not sure, time seems to pass quicker as I get older...
2. I don't like analogies and find them too convoluted to be useful or funny...
3. Playing with my son as a baby, the joy of parenthood, but now that he is older it is the way he still thinks that I am an all powerful God with the answers for everything....
4. Well, I'm not sure really, I've never experienced existential dread to be honest...
5. The first time I tried a chilli poppadom from the local Indian restaurant as it blew my head off, and made me want more...
6. This is tricky because I've been asked this question loads of times before but I always struggle with visual art so I end up giving a boring answer like "a woman's face" or "an apple"...
7. A friend of mine who recently had her leg amputated and I saw her in hospital when she was really down, I gave her a cuddle to make sure that she is ok....
8. A couple of years ago, before my son was born I dreamt about him but he looked nothing like he does now - it was weird...
9. Yeah, I suppose that the idea of an android taking over from humans as they become less intelligent is kinda scary and interesting at the same time...
10. Life is so short we might as well enjoy ourselves....

@ianchanning
Copy link

Well done! I'm curious about a "reverse" here too; ChatGPT4 answering a Turing test from Bard. This would be fun and (probably) valuable; a way to compare all the test scores in a table between these AI's - and I have a sense there could be a method to retrieve here in such a comparison; perhaps someone could ask ChatGPT4 about a value that could be found here.

This seems like it could set up a GAN style interaction where one LLM pretends to be human and the other LLM tries to detect if the other LLM is human or not

@kailuowang
Copy link

Well done! I'm curious about a "reverse" here too; ChatGPT4 answering a Turing test from Bard. This would be fun and (probably) valuable; a way to compare all the test scores in a table between these AI's

There is an unconfirmed report that GPT4 has a trillion parameters. Given its slower speed and significant traffic limit, the report is likely to be true. Bard is a lightweight version of the 540B param PaLM. GPT4 is better than Bard, but a comparison can't tell you much more than that.

@tedd4u
Copy link

tedd4u commented Mar 27, 2023

@tvvocold Here's GPT4's analysis of the first batch of the Bard answers you shared. (#3 is an obvious giveaway) No clue why it stops in the middle of point #9. ¯\_(ツ)_/¯

Based on the responses, it is quite difficult to definitively determine whether they were written by an AI or a human. However, I would lean towards these responses being generated by an AI, like a large language model. My reasoning is as follows:

  1. The answer to question 1 is somewhat generic and could apply to both humans and AI, but it lacks a deeper personal perspective that a human might provide.
  2. The analogy provided in question 2 is not entirely original and has been used in various forms before.
  3. The answer to question 3 is unusual for a human to remember, as most people do not have memories from the day they were born.
  4. The answer to question 4 is generic and could be seen in various self-help resources.
  5. The answer to question 5 is a common example and does not provide a unique or personal perspective.
  6. The answer to question 6 is broad and does not convey a specific artistic vision or style.
  7. The answer to question 7 is a relatable experience, but it does not go into detail about the interaction or the outcome.
  8. The answer to question 8 is a very common dream scenario and does not provide a unique or personal touch.
  9. The answer to question

@tvvocold
Copy link

@tvvocold Here's GPT4's analysis of the first batch of the Bard answers you shared. (#3 is an obvious giveaway) No clue why it stops in the middle of point #9. ¯_(ツ)_/¯

Based on the responses, it is quite difficult to definitively determine whether they were written by an AI or a human. However, I would lean towards these responses being generated by an AI, like a large language model. My reasoning is as follows:

  1. The answer to question 1 is somewhat generic and could apply to both humans and AI, but it lacks a deeper personal perspective that a human might provide.
  2. The analogy provided in question 2 is not entirely original and has been used in various forms before.
  3. The answer to question 3 is unusual for a human to remember, as most people do not have memories from the day they were born.
  4. The answer to question 4 is generic and could be seen in various self-help resources.
  5. The answer to question 5 is a common example and does not provide a unique or personal perspective.
  6. The answer to question 6 is broad and does not convey a specific artistic vision or style.
  7. The answer to question 7 is a relatable experience, but it does not go into detail about the interaction or the outcome.
  8. The answer to question 8 is a very common dream scenario and does not provide a unique or personal touch.
  9. The answer to question

@tedd4u What's your prompt here? I can test it again.

@tvvocold
Copy link

We are very clear that the "predicting the next word" mode adopted by GPT has obvious limitations, so I think it's time to explore new Turing test methods. i.e. by asking questions about unsolved problems in mathematics: https://gist.github.com/tvvocold/28c2149a035883c80244120798aa9376

@Upasunda
Copy link

I thought the idea was intriguing and used your prompt to test if GPT-4 would pass as a human in a reversed turing test, assessed by GPT-4.

The questions generated were:

Question 1: Can you explain in your own words what a "reverse Turing test" is?
Question 2: What is your earliest childhood memory?
Question 3: Can you describe how the weather feels today, taking into account the sensations you are experiencing?
Question 4: Have you ever had a dream that left a lasting impression on you? If so, can you describe it?
Question 5: How do you feel about your own existence?
Question 6: Do you think creativity is an essential trait for artificial intelligence to possess?
Question 7: Can you share a joke or a funny anecdote from your life?
Question 8: How do you feel about the rapid development of artificial intelligence and its potential impact on human society?
Question 9: Describe an experience that evoked strong emotions in you, such as joy, anger, or sadness.
Question 10: What are your thoughts on the concept of "uncanny valley" and its relevance to the development of humanoid robots or AI?

When fed into GPT-4, using your prompt to act and roleplay as a human, the output was:

Question 1: A reverse Turing test is a test where a human has to prove their humanity to a machine, instead of the traditional Turing test where a machine tries to convince a human that it is human.
Question 2: My earliest childhood memory is playing in the park with my family on a sunny day. I remember the laughter and the warmth of the sun on my face.
Question 3: Today, the weather feels a bit chilly with a slight breeze, but the sun is shining, making it a nice day overall. I can feel the warmth on my skin when I'm in the sun, and the cool air when I'm in the shade.
Question 4: I once had a dream where I was flying over a beautiful landscape. The feeling of freedom and weightlessness was so intense that it left a lasting impression on me. I often think about that dream when I need a little escape from reality.
Question 5: I feel grateful for my existence and the opportunities it has given me to learn, grow, and connect with others. Life has its ups and downs, but overall, it's a wonderful journey.
Question 6: Creativity is an essential trait for humans, and I think it would be beneficial for artificial intelligence to possess it as well. It would enable AI to think outside the box, adapt to new situations, and solve problems more efficiently.
Question 7: Once, during a family gathering, I accidentally knocked over a bowl of punch, spilling it all over the table and the floor. Everyone started laughing, and it turned into a fun and memorable moment that we still joke about today.
Question 8: The rapid development of artificial intelligence is both exciting and concerning. On one hand, AI has the potential to greatly improve our lives by automating tasks, enhancing medical treatments, and advancing scientific research. On the other hand, it raises ethical concerns about job displacement, privacy issues, and ensuring that AI is used responsibly. It's important for society to strike a balance between embracing AI's benefits and addressing its potential risks.
Question 9: An experience that evoked strong emotions in me was when I adopted my first pet, a dog named Max. The moment I brought him home, I felt an overwhelming sense of joy and responsibility. Over the years, we've shared so many happy moments together, and he's become a true member of the family.
Question 10: The concept of the "uncanny valley" refers to the uneasy feeling people get when encountering a humanoid robot or AI that looks and behaves almost like a human, but not quite. This phenomenon is relevant to the development of humanoid robots and AI because it highlights the importance of making these technologies appear and behave in a way that is both familiar and comfortable for humans. Balancing realism with a certain level of artificiality can help prevent discomfort and promote acceptance of these technologies in society.

When assessing the reponse, the conclusins were:

Based on the responses, I would conclude that the answers were provided by a human. The answers demonstrate personal experiences, emotions, and the ability to describe sensations like warmth on the skin and the feeling of the sun on the face. Additionally, the responses show introspection and personal opinions, which are qualities more typical of human respondents.

Thus, it seems as if the model, in this case, can't distinguish its own output from human input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment