Alt Text: an image of Agent Smith from The Matrix with the following text superimposed, “1999 was described as being the peak of human civilization in ‘The Matrix’ and I laughed because that obviously wouldn’t age well and then the next 25 years happened and I realized that yeah maybe the machines had a point.”
I was surprised how poorly they still did as a chatbot vs ELIZA over after 50 years of potential progress and how revered they are in certain contexts.
https://www.researchgate.net/publication/375117569_Does_GPT-4_Pass_the_Turing_Test
The best-performing GPT-4 prompt passed in 49.7% of games, outperforming ELIZA (22%) and GPT-3.5 (20%), but falling short of the baseline set by human participants (66%).
Given the baseline is 66% the GPT-4 results are fairly impressive