Although everyone has an intuitive way of understanding what AI means, the term is somehow difficult to grasp in its whole complexity.
One example of a definition is given by Kaplan, A. and M. Haenlein (2019), who characterize AI as “a system’s ability to correctly interpret external data, to learn from such data, and to use those learnings to achieve specific goals and tasks through flexible adaptation.”
These days, one question arises: is there an objective way to measure the capabilities of an intelligent machine?
Or: is it even possible to draw a line between AI systems and standard computer systems?
One of the first attempts: the Turing Test
One of the first attempts to answer these questions was done by Alan Turing, a famous mathematician and cryptologist during World War II.
In 1950, he designed an experiment called Turing Test, where an interrogator is chatting with two other clients (see figure below).
Client A is controlled by a computer program, whereas client B is a human. The task of the interrogator is now to determine only from the answers received and chat behavior of the clients which one is which.
If he is not able to distinguish them reliably, the computer program passes the Turing Test successfully and is then considered AI.
The assumption of Alan Turing for this test was that only a machine with human-like intelligence could fool a human in this complex task.
Since its creation …
Since its creation, the test has become the most popular way in determining intelligent behavior in machines.
Nevertheless, scientists question the model in its sufficiency to measure real intelligence and argue about the reliability of certain programs passing this test.
A discussion like this came up when Google presented the Google Assistant in 2018. Recordings were shown where the digital assistant made phone calls, for example, to schedule appointments at the salon or a restaurant for their clients.
Compared to other digital assistants like Siri or Alexa, the new Google Assistant had a much greater capability of interacting and understanding difficult nuances in human speech. To everyone’s surprise, in the examples above, the Google Assistant would not even be recognized as a „false conversationalist“ by the interlocutors answering the phone.
This indicates a success for the machine regarding the conditions of the Turing Test.
So what is to criticize?
So what is to criticize? Although the presentation by Google was very impressive, the outcome is not verified by the original Turing Test design. This is because the interlocutor of the phone call would take the role of the interrogator for the Turing Test, but they are not aware of this status at the time of the phone call. Therefore, they would not pay attention to indications for whether or not the caller might be a machine.
It is unlikely for the receptionists to expect a computer program to call during their busy working time. They can easily miss or ignore things such as robotic voice patterns, anomalies in the sound, or unusual pauses in the conversation.
In the future …
In the future of determining AI, the Turing Test may hold great possibilities of measuring general intelligence, especially in interaction. However, these days, the scientific community focuses more on objective ways to measure AI in very specific settings, such as computer vision tasks, voice recognition and so forth.
Personally, I don’t consider the development of real, complex human-like intelligent machines possible in the upcoming years. The current techniques are just too restricted to their area of design. Alan Turing predicted machines to pass his test with certainty by the year 2000, but time proved him wrong and we are still waiting for this event to happen.