Decision-making has gathered immense interest in fields like psychology, neuroscience, robotics and even economics, with numerous models and software simulating the human mind. However, such models are limited to a type of decision-making that focuses only on each decision step in isolation, without taking into account the preceding decisions leading up to it, although the latter is often our everyday experience. Publishing inPLoS One, scientists from EPFL and the University of Berne have perfected a model that can simulate this type of decision-making and learning conditions with surprising accuracy.
Decision-making comes in two major into two types: Markovian and non-Markovian, named after the mathematician Andrey Markov (1856-1922). Simply put, in Markovian decision-making, the next decision step depends entirely on the current state of affairs. For example, when playing backgammon, the next move depends only on the current layout of the board, and not on how it got to be like that. This relatively straightforward process has been extensively modeled in computers and machines.
Non-Markovian decision-making is more complex. Here, the next step is affected by other factors, such as external constraints and previous decisions. For example, a person's goal might be to travel on the train. But what will happens when he arrives at the door to the train depends on whether or not he has previously visited the ticket booth to buy a ticket. In other words, the next step depends on how he got there; without a ticket, he cannot proceed to the desired goal. In neuroscience, the "buy-ticket" step is referred to as a "switch-state".
A team led by Michael Herzog at EPFL and Walter Senn at the University of Berne developed the first biologically plausible model that can handle non-Markovian decision-making. Michael Herzog's group has now tested it with humans as well as various computer models. The model, developed in a previous study, was now validated with two distinct tests, designed by Aaron Michael Clarke and Elisa Tartaglia in Michael Herzog's lab. The tests were performed by human subjects, and three computer models with different degrees of learning ability. In addition, the test were also taken by an advanced brain model called a "spiking neuron network", which makes decisions based on whether the majority of neurons in a population fired a signal, or "spike", and simulates human performance in a very realistic manner.
The first experiment tested the impact of the switch-state on people's decision-making and learning. Users played a computer game where they had to navigate through eight icons (a gun, a car etc.) to finally reach the end goal (called "Yeah!"). Each icon came with three buttons, each leading down a different route, and the user had to decide which one to take. Although there was a relatively short route from the first icon to the goal, it was impossible to go through it unless the user first went through a switch-state icon - an image of a computer. Users repeated the experiment multiple times, becoming increasingly better at deciding which routes to pick. For example, most people took over 80 clicks to get to the goal when they began, but after 40 games, they needed fewer than ten.
The second experiment tested how delayed feedback affects decision-making and learning. Here, users were shown a set of experimental images and told that each image belonged to either category one or category two. Each category corresponded to either the left or the right arrow on the keyboard, but the participants were not told which arrow went with which image beforehand. Next, the users were shown each image one at a time, and had to press either the left or right arrow depending on the category of each icon. In response, the screen produce a RIGHT or WRONG feedback message. As the test went on, the feedback became delayed, to the point where feedback from one icon would come after the feedback for the next icon had appeared.
The results of the study drew three major conclusions. First, that human decision-making can perform just as well as current sophisticated computer models under non-Markovian conditions, such as the presence of a switch-state. This is a significant finding in our current efforts to model the human brain and develop artificial intelligence systems.
Secondly, that delayed feedback significantly impairs human decision-making and learning, even though it does not impact the performance of computer models, which have perfect memory. In the second experiment, it took human participants ten times more attempts to correctly recall and assign arrows to icons. Feedback is a crucial element of decision-making and learning. We set a goal, make a decision about how to achieve it, act accordingly, and then find out whether or not our goal was met. In some cases, e.g. learning to ride a bike, feedback on every decision we make for balancing, pedaling, braking etc. is instant: either we stay up and going, or we fall down. But in many other cases, such as playing backgammon, feedback is significantly delayed; it can take a while to find out if each move has led us to victory or not.
Finally, the researchers found that the spiking neurons model matches and describes human performance very well. The significance of this cannot be overstated, as non-Markovian decision-making has proven to be very challenging for computer models. "This is a proof-of-concept study", stated Michael Herzog. "But the study makes an important contribution toward understanding, and accurately modeling, the human brain - and even surpassing its abilities with artificial intelligence."
This study represents a collaboration of EPFL's Brain Mind Institute with the University of Berne.
Clarke A.M., Friedrich J., Tartaglia E.M., Marchesotti S., Senn W., Herzog M.H. are the authors of "Human and Machine Learning in Non-Markovian Decision Making". The study appears inPLoS Oneof 21 April 2015 - 10(4): e0123105. DOI:10.1371/ journal.pone.0123105