Isotropic sequence order learning using a novel linear algorithm in a closed loop behavioural system

2002 | journal article

Jump to: Cite & Linked | Documents & Media | Details | Version history

Cite this publication

​Isotropic sequence order learning using a novel linear algorithm in a closed loop behavioural system​
Porr, B. & Wörgötter, P. ​ (2002) 
BioSystems67(1-3) pp. 195​-202​.​ DOI: https://doi.org/10.1016/s0303-2647(02)00077-1 

Documents & Media

License

GRO License GRO License

Details

Authors
Porr, Bernd; Wörgötter, P. 
Abstract
In this article, we present an isotropic algorithm for sequence order learning. Its central goal is to learn the causal relation between two (or more) inputs in order to react to the earliest incoming signal after successful learning (like in typical classical conditioning situations). We implement this algorithm in a behaving system (a robot) thereby creating a closed loop situation where the learner's actions influence its own sensor inputs to the end of creating an autonomous agent. Autonomous behaviour implies that learning goals are internally defined within the organism's capabilities. Standard learning models for sequence learning (e.g. temporal difference (TD)-learning) need an externally defined reward. This, however, is in conflict with the requirement of an implicitly defined internal goal in autonomous behaviour. Therefore, in this study we present a system in which the external reward is replaced by a reflex loop. This loop explicitly includes the environment. Every reflex loop has the inherent disadvantage, which is that its re-actions occur each time just after a reflex-eliciting sensor event and thus ‘too late’. However, a reflex can serve as the internal reference for sequence order learning, which has the task of eliminating this disadvantage by creating earlier anticipatory actions. In our system learning is achieved by modifying synaptic weights of a linear neuron with a correlation based learning rule which involves the derivative of the neuron's output. All input lines are entirely isotropic. The synaptic weight change curve of this rule is strongly related to the temporal Hebb learning rule, which was found in spike timing experiments. We find that after learning the reflex loop is replaced in functional terms with an earlier anticipatory action (and pathway). In addition, we observed that the synaptic weights stabilise as soon as the reflex remains silent.
Issue Date
2002
Journal
BioSystems 
ISSN
0303-2647
Language
English

Reference

Citations


Social Media