000117355 001__ 117355
000117355 005__ 20190316234133.0
000117355 0247_ $$2doi$$a10.1098/rsif.2007.1348
000117355 02470 $$2ISI$$a000258530600006
000117355 02470 $$2DAR$$a13088
000117355 037__ $$aARTICLE
000117355 245__ $$aSimple learning rules to cope with changing environments
000117355 269__ $$a2008
000117355 260__ $$c2008
000117355 336__ $$aJournal Articles
000117355 520__ $$aWe consider an agent that must choose repeatedly among several actions. Each action has a certain probability of giving the agent an energy reward, and costs may be associated with switching between actions. The agent does not know which action has the highest reward probability, and the probabilities change randomly over time. We study two learning rules that have been widely used to model decision-making processes in animals—one deterministic and one stochastic. In particular, we examine the influence of the rules’ “learning rate” on the agent’s energy gain. We compare the performance of each rule with the best performance attainable when the agent has either full knowledge or no knowledge of the environment. Over relatively short periods of time both rules are successful in enabling agents to exploit their environment. Moreover, under a range of effective learning rates, both rules are equivalent, and can be expressed by a third rule that requires the agent to select the action for which the current run of unsuccessful trials is shortest. However, the performance of both rules is relatively poor over longer periods of time, and under most circumstances no better than the performance an agent could achieve without knowledge of the environment. We propose a simple extension to the original rules that enables agents to learn about and effectively exploit a changing environment for an unlimited period of time.
000117355 6531_ $$adecision-making
000117355 6531_ $$alearning rules
000117355 6531_ $$adynamic environments
000117355 6531_ $$amulti-armed bandit
000117355 6531_ $$aanimal behavior
000117355 700__ $$aGroß, Roderich
000117355 700__ $$aHouston, Alasdair I.
000117355 700__ $$aCollins, Edmund J.
000117355 700__ $$aMcNamara, John M.
000117355 700__ $$aDechaume-Moncharmont, Francois-Xavier
000117355 700__ $$aFranks, Nigel R.
000117355 773__ $$j5$$tJournal of the Royal Society Interface$$k27$$q1193-1202
000117355 8564_ $$zURL
000117355 8564_ $$uhttps://infoscience.epfl.ch/record/117355/files/Gross-etal08-Interface.pdf$$zn/a$$s430129
000117355 909C0 $$0252016$$pLSRO
000117355 909CO $$particle$$ooai:infoscience.tind.io:117355$$qGLOBAL_SET$$pSTI
000117355 937__ $$aLSRO-ARTICLE-2008-007
000117355 973__ $$rREVIEWED$$sPUBLISHED$$aOTHER
000117355 980__ $$aARTICLE