Skip to main navigation Skip to search Skip to main content

Human behavior in contextual multi-armed bandit problems

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Abstract

In real-life decision environments people learn from their direct experience with alternative courses of action. Yet they can accelerate their learning by using functional knowledge about the features characterizing the alternatives. We designed a novel contextual multi-armed bandit task where decision makers chose repeatedly between multiple alternatives characterized by two informative features. We compared human behavior in this contextual task with a classic multi-armed bandit task without feature information. Behavioral analysis showed that participants in the contextual bandit task used the feature information to direct their exploration of promising alternatives. Ex post, we tested participants' acquired functional knowledge in one-shot multi-feature choice trilemmas. We compared a novel function-learning-based reinforcement learning model to a classic reinforcement learning. Although reinforcement learning models predicted behavior better in the learning phase, the new models did better in predicting the trilemma choices.

Original languageEnglish
Title of host publicationProceedings of the 37th Annual Meeting of the Cognitive Science Society, CogSci 2015
EditorsDavid C. Noelle , Rick Dale , Anne Warlaumont , Jeff Yoshimi , Teenie Matlock, Carolyn Jennings , Paul P. Maglio
Number of pages6
Volume1
PublisherCognitive Science Society
Publication date2015
Pages2290-2295
ISBN (Print)9781510809550
ISBN (Electronic)978-0-9911967-2-2
Publication statusPublished - 2015
Externally publishedYes
Event37th Annual Meeting of the Cognitive Science Society: Mind, Technology, and Society - Pasadena, CA, United States
Duration: 22. Jul 201525. Jul 2015
Conference number: 37

Conference

Conference37th Annual Meeting of the Cognitive Science Society
Number37
Country/TerritoryUnited States
CityPasadena, CA
Period22/07/201525/07/2015

Keywords

  • contextual multi-armed bandits
  • decision making
  • exploration-exploitation trade-off
  • function learning
  • reinforcement learning

Fingerprint

Dive into the research topics of 'Human behavior in contextual multi-armed bandit problems'. Together they form a unique fingerprint.

Cite this