PT - JOURNAL ARTICLE AU - Shoichiro Yamaguchi AU - Honda Naoki AU - Muneki Ikeda AU - Yuki Tsukada AU - Shunji Nakano AU - Ikue Mori AU - Shin Ishii TI - Identification of Animal Behavioral Strategy by Inverse Reinforcement Learning ∼ its Application to Thermotaxis In <em>C. elegans</em> ∼ AID - 10.1101/129007 DP - 2017 Jan 01 TA - bioRxiv PG - 129007 4099 - http://biorxiv.org/content/early/2017/04/20/129007.short 4100 - http://biorxiv.org/content/early/2017/04/20/129007.full AB - Animals are able to flexibly adapt to new environments by controlling different behavioral patterns. Identification of the strategy used for this control (behavioral strategy) is important for understanding animals’ decision making, but methods available for quantifying such behavioral strategies have not been fully established. In this study, we propose a computational approach to identify an animal’s behavioral strategy from behavioral time-series data. To this end, we utilized inverse reinforcement learning (IRL) of a linearly-solvable Markov decision process (LMDP), with the assumption that animals behave optimally by minimizing costs, i.e., state cost and control cost. As a particular target, we focused on the thermotactic behaviors in C. elegans under a thermal gradient. After identifying the behavioral strategy dependent on thermosensory state, we found it comprised mixture of two strategies: directed migration (DM) and isothermal migration (IM). First, the DM is a strategy that the worms efficiently reach to specific temperature, which not only explained observation that the worms migrate toward the cultivated temperature, but also clarifies how the worms control thermosensory state through the migration. Second, the IM is a strategy that the worms track along a constant temperature, which reflects isothermal tracking well observed in previous studies. Furthermore, we applied our method to thermosensory neuron-deficient worms, which then identified neural basis of the DM strategy. Therefore, we believe this novel approach can quantitatively visualize hidden strategies extracted from the behavioral time-series data.