PT - JOURNAL ARTICLE AU - Shoeb Shaikh AU - Rosa So AU - Tafadzwa Sibindi AU - Camilo Libedinsky AU - Arindam Basu TI - Towards Autonomous Intra-cortical Brain Machine Interfaces: Applying Bandit Algorithms for Online Reinforcement Learning AID - 10.1101/2020.01.08.899641 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.01.08.899641 4099 - http://biorxiv.org/content/early/2020/01/09/2020.01.08.899641.short 4100 - http://biorxiv.org/content/early/2020/01/09/2020.01.08.899641.full AB - This paper presents application of Banditron - an online reinforcement learning algorithm (RL) in a discrete state intra-cortical Brain Machine Interface (iBMI) setting. We have analyzed two datasets from non-human primates (NHPs) - NHP A and NHP B each performing a 4-option discrete control task over a total of 8 days. Results show average improvements of ≈ 15%, 6% in NHP A and 15%, 21% in NHP B over state of the art algorithms - Hebbian Reinforcement Learning (HRL) and Attention Gated Reinforcement Learning (AGREL) respectively. Apart from yielding a superior decoding performance, Banditron is also the most computationally friendly as it requires two orders of magnitude less multiply-and-accumulate operations than HRL and AGREL. Furthermore, Banditron provides average improvements of at least 40%, 15% in NHPs A, B respectively compared to popularly employed supervised methods - LDA, SVM across test days. These results pave the way towards an alternate paradigm of temporally robust hardware friendly reinforcement learning based iBMIs.