Abstract
In case of deafness, cochlear implants bypass dysfunctional or lost hair cells by direct electrical stimulation (eCIs) of the auditory nerve. However, spectral selectivity of eCI sound coding is low as the wide current spread from each electrode activates large sets of neurons that align to a place-frequency (tonotopic) map in the cochlea. As light can be better confined in space, optical cochlear implants (oCIs) promise to overcome this shortcoming of eCIs. This requires fine-grained, fast, and power-efficient real-time sound analysis and control of multiple microscale emitters. Here, we describe the development, characterisation, and application for hearing restoration of a preclinical low-weight and wireless LED-based multichannel oCI system and its companion eCI system. The head-worn oCI system enabled deaf rats to perform a locomotion task in response to acoustic stimulation proofing of concept of multichannel optogenetic hearing restoration in rodents.
Introduction
According to the World Health Organization, in 2018 there were 466 million people in the world with hearing loss1. It is forecasted that in 2030 this number will reach 630 million and will further grow to 900 million in 2050. If profound, hearing loss makes these people potential candidates for hearing rehabilitation by CIs. Currently there are approximately 700,000 eCI users worldwide, most of which achieve good open-set speech understanding in the quiet. CIs are associated with low risk for implantation and device failure2–4. CI systems are composed of an external speech processor and implanted stimulator. They convert sound into electrical currents delivered from an intracochlear electrode array (eCI) to stimulate the spiral ganglion neurons (SGNs), which are ordered according to their characteristic frequency along the tonotopic axis (place-frequency map) that follows the spiral anatomy of the cochlea. Real-time sound processing involves decomposition into frequency bands and extraction of the intensity within each band. These intensities are then used to scale the amplitude of electrical pulses delivered to the electrode at the tonotopic position corresponding to the respective frequency band.
However, due to the wide spread of the electrical current from each of the 12–24 eCI contacts (depending on manufacturer5), signals containing information of a given frequency band activate a large fraction of the tonotopically ordered SGNs. This results in limited spectral resolution of sound coding with typically less than ten perceptually independently stimulation channels6–8. Efforts to improve the performance of the electrical CI include current steering using multipolar stimulation9 as well as intraneural stimulation10, but the potential for reducing the spread of electrical excitation seems rather limited. Poor spectral resolution is commonly considered the bottleneck of the eCI that makes more complex listening tasks like communication in noisy or reverberant environments difficult and limits music apprecitation11–13. Light offers an alternative mode of SGN stimulation with the potential to overcome this bottleneck (reviewed in ref. 14–17). Considering the ability to confine light in space, future optical cochlear implants (oCIs) could activate smaller fractions of SGNs and, hence, enable a higher number of perceptually independent stimulation channels. Two approaches toward optical SGN stimulation have been employed: i) infrared direct neural stimulation (INS)18 and ii) optogenetics19. While the INS concept has remained controversial for the cochlea20–23, optogenetics offers a defined molecular mechanism, restores auditory function in various animal models of deafness and has been successfully implemented in preclinical animal studies by several laboratories e.g. 24–26. The spectral selectivity of optogenetic SGN stimulation has been shown to be greater than that of electrical stimulation19,27 and near physiological SGN firing rates can be achieved with fast opsins such as Chronos and f-Chrimson28,29.
In parallel, major advances have been achieved towards the technological implementation of the oCI. Since the proof of concept study on flexible oCIs based on microscale gallium nitride (GaN) light emitting diodes (pLEDs)30 their optimization (light extraction and focusing) and technical characterization has been progressed31,32. In addition, studies with larger emitters33,34 and waveguides35 have been undertaken. However, to the best of our knowledge, implementation and characterisation of a full oCI system has not yet been presented, except for a brief proof of principle demonstration36. Such oCI system should employ real-time sound processing and coding strategies employing more stimulation channels than current eCI. Here, we report the development and functional demonstration of a low-weight, wireless, battery-powered oCI sound processor and driver circuitry to be head-mounted for experiments on freely moving animals. We demonstrate the function of the oCI, and its sister eCI, sound processor and driver. In summary, this preclinical oCI system will help paving the way for developing the future clinical oCI for improved hearing restoration in human deafness.
Results
Requirements for the design of the oCI sound processor and driver circuitry
The primary aim of the study was to develop and characterise a miniaturised oCI system consisting of a sound processor, a driver circuitry, and an emitter array. For comparison, we also implemented a preclinical eCI system with an architecture as similar as possible. These oCI and eCI systems capture and process in real time sound to drive emitters or electrodes for neural stimulation (Figure 1). While the clinical eCI system consists of an external sound processor and an implanted stimulator composed of driver circuitry and electrode array, the miniaturised preclinical oCI and eCI systems combine sound processor and driver circuitry in one head-worn assembly. Moreover, the preclinical oCI and eCI systems require a broader bandwidth to accommodate animal vocalizations containing high frequency components that exceed the upper limit of clinical eCIs (≤ 10 kHz) The oCI system employed previously described microfabricated LED arrays33 while the eCI system used clinical-style rodent electrodes (see Methods). Appropriate benchmarking of the oCI system imposes maximum design similarities, i.e. use of the same input stage and digital signal controller (DSC), accommodating differences where appropriate, primarily at stages of coding strategy and output for driving eCI (charge balanced biphasic current pulses) or oCI (monophasic driving current for light pulses). The presented implementation of the oCI can drive forty-nine stimulation channels of which we employed control up to eight implanted LEDs in the in vivo experiments on rats. The eCI was designed to operate up to sixteen stimulation channels of which we employed two.
Size, weight, and portability compatible with head-mounting were among the most important design requirements. The size of the round multilayer rigid printed circuit boards (PCBs) carrying commercial off-the-shelf electronic components on both sides was limited to 20 mm in diameter. The stack of two PCBs integrating all components including batteries should not exceed 20 mm in height. The whole assembly should be housed in a light but robust enclosure consisting of a base mounted to the animal’s skull by dental acrylic and metal anchors, as well as a screwable cap for convenient battery replacement. The weight of the complete sound processor including enclosure and batteries was limited to below 15 g, which is around 33% of the animal’s head weight (40-45 g). Minimizing the energy consumption of the sound processor was another design criterion with particular relevance for the choice of the DSC and the implementation of wireless communication.
Choice of the digital signal controller
Prior to starting development of the sound processor, we critically reviewed state-of-the-art DSCs to see which would qualify for the envisioned tasks. The initial benchmarking of DSCs was performed in 2016 and the shortlist included PIC32MZ2048EFH and PIC32MX795F512L (Microchip), KW40Z based on ARM Cortex M0 (Freescale), SAM4L based on ARM Cortex M4 (Atmel), and nRF52832 based on the ARM Cortex M4 (Nordic Semiconductor). A comparison of their key properties is provided in Table 1.
We compared and benchmarked the shortlisted systems in various ways. Since the most demanding of the envisioned tasks was the real-time audio processing, we started by benchmarking spectral decomposition methods on all platforms. For that, we implemented spectral decomposition via fast Fourier transform (FFT) and infinite impulse response (IIR) based filter banks in several ways including those optimized for the specific microarchitecture (like the CMSIS DSP library for the ARM-based systems37). First, we determined with all filter bank variants the filtering duration of 1 second of audio sampled at 48 kilosamples per second into 10, 20, and 32 quasi-logarithmically spaced bands followed by an envelope extraction and quantisation in time. This calculation yields a matrix of spectrotemporal data, which can serve as a basis for an n-of-m-like coding strategy (see Sound coding strategy).
After choosing the ideal implementation for each individual system—as judged by the shortest calculation times for FFT and IIR—we determined the required clock speeds for just real-time spectral decomposition as a function of the number of frequency channels. Finally, based on measurements of electrical current and information from the specifications (combined with data extrapolation where the clock speed was not adjustable as required), we estimated the required power as shown in Figure 2. Since CI systems require real-time audio processing for a long period of time, we deem the power efficiency of spectral decomposition to be very important for which nRF52832 outperformed the other processors for more than 20 frequency channels (Figure 2).
For the DSCs without built-in radio transceiver we considered various small commercial off-the-shelf Bluetooth Low Energy (BLE) modules to be connected via the Serial Peripheral Interface (SPI) and a few general purpose I/O (GPIO) pins for interrupt signalling. The extra space required by those modules belongs to the downsides of the radioless systems on a chip. First tests with the BLE protocol stack revealed that a sensible amount of working memory must be dedicated to the wireless functionality. Moreover, latencies can become relatively long when operating via BLE, which must be considered in the design of wireless control (for details see Wireless control and communication protocol).
In a final step, we tested the response speed and quality of the product support, reviewed the clarity and comprehensibility of documentation, as well as the availability of examples and public forums. Eventually, we selected the nRF52832 (Nordic Semiconductor) as implemented in the BL652 module (Laird Connectivity), which back then was the smallest form factor available on the market that implemented the full reference circuitry as specified by Nordic Semiconductor. While this system on a chip lacks a digital-to-analogue converter (DAC) and the possibility to adjust clock speed, it provides a range of useful interfaces, an on-chip 2.4 GHz radio transceiver, and supports high computing power and efficiency. Last but not least, we found the documentation and support favourable.
Sound processor and driver circuitry overview and connectivity
Before detailed description of the individual components, we provide an overview of the processor and driver circuitry for the oCI and eCI as well as their wired and wireless connectivity, as shown in Figure 3. The DSC connects to a PC wirelessly over its 2.4 GHz radio transceiver using a protocol based on Enhanced ShockBurst (ESB), or over a 2-wire universal asynchronous receiver-transmitter (UART) port. Further inputs to the DSC are the PDM microphone, an operational amplifier for electrophysiological recordings via its analogue-to-digital converter (ADC), and a trigger input via a general-purpose input/output pin using tasks and events (GPIOTE). The battery board (not illustrated in Figure 3) is also connected internally to the ADC to be able to monitor battery voltage. For output, the driver circuitry of either oCI or eCI is connected via two SPI instances.
In the oCI system, the DSC controls a 16-channel digitally adjustable current sink (TLC5923, Texas Instruments Inc.) and an 8-channel analogue switch (ADG1414, Analog Devices) via its SPI0 and SPI1 interfaces, respectively. While the analogue switch can provide voltage selectively to common anodes of LED blocks, the current sink can adjust the exact amount of current (hence light emission) of each LED within a block. It takes around 15 ps for a DSC-internal command to set up the complete LED array (independent of the topology of the LED array) with the requested currents. In the eCI system, the DSC’s SPI ports control a 16-channel multiplexer (MAX14661, Maxim Integrated) and a 12-bit DAC (AD5683, Analog Devices) driving a current source (AD8643, Analog Devices) to select the electrode contacts to be actuated and to supply the charge balanced biphasic stimuli, respectively. The current source voltage can be monitored by the DSC via an ADC channel. The oCI and eCI driver outputs via pin connectors CLE-110-01-G-DV (Samtec) and CLE-107-01-G-DV (Samtec), respectively that were interfaced with the stimulation probes (for details see Methods).
Audio input
For the audio input to the oCI/eCI system, we considered different microphones providing sufficient sensitivity to sound frequencies at least up to 25 kHz in order to cover a substantial part of the audible spectrum of rats (1–70 kHz)38,39 and marmoset monkeys (0.1–35 kHz)40,41. In addition, we evaluated different concepts of interfacing including analogue, inter-IC-sound (I2S), and pulse density modulation (PDM). Given the constrains in PCB space and energy budget, we finally decided for a microelectromechanical system (MEMS) microphone (SPH0641LM4H-1, Knowles), which can be interfaced to the DSC of choice (see Choice of the digital signal controller) via PDM and provides a sufficient sensitivity in the high frequency range (Supplementary Figure 2). Also, the selected microphone is energy efficient (~0.8 mA current consumption during use at 3.3 V), has both a wide dynamic range and a relatively flat transfer characteristic (Supplementary Figure 2).
Software framework
For evaluating the preclinical oCI system in comparison to the eCI, the software framework considers several functionalities (Figure 4) that encompass a subset of features of clinical CIs. Real-time sound processing (used for behavioural assessment) and triggered stimulation protocols for physiological and behavioural assessment) comprise the two main modes of operation. The software framework switches dynamically between operating modes and manages related memory allocations and deallocations as well as starting and stopping of peripherals. This concept was introduced for three reasons. First, the limited random-access memory (RAM) does not allow the DSC to cater to all functionalities at the same time. Second, turning off peripherals not relevant to a given task saves energy hence extending battery life. Third, task-specific grouping of functions contributes to safe operation and robustness against unintended use.
In addition, the software framework encompasses standard functions such as booting the device, initialising peripherals, and saving settings and logs to the non-volatile memory (NVM). It also provides managed access to 4 kB of the NVM, where initial parameters (among which is a unique identifier) are burned-in before first operation of the DSC. These parameters can be changed and overwritten later during operation in “Setup and debug” mode (see) via specific commands over wired or wireless communication channels. This NVM is also shared with a simple embedded logger that can store critical logs for later read-out.
The software operates on an ARM Cortex M4f central processing unit (CPU), a reduced instruction set computer (RISC) at the heart of the nRF52832 DSC, featuring 3-stages instruction pipeline, Thumb-2, digital signal processing (DSP), and single-precision floating point instructions at excellent energy efficiency42. Thanks to the DSP library of the Cortex Microcontroller Software Interface Standard, a suite of common signal processing functions is available as highly optimised code for the platform43. While a number of operating systems support this DSC, due to the required precision in stimulus timing we choose to have full control of timing by using bare-metal programming. This requires implementing the software framework to schedule and monitor all DSC functions.
Audio processing and spectral decomposition
Whenever the signal processor is in real-time sound encoding mode, it transfers audio samples from the PDM microphone to the RAM, spectrally decomposes the signal, transforms it into stimulation patterns for the activation of the oCI emitters or the eCI electrodes at the tonotopic positions corresponding to the frequencies contained in the sound signal. While sound capturing—once appropriately configured and started—operates by direct memory access and double buffering without substantial involvement of the CPU, the further processing of the audio samples places major demands on the computational power. For this reason, we carefully evaluated spectral decomposition via FFT- vs. IIR-based filter bank operations.
Because the optical or electrical stimulation channels are placed equidistantly on the cochlear implant, the employed filter bank should ideally have a quasi-logarithmic frequency resolution for the stimulation to match the tonotopic organisation of the cochlea. This is easily set up when using a IIR-based filter bank: with IIR filters the required computational power scales almost linearly with the number of required frequency channels. When using FFT, the increase of computational power as a function of frequency channels is less obvious: the real FFT implementation of the DSP library is very efficient but it implies various overheads in our case. First of all, every block of input data needs to be multiplied by the values of a window function. Then, real-valued magnitudes need to be calculated from the complex spectral values, requiring not only multiplications and additions, but also the computationally more expensive square root calculation. Next, the resulting magnitudes, corresponding to the linearly spaced real FFT frequency bins, need to be combined to form quasi log-spaced channel frequencies, which requires further multiplications and additions. Finally, processing encompasses 50 % overlapping of the temporal analysis windows to not miss transients, which, as a consequence, doubles the required calculations. For these reasons, we chose an IIR-based filter bank for spectral decomposition. More specifically, we used the single precision floating point implementation of a biquadratic second order IIR (biquad) band-pass filter as the basic building block of our filter bank with quality factor Q = 12. We also evaluated the fixed-point implementation but found only a trend toward better performance at the same current consumption, while trading in a reduced dynamic range and noise artefacts due to rounding errors.
Sound coding strategy
Optical sound encoding can take advantage of near-physiological spectral selectivity27. Following the Nyquist sampling theorem, the number of independent optical stimulation channels should be at least twice the number of the critical bands cochlea holds. This would equate to at least 48 channels for the human cochlea44 and ~8–12 for the rat45. At the same time, the most efficient pulse duration for optogenetic stimulation which depends on the kinetics of the opsin of choice (0.5–2 ms24,28,29) exceeds that of electrical stimulation (typically below 100 μs46). Moreover, the synchrony of optogenetically driven SGN firing24,28,29 is lower than that of electrical stimulation47–49 and closer to physiological sound stimulation24,28,29. This indicates a choice of ~1 ms optical stimulation and a limit the maximal stimulation rate to SGN-typical steady state firing rates of 300 Hz50–54. This lower rate of optical stimulation seems justified by speech recognition with eCI sound coding not improving much beyond stimulation rates of 500 Hz55. Moreover, the greater stochasticity of optogenetically induced firing suggests that high stimulation rates (~1 kHz) as employed in eCIs to invoke SGN refractoriness for avoiding non-natural SGN synchronicity46 will not be required in oCIs.
The coding strategy will also need to consider the design of the oCI. For “active” oCIs, i.e. active optoelectronic light emitters implanted into the cochlea17,56,57, two concepts have been put forward in order to cope with addressing dozens of emitters while not scaling up the number of feeding lines given the rigid space constraints of the cochlea. The matrix addressing concept30 relies on ultrafast activation of the channelrhodopsins (μs range) and integration of the resulting depolarizing current by SGNs for spike generation58. A second concept foresees an array of emitters on CMOS that connect by a limited number of lines for operation59.
Here, we implemented the continuous-interleaved sampling (CIS) strategy60 and the n-of-m strategy61, two state-of-the-art clinically used coding strategies for the sake of a first preclinical optical coding strategy as well as for operating the eCI. In the case of the n-of-m strategy, m denotes the total number of addressable stimulation sites of the implant, whereas n is the maximum number of stimulation sites to be used in a stimulation cycle. Typically, one stimulation cycle is based on the spectral analysis of one buffer of input audio samples and runs sequentially through the stimulation sites from the base to apex once while bypassing those considered least influential with usually n ≤ m/2 (ref. 61).
In our implementation, real-time sound processing encompasses three concurrent software processes. The first is maintaining sample transfer from the PDM microphone to a set of buffers, the size of which depends on the required spectral update rate. Whenever a buffer got filled up with new audio samples, the second process performs the core calculations of the coding strategy and stores the order and corresponding magnitude of stimulations for a complete stimulation cycle into the stimulation buffer. The third process is launched periodically at the rate of stimulation of the oCI/eCI, fetches stimulation parameters from the buffer and orchestrates hardware peripherals to output the upcoming stimulus with high temporal precision.
The core calculations of the sound coding strategy process starts by spectrally decomposing the signal into m bands using a bank of second order IIR filters (as detailed above in the section Audio processing and spectral decomposition) and extracting the envelopes of the band-filtered signals. Next either all (CIS, where n = m) or just the n maximal (n-of-m) spectral values are taken and converted into dB relative to full scale (dB FS) values. The latter is denoted xi for channel i. Next, xi is mapped to the adequate physical stimulation range yielding stimulation amplitude yi for each channel i according to: where and denote acoustical threshold and saturation levels (in dB FS), respectively, and denote threshold and maximum comfortable levels in terms of stimulation intensity, respectively, and L is a parameter used to change the non-linearity of the mapping between soft and loud signal parts and is used in human CI systems to tune the perceived loudness growth. While optimal and values mostly depend on the audio recording hardware (see Supplementary Information), all above parameters are influenced by the acoustic environment the sound processor is intended to be used in. Calculated yi values can be directly used to set the optical or electrical stimulation circuitry’s magnitude for each stimulation pulse. In case of yi = NaN, no stimulation will take place on the corresponding channel i in the actual stimulation cycle.
Wireless control and communication protocol
While the standardisation might be considered as an advantage of BLE—e.g. offering remote controlled by any BLE-capable device—we anticipated substantial drawbacks. BLE requires larger overhead (in terms of code space, working memory that must be dedicated to its functionality, and other resources e.g. timers) than a proprietary radio protocol. Furthermore, the additional protocol stack layers of BLE cause a latency in transmitting a single short message in the range of 10 ms (measured from enqueueing the message on the sender side till receiving the message contents on the receiver side, see also ref. 62). Since short trigger latency was one of the key design requirements, we choose to build upon Nordic Semiconductor’s proprietary ESB protocol. ESB avoids most of the BLE-typical overhead thereby freeing up peripherals, code space, and working memory of the DSC, also enhancing wireless responsiveness. The use of the ESB allowed to achieve the time between sending a wireless trigger command and a stimulation start in the range of approximately 350 ps. Disadvantages of ESB include a shortage of sophisticated 2.4 GHz wireless technologies (manual frequency channel selection required) and the need for ESB-capable communication interface on the PC side. For this reason, we implemented a custom-coded firmware running on a single-board development kit for 2.4 GHz proprietary applications (Nordic Semiconductor nRF52 DK) connected to the PC via a USB (USB-to-ESB bridge). To increase speed of data transmission, commands sent from a PC via USB interface and arriving at the bridge are not interpreted by its firmware but forwarded directly to the oCI/ eCI system. A TTL trigger input can be provided to bridge’s GPIOTE and sent as a triggering command wirelessly to oCI/eCI system or, in a wired configuration, directly to oCI/ eCI system GPIOTE. Triggering in both wireless and wired communication protocols enables also triggering via command sent from the PC. For the radio operation of the USB-to-ESB bridge and oCI/eCI system, different modes of radio operation were implemented in each firmware: permanent receiving or transmitting of radio is always turned on, for applications where only minimal latencies are acceptable and periodic (transmission turned on with a given frequency, for applications with lower requirement for timing but emphasis on low power consumption). Preselection of modes for the bridge and oCI/eCI allows for reliable communication and triggering of a predefined stimulation pattern.
The communication protocol (also over wired UART interface) uses fixed-length commands in a form of a human-readable 15-character-long ASCII messages. Due to the lower overhead, interpretation of such messages in real-time and debugging are straight forward. All commands send to the DSC of the oCI/eCI system (e.g. application of new setting) can be acknowledged to ensure success of communication or retrieve requested information (e.g. status of the DSC or its settings). The configuration is stored in the NVM and loaded at boot. The performance of the wireless communication of a PC via USB-to-ESB bridge with a processor was assessed in the animal experiments (see In vivo experiments).
Optical stimulation circuitry
The oCI driver circuity employs a 16-channel digitally-adjustable current sink (TLC5923, Texas Instruments Inc.) and an 8-channel analogue switch (ADG1414, Analog Devices) each controlled by the DSC via dedicated SPIs (SPI0 and SPI1, respectively, see Figure 3). This design enables matrix addressing of the oCI arrays: the common anode of the respective LED block (up to maximum 8 blocks) is selectively activated by the analogue switch. The individual LED within a block is selected and its light intensity is adjusted by the current sink, setting of the current level of each LED within a block (up to maximum 16 LEDs per block). Independent of the topology of the LED array, the DSC needs ~15 μs for internal commands to set up the LED array with the requested light intensities. The output of the oCI driver is accessible on the female pin header (CLE-110-01-G-DV, Samtec; see Methods).
Electrical stimulation circuitry
The eCI driver circuity employs a dual 16-channel multiplexer (MAX14661, Maxim Integrated) and a 12-bit DAC (AD5683, Analog Devices) driving a current source (AD8643, Analog Devices) to supply the charge balanced biphasic stimuli via an independent for each channel capacitor to the selected electrode. The multiplexer and DAC are controlled by the DSC via dedicated SPIs (SPI0 and SPI1, respectively, see Figure 3). By the design of the eCI, the output of the eCI driver is limited to 10 channels accessible on the female pin header (CLE-107-01-G-DV, Samtec; see Methods).
Size, weight, and power consumption
The complete oCI/eCI sound processor and driver circuitry is built in a form of a cylinder accommodated in a low-weight (~6.5 g) and robust plastic enclosure made of Polyether ether ketone (PEEK). The inside diameter of the enclosure is 25 mm and its height 30 mm. The enclosure consists of a threaded head-mounted base (~1.5 g, Error: Reference source not found C and D) fixed by dental acrylic and metal anchors to the skull of the animal, as well as a screwable cap (~5 g, Error: Reference source not found B). This design allows for convenient installation of the device on the animal’s head and battery replacement. The oCI/eCI sound processor and driver circuitry consists of two round multilayer rigid PCBs (Error: Reference source not found B, E, and G), each 20 mm in diameter, carrying commercial off-the-shelf electronic components on both sides, which were interconnected via PCB surface-mounted pin headers (top PCB: FTE-106-03-G-DV, Samtec; bottom PCB: CLE-106-01-G-DV, Samtec). The bottom PCB (closer to the animal’s head) consists of the sound processor and oCI/eCI driver circuitry (Figure 3). It is connected to the head-mounted pin header to interface with the eCI or oCI. The top PCB consists of a battery bracket holding a lithium-ion battery (CP1654A3, VARTA Microbattery GmbH; for details see Supplementary Information), a step-down converter (TPS62240, Texas Instruments Inc.) for powering of the DSC, booster (MCP16251, Microchip Technology Inc.) for powering current source (in case of the eCI system) or LED driver (in case of the oCI system), and a circuit based on a nanopower comparator (MAX9117, Maxim Integrated) that shuts off powering when the battery voltage is too low to allow a reliable operation of the system. A flexible antenna for 2.4 GHz radio (FXP73, Taoglas) is wrapped around the PCBs assembly. The weight of both PCBs including batteries and antenna is ~8 g. The performance of the oCI/eCI system operating under battery powering is shown in Supplementary Figure 1. For both eCI and oCI systems the battery enabled 7–8 hours of operation when driven with identical conditions as used in behavioural experiments (typical session duration is less than an hour).
In vivo experiments
In order to test whether optical/electrical stimulation with our oCI/eCI sound processor triggers behavioural responses, we turned to instrumental conditioning by negative reinforcement (avoidance behaviour) paradigm in a custom-built ShuttleBox setup (Figure 6 B, for details see Methods). oCI systems were tested on kanamycin-deafened rats, which had received postnatal injection of AAV-PHP.B63 carrying the ChR2 mutant CatCh64 under the synapsin promoter into the left cochlea. Four adult rats were implanted with unilateral (left side) chronic oCI probes. For comparison, additional kanamycin-deafened rats were implanted with eCI probes (4–8 month of age, N = 6 females; for details on surgical procedures see Methods). Prior to deafening and implantation surgery the animals were trained to show avoidance behaviour in response to acoustic stimulation in the ShuttleBox. After the implantation, we investigated if the animals show the same response to the optical/electrical stimulation in the identical setup (see also Methods).
Before implantation animals (both eCI and oCI groups) were trained with acoustic click stimulation in the ShuttleBox paradigm (for details see Methods) to achieve response rate of at least 0.8 for a target while not exceeding 0.2 for non-target control trial (Figure 6 D and E). After deafening, implantation, and adaptation of the animals to the new stimuli, animals performed the same task as previously, reaching similar rates for predefined stimulation with both eCI (Figure 6 F) and oCI (Figure 6 G). In both cases all available stimulation contacts were used, i.e. both eCI electrodes at intensities between 100–200 μA (mean threshold was ~70 μA) for eCIs and all oCI LEDs at intensities between 10-100% (~0.31–3.1 mA LED current per channel).
After we had shown that the predefined stimulus from the eCI and oCI systems can elicit behavioural response (Figure 6), we tested if oCI sound coding can restore hearing in frequency ranges not available for acoustic hearing due to partial deafening of the animal (Figure 7). We observed that operating all LEDs at intensity levels between 30–100% triggered behavioural responses but no aversive behaviour like freezing or intensive scratching in the animals. In the style of fitting of CIs in humans, we roughly estimated threshold level (TL) to be 20% intensity and the most comfortable level (MCL) to be 80%. We employed both CIS and n-of-m, sound coding strategies implemented into the firmware for m = 8 (in both cases) and n =3 (in case of n-of-m; for details see Sound coding strategy). Based on measurements of auditory brain stem responses (ABRs) and ShuttleBox performance we identified frequency ranges which were not eliciting a response to acoustic stimulation after the surgery (Figure 7 A). The exemplary animal was still able to hear ≥ 50 dB (SPL) at 8 kHz pure tones but was more severely hearing impaired (both at ABR and behaviour level) for frequencies ≥ 16 kHz. We adapted the frequency window of the sound processor and used all LEDs inside the cochlea to restore hearing in range between 16 and 28 kHz. With both coding strategies, n-of-m and CIS, the animal regained hearing ability showing the pre-trained avoidance behaviour (Figure 7 C and D).
Discussion
Here we describe and characterise a miniaturised multichannel oCI system for animal studies and compare it to its sister eCI system. Key features include low weight, real-time sound processing for multichannel stimulation (oCI: 32 channels, eCI: 16 channels), wired or wireless control, and operation over several hours in behavioural experiments. In order to enable rigorous benchmarking of the oCI, the design of hard- and software for oCI and eCI was kept as similar as possible, such that the two systems primarily differ in the number of stimulation channels and precise driver hardware. Next, to technical characterisation and benchmarking of components, we provide a proof-of-concept of the oCI-system using optogenetically modified rats. This study proves the feasibility of using the miniaturised oCI system to restore auditory-driven behaviour in deaf, freely moving rats.
Utility of the developed oCI system
The oCI sound processor and driver circuitry operated “active” oCIs56 based on microfabricated 1D-arrays of ten blue light emitting GaN-LEDs. This served multichannel optogenetic stimulation of the auditory pathway. H owever, the probes used here did not fully capitalise on the potential of the oCI sound processor and driver circuitry (e.g. real-time 32 channel sound processing) given the small size of the rat cochlea. The developed oCI systems pave the way for further preclinical development of multichannel optogenetic stimulation of the auditory pathway in animal models with larger cochlea such as gerbils, cats or marmoset monkeys65. Moreover, the hard- and software framework established here can as well drive more sophisticated arrays of emitters also for other optogenetic applications in basic and preclinical research. For instance, using the implemented matrix addressing it can operate 2D arrays or more complex 1D arrays, in which emitters of one block share a common cathode. This is highly relevant for when scaling-up the number of channels in “active” oCIs56 e.g. based on μLED arrays30,31 where the number of lines is limited by space constraints of the scala tympani. The same sound processor and a similar driver circuitry can also serve for the control of laser diodes in multichannel “passive” oCIs56 where the light is fed into the cochlea via polymer-based waveguides probes66. Future modifications of the described hard- and software framework will also accommodate the combination of electrical and optical stimulation (combined oCI and eCI) for hybrid electro-optical stimulation26 or optofluidics circuitry for application of photopharmacology into the cochlea and later optical stimulation67–69. Most studies presenting optogenetic stimulators address brain stimulation70–73, e.g. cortical control of movement74, and also peripheral nervous system75. Some systems require external devices to deliver stimuli76: via wireless transfer of power (inductive coupling) from an external transmitting coil to a device coil or radiofrequency power harvesting from an external transmitter antenna to a receiver antenna located on the device. Some of these devices, similarly to ours, enable wireless communication and battery powering, being even lighter. However, our system not only can stimulate with predefined patterns set via a wireless communication protocol but can process captured sound in a real-time and convert it into optical stimulation pulses, thereby operating as a fully stand-alone device.
Towards clinical translation of the oCI system
In order to implement a basic, head-wearable multichannel oCI system for animal experiments we emphasised low-weight and small size. The weight of the whole system corresponded to roughly one-third of the animals head weight, which is well below the provisions of the U.S. Army Aeromedical Laboratory of 50 % (see also ref. 77) and below the another study in rats reporting no adverse effects in animals carrying nearly twice as much weight78 as we showed here. Different from clinical eCI systems consisting of an external sound processor and the actual implant with driver and electrode array, we implemented a single, external head-mounted device to which the oCI or eCI probe were interfaced at the vertex of the animal’s skull. However, our approach shows feasibility of driving oCI with a miniaturised system built with commercial off-the-shelf components and as a proof of concept, paves the way for the future development of human prototypes based on dedicated components. This will closely follow the design of current eCI systems but will accommodate the specific features of the oCI such as matrix addressing of a greater number of stimulation channels and their parallel operation. Given the higher per pulse energy requirement of oCIs79 and the larger number of stimulation channels, saving energy on other ends will be critical. For instance, through-dermal optical data transmission80 could be considered to replace the costly inductive transmission between sound processor and an implantable driver. In conclusion, to the best of our knowledge, this is the first functional system consisting of a sound processor and a driver for chronic oCI stimulation allowing experiments in freely moving animals. This proof-of-concept oCI system paves the way for improved hearing restoration in human deafness with the future clinical oCI systems.
Methods
Surgical procedures
CI surgeries were performed on animals anaesthetised with isoflurane (3.5–4 % at 1 l/min for induction, 0.5–2 % at 0.4 l/min for maintenance). Appropriate analgesia was achieved by subcutaneous injection of buprenorphine (0.1 mg/kg bodyweight) and carprofen (5 mg/ kg bodyweight) 30 minutes prior to surgery. Depth of anaesthesia was monitored regularly by the absence of reflexes (hind limb withdrawal) and breathing rate and adjusted accordingly. Animals were kept on a heating pad to maintain body temperature at ~37°C. All experimental procedures are in compliance with national animal care guidelines and were approved by the local animal welfare committee of the University Medical Center Göttingen as well as the animal welfare office of the state of Lower Saxony, Germany.
Animals were chronically implanted unilaterally (left side) with an eCI or with an oCI into the scala tympani via a cochleostomy of the basal turn. oCI surgery was preceded by an early postnatal injection of AAV-PHP.B carrying the ChR2 mutant CatCh under the synapsin promoter into the left cochlea (at postnatal day 6, 3–6 months before the implantation). Animals were deafened prior to oCI/eCI implantation: 2–3 μl of kanamycin solution (100 mg/ml; Kanamysel, Selectavet) were injected in both ears. In the left cochlea kanamycin was injected into the cochleostomy which was also used for the implant later, while on the right ear kanamycin was injected into the middle ear via a transtympanic approach.
SGN stimulation: probes and protocols
Multichannel oCI and eCI probes were designed to fit the dimensions of the scala tympani of the rat cochlea (length of ~7 mm, ref. 81) and have been described in previous studies: eCI82 and oCI36. They will be briefly introduced.
eCI probes
The most apical electrode was positioned ~3 mm-deep inside the cochlea and the return electrode was inserted between connective tissue and muscles in the neck of the animal. The eCI probes consist of 2 linearly-arranged intracochlear platinum sheet contacts and an extracochlear platinum-iridium reference ball electrode (Supplementary Figure 3 A and B). All contacts are connected via lead wires to a 3-pin male connector (1.27 mm pitch) and are embedded into silicon. The intracochlear contacts are 0.3 mm in diameter each with a centre-to-centre pitch of 0.8 mm. To assure reproducibility of implantations across animals an array was marked with a black dot at a distance of 3 mm measured from the apical tip (Supplementary Figure 3 B). The diameter of the silicone-encapsulated intracochlear and intrabullar part is 0.3 mm increasing to 0.9 mm at the extrabullar part to provide a stable submuscular connection. The silicone-encapsulated return electrode is 0.3 mm in diameter. These animal eCIs, except the size and the number of stimulation sites, are virtually identical to implants used in human patients. Monopolar electrical stimuli (charge balanced biphasic cathodic-first pulses) were delivered.
oCI probes
oCI contained 10 LEDs (C460TR2227-S2100, Cree), each 220 pm by 270 μm by 50 μm, spaced at a pitch of 350 or 450 μm along a polyimide carrier comprising wiring and LED contact pads encapsulated into silicone (Supplementary Figure 3 C). Its ZIF connector is interfaced with the head-mounted adaptor board (ZIF connector to male pin header FTE-110-03-G-DV, Samtec) that in turn was connected to the female pin header (CLE-110-01-G-DV, Samtec) of the driver circuitry. Light pulses were delivered via custom-made oCIs driven by monophasic current. oCI fabrication processes have been adapted to the chosen device layout following refs. 33 and ref. 36.
ABR recordings
ABRs were recorded using needle electrodes on the vertex and mastoid, while an active shielding electrode on the neck was used to reduce the noise level. The differential potential between vertex and mastoid subdermal needles was amplified using a custom-designed amplifier (gain 10,000), sampled at a rate of 50 kHz (NI PCI-6229, National Instruments), and filtered off-line (0.3 kHz to 3 kHz Butterworth filter) for acoustically evoked ABRs.
Stimulus generation and presentation, and data acquisition were realised using custom-coded MATLAB (The MathWorks, Inc.) scripts operating digital-to-analogue converters (National Instruments) in combination with custom-built hardware to amplify and attenuate audio signals. For acoustically evoked ABRs, sounds were presented in an open near field via a single loudspeaker (Vifa, Avisoft Bioacoustics) placed on average 30 cm in front of the animal at the level of the animal’s head. Sound pressure levels were calibrated with a 0.25-inch microphone (D 4039, Brüel & Kjaer GmbH) and a measurement amplifier (2610, Brüel & Kjaer GmbH).
Behavioural experiment setup
The setup was identical as previously used in studies of optogenetic stimulation via optical fibre involving Mongolian gerbils24. Briefly, the ShuttleBox is composed of two platforms and a hurdle between them. Platforms, consisting of equally spaced rows of round metal bars, are mounted on springs and accelerometers to determine the position of the animal. Each acoustic stimulus is presented through a ceiling-mounted loudspeaker (Vifa, Avisoft Bioacoustics). Sound pressure levels were calibrated using a 0.25-inch microphone (D 4039, Brüel & Kjaer GmbH) in combination with a measurement amplifier (2610, Brüel & Kjaer GmbH). Stimulus generation as well as data recording is performed with custom-coded MATLAB (The MathWorks, Inc) scripts via digital-to-analogue converters (National Instruments) and custom-built hardware (for details see ref. 24). The existing custom-coded MATLAB (The MathWorks, Inc) scripts were extended to interface with the oCI/eCI processor for adjusting settings and for linking the oCI/eCI processor with a PC to trigger predefined stimuli.
Behavioural experiments paradigm
The behavioural training procedure was similar to previously published24. First, rats were behaviourally trained in acoustic click sessions, later followed by trainings with oCI/eCI stimulation (with triggered predefined stimuli and sound coding-based acoustic stimulation). For each training session, the animal was positioned on one of the platforms inside the ShuttleBox and adapted for 5 minutes before any stimulation. Per session, target trials were presented with an inter-trial-interval randomised between 12 and 24 seconds. Each trial consisted of 200–250 ms long stimuli (train of 1 ms-long acoustic clicks or electrical/ optical pulses at a rate of 50 Hz in case of predefined oCI/eCI stimuli) repeated at rate of 2 Hz within a response window of 6 seconds. Following perception of a stimulus in the target trial the animal was expected to shuttle (pass over the hurdle) to the opposite side of the ShuttleBox (Figure 6 A). Once the animal shuttled within the response window, the trial was counted as a hit and the stimulus was terminated (Figure 6 A left). Otherwise, an aversive stimulus (mild electrodermal stimulation of the paws via metal bars of the platform, 0.1–1 mA) was executed until the animal shuttled or for maximally 6 seconds and the trial was counted as a miss (Figure 6 A centre). Per session, non-target (no stimulus, no aversive event, Figure 6 A right) and target trials (Figure 6 A left) were randomly intermixed to determine the baseline of the animal activity. The response rate of the animal (Figure 6 D–G) for target/non-target trials is reported as the fraction of target/nontarget trials in which the animal crossed the hurdle divided by the total number of target/ non-target trials. In all sessions employing oCI/eCI stimulation each animal was briefly anaesthetised with isoflurane to connect the sound processor and driver circuitry to the implant. Adaptation time was counted once the animal became active again.
After proving that the sound processor can be triggered by acoustic clicks to elicit light stimulation which retrieves pre-trained avoidance behaviour36, we moved to the first in vivo application of the sound processor using coding strategies. Beforehand we had identified that operating all LEDs at 30% intensity still was sufficient to trigger behavioural response, pointing towards TL below that value. For MCL estimation we referred to the observation that LEDs operating on 100% intensity did not cause adverse behaviour in the animals like intensive scratching, which for example had observed for eCI implanted animals. Therefore, for the first implementation of coding strategy we used an TL of 20% and MCL of 80%. The processing range of the sound coding strategy was set to the prior identified frequency range without residual hearing (16–28 kHz; see also Methods). This was only done for one rat, which had 8 functional LEDs inside the cochlea, resulting in coding of 6 kHz bandwidth using 8 LEDs. Within the ShuttleBox session pure tones of 8, 16, 18, 20, and 22 kHz were presented as targets, whereas 8 kHz served as control as natural hearing was still possible at this frequency (Figure 7).
Author contributions
T.H., L.J., G.H., and T.M. designed the study. T.H. and L.J. developed software (firmware for sound processor and control software for computer). T.H., G.H., L.J., and T.M. planned the hardware development, which was implemented by G.H. and L.J. supported by Daniel Weihmüller. A.D. and B.W. performed surgeries/implantations of eCIs/oCIs. B.W. deafened the animals. A.D., B.W., and L.J. performed ShuttleBox trainings/experiments involving rats and analysed the data. R.H. designed and generated eCIs. S.A. and P.R. developed and generated oCIs. L.J., T.H., B.W., and T.M. designed the figures. L.J., B.W., and T.H. prepared the figures. A.D. contributed to preparation of the figures. L.J., T.H., B.W., and T.M. prepared the manuscript. All authors discussed the results and commented on the manuscript.
Competing interests
T.M. is a co-founder of OptoGenTech GmbH.
Supplementary Information
Selection of the battery
Seeking the low weight battery enabling reliable operation of the oCI/eCI systems, we tried various batteries starting with lithium coin cell CR1631, including hearing aid dedicated Zinc Air MERCURY-FREE (size p13 and p675, power one VARTA Microbattery GmbH), and cochlear implant dedicated series IMPLANT plus (size p675, power one VARTA Microbattery GmbH). Most of them failed for our purpose not being able to provide enough current for long term wireless operation in permanent receiving or transmitting mode where radio transceiver is always turned on (ShuttleBox paradigm with triggered predefined stimulation, for details see Results and Methods). Finally, we successfully employed a lithium-ion battery (CP1654A3, VARTA Microbattery GmbH). Battery testing running the ShuttleBox-like paradigm revealed 7–8 hours of the device operation. The discharge curve of the battery is shown in Supplementary Figure 1.
Testing was performed using MATLAB (The MathWorks, Inc) script simulating ShuttleBox experiment in which stimulation is triggered via USB-to-ESB bridge for the maximum length of the response window and minimum inter-trial-interval of 10 s (compare with the ShuttleBox experiment paradigm described in Methods). This simulation is considered the most demanding scenario from the point of the view of power consumption in such ShuttleBox experiments. For eCI system, instead of implant and the animal, the 6.8 kΩ resistor, matching the average impedance of the electrode implanted into the animal, was used. For oCI system, oCI implant was used (see Methods). Stimulus parameters were the same as in the experiments with animals for both systems except of number of channels stimulated simultaneously, here single channel for either eCI or oCI, as well as for eCI current intensity of 300 μA (~20% of current possible for a current source) and for oCI light intensity of 20% (~620 μA per LED current possible for an LED driver). Current consumption of the device and voltage level of the battery where measured with custom-build setup consisting of Cmod CK1 (Digilent Inc.) and a current-shunt monitor INA186 (Texas Instruments Inc.) running custom-coded firmware sending data via UART interface to the PC (Supplementary Figure 1).
Transfer characteristic of the sound processor’s audio system
The audio quality of the sound processor is determined by the acoustic set-up (e.g. positioning of the microphone, diameter of the acoustic vent in the PCB, acoustic shadow effects of the processor enclosure), the PDM microphone, and the processor’s PDM interface.
To characterise the overall audio quality of the processor, we developed a program (stored as “Audio calibration mode” in the sound processor, see Figure 4) that analyses signals coming from the PDM microphone and forwards analysis results to a PC. For the analysis it uses a sampling rate of 50 ksps, a 512-point real-valued FFT with floating point precision based on the CMSIS DSP library80, and Blackman-Harris window function.
In an acoustically shielded chamber (Industrial Acoustics) we used a single Tannoy Reveal 402 loudspeaker (±3 dB linearity between 56 Hz and 48 kHz) driven by a Zoom UAC-8 audio interface (±1 dB linearity between 20 Hz and 40 kHz at 96 ksps) to play back sound generated by a custom-coded MATLAB (The MathWorks, Inc) script. The distance between the centre of the loudspeaker and the sound processor was exactly 1 m. The sound pressure level was controlled for at the sound processor’s position with a calibrated Phonic PAA3 audio analyser.
Using script, we generated a sine sweep (chirp signal sweeping from 100 Hz to 30 kHz at 96 ksps sampling rate), which we played back at specific sound pressure levels (SPL) while the analysis program was running on the sound processor. The results for 95 dB SPL, 65 dB SPL, and silence (noise floor measurement) for frequencies between 200 Hz and 25 kHz are shown in Supplementary Figure 2.
Acknowledgements
The authors thank Daniel Weihmüller for PCB layouting, assembly, testing, and help/support in hardware development. We thank Dr. Vladan Rankovic for producing and injecting AAVs. Dr. Daniel Keppeler and Dr. Marcus Jeschke designed the enclosure, which was then produced by Rainer Schürkötter and colleagues (MPI for Biophysical Chemistry, Göttingen). The work was funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 670759 – advanced grant “OptoHear” to T.M.) and by the Deutsche Forschungsgemeinsch-aft (DFG, German Research Foundation) via the Leibniz Program (MO896/5 to T.M.) and under Germany’s Excellence Strategy (EXC 2067/1- 390729940 to T.M. and EXC 1086 to P.R.).