Abstract
There is overwhelming evidence that metabolic processes are altered in cancer cells and these changes are manifested in the volatile organic compound (VOC) composition of exhaled breath. Here, we take a novel approach of an insect olfactory neural circuit-based VOC sensor for cancer detection. We combined an in vivo antennae-attached insect brain with an electrophysiology platform and employed biological neural computation rules of antennal lobe circuitry for data analysis to achieve our goals. Our results demonstrate that three different human oral cancers can be robustly distinguished from each other and from a non-cancer oral cell line by analyzing individual cell culture VOC composition-evoked olfactory neural responses in the insect antennal lobe. By evaluating cancer vs. non-cancer VOC-evoked population neural responses, we show that olfactory neurons’ response-based classification of oral cancer is sensitive and reliable. Moreover, this brain-based cancer detection approach is very fast (detection time ~ 250 ms). We also demonstrate that this cancer detection technique is effective across changing chemical environments mimicking natural conditions. Our brain-based cancer detection system comprises a novel VOC sensing methodology that will spur the development of more forward engineering technologies for noninvasive detection of cancer.
Introduction
Breath analysis is a noninvasive disease detection technique that aims to characterize the volatile chemical composition of exhaled breath, which represents the volatile chemicals present in blood and airways inside the body 1–5. Cancer alters cellular metabolism and these alterations are ultimately reflected in the volatile organic compound (VOC) composition of patients’ exhaled breath 1,6,7. It is well known that volatile cancer biomarkers are present in exhaled breath at a detectable range (parts per million to parts per trillion) 1,8. Recent studies have identified several putative volatile biomarkers associated with multiple cancers, including head and neck, lung, and breast 1,8,9. Moreover, it is presumed that cancer-induced changes in breath samples are detectable at early stages of the disease, and recent studies have supported the possibility of early cancer detection by analyzing VOCs in exhaled human breath sample 2,5,10,11.
Consequently, patients can potentially be screened early, noninvasively, and periodically by identifying unique exhaled breath volatile compositions indicative of cancer. Currently however, there is no gas sensing technology being used in clinical settings for cancer detection. The most commonly used volatile chemical sensing technology is gas chromatography-mass spectrometry (GC-MS), which performs individual component-wise identification of gas mixtures 1. Although the GC-MS-based technique is sensitive and has been shown to identify putative cancer biomarker concentrations in breath samples, it is not suitable for clinical settings, being generally slow, not portable, and requiring pre-processing and storage of samples. Moreover, this component-wise classification approach is fundamentally different from biological olfactory gas sensing and poses challenges regarding diagnostic capabilities due to internal variations of breath samples and environmental factors. Another gas sensing technology, electronic nose (e-nose) devices employ biological principles, such as combinatorial coding approaches, to achieve one shot VOC sensing 12–14. Although, these portable and inexpensive chemical sensors are able to process breath samples in real time, even after decades of development, they still lack specificity, cross-selectivity, and the ability to work in natural conditions 13.
While there are limitations in engineered chemical sensors to detect volatile compounds in natural settings reliably, biology has solved this problem over millions of years of evolution. The canine nose is the most widely used biosensor and remains the state-of-the-art approach for several gas sensing applications including homeland security and explosive detection 15. Trained dogs are also efficient at detecting diseases via human breath and body odors 16–21. However, bioassays based solely on behavior are binary, i.e., disease vs. no disease, and cannot report on different types of diseases. Insects also have an extremely sensitive sense of smell, they are easier to maintain, and can be trained behaviorally to detect specific volatiles.
In our work, we employ a forward engineering approach to ‘sniffing out cancer’ by combining a live insect brain with an electrophysiological recording platform, precision VOC delivery, and sophisticated data analysis tools. Unlike canine gas sensing which uses behavioral readouts, the olfactory neurons in the insect brain do not need to be trained to identify cancer biomarkers. Rather, cancer VOC-evoked neural response templates are used to calibrate the sensor. These VOC composition-specific neural response templates or ‘fingerprints’ can be generated for different types of diseases, which enable us to simultaneously distinguish between multiple cancers. By incorporating a live brain, this approach also harnesses the full power of a biological chemosensory array (antennae) and the associated neural computation (antennal lobe circuitry) for cancer VOC classification.
Insect olfactory sensory systems are extremely powerful and have evolved to detect low concentrations of gas molecules and minute changes in the composition of gas mixtures 22–25. Moreover, the locust olfactory system is well studied for odor-evoked neural coding schemes 26–47, and is accessible for electrophysiological recordings from multiple olfactory brain centers 27,38,47,48. In the insect olfactory system, VOCs are first detected by the olfactory receptor neurons (ORNs) situated in the insect antennae 49. Each ORN generally responds to several VOCs based on their chemical identities and concentrations 50. Employing a combinatorial coding scheme, 50 insect ORNs alone can detect a total of ~250 chemicals, which is several trillion odorants. This enormous encoding capacity coupled with chemical specificity and specialized neural computations render the insect olfactory system extremely powerful for chemical sensing. The ORNs transmit odor-evoked electrical impulses to the antennal lobe, where the signal is processed by a complex network of excitatory projection neurons (PNs) and inhibitory local neurons. Individual PNs have broad tuning curves and they respond to several odorants and odor mixtures with varying spike rates and temporal firing motifs. The odor-evoked spatiotemporal PN population response gives rise to odor-specific neural codes, which are presumed to determine odor identity, intensity, and time course 23,25,51. Our previous work has identified several functional neural coding schemes that can achieve background-invariant odor recognition and novelty detection which is critical for robust detection in natural settings 26–29.
We obtained in vivo extracellular neural recordings from the PNs in the locust antennal lobe and used the neural responses for the detection of VOC mixtures emitted from human oral cancer cells. We hypothesized that locust antennal lobe PNs would respond differently to the VOC compositions associated with different oral cancer cell lines and a non-cancer cell line. We also hypothesized that this approach would be fast, reliable, and sensitive to the differences in VOC mixtures associated with different types of oral cancers. Here, we have systematically tested these hypotheses and demonstrated the feasibility and robustness of this forward engineering approach for noninvasive cancer detection.
Results
Cancer vs. non-cancer VOC compositions elicit distinct olfactory neuronal responses
We began by investigating odor-evoked individual projection neuron (PN) responses in the locust antennal lobe. Volatile chemical mixtures emitted from different cell cultures were delivered to the locust antenna using an olfactometer. Three oral carcinoma cell lines (Ca9-22, HSC-3, and SAS) and one non-cancer cell line (HaCaT) were grown in identical cell culture medium after seeding each cell line at the same initial cell number 52. All four cell cultures were grown individually in airtight flasks for 96 hours (h) to protect the emitted VOCs from contaminants. Precise amounts of the cell culture VOCs were delivered to the locust antennae for 4 seconds (s), while in vivo extracellular neural recordings were obtained from PNs (Fig. 1a, b, see Methods). Cell culture VOC samples were examined at 24 h intervals by in vivo PN recordings. Additionally, we used two control odorants, hexanal and undecane, which have been implicated in earlier studies as putative cancer biomarkers 1.
We observed VOC-evoked changes in neural spiking responses in most of the PNs recorded. Since PNs are broadly selective to several odor stimuli and respond to specific odorants or odor mixtures with distinctive temporal firing patterns 22–24,51, we targeted this neuron population for oral cancer classification. At the individual neuron level, the three oral cancer and the non-cancer VOC mixtures elicited distinct spiking responses over the odor presentation window. Raw voltage traces of representative extracellular neural recordings showed clear differentiation between the oral cancer cells, non-cancer cell, and cell culture medium. Moreover, we noted differences in PN spiking responses between the three oral cancer cell lines (Fig. 1c, d).
Next, we investigated how total spike counts (over the entire 4 s stimulus window) varied for each recorded neuron corresponding to different VOC exposures (Fig. 1e). To identify single neurons, spike-sorting of extracellular multi-channel recordings was performed following previously published methods 53. Then, we used a simple metric of VOC-evoked time-averaged and trial-averaged spike counts of individual PNs for each stimulation condition. Individual PN spike counts were summed over the 4 s stimulus presentation window and averaged across trials (n = 5 trials) to quantify these changes. Next, we compared the average spike count of each PN across two stimulus conditions. For example, PN spike counts corresponding to each oral cancer cell line were compared to the spike counts of the same set of PNs elicited by the culture medium VOC composition. When all recorded PNs were analyzed, several PNs showed significant changes in spike counts across two stimulus conditions (P < 0.05, d.f. = 4, 28, one-way ANOVA with Bonferroni correction). These results demonstrated that there were differences in individual PN spike counts elicited by cancer vs. non-cancer vs. control VOCs. Notice that this analysis only compared total PN spike counts corresponding to different stimuli, but differences in temporal firing motifs of individual PNs as seen in Fig. 1c, d were not reflected in this analysis.
Strategies to classify oral cancer VOCs employing spatiotemporal olfactory neural responses
To incorporate the temporal spiking characteristics, we analyzed the spatiotemporal PN responses elicited by the three oral cancer cell lines, the non-cancer cell line, and the cell culture medium. To generate ‘spatial’ (neuronal identity) – ‘temporal’ (spiking dynamics) response vectors of the entire PN population, trial-averaged firing rates of each neuron were binned into 50 ms nonoverlapping time windows. Individual neuron responses were temporally aligned following stimulus onset. For this analysis, we combined spiking responses from all recorded PNs over multiple days of cell culture. This resulted in a high dimensional population neuron response, which was represented by an n x m matrix (Fig. 2a, n = 194 PNs; m = 80 time bins with 50 ms bin size over 4 s of odor presentation). Next, all recorded PN responses corresponding to each stimulus were concatenated to generate the population PN time-series data for the stimulus panel corresponding to Ca9-22, HSC-3, SAS, HaCaT, and the cell culture medium.
To visualize these cell culture VOC-evoked spatiotemporal neural responses, we projected the high dimensional data onto three dimensions using a linear principal component analysis (PCA, Fig. 2b, see Methods). The points in the three-dimensional PCA subspace were connected in a temporal order to generate stimulus-specific neural response trajectories. We observed that each VOC profile generated a closed loop neural trajectory, which evolved in a unique direction. A long line of work in insect olfaction has established that the unique direction of the population PN trajectories are specific to odor identity and intensity 23,24,26. Our previous work demonstrated that larger angular distances between PN trajectories signify better separability between two odorants 24,26. Therefore, unique neural trajectories corresponding to individual VOC mixtures indicate that oral cancer VOC profiles are distinct from the non-cancer cell line. Moreover, we observed distinctions among the neural trajectories evoked by the three oral cancer cell lines (Fig. 2b), which signify that differences between various oral cancers can be identified by this approach as well.
To determine the separation between the cell line specific neural response clusters, we performed linear discriminant analysis (LDA) on the population PN time-series data (Fig. 2c). Similar to the PCA analysis, we used the population PN time-series dataset and plotted the VOC-evoked PN responses in a three-dimensional LDA subspace. This linear dimensionality reduction technique maximized the neural response cluster separation between stimuli. We observed distinct clustering of PN responses corresponding to all five stimuli, indicating that a linear classifier in a three-dimensional LDA space is sufficient to classify cancer vs. non-cancer successfully based on their corresponding VOC profiles.
To get a quantitative estimate of the classification performance, we performed a leave-one-trial-out cross validation analysis of the PN time-series data (see Methods, Fig. 2d, e). This analysis was performed on the high dimensional dataset (n = 194 PNs, m = 80 time bins) without any dimensionality reduction. First, Euclidian distances of neural response vectors at each time bin (50 ms duration) were compared between the testing and the training data (total 80 comparisons over 4 s for each test trial), generating a bin-wise classification (Fig. 2d). This bin-wise confusion matrix had its highest values along the diagonal, which implied a high rate of successful detection of all five stimuli. Next, we plotted a trial-wise confusion matrix by calculating the mode of the predicted responses for all 80 time bins (Fig. 2e). The trial-wise analysis was implemented to assign one predicted value for each test trial. This analysis showed 100% classification for all three oral cancer VOC mixtures among themselves and in comparison with the non-cancer and control VOCs. Similar dimensionality reduction and confusion matrix analyses were performed on the dataset while including the two other control odorants (Fig. S1). Neural trajectories corresponding to the two control odorants were significantly different from all cell culture VOCs and the confusion matrix analysis showed high classification success for all seven VOCs tested.
Classification of cancer vs. non-cancer cells in a changing chemical background
We anticipated that emitted VOC compositions corresponding to each cell line would vary over time due to cell growth and ongoing metabolic processes in a fixed cell culture medium. We also hypothesized that the neuronal template-based VOC classification approach would be able to compensate for these variations caused by changing environments. To investigate this, the neural data that were previously combined were split and analyzed at four different time points: 24-, 48-, 72- and 96-h after seeding (Fig. S2). All PNs recorded at a specific time point across multiple repetitions of the cell cultures were combined to generate the population PN response vector for that time point. For example, each cell culture was repeated 7 times, and the VOC analysis at the 24-h time point resulted in a total of 42 PNs. All the cell cultures remained viable over 96-h from initiation, which was verified by manually counting healthy cells at different time points of the cell cultures (Fig. S3, S4).
We began by examining VOC-evoked population PN time-series data at 24-h post seeding. We noticed that dimensionally reduced neural trajectories evolved in different directions for different VOC profiles in the PCA space (Fig. 3a). When we performed the same analysis at 48-h, 72-h and 96-h time points, we continued to observe distinct cell line specific neural trajectories, which indicated that all the tested stimuli were distinguishable from each other at different time points of cell growth (Fig. 3a-d). This observation demonstrated that cultured cells started emitting VOCs specific to their identity early (~ 24-h) and remained separable over multiple days based on their emitted VOC profiles. Next, we analyzed the neural cluster separation between the three oral cancer cell lines, the non-cancer cell line, and the control medium at different time points of the cultures using LDA (Fig. 3e-h). Since the number of PNs recorded at each time point was low, PN response clusters showed some overlap in the LDA space. This was also reflected in the time binwise confusion matrix classification results performed in the high-dimensional space (Fig. 3i-l). However, the trial-wise classification result yielded 100% classification success for each test trial for all the VOCs at all four time points (Fig. 3m-p). Notice that we generated VOC-specific neural fingerprints at each time point of the cell cultures and performed leave-one-trial-out cross validation between the test trials and the training templates generated at the same time point.
These results validated our hypothesis that neural response-based classification of cancer VOCs is unaffected by the variations in chemical background caused by evolution of cancer cells in the culture medium. Considering that fluctuations due to internal and external factors is a problem for current breath sensing technologies, the ability to differentiate all four cell cultures over days is a unique feat achieved by this approach.
Neural response-based classification of cancer VOCs is fast
We investigated how short of a VOC exposure will result in robust cancer classification. We hypothesized that a neuron response-based classification approach would be fast and able to classify different VOCs with a short inter-stimulus interval (~ 1 minute). Based on the fast PN response dynamics, we anticipated that distinction between cancer VOCs would be achieved within a few hundred milliseconds of stimulus exposure. To achieve fast analyses of neural signals, we employed a different metric of neural response, which was obtained by root mean squared (R.M.S.) filtering of raw neuron voltage responses (Fig. S5, see Methods). Until now, all classification analyses were performed after spike-sorting of multi-unit extracellular voltage responses obtained from each recording location. However, this approach eliminated neurons that did not pass the statistical test necessary to be counted as single units. These lost signals from unresolved neurons could potentially be important for odor discrimination, therefore, we decided to employ the R.M.S.-based approach which takes into account the total energy of the signal acquired from each location. This approach was computationally less expensive, unsupervised and shown to be odor specific in our earlier work 22. Using the R.M.S. filtered population PN voltages, we observed distinct classification of all 7 VOCs tested (Fig. S6). These classification results were qualitatively similar to the results obtained from spike sorted single unit data.
To determine the speed and efficacy of this method, we performed VOC classification during four different 250 ms time segments of the 4 s stimulus presentation window (Fig. 4). The rationale behind choosing different time windows follows from the unique odor-evoked response dynamics of the projection neurons. PNs generally fire strongly with high spiking rates within the first ~1.5 s of stimulus onset, which is known as the ‘transient state’23,28,32,51. After about 2 s of stimulus exposure, the population PN firing rate converges to a stable firing rate, which stays above baseline firing but does not change significantly over the rest of the odor presentation duration. This is known as the ‘steady state’ response period. It is shown in our and others’ work that odor-evoked transient PN responses are more discriminatory22–24. Therefore, we expected the cell culture VOCs to display the best separation when the population PN responses are within the transient state. We observed that odor plumes took about 0.5 s to elicit spiking responses in PNs. This time corresponded to the delay between the final olfactometer valve opening and the odor plume hitting the antenna. Therefore, we chose the analysis time windows for transient PN response period as 0.5 – 0.75 s and 0.75 – 1 s and the steady state time windows as 2 – 2.25 s and 2.25 – 2.5 s (Fig. 4).
We performed PCA dimensionality reduction analysis to visualize population neural trajectories, which showed distinct trajectories at the earliest of the time windows (0.5 to 0.75 s). The VOC-evoked neural trajectories remained distinct during both transient and steady state time epochs (Fig. 4a-d). Next, we performed the quantitative high dimensional confusion matrix analysis using leave-one-trial-out methodology. We observed better classification during transient state time windows compared to the steady state time windows, evident from the higher value of diagonal elements in the confusion matrix shown in Fig. 4e, f in comparison to Fig. 4g, h. Trialwise classification also showed better predictability during transient state response periods (0.5 – 0.75 s and 0.75 – 1 s) compared to the steady state segments (2 – 2.25 s and 2.25 – 2.5 s, Fig. 4i-l). Finally, when we compared the pairwise R.M.S. response distances of the PN population elicited by all 5 VOCs, we observed the largest separation was also during the transient periods. These sets of results demonstrated that neural response-based cancer classification is fast and only requires 250 ms of neural data from stimulus onset to distinguish oral cancers from controls.
To verify that a one-minute inter-stimulus interval is sufficient for the VOC classification and our results are consistent with the PN response dynamics, we employed the R.M.S.-based classification analysis on the baseline, transient, and steady state epochs of the population PN response (Fig. S7). Each analysis epoch was 1.5 s in duration and the 0.5 s delay for the odor stimulus to reach the antenna was included in the pre-stimulus period. We observed no classification in the baseline period (−1 to 0.5 s), but VOC classification was distinct during the transient (0.5 to 2 s), and steady state (2 to 3.5 s) periods. Overall, VOC-evoked neural responses during the transient period yielded best classification results as expected from the PN response dynamics 24.
Discussion
Current state-of-the-art gas sensing technologies (e.g., GC-MS) analyze a gas mixture to identify individual chemical components and their concentrations. GC-MS has shown promise as a diagnostic technology for a number of diseases including asthma, COPD, cystic fibrosis, diabetes, and cancer 11,54–60. However, there does not exist a single VOC biomarker that is indicative of a specific type of cancer. Instead, subtle changes in VOC compositions indicate altered metabolic processes corresponding to a particular cancer. Moreover, chemicals such as nitric oxide, nitrogen dioxide, and ethane have been observed to be key biomarkers in a number of diseases, yet are difficult to detect with GC-MS due to the lag in sampling-to-processing time 61. This component wise VOC mixture classification approach is also hindered by the variability in VOC compositions in exhaled breath between individuals (internal factors) as well as due to the presence of any background odorants (external factors) in the environment.
While GC-MS has proven essential for volatile chemical identification, the desire for point-of-care clinical implementation has fueled interest in developing low-cost, portable sensors 12,13. Electronic noses have increased in popularity owing to advancements in materials science, nanotechnology, and pattern recognition algorithms. These devices have demonstrated the ability to distinguish between the ‘breath prints’ of healthy controls and those afflicted with diseases, such as cystic fibrosis 62, cancer 63–65 and others 65–69. These point-of-care sensors lack GC-MS’ exceptional chemical sensitivity, typically only achieving detection thresholds in the parts-per-million and parts-per-billion ranges. Although electronic noses have shown promise in some disease classification, sensitivity is of concern when considering clinical implementation, as endogenous volatile compounds are typically found on the order of parts per billion to parts per trillion ranges in exhaled breath 70,71. Moreover, differences in environmental volatiles between patient sampling can lead engineered sensors to produce false classifications and diagnoses 72. Therefore, a background-invariant chemical sensing system is essential for the evolution of breathbased clinical diagnostics.
In biological olfaction, natural selection has forced animals to develop highly sensitive olfactory capabilities while preserving chemical specificity. In the olfactory sensory system, a target VOC mixture as a whole is encoded by a distinct neuronal response template (or a neuronal ‘fingerprint’ of a VOC), while a different gas mixture is uniquely encoded by a different neuronal fingerprint. It is important to note that biology does not perform component-wise classification of gas mixtures, but instead achieves optimal separation between the VOC-evoked neural fingerprints. Through experience, biological systems assign meaning to those neuronal fingerprints (e.g., food vs. harmful odors). For example, the implementation of a neuronal template-based classification approach enables honeybees to detect minute changes in odor mixtures, such as differentiating between a flower with nectar vs. the same flower without nectar based on smell alone. We utilized these biological neural coding schemes in the brain-based cancer detection approach. Based on the odor-specific population neuron responses, we first constructed neuronal fingerprints for the target VOC mixtures (training templates, Fig. 5). While testing an unknown VOC sample, we recorded the responses of the neuronal population and determined how well that test template matched the pre-established training templates (e.g., by measuring Euclidian distance). Based on the best match between the training and testing templates, we determined the identity of the unknown odor (Fig. 5). Note that the performance of this approach does not depend on neuronal identity, which will vary across experiments. This approach is based on recording from a large neuronal population (~40-50 PNs) and finding distinct training templates using known volatiles (calibration process) and then testing unknown VOCs to determine their identity using biological neural computational schemes.
There is overwhelming evidence that cell metabolism is altered in cancer cells relative to normal cells as they switch from glycolysis to oxidative phosphorylation (OXPHOS) for energy production leading to changes in VOC compositions that are vented in exhaled breath of patients 1,6,7,73. We have shown that the cancer cell lines in this study demonstrated functional increases in both glycolysis and OXPHOS relative to the non-cancer cell line 52. Such metabolic profiles are only now being characterized in cancer biology, and a similar coincident increase in glycolysis and OXPHOS has only been documented in a resistant clone of PDAC cancer stem cells (CSC) where suppression of the c-myc oncogene and an increase in peroxisome proliferator-activated receptor-gamma coactivator-1 alpha (PGC-1a) underlie both OXPHOS and glycolysis 74. These metabolic differences account, in part, for the variations in VOCs from these cells. We observed that all the cell lines were healthy throughout the time of culture, however, cell proliferation rates were not optimal (Fig. S4). In order to retain maximal VOC-populated headspace, the cells were grown in a closed flask and without changing the culture medium. HEPES buffer was added to the medium to maintain the optimal pH in the airtight culture. Use of one medium for all cell lines was essential for comparison to the medium only control but may have impacted how the various cell lines grew in culture.
The concept of using canines as a way to detect disease began in 1989 when a dog alerted its owner to a suspicious mole that turned out to be cancer 75. Since then, canines have participated in a plethora of studies in cancer detection, including continued studies for melanoma 20, bladder and prostate cancer via urine 17,76–78, ovarian carcinomas from both blood and ex vivo samples 79, and breast and lung cancers from exhaled human breath 16,19. Canines have also been used to detect other types of sickness, such as hypoglycemia in patients with type I diabetes 80, identify stool samples containing Clostridium difficile from patients admitted to a hospital 81, and even as a diagnostic tool for covid-19, correctly identifying infected patients from patients’ clothing, masks, or breath samples 82. The successes in disease detection by canines led researchers to look into other animals, such as the African giant pouched rat, which has been used successfully to identify tuberculosis in human sputum samples 83,84. Another alternative has been to use insects. Researchers have successfully used the honeybee proboscis extension reflex (PER) to detect both covid-19 from mink throat swabs 85 and tuberculosis odor biomarkers 86. Cells lines of human breast and lung cancers have also been distinguished from healthy tissue by both fruit flies and ants 87,88.
However, all these studies used behavioral training of animals which had a limited binary output and can be impacted by the animals’ innate behavioral preferences. Until now biologically inspired VOC detection efforts have mainly been directed towards reverse-engineering the biological olfactory system’s functionality and implementing those rules in e-nose devices. Some research groups have integrated a few live olfactory sensory receptors in engineered platforms 89, but those devices still lack in chemical discriminability and long-term performance. For example, e-nose devices have recently incorporated live biological olfactory receptors as chemosensor’s 14,90–94, including insect olfactory receptors 95–99. However, the integration of biological sensors into engineering platforms has proven challenging 13. Overall, it has become evident that it will be challenging to reverse-engineer highly efficient and intricate biological olfactory sensory systems for diagnostic means anytime soon, a skill which biology has perfected over millions of years of evolution.
Here, we took a forward engineering approach by ‘hijacking’ an insect brain to detect oral cancers from their VOC signatures. We combined in vivo, multi-electrode, population neuronal recordings with a multi-channel micro-amplifier, high speed data acquisition, and biological neural computations to achieve noninvasive cancer detection. This approach is fundamentally different from current gas sensing devices and animal behavior-based disease detection as it uses a fully functional biological chemosensory array (antennae) and olfactory neural circuits as a gas sensor, and neuronal ‘fingerprints’ of cancer VOC profiles as decoding schemes. We envision this study as the first step in ‘sniffing out cancer by the insect brain’ research that we can employ to detect cancer from human breath. Here, we have performed neural recordings from the brain of live animals whose chemosensory array (antennae) were exposed to VOC mixtures produced by cancer cells in culture. This in vivo neural recording technique can be portable as shown in our previous work 22. In the future, we plan to employ an antennae-attached-whole-brain (without body) in a portable and closed chamber that prolongs brain viability. This cyborg VOC sensing device will be ideal for real-time analysis of breath samples while its rapid detection ability will promote high throughput screening of a large number of VOC samples. Our next objective will be to increase the neural recording capacity and extend the brain viability for several days in a closed chamber as we progress towards the development of a portable, one-shot, point-of-care brain-based VOC sensor.
Methods
Electrophysiology experiments
All neural recordings were conducted on post-fifth instar locusts (Schistocerca americana) of either sex raised in a crowded colony. For in vivo extracellular recordings, locusts were immobilized on a surgical platform and antennae were stabilized. Surgery was conducted following a previously published method24,48. Briefly, a batik wax bowl was constructed to isolate the head region and subsequently filled with a room temperature, physiologically balanced locust saline solution. Exoskeleton and glandular tissue were removed until the brain was fully apparent, and the antennal lobes were desheathed following treatment with protease. A commercial Neuronexus 16-channel silicon probe (A2×2-tet-3mm-150-150-121) with impedances between 200-300 kΩ was used for PN recordings. Voltage signals from PNs were recorded by inserting electrodes about 100 μm into the antennal lobe. A silver-chloride ground wire was placed in the saline bath. Voltage signals were sampled at 20 kHz and digitized via an Intan pre-amplifier board. The digital signals were transmitted to a recording controller and successively visualized and stored using the Intan graphical user interface.
Odor stimulation
A commercial olfactometer (Aurora Scientific, 220A) was used for precision odor stimulus delivery. Purified, zero-contaminant air was used as the carrier stream. Throughout the entirety of the experiment, a constant 200 sccm air flow was passed to the locust antenna via a 1/16 in. diameter PTFE tube, positioned approximately 2-3 cm from the last antennal segment.
During stimulus exposure, 40% (80 sccm) of the carrier stream clean air was replaced with the cell culture VOCs or other odorants. To eliminate any neural response due to sudden changes in airflow during odor delivery and removal, we kept the total volume of the airflow constant before, during, and after odor delivery. Stimulus duration for all experiments was 4 s. Each stimulus was repeated five times with an interstimulus interval of one minute. A 6” diameter funnel pulling a slight vacuum was placed immediately behind the locust antennae to ensure swift removal of odorants. The order of odor stimuli was pseudorandomized for each experiment.
Cell culture
Human oral squamous cell carcinoma (OSCC) cell lines derived from the gingiva (Ca9-22), tongue (SAS), and a site of lymph node metastasis in tongue (HSC-3) were obtained from the Human Science Research Resources Bank (Osaka, Japan). The immortalized normal human epidermal keratinocyte cell line (HaCaT) was obtained from Cell Lines Service (Eppelheim, Germany). As a non-cancer control, HaCaT cells were chosen because they are a nontransformed, immortalized, non-tumorigenic cell line and are widely used to mimic normal stratified squamous epithelium of the oral mucosa100,101. The cells were all seeded at a density of 1 x 106 in T-25 flasks (Nunc™ EasYFlask™ 156340, Thermo Fischer Scientific, MA, USA) with airtight caps. Airtight T25 flasks were constructed prior to the start of any experiment. Inlet and outlet 19-gauge needles were inserted into each flask and stabilized using a low-volatile, two-part epoxy at least 24-h prior to cell seeding. All the cells were cultured at 37°C in 5% CO2 using 5 mL of Dulbecco modified Eagle medium (DMEM, Thermo Fisher Scientific, MA, USA)–high-glucose (4500 mg of D-glucose/liter) medium with 25 mM HEPES and supplemented with 10% fetal bovine serum (FBS) (Biowest, France), and 1% penicillin/streptomycin (Thermo Fisher Scientific, MA, USA). HEPES was used for maintaining the pH values of the cell culture medium. Cells were allowed to grow for four consecutive days and electrophysiological data were collected at each 24-h timepoint post-seeding. Seven replicates of the four-day experiment were conducted. Flasks were maintained with a regulated temperature of 37°C and only removed while conducting experiments (less than 10 minutes). Five mL of the same cell culture medium was also placed in an identical T25 flask and kept in the same conditions as the cell cultures. Hexanal and undecane (1 % v/v in 5 mL mineral oil) were kept in identical T25 flasks and maintained at 37°C during experiments.
Cell culture imaging and cell counting
Prior to each electrophysiology experiment, cell cultures were imaged using an optical microscope (Olympus CKX53). A total of ten images were taken at different pseudorandom locations throughout each flask. Cells were manually counted from every image (n = 1120 total images) using FIJI/ImageJ. The images from each flask taken at each 24-h timepoint were averaged and then converted to the total cell count in each flask. Mean and standard error of the mean (S.E.M.) were calculated for the total cell counts for each timepoint across seven replicates. One-way ANOVA with Bonferroni correction due to multiple comparisons was then used to determine if the cell count at each 24-h timepoint had statistically significant differences (P < 0.05, d.f. = 6, 16, one-way ANOVA with Bonferroni correction).
Data analyses
Data was imported into MATLAB and high pass filtered using a Butterworth filter to remove any frequency components below 300 Hz. The data was analyzed by custom-written code in MATLAB.
Spike sorting
For spike sorting analysis, all data was processed with Igor Pro using previously described methods 53. Detection thresholds for spiking events were between 2.5-3.5 standard deviation (SD) of baseline fluctuations. Single PNs were identified if they passed the following criteria: cluster separation > 5 SD, inter-spike intervals (ISI) < 10%, and spike waveform variance < 10%. A total of 194 PNs were identified using spike sorting from 23 locusts.
Scatter plots
The total number of spikes for each PN during the four s of odor stimulus presentation was computed for each trial (total 5 trials of each stimulus). The mean spike counts ± S.E.M. across trials for each PN were then plotted for two stimulus conditions along X- and Y-axes (e.g., SAS spike counts vs. culture medium spike counts). One-way ANOVA with Bonferroni correction due to multiple comparisons was then used to determine if each neuron had statistically significant differences in mean spike counts to different conditions (P < 0.05, d.f. = 4, 28, one-way ANOVA with Bonferroni correction). Neurons with a statistically significant increase/decrease in spikes along the vertical axis compared to the horizontal axis were plotted in red/blue, respectively. Statistically nonsignificant differences were plotted in grey (Fig. 1e).
R.M.S. transformation of PN voltage response
The filtered data was trimmed to the time window of interest. All data were passed through a 500-point continuous moving R.M.S. filter followed by a smoothing step via a 500-point continuous moving average filter. Stimulus-specific baseline values were calculated as the average voltage over all time bins for the two s prior to stimulus onset. Baseline responses were averaged over all trials and subsequently subtracted from the data to obtain the ΔR.M.S. values. These values were then binned according to the specified bin size and the average of each bin was computed. For each recording location, R.M.S. transformed voltage data of each tetrode were averaged together (Fig. 4).
Dimensionality reduction analyses
We performed two methods of dimensionality reduction – Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). In PCA, we binned baseline subtracted, spike sorted PN signals in 50 ms non-overlapping time bins and averaged over trials (n = 5, each stimulus was repeated 5 times with a 1 min inter-stimulus interval). The baseline response was calculated for each PN by averaging the firing rate over 2 s time windows immediately before stimulus presentation across trials. Recorded PNs were pooled across multiple experiments. For example, in Fig. 2, spike sorted and binned responses of all recorded PNs (194 total) over 24- to 96-h of cell culture were combined to generate a PN number (n=194) × time (t =80) matrix, where each element in the matrix corresponds to the spike count of one PN in one 50 ms time bin. Similar PN population time-series data matrices were generated for each stimulus. PCA dimensionality reduction analysis was performed on the time-series data involving 5 odorants (SAS, Ca9-22, and HSC-3, HaCaT, and culture medium) and directions of maximum variance were found (Fig. 2). The resultant high-dimensional vector in each time bin was projected along the eigenvectors of the covariance matrix. Only the three dimensions with the highest eigenvalues were considered for visualization purposes and data points in adjacent time bins were connected to generate low-dimensional neural trajectories. The trajectories were smoothed using a third order IIR Butterworth filter (Half Power Frequency = 0.15). Finally, all trajectories were shifted to begin at the origin to examine stimulus-specific response dynamics and trajectory divergence. A similar approach was used for PCA analysis in Fig 3, except recorded PNs were separated based on the cell culture time points (e.g., 24-, 48-, 72-, and 96-h) and PCA analysis was done separately for each cell culture time point. For LDA analysis, the same population PN time-series data matrix was used. Here, we maximized the separation between interclass distances while minimizing the within class distances. To visualize the data, time bins were plotted as unique points in this transformed LDA space and stimulus-specific VOC clusters became readily apparent (Fig. 2, 3). The same PCA and LDA analyses were applied to the baseline subtracted R.M.S transformed population PN time-series data (Fig. 4).
Classification analysis
To obtain a quantitative estimate of classification performance, we performed a leave-one-trial-out cross validation method. During each iteration, population PN time-series data from one trial was used as the test data and the remaining trials were used to train a linear classifier (total 5 trials for each stimulus). The linear algorithm aimed to generate a model based on the training data set within the original high-dimensional encoding state space to effectively classify testing data. By considering time bins as points in a high-dimensional space, we were able to calculate an average response vector for the training data corresponding to each stimulus. These neural templates were then used to classify individual time bins of each testing data set. The minimal norm distances between each point corresponding to a time bin of the testing data and the previously calculated average response vectors for the training data were used to assign class identities. The Euclidean (L2) norm was used to quantify classifier predictability in most cases. For results involving R.M.S.-transformed data, the Manhattan (L1) norm was selected as it outperformed the Euclidean norm metric in terms of classifier prediction accuracy (Fig. 4 and S6)._Furthermore, a winner-take-all approach, was also incorporated to calculate the most likely predicted class for each trial. This was performed by considering the mode of all predicted time bins as the trial-wise class identifier. Model performance was illustrated using a confusion matrix, which compared the predicted responses to the true class labels. A fully diagonal matrix indicates 100% classification accuracy.
Supplementary figure captions
Contributions
D.S. conceptualized the study. D.S., A.F., M.P., E.H.A, and C.H.C designed experimental plans. A.F. and M.P. conducted electrophysiology experiments. M.P., E.H.A, and N.L. performed cell culture and imaging studies. C.H.C. provided resources and supervised the cell culture work. Data analysis was done by A.F., M.P, and D.S. Locust colony was maintained by E.C. Paper was written by D.S., A.F., and M.P. All the authors contributed to review and editing of the manuscript. D.S. performed overall project supervision and administration.
Declaration of competing interests
The authors declare that they have no competing financial interests.