Abstract
Experimental records of single molecules or ion channels from fluorescence microscopy and patch-clamp electrophysiology often include high-frequency noise and baseline fluctuations that are not generated by the system under investigation and have to be removed. More-over, multiple channels or conductance levels can be present at a time in the data that need to be quantified to accurately understand the behavior of the system. Manual procedures for removing these fluctuations and extracting conducting states or multiple channels are laborious, prone to subjective bias, and hinder the processing of often very large data-sets. We introduce a maximum likelihood formalism for separating signal from a noisy and drifting background such as fluorescence traces from imaging of elementary Ca2+ release events called puffs arising from clusters of channels and patch-clamp recordings of ion channels. Parameters such as the number of open channels or conducting states, noise level, and back-ground signal can all be optimized using the expectation-maximization (EM) algorithm. We implement our algorithm following the Baum-Welch approach to EM in the portable java language with a user-friendly graphical interface and test the algorithm on both synthetic and experimental data from patch-clamp electrophysiology of Ca2+ channels and fluorescence microscopy of a cluster of Ca2+ channels and Ca2+ channels with multiple conductance levels. The resulting software is accurate, fast, and provides detailed information usually not available through manual analysis. Options for visual inspection of the raw and processed data with key parameters, and exporting a range of statistics such as the mean open probabilities, mean open times, mean close times, and dwell time distributions for different number of channels open or conductance levels, amplitude distribution of all opening events, and number of transitions between different number of open channels or conducting levels in asci format with a single click are provided.
Introduction
Recent advances in total internal reflection fluorescence microscopy (TIRFM) provide a pow-erful tool for the functional study of thousands of Ca2+ release channels simultaneously at a single channel resolution in minimally invasive manner, maintaining the physiological environment of the channels (1–3). This technique, called “optical patch-clamp”, has been used to study the Ca2+ flux through several individual channels including N-type voltage-gated Ca2+ channels (1, 4, 5), nicotinic acetylcholine receptors (2, 5), and L-type Ca2+ channels in cardiac muscle (5). Optical patch-clamp was also used to resolve the quantal substructure of elementary Ca2+ puffs and sparklets arising from concerted opening of multiple channels in a cluster of inositol 1,4,5-trisphosphate (IP3) receptors (IP3Rs) and dihydropyridine-sensitive voltage-gated CaV1.2 channels respectively, by directly resolving the opening and closing of individual channels in the cluster (6, 7). Recently, the same technique was employed to resolve the conductance levels in a single ion channel while simultaneously imaging thousands of Ca2+-permeable plasma membrane (PM) pores formed by beta amyloid (Aβ1-42) in the cells with Alzheimer’s disease pathology (8, 9).
Optical patch-clamp generates thousands of time-traces representing the gating of individual channels in a single experiment. However, manual analysis of the data generated is extremely tedious and challenging, which hinders the full utility of this powerful technique. The analysis of fluorescence traces from single channels with a single or multiple conductance levels or concerted opening of multiple channels in a cluster of channels (such as IP3Rs, ryanodine receptors, and dihydropyridine-sensitive voltage-gated CaV1.2 channels) obtained from TIRFM requires the removal of background noise and baseline fluctuations in the data records that are not generated by the channels under study. These fluctuations could arise either from the saturation of dye molecules or drift in measuring equipment itself. Since the data is expected to exhibit quantized steps, the signal and the data (and possibly the noise) have significant autocorrelation times, and simple filtering may not yield a clear separation of signal from background and fails to make use of the known quantal character of the data.
A similar situation arises while using other techniques such as single channel patch-clamp and single molecule fluorescence and photobleaching experiments, where analysis of the experimental data first requires the removal of noise and varying background that is not generated by the phenomenon under study (10–15). In the patch-clamp experiments (16), instabilities in the gigaohm seal formed between biological membrane patch and patch-clamp microelectrode often result in substantial fluctuations in the magnitude of the observed current and a drift in the recorded signal. Such baseline fluctuations are usually significantly slower than the abrupt changes in current magnitude caused by opening and closing of the channel, and have to be removed before the current traces can be analyzed further by standardized software such as QUB (17, 18) and HJCFit (19) that model the channel gating characteristics based on idealized data.
Existing baseline subtraction algorithms can require substantial time and/or user interaction on ion channel data as the program forces user to check each individual fit by eye. Without a robust baseline subtraction algorithm, some experimentalists perform this laborious task by hand where the experimenter follows the quantal jumps by eye and subtract the drifting background. Such manual background-subtraction procedure requires eye-balling and mouse clicking to indicate the background trajectory, which is time consuming and susceptible to subjective bias. To overcome these obstacles, we previously developed a minimally-parameterized likelihood (20) approach for separating the signal from noisy and drifting background, which gives results that agree with the mouse-based method and is much faster (21). However, the requirement to compile and execute the algorithm for processing the data proved a major hurdle for the experimentalists not familiar with compiled languages. While the main goal of this study was to extend the algorithm so that it can be used for both patch-clamp electrophysiology and fluorescence data, we develop a user-friendly graphical user interface (GUI) which is easy to use, flexible, and portable. Furthermore, we incorporate several new features in the software that were not included in the previous version. In addition to extracting idealized trace and drifting background, the software extracts many features from the data, including the mean open probabilities (PO), open times (τO), close times (τC), dwell-time distributions for different conductance levels or number of open channels, overall PO, τO, and τC in the trace, amplitude distribution of all events, and the transition frequencies between different conductance levels or channels open with a few simple steps. We demonstrate the utility of our software by processing patch-clamp data containing single and multiple channels per trace, fluorescence traces from TIRFM representing Ca2+ signals generated by concerted opening and closing of several channels in a cluster of IP3Rs, and fluorescence traces representing the Ca2+ flux through PM pores with multiple conductance levels, formed by Aβ1-42 oligomers associated with Alzheimer’s disease pathology.
Methods
In this section we first outline the main points of the algorithm (detailed derivation can be found in Ref. (21)) followed by details of how we generate synthetic records representing patch-clamp electrophysiology of a single ion channel with varying signal-to-noise ratio (SNR) as well as synthetic fluorescence traces representing the concerted opening and closing of multiple channels in a cluster of IP3Rs. A brief description of sample experimental methods and the data used to demonstrate the utility of the software is also given towards the end of this section.
Theory
We use the Expectation Maximization (EM) algorithm to estimate the number of open channels or conductance levels, which is treated as missing data. The EM algorithm was originally developed by Baum, Welch, et al in the 1960’s and formalized by Dempster et al (22–25).
Assuming that nt is the number of channels open or the conductance level in which the channel is gating and I is the current passing through the channel at time t, the observed signal at time t is given by dt = bt + Int + σξξt. Where ξt is discrete-time Gaussian white noise with moments < ξt >= 0, . Here δtt′ is the Kronecker delta and σξ is the noise strength. We further assume that the baseline is undergoing a discrete time random walk,
where
is discrete-time white noise like ξt. Under these assumptions, it follows that dt − bt + Int and bt − bt−1 are zero mean Gaussian distributed random variables with variances
and
respectively.
The joint distribution for d, b, and n is given as
𝓗 is an “energy” function given by:
where T is the number of data points, d = (d1,d2,..…, dT), b = (b1, b2,..…bT, n = (n1,n2,..…, nT), and θ represents the parameters
,
, and I. N is the normalization factor:
Where and Nch is an upper bound on the number of channels contained in the trace (number of channels patched or number of channels in a cluster of channels). For a single ion channel (or molecule) with multiple equally spaced conductance levels, Nch represents the total number of conductance levels that the system can have, where the closed state of the channel is represented by level 0. dd and db represent the differentials of all the variables being integrated over: e.g.
. In principal we would like to maximize the likelihood of the data given the model, L(d; θ)
but this is unwieldy. Although the integral over T dimensions can be dealt with, the sum is over at least 2T values and cannot be maximized in one step. Thus we employ the iterative EM procedure that attempts to find the maximum likelihood estimate of L by first making an initial guess for
and then iterating the following steps:
In the above denotes the expectation of log p(d, b, n; θ′) with respect to the distribution
The EM iteration is known to converge to at worst a local maximum likelihood estimate for L(d|θ).
Because of the simple dependence that the likelihood function has on b we can integrate b out and then employ a slightly modified version of the EM algorithm to solve the remaining maximum likelihood problem (see Ref. (21) for the derivation).
where b* is the maximum likelihood estimate of b found by minimizing H:
and Δ is the finite difference Laplacian: Δbi = bi+1 + bi−1 − 2bi. Thus
where
As written, the b* in the argument of p(d, n; θ, b*) in Eq (6) is redundant since b* is a known function of d, n, and θ. The reason we have employed this seemingly redundant notation is just to make clear in the algorithm which value of b* (old or updated) we used to compute the current expected values of n.
The normalization constant 𝓝 is given by (see Ref. (21) for the derivation):
The distribution needed for Q in Eq (5), becomes
after integrating out b. Here
is the initial guess for the background signal. To find
we note that in general p(x|y) = p(x,y)/p(y). Thus:
The quantity is the conditional probability that there are nt channels open at the single time point t. To simplify the notation, we write
to indicate the conditional probability that there are j channels open at time t.
where
, m = 0,1,2,3,…Nch. We will use angle brackets without subscripts to denote expectations with respect to
so that
For a single channel or single molecule recording, Nch = 1 and Eq (11) results in 〈nt〉 =
At this point we can write Q explicitly:
The dependence on is only through b* and the angle brackets. The end result of this calculation is that we replace the dependence in 𝓗 on powers of nt with the expected value of those powers: nt → 〈nt〉 and
We maximize
by updating θ iteratively in the following steps.
Step 0: (Initialization) Make initial guess for , I, and
Step I: (Estimate or Expectation) Update the expected values of nt and with respect to p(j|dt, θ, t) which we will denote: 〈nt〉 and
.
At the end of this step nt and in 𝓗 will be replaced by 〈nt〉 and
respectively.
Step II: (Maximize)
A. Update baseline signal
We obtain b* with a standard tridiagonal matrix solver which is very fast* Note the dependence in b* is on rather than on θ. In steps B, C, and D we update the parameters I,
and R2 while keeping b* fixed.
B. Update 𝓘.
The derivative of Q with respect to 𝓘 is:
To maximize Q with respect to 𝓘 we set the above equation to zero which yields:
C. Update
The derivative of Q with respect to is:
To maximize Q with respect to we set the above equation to zero which yields:
D. Update R2
The derivative of Q with respect to R2 is: where
The updated R2 is the single real positive root of
. At this point
.
Return to Step I.
To summarize, we iterate the steps described above in the order Step 0 → Step I → Step II → Step I. So, in the E step, we compute the Nch expectation values of nt for each t and in the M step, we update the parameters based on maximizing Q in Eq. 6.
The algorithm is implemented in widely portable java language in a user-friendly GUI with options to setup initial parameters and graphical display of raw and processed traces. Options to store statistics on the processed data including mean PO, τO, τC, dwell-time distributions for different conductance levels or number of open channels for the processed trace are provided. Information regarding the amplitude distribution of all opening events as well as the frequency of transitions between different number of open channels or conductance levels are also provided. An example of algorithms GUI is shown in Fig 1. A compiled and portable java program with the GUI that performs these calculations and user manual will be provided in the online supplement after the paper is accepted for publication.
Graphical user interface of TraceSpecks. Various controls for reading, displaying, and saving data sets are provided. Text controls are used to provide input parameters for the algorithm as well as displaying the optimum parameters and results determined by the algorithm. Raw trace, background noise signal, and idealized signals can be viewed in the graphical panel of the GUI for visual inspection and can be zoomed in and out as needed.
Data Records
We apply our method to synthetic and experimental data sets from both TIRFM experiments and patch-clamp recordings. Our synthetic data sets comprise of two examples, one representing patch-clamp experiments with variable SNR and another one mimicking florescence traces from a cluster of IP3Rs (26). Experimental data set consists of nuclear patch-clamp recordings from IP3 in Sf9 cells (27), florescence recordings from IP3-induced Ca2+ puffs (28), and Αβ-induced Ca2+-permeable PM pores (8) in Xenopus oocytes.
Synthetic data sets
Details of patch-clamp synthetic data generation can be found in our previous article (21). Briefly, we use 10,000 open and closed events with duration of these events drawn randomly from a uniform distribution. The range of distribution for close events is 5 times that of open events. Channel’s PO is controlled by varying the range of the distribution from which the close time is drawn. We construct background signal to get final simulated channel trace dt = bt + 𝓘 nt + σξξt, where
and ξt are zero mean Gaussian noise with deviation σb and σξ respectively. In addition to background and signal noise, channel current 𝓘 is also treated as normal random variable of mean < 𝓘 > and deviation σξ. nt is the number of channels open at time t, which is 0 or 1 for current case but can be higher for multi-channel/multi-conductance levels synthetic data.
To test the robustness and accuracy of the software towards noise levels present in the experimental data, we generated single channel synthetic patch-clamp data with SNR ranging from 2.1 to 12.6 as described below.
As mentioned above, there are two noise sources in our model, ξt and :
where NO and NC are the mean number of sampling intervals for open and closed event respectively. SNRξ is a standard signal to noise ratio. SNRb indicates how much the baseline drifts relative to the current during the average open or closed event. SNR is the combined signal to noise ratio: SNR
Using
, we vary mean current 𝓘 from 20 to 120 (arbitrary current units) to get a SNR from 2.1 to 12.6.
We also process fluorescence data from a simulated cluster composed of 10 IP3R Ca2+ channels. Details about simulating a single cluster have been published before (26, 29, 30). Briefly, channels were arranged in a 2D array on a patch of endoplasmic reticulum (ER) membrane with an inter-channel spacing of 120nm. The gating of each channel is given by a four state Markov chain model with rest (R), active (A), open (O), and inactivate (I) states connected in a loop (Fig. 2 of Ref.(26)). The channel is open when in state O and closed otherwise. The state of each channel was determined by using a stochastic scheme outlined in (31).
TraceSpecks uses two schemes for calculating open and close dwell-times. As an example, we show an idealized trace with three channels/open conductance levels. (a) In scheme 1, toij and tcij represent the jth dwell-time when a minimum of i channels are open (or the channel is gating in conductance level i or above) and the jth dwell-time when a minimum of i channels are not open respectively. (b) In scheme 2, toij represents the jth dwell-time when i channels are open (or the channel is gating in conductance level i) and tclj represents the jth dwell-time when no channel is open (or the channel is in closed level).
Ca2+ concentration on the cytosolic side of the cluster was controlled by diffusion, flux through IP3Rs, free stationary buffers, mobile buffers, and imaging dye. We considered slow mobile buffer mimicking ethylene glycol tetraacetic acid (EGTA). The propagation of Ca2+ and buffers was solved implicitly using the Laplacian of Ca2+ and buffers in spherical coordinates on a hemispherical volume of radius 5μm and spatial grid size of 5nm. We estimated the fluorescence changes (ΔF/F0) from TIRF microscopy by following the procedure in (32, 33), i.e.
Where is the Ca2+-bound dye at distance rj(x,y,z) due to Ca2+ released by the jth channel at x0 = 0, y0 = 0,
and γz = 0.15 μm. The integration in Eq. 18 was performed over a cubic volume of 1.5 μm × 1.5 μm × 0.15 μm. To make the simulated TIRF signal as close to the experiment as possible, we added experimental noise and drifting background extracted from TIRF microscopy of puff sites in Xenopus oocytes to the simulated fluorescence data.
Experimental data sets
Experimental traces of current passing through single IP3R channels, which are ubiquitous intracellular Ca2+-release channels localized mainly to the ER and outer nuclear membranes (34), were acquired by nuclear patch-clamp electrophysiology as described in Refs.(27, 35, 36). Currents passing through outer nuclear membrane patches isolated at the tips of micro-pipettes were amplified using an Axopatch 200B patch-clamp amplifier (Molecular Devices, Sunnyvale, CA); filtered either at 1 kHz using a tunable low-pass four-pole Bessel filter (Frequency Devices, Ottawa, IL) or at 5 or 10 KHz using the internal tunable low-pass four-pole Bessel filter of the Axopatch amplifier. The current signals were digitized at 5 kHz using an ITC16 interface (HEKA Instruments, Bellmore, NY) and recorded directly onto a data acquisition computer using the Pulse + PulseFit software (HEKA Instruments, Bellmore, NY).
Full details of TIRFM experiments on IP3-induced puffs are given in Ref.(28). Briefly, records of IP3-induced puffs were obtained from Ca2+ imaging of Xenopus laevis oocytes at room temperature by widefield epifluorescence microscopy using an Olympus inverted microscope (IX 71) equipped with a 60xoil-immersion objective, a 488 nm argon ion laser for fluorescence excitation, and a CCD camera (Cascade 128+; Roper Scientific) for imaging fluorescence emission (510 - 600 nm) at frame rates of 30 to 100 s−1. Fluorescence was imaged within a 40 × 40 μm region within the animal hemisphere of the oocyte, and measurements are expressed as a ratio (ΔF/F0) of the mean change in fluorescence (ΔF) at a given region of interest (1.5 μm × 1.5 μm) centered on the putative IP3R cluster, relative to the resting fluorescence at that region before stimulation (F0). Mean values of F0 were obtained by averaging over several frames before stimulation. MetaMorph (Molecular Devices) was used for preliminary image processing. Fluorescence traces representing Ca2+ release events due to channels’ gating in the cluster were extracted from the image sequences using a previously developed software called CellSpecks that can detect all Ca2+ events in a video sequence recorded from cell’s membrane and extracts fluorescence time-traces for individual clusters. Further details about CellSpecks can be found in Ref. (8).
Detailed experimental methods for imaging the activity of PM pores formed by Aβ1-42 oligomers are given in Ref.(8). Briefly, solution containing soluble oligomers prepared from human recombinant Aβ1-42 peptide and aliquots were applied using a glass pipette of tip diameter of ≈ 30 μm to voltage-clamped oocytes of defolliculated stage VI Xenopus treated with fluo-4 dextran. For imaging, oocytes were placed animal hemisphere down in a chamber whose bottom is formed by a fresh ethanol washed microscope cover glass (type-545-M; Thermo Fisher Scientific) and were bathed in Ringer solution (110 mM NaCl, 1.8 mM CaCl2, 2 mM KCl, and 5 mM Hepes, pH 7.2) at room temperature (≈23°C) continually exchanged at a rate of ≈ 0.5 ml/min by a gravity-fed superfusion system. The membrane potential was clamped at a holding potential of 0 mV using a two-electrode voltage clamp (Gene Clamp 500; Molecular Devices) and was stepped to more negative potentials −100 mV when imaging Ca2+ flux through amyloid pores to increase the driving force for Ca2+ entry into the cytosol. The video sequences generated by TIRFM were processed by CellSpecks to extract fluorescence traces for individual Aβ1-42 pores.
Results and Discussion
We use TraceSpecks to idealize experimental time traces from patch-clamp experiments with single and multiple channels in the patch, TIRFM of Ca2+ puffs generated by clusters of IP3Rs, fluorescence traces representing the gating of Ca2+- permeable pores formed by Aβ1-42 oligomers in the PM, simulated TIRF signals from a cluster of IP3R channels as well as synthetic patch-clamp data with variable SNR. In this section, we first demonstrate the robustness and accuracy of our software using synthetic data with variable SNR and then apply it to other data sets mentioned above.
Before presenting our results, we would like to point out that the software uses two different schemes for reporting the dwell-time distributions, mean PO, τO, and τC for different conductance levels or number of open channels in the trace as shown in Fig. 2. Although we only present the dwell time distributions and other statistics from the first scheme (Fig. 2a) in the paper, TraceSpecks saves the results from both schemes in ascii format in separate files.
Synthetic data
As described in the “Methods” section, using , and varying mean current 𝓘 from 20 to 120 (arbitrary units), we have generated synthetic data sets with SNR ranging from 2.1 to 12.6. As shown in Fig. 3, TraceSpecks idealized these noisy synthetic patch-clamp traces with drifting and variable background, and accurately determined various parameters of interest for almost all values of SNR. A sample noisy trace with the background separated by the software and the idealized trace are shown in Fig. 3a, and c respectively. We also show the actual trace without the noise and background for comparison in Fig. 3b. We repeat this experiment for many traces with varying SNR, estimate different features, and compare these estimates with the actual values of these features. In Fig. 3d-f we show the mean squared error for τO, τC, and PO. Even for SNR as small as 2.1, the estimates for these parameters are very good. It is worth mentioning that the SNR in TIRFM experiments is close to 8 (37) while that for the patch-clamp electrophysiology is even higher (38). TraceSpecks extracts the exact values for different parameters of interest for traces with SNR of 5 or above (Fig. 3d-f).
Processing synthetic patch-clamp data with variable SNR of 2.1 – 12.6. (a) Sample raw trace with noisy and drifting background (blue line) and background separated by TraceSpecks (red line), (b) actual trace, and (c) idealized trace. Mean squared error for mean τO (d), mean τC (e), and mean PO (f).
Application to fluorescence data from a simulated cluster, composed of 10 IP3R channels demonstrates (black line in Fig. 4a.) that the software works well for multi-channel data too. The drifting background separated from the signal by the software is shown by the red line in Fig. 4a. The idealized signal representing the expected number of open channels in the cluster at time t is shown in Fig. 4b. For simulations shown in Fig. 4a, we also recorded the number of open channels in the cluster that is shown in Fig. 4c for comparison. It is clear from Fig. 4b and c that the processed signal closely compares with the actual number of open channels. It should be noted that our method also reports τO, τC, PO as well as the dwell-time distributions for different number of open channels and inter-channel transitions (transition from X number of channels open to Y number of channels open at a time t where Y can be less than or larger than X) for the given trace (not shown for this example).
Processing simulated puffs. (a) Noisy and drifting fluorescence time-series generated by a simulated single IP3R cluster with added background extracted from the raw experimental data (black) and the drifting background estimated by the software (red). (b) Idealized signal generated by the software. (c) Number of open channels in the cluster as a function of time from simulation shown for comparison.
Experimental data
In this section, we demonstrate the application of our method to fluorescence time-traces representing IP3-induced Ca2+ puffs by IP3R clusters in the ER membrane and the gating of PM pores formed by Αβ1-42 oligomers, both observed through TIRFM in Xenopus oocytes, as well as traces from nuclear patch-clamp electrophysiology of IP3Rs in Sf9 cells with single and multiple channels per patch.
First, we idealize traces from nuclear patch-clamp electrophysiology of IP3Rs. While the method works equally well for traces with single channel per patch (see for example Fig. 3), we only show slightly complex examples of multiple channels per patch. A summary of eight traces with two to four channels per patch is shown in Fig. 5 and Table 1. A sample raw trace with two channels along with the separated background and idealized trace identified by the algorithm is shown in Fig. 5a and b respectively. Dwell-time distributions when at least one, two, three, or four channels are open in all eight traces are shown in Fig. 5c-e respectively. Our method also provides mean POi, τOi, and τCi for at least 𝓘 channels open for the eight traces (Table 1). For example, PO1 and τO1 is the mean probability and mean open time that at least one channel is open. Similarly, τc1 is the mean time when none of the channels are open (a minimum of one channel is not open) and τC2 is the mean time when either one or no channels are open (a minimum of two channels are not open).
Channel statistics for eight traces from nuclear patch-clamp electrophysiology of IP3Rs in Sf9 cells with two (trace 1-3), three (trace 4-6), and four (trace 7 and 8) active channels per patch. Columns 2-4 and 5-8 are the mean probabilities and mean open times respectively when at least one, two, three, and four channels are open. Columns 9-12 are the mean close times when a minimum of one, two, three, and four channels are not open.
Processing current time-traces that record the gating of IP3R channels during single-channel patch-clamp electrophysiological experiments using nuclei isolated from Sf9 cells. (a) A typical raw current trace (blue line) in which two channels were active, and the background leak current level (red) identified by our software. (b) The number of open channels at any time during the experiment in (a). Dwell-time distributions when at least one (c), two (d), three (e), or four channels are open in all eight traces processed.
We also processed many fluorescence records from TIRFM of IP3-induced Ca2+ puffs in Xenopus oocytes with varying amplitudes (in terms of peak fluorescence or number of open channels during the puff) and duration. We identified up to ten channels in the traces processed and show a range of parameters and statistics extracted from the fluorescence traces by our software in Fig. 6. Panels (a1) and (a2) show a raw trace, representative of IP3-induced puffs (blue) with background noise (red) and the number of channels identified by TraceSpecks respectively. The raw TIRFM signal is the relative fluorescence change due to the cluster activity, averaged over a 1.5 μm × 1.5 μm area surrounding the cluster. As pointed out above, the software allows us to store the mean PO, τO, τC, and dwell-time distributions for different number of channels open in the cluster, and amplitudes of all puffs in a trace in ascii format for plotting and post-processing. As an example, we show the distribution of puff amplitudes represented as the maximum number of channels opened simultaneously in a puff from all traces processed in panel (a3) of Fig. 6.
Processing fluorescence records from TIRFM of IP3-induced Ca2+ puffs in Xenopus oocytes. The first column on the left shows a sample raw trace (blue) and the background separated by the software (red) (a1), the corresponding idealized trace (a2), amplitude (maximum number of channels open simultaneously) distribution of all puffs in all traces processed, and the transition between different number of channels open/close (a4). The remaining four columns show distributions of dwell-times (first row), mean PO (second row), mean τO (third row), and mean τC (fourth row) when at least one (panels b1-b4), two (panels c1 - c4), three (panels d1 - d4), and four channels (panels e1 - e4) are open in all traces processed. Mean δC for channel i represents the mean time when the number of open channels is less than i.
Another key information about the traces representing the activity of multiple channels or single channel with multiple conductance levels is determining the probability of transitions between different number of channels opened or different conducting states. For example, what is the likelihood that Y channels will be open at time t+δt given that there are X channels open at time t, where X and Y don’t have to be consecutive digits? Or what is the probability that a channel in conductance level X at time t will be in conductance level Y at time t+δt? This information can be used for modeling the kinetics of a single channel with multiple conductance levels or a cluster with multiple channels. We report this information as an inter-state or inter-channel transition graph as shown in panel (a4). In this graph, the point at location (X,Y) = (2,1) indicates that there is a transition from 2 simultaneous open channels to 1 open channel (one of the open channels closed). The size of the circle is proportional to the number of such transitions found in all traces we processed. Similarly, the point at location (X,Y) = (1,2) indicates that another channel opened up between time t and t+δt and the total number of open channels is now 2. We also observed transitions where the total number of channels at time t and t+δt are not consecutive digits. The dwell-time distributions (first row), mean PO (second row), mean τO (third row), and mean τC (fourth row) when at least one (panels b1 - b4), two (panels c1 - c4), three (panels d1 - d4), and four (panels e1 - e4) channels are open in all traces are also shown in Fig. 6. We remark that the software saves this information for all channels identified (ten in this case) but we skip the statistics for the remaining six channels for clarity.
The method also works for fluorescence traces representing the function of ion channels having multiple levels with quantal conductances and traces with very low SNR. As an example, we process fluorescence traces from TIRFM of many Aβ1-42-induced Ca2+-permeable pores in the PM of Xenopus oocyte and summarize our results in Fig. 7. We observe a closed level where no Ca2+ flux is passing through the pore (represented by level 0) and up to five open conductance levels (represented by level 1, 2, 3, 4, and 5) (9). In Fig. 7, we show a sample raw trace, the separated background, and idealized trace, the distribution of dwell times, mean PO, mean τO, mean τC when the channel is gating in a conductance level equal to or larger than 1, 2, 3, and 4, and the number of transitions between different conductance levels. We skip the distributions for conductance level 5 as there were only a few instances where the channel opened in that level. The details about the data for Aβ1-42 pores is similar to that of IP3-induced puffs except that the number of open channels from puffs discussion is replaced by conducting levels in case of A㭂1-42 pores. We would like to point out that there are significantly larger number of transitions between levels 0 and 1 in comparison to the puffs data, indicating that Aβ1-42 pores prefer to gate in low conductance levels. Also, in comparison to IP3-induced puffs, we observed a larger number of transitions where the levels involved are not immediately next to each other (for example transitions from level 0 to level 2 and vice versa) (Fig. 7a4).
Processing fluorescence records from TIRFM of Aβ1-42-induced pores in the PM of Xenopus oocytes. The first column on the left shows a sample raw trace (blue) and the background separated by TraceSpecks (red) (a1), the corresponding idealized trace representing the conductance level in which the pore is gating (a2), amplitude (the highest conductance level in which the channel is gating during a single opening event) distribution, and the transition between different conductance levels (a4) for all opening and closing events in all traces processed. The remaining four columns show distributions of dwell-times (first row), mean PO (second row), mean τO (third row), and mean τC (fourth row) when the pore is gating in level 1 or above (panels b1-b4), level 2 or above (panels c1 - c4), level 3 or above (panels d1 - d4), and level 4 or above (panels e1 - e4) in all traces processed. Mean τC for conductance level i represents the mean time when the conductance level is less than i.
Conclusions
Data from fluorescence imaging as well as the routinely employed electrophysiological patch-clamp techniques are contaminated with noise and drifting background that does not arise from the system under study. Removing this noise and fluctuating background is usually the first step in the analysis of the data. In the absence of automated method, experimentalists adhere to the eye and mouse-based approach for processing noisy imaging and patch-clamp data. This laborious manual procedure could cost experimentalists more time for processing the data than the actual experiment.
One example of where the lack of efficient algorithm for separating the signal from background noise could limit the full utilization of very powerful experimental techniques is the high resolution imaging of Ca2+ signals. Recent advances in imaging techniques such as TIRFM offer a paradigm shifting improvement in our ability to study thousands of ion channels in parallel in their native environment (1, 2). This technique called “optical patch-clamp” generates fluorescence time traces from thousands of ion channels in a single experiment. While an algorithm to automatically detect, localize, and measure fluorescence changes during localized Ca2+ signals, imaged through optical patch-clamp has been recently developed (39), the lack of efficient algorithms to extract signal out of the noise and idealize the fluorescence traces remains the main hurdle in accessing a wide range of information about the system under investigation and developing single channel/molecule models based on huge data-sets collected from thousands of channels/molecules in their native environment in these experiments.
In this work, we developed a minimally-parameterized likelihood method that can process both noisy patch-clamp and imaging data with drifting background. The robustness and accuracy of our algorithm is demonstrated by closely recovering the signal from noisy synthetic patch-clamp data with variable SNR and fluorescence traces representing Ca2+ puffs generated by a cluster of IP3R channels. This was followed by the processing of significant data from both TIRFM experiments as well as patch-clamp recordings and extracting many important features from the data.
We implemented our algorithm in the very portable java language with options to setup initial parameters, display optimized parameters, raw trace, separated background, and idealize trace, mean PO, τO, and τC for all channels/conductance levels in the trace, and overall mean PO, τO, and τC of the trace (Fig 1). Option to store all these traces and statistics in addition to dwell-times for the channel gating in different conducting states or different number of open channels in a cluster or a patch, amplitude distribution of all events in terms of the highest number of channels open or conductance level in which the channel is gating during the event as well as the number of transitions between different states in ascii format for plotting and later use is also provided. A compiled java program with user-friendly GUI that performs all these tasks, portable to Mac, PC, and Unix platforms with user manual, will be provided in the online supplement upon acceptance of this paper for publication.
Acknowledgements
This work was supported by NIH through grants R01 AG053988 (to AD and GU), R01 GM065830 (to DODM, IP, and JEP), and R37 GM048071 (to IP).