Abstract
This paper documents the development of a novel method to predict the occurrence and exact locations of mitochondrial fission, fusion and depolarisation events in three dimensions. These occurrence and location of these events were successfully predicted with a three-dimensional version of the Pix2Pix generative adversarial network (GAN) as well as a three-dimensional adversarial segmentation network called the Vox2Vox GAN. The Pix2Pix GAN predicted the locations of mitochondrial fission, fusion and depolarisation events with accuracies of 35.9%, 33.2% and 4.90%, respectively. Similarly, the Vox2Vox GAN achieved accuracies of 37.1%, 37.3% and 7.43%. The accuracies achieved by the networks in this paper are too low for the immediate implementation of these tools in life science research. They do however indicate that the networks have modelled the mitochondrial dynamics to some degree of accuracy and may therefore still be helpful as an indication of where events might occur if time lapse sequences are not available. The prediction of these morphological mitochondrial events have, to our knowledge, never been achieved before in literature. The results from this paper can be used as a baseline for the results obtained by future work.
Introduction
Mitochondria are highly dynamic organelles present in most mammalian cells and perform the vital biochemical processes of cell respiration and energy production. These organelles are highly networked and capable of rapidly changing form and function to meet the physiological demands of the cell [1]. If mitochondrial DNA (mtDNA) is mutated, it can lead to various common age-related diseases, which includes Alzheimer’s disease and certain types of cancer [2–7]. To reduce the risk of mtDNA mutation accumulation, a mitochondrion can undergo dynamic events such as fission, fusion and mitochondrial autophagy (mitophagy).
Fission is the division of a mitochondrion into two or more distinct daughter organelles and fusion is the inverse of this process. The fission and fusion processes play vital roles in the quality control of cells and mitochondrial health, which is crucial for cellular homeostasis [8].
Adenosine triphosphate (ATP) is produced in the mitochondria through a process called oxidative phosphorylation, for which the mitochondria must have constant membrane potentials [9]. Negative by-products of this process are reactive oxygen species, which are free radicals capable of damaging the organelles and possibly leading to the mutation of its DNA [10]. The organelle can undergo membrane depolarisation (changing the polarity of its membranes) to reduce the production of these reactive oxygen species [11,12]. If depolarisation is sustained for prolonged periods, it can lead to cell death [13].
A high-throughput method that can effectively predict the occurrences and exact locations these mitochondrial events, will greatly enhance the potential for research of these organelles. This may, in turn, aid the development of diagnosis and treatment strategies for human pathologies associated with mitochondrial dysfunction.
1 Related work
1.1 Mitochondrial Event Localiser (MEL)
The localisation of mitochondrial fission, fusion and depolarisation events, especially in three-dimensions, has, until, recently been done manually. The mitochondrial event localiser (MEL), developed by Theart et al., [14], is an automatic, deterministic and high-throughput analysis system. MEL takes a time-lapse sequence of fluorescence microscopy z-stack images that was stained for mitochondria (with TMRE) as input and provides the three-dimensional locations of mitochondrial fission, fusion and depolarisation events as output.
MEL is, to our knowledge, the only method currently available to automatically localise mitochondrial events. The method is not perfect, and does introduce false-positive events during localisation. For this reason the outputs of MEL are manually evaluated by the user by comparing the localised events to the relevant frames of the time-lapse sequence using a tool developed by the author of MEL, which is shown in S1 Figure. This process allows for the removal of false-positive event localisations, but does not solve the problem of “missed” events.
For this reason the dataset created by MEL cannot be viewed as the absolute ground-truth, but purely a good approximation of it. It is however not feasible to create such a training dataset by hand since it would take to long to complete and not necessarily guarantee an increase in accuracy.
An adapted version of MEL was used in order generate the labelled data for this paper.
1.2 Pix2Pix Generative Adversarial Network (GAN)
The Pix2Pix GAN was developed specifically for image-to-image translations and was first introduced in 2016 by Isola, et al., [15]. In order to train the network, both input and target output images are passed to the network. The generator network encodes the input image to a lower dimensional latent vector and then tries to generate an output image that is indistinguishable from the target image. The discriminator classifies its input images as real or fake.
1.3 Vox2Vox GAN
The Vox2Vox GAN is a three-dimensional adversarial image segmentation network [16]. It is similar in architecture and principle to the Pix2Pix GAN, but differs with an alteration to the bottle-neck. The Vox2Vox generator does not encode the input z-stack to a single row latent vector, but instead to a three-dimensional array. This array is then passed to a bottle-neck with four Res-Net style encoder blocks and then decoded to the output image.
2 Materials and methods
For this project, U-118MG mammalian cells were purchased from the American Type Culture Collection. These cells were supplemented with Dulbecco’s Modified Eagles Medium1, 1% penicillin-streptomycin2 and 10% fetal bovine serum3. They were incubated in a SL SHEL LAB CO2 humidified incubator in the presence of 5% CO2 at 37C.
To allow for visualisation of the mitochondrial network, the U-118MG cells were seeded in 8-chamber Nunc® Lab-Tek® II dishes and incubated with 100 nM tetramethylrhodamine-ethyl ester (TMRE)4 for 5 minutes in the presence of 5% CO2 at 37C. TMRE is used widely when assessing mitochondrial dysfunction and has a good signal-to-noise ratio [17].
2.1 Microscopy
A Carl Zeiss Elyra PS1 microscope was used to conduct live cell confocal microscopy. Z-stacks images were acquired with a 0.5 μm step width, Plan-Apochromat 100x/1.46 oil DIC M27 objective, GaAsP detector and a 561 nm laser as illumination source. A time-lapse sequence of mitochondrial dynamics was recorded with 10 s intervals for 30 cycles. The laser was used at a low power setting to limit photo-bleaching and photo-oxidative stress during the acquisition period.
2.2 MEL training data generation
MEL was used to localise mitochondrial fission, fusion and depolarisation events for 44 three-dimensional distinct fluorescence microscopy z-stack time-lapse sequences of healthy, untreated, cells. The parameters proposed by Theart et al., [14] were used to create the dataset for the training of the neural network.
A review tool was developed by Theart et al., [14], which was used to validate the legitimacy of the events localised by MEL by manually comparing the relevant frames of the time-lapse sequence to the localised events. False-positive events were removed form the dataset through this process.
After the removal of all of the unsatisfactory samples, a dataset containing 944 z-stacks remained. From this, 94 z-stacks (approximately 10%) were used as validation z-stacks and the remaining 850 z-stacks as training z-stacks.
In order to create a subset of the best representative samples, these 944 z-stacks were once again manually evaluated by comparing their relevant frames in the time-lapse sequence to the validated events in order to determine the most appropriate z-stacks that should be used for training. A total of 179 z-stacks were selected as the best samples for training, which was divided into 160 training and 19 validation z-stacks. For the remainder of this paper, the dataset containing all 944 z-stacks will be referred to as the complete dataset and the subset containing 179 z-stacks as the clean dataset.
2.3 Dataset Deviation From Absolute Ground-Truth
The MEL method uses various image pre-processing steps, which, however vital for the process, leads to localisation errors such as false-positive event localisations or the omission of true-positive events. The choice of pre-processing parameters could have an effect on the binarisation, which would affect the localisation accuracy of MEL. It is however not possible to eliminate the false-positives or omission of true-positives and as a consequence when choosing the MEL pre-processing parameters the aim is to find the best compromise of these two.
A common cause of localisation errors was the binarisation of low intensity bridges which were created during imaging between two structures that were in close proximity. These low intensity bridges were incorrectly binarised by the thresholding process, consequently creating one large structure instead of two separate structures. MEL only uses these binarised z-stacks, and not intensity z-stacks, to localise the mitochondrial events. S.2 in Figures shows a good example in which MEL would have incorrectly localised a fusion event due to the thresholding of a low intensity bridge between structures.
To ensure that the neural networks are not adversely affected by this limitation of binarised data, the neural networks, discussed in this paper, received the deconvolved z-stacks as input, rather than the thresholded z-stacks. It was expected that the networks will predict events that were not localised by MEL, because the network is expected to also discern between mitochondrial structures and the background. This would imply that the low intensity bridges, which were created by the thresholding process in MEL, would not be passed as input to the neural network. In contrast, the MEL method received thresholded z-stacks with a clear boundary between mitochondrial structures and the background, which limited its localisation performance. It was presumed that the neural network would perform better than a thresholding method when discerning between foreground and background pixels.
This lack of accurate ground-truth data complicated the optimisation of the neural networks, because the validation loss cannot be used as the metric for determining the optimal point of the network. Due to the use of the deconvolved z-stacks as input, it was likely that the neural network would predict more (true-positive) events than which were localised by MEL. This could lead to a high validation loss but a more accurate representation of the mitochondrial events.
2.4 Event location penalised MAE loss (ELP-MAE)
The nature of the training data meant that standard loss functions would be unsuitable for our application. The mean absolute error (MAE) loss was modified by adding a penalty factor to the voxels in the generator output that did not contain the events localised by MEL. To our knowledge, adapting the standard MAE loss has not been documented before. This modified MAE loss was named the event location penalised MAE loss (ELP-MAE).
To do this, MEL was first altered to provide two z-stacks as output. One z-stack containing only the localised events, and the second the localised events superimposed on the first input frame of the time-lapse sequence. The two output z-stacks of MEL are illustrated in Fig 1.
The voxel values of the events localised by MEL (Fig 1A) were all assigned a value of 1 and the surrounding background was assigned a value of 0. This array is henceforth referred to as the thresholded events z-stack.
The ELP-MAE is designed to receive three z-stacks, namely the ground-truth, the generator output and the thresholded events z-stack are passed as input to the loss function. However, the implementation of loss functions in the Keras Python library allows for only two inputs, namely the predicted value (ypredicted) and the ground-truth value (ytrue).
In order to overcome this restriction of the Keras library, the thresholded events z-stack was concatenated with the ground-truth z-stack (shown in Fig 1B) to create a 2 × 8 × 128 × 128 × 3 array, which will henceforth be referred to as the stacked array. For the implementation of the ELP-MAE, the stacked array is passed as the ytrue input, whilst the output of the generator remained the ypredicted input to the loss function.
Fig 2 provides an illustration of the ELP-MAE loss function implementation within the Keras Python library on a 2 × 2 pixel example image. The black pixels indicates background pixels with a value of 0, and white pixels are the foreground pixels with a value of 1. The red pixel indicates a mitochondrial event, also with a pixel value of 1.
Within the function that implements the ELP-MAE, the 2 × 8 × 128 × 128 × 3 stacked array is split into two 8 × 128 × 128 × 3 z-stacks. The one z-stack is the ground truth (ytrue) and the other one the thresholded events z-stack. The absolute error between the ground-truth and the generator output (ypredicted) is calculated and the thresholded events z-stack is multiplied with a user-defined penalty value (100 in this example), after which a value of 1 is added. This is done to prevent the elimination (multiplication with zero) of error values in areas where mitochondrial events are not present in the thresholded events array. If this addition of 1 is not implemented, only the absolute errors in the areas containing mitochondrial events will be used to update the parameters of the GAN. The network would perceive that the surrounding regions have an error value of zero and therefore would not contribute to the training of the generator. This would result in an untrainable model.
The final step of the ELP-MAE loss function is to multiply the penalised thresholded events z-stack with the absolute error, after which the mean of the voxel values is calculated.
During the training iterations it was found that increasing the penalty factor led to increased detail in the predicted events, however, this also caused an increase in the creation of unwanted random noise artefacts by the generator. These noise artefacts were reduced with noise-reduction training methods, which will be discussed later. Fig 3 shows the noise artefacts that were generated by the generator for one of the training images.
It was also found that if the penalty factor was too small, the neural network was unable to predict the locations of mitochondrial events. On the other hand, if the penalty was too large, excessive noise artefacts were generated. The penalty factors that were used were iteratively determined during training.
A pertinent shortcoming of the neural network training, which is discussed in this paper, is the lack of an accurate method to validate the accuracy of the predictions made by these networks. The dataset generated by MEL cannot be viewed as the absolute ground-truth, since localisation errors due to image pre-processing and spatial movement of the mitochondria affects the accuracy of its localisations. For this reason, the ELP-MAE loss for the validation dataset can not be monitored during training to determine when the network has reached an optimal point. Visual inspection of the generator’s outputs after every few iterations was the only viable method of monitoring the accuracy. This complicated the optimisation of the neural network, because alterations leading to slight improvements in training were difficult to notice through visual inspection.
2.5 Mitochondrial event location prediction in three dimensions
Two methods of predicting the locations of mitochondrial fission, fusion and depolarisation events in three dimensions were investigated. The first method was with the use of the Pix2Pix GAN and the second with the Vox2Vox GAN. These networks received a single z-stack from a time-lapse sequence as input and were tasked with producing z-stacks containing the event locations.
2.5.1 Three-dimensional Pix2Pix GAN generator architecture
The generator network that was used in the Pix2Pix GAN is similar to the standard architecture, as described by Isola et al., [15]. It contains an encoder and a decoder network with skip connections between them, as illustrated by Fig 4.
The encoding region of the generator was constructed from six encoder blocks, each containing a three-dimensional convolution layer with kernels of size 4, followed by an instance normalisation and a Leaky ReLU activation layer. Normalisation was not applied in the first encoding block in order to preserve the semantic information of the input z-stack, which is often lost in the encoder. The output of each encoding block is passed across the network via a skip connection to a corresponding decoding block. The stride lengths and kernel sizes used in each encoding block are listed in Table 1
The final encoding block is followed by the bottle-neck, which contains a single convolutional layer and ReLU activation layer. Normalisation was not applied to the bottle-neck because it would zero the activations, which will cause the network to skip the bottle-neck [15].
The decoder blocks were constructed from three-dimensional up-sampling layers, followed by three-dimensional convolutional layers with kernels of size 4 and stride lengths of 1 in all three directions. After this, instance normalisation, the ReLU activation function and dropout with a rate of 50% were applied. The output of the decoder block was concatenated along the colour axis with its corresponding skip connection. Dropout layers were only used in the first four decoder blocks, as was suggested by Isola et al., [15]. Normalisation was not applied in the final decoding block 223 and the hyperbolic tangent activation function was used to generate output z-stacks 224 with voxel values in a range of −1 to 1, which is the accepted norm for GANs. The 225 parameters that were used for each decoder block in the generator architecture are listed 226 in Table 2. The additional decoder block in the generator architecture is used to scale 227 the output of the previous decoder block to the desired dimensions of the target z-stack. 228
The number of filters used for the convolutional layers, and consequently the depth of the neural network, was maximised for the available VRAM of 12 GB.
2.5.2 Three-dimensional Pix2Pix GAN discriminator architecture
The architecture of the discriminator, as discussed in Isola et al., [15] was modified to work for three-dimensional z-stacks. The concatenated input and target/generator output z-stacks were used as input to the discriminator network. The network was constructed from 5 encoding blocks, which reduced the input z-stack to a 1 × 8 × 8 × 1 patch of probability values. Each voxel in the 1 × 8 × 8 × 1 patch output of the discriminator relates to a 8 × 94 × 94 cube of voxels in the input z-stack. This relation is called the receptive field of the generator. The architecture of the discriminator is shown in Fig 5.
The discriminator encoding blocks were constructed from three-dimensional convolutional layers with kernels of size 4. The convolutional layers were followed by Leaky ReLU activation and instance normalisation layers. A slope of 0.2 was used for the Leaky ReLU activation function. In order to preserve semantic information, such as the colours of the input z-stacks, normalisation was not applied in the first block, as was suggested by Isola et al., [15]. In order for the discriminator to assign values close to or equal to 0 for fake z-stacks, and 1 for real z-stacks, the output of the last encoding block was passed to a sigmoid activation function. The binary cross-entropy loss was used as the loss function for the discriminator. The parameters that were used for each of the discriminator encoding blocks are listed in Table 3.
The clean dataset, which contains 179 z-stacks as discussed in Section 2, was used to train the network with a large penalty value in the ELP-MAE loss function. This created random noise artefacts in the output z-stacks, which were reduced by continuing the training of the network using the complete dataset with a low penalty value. This continuation of training with thecomplete dataset will henceforth be referred to as noise reduction training.
For this project, The training of the Pix2Pix GAN with the clean dataset was deemed to be complete after 545 epochs with 16 mini-batches of size 10 and a ELP-MAE loss penalty value of 50 000. This penalty value was iteratively determined to be large enough to ensure the correct training of the generator, whilst not generating excessive amounts of noise.
Training with the clean dataset was deemed to be sufficient after 545 epochs, because the network predicted mitochondrial events with bright, round kernels on the validation z-stacks and the generator training loss remained constant.
Training of the network was continued using the complete dataset containing 944 samples, for an additional 25 epochs, with 85 mini-batches of size 10, and a penalty value of 500. After the 25 epochs of noise reduction training there was no further reduction in noise artefacts and the brightness of the predicted events began to fade. The penalty value of 500 was small enough to allow for a significant reduction of the noise artefacts, whilst not completely eliminating predicted events. Fig 6 shows the difference in results obtained on one of the validation z-stacks, before and after noise reduction training with the complete dataset.
2.6 Vox2Vox GAN
The Vox2Vox GAN was specifically designed for three-dimensional adversarial z-stack segmentation tasks. Similar to the Pix2Pix GAN, the Vox2Vox received a single frame of a time-lapse sequence as input. The architecture and the training of the Vox2Vox GAN will be discussed in this section.
2.6.1 Vox2Vox GAN generator architecture
The Vox2Vox GAN was constructed according to the specifications discussed in Cirillo et al., [16]. The architecture of the generator consists of an encoder, bottle-neck, decoder and skip connections, as illustrated in Fig 7. The generator architecture of the Vox2Vox GAN is shallower than that of the Pix2Pix GAN. This was due to an increase in the trainable parameters of the network (329 546 755 for the Vox2Vox GAN compared to 129 546 755 for the Pix2Pix GAN), as a consequence of the short skip connections in die bottle-neck of the Vox2Vox generator. Theoretically, a deeper model would preform better and should be investigated with improved hardware.
The encoder was constructed from four encoding blocks similar to those that were used for the Pix2Pix GAN. This was followed by the bottle-neck region, which contained four bottle-neck blocks with Res-Net style short skip connections. These blocks were constructed from three-dimensional convolutional layers with kernels of size 4, followed by instance normalisation, Leaky ReLU activation and dropout layers.
All Leaky ReLU functions had a slope of 0.2 and the dropout was applied at a rate of 20%, as suggested by Cirillo et al., [16]. The parameters that were used in the encoding and bottle-neck blocks are listed in Table 4.
The Vox2Vox generator’s decoding region was constructed from five decoding blocks with similar architectures to those that were used for the Pix2Pix GAN. A fifth decoding block was added used scale the output of fourth decoding block to the dimensions of the target output image.
The decoder blocks were constructed from a three-dimensional up-sampling layer, followed by a convolutional layer with kernels of size 4 and strides lengths of 1 in all directions. The ReLU activation and instance normalisation layers were used in these blocks. Similar to the Pix2Pix GAN, and as dictated by convention [18], normalisation was not applied in the last decoder block and the hyperbolic tangent activation function was used to normalise the voxel values in a range from −1 to 1. The parameters that were used for the decoding blocks of the Vox2Vox GAN are listed in Table 5. The listed output sizes are the dimensions after the decoder block output was concatenated with the skip connections.
The architecture of the Vox2Vox discriminator is identical to that of the Pix2Pix GAN as was shown in Fig 5.
2.6.2 Vox2Vox GAN training
Training of the Vox2Vox GAN was done using the same principles as for the Pix2Pix GAN. The network was trained for 514 epochs with an ELP-MAE penalty of 50 000, using the clean dataset, which consisted of 40 mini-batches, each with a batch size of 4.
After 514 epochs of training, the generator loss stabilised, and events were predicted with bright kernels. Similar to the Pix2Pix GAN, the penalty value was determined iteratively to be large enough for the sufficient penalisation of the areas containing mitochondrial events. This ensured the optimal training of the generator, whilst not generating excessive amounts of noise.
To reduce noise, the network was trained using the complete dataset for an additional 26 epochs with an ELP-MAE penalty of 500. For this training, 170 mini-batches with a batch size of 5 were used. After 26 epochs the validation loss began increasing, which indicated over-fitting, and no further noise reduction was observed. The output of the network after 514 epochs and 540 epochs of training, is shown in Fig 8 for one of the validation z-stacks.
3 Results and discussions
3.0.1 Pix2Pix GAN results
The output z-stacks of the Pix2Pix GAN for the 19 validation z-stacks of the clean dataset were manually evaluated to determine the accuracy with which the network can predict the three-dimensional locations of mitochondrial fission, fusion and depolarisation events. This was done by comparing the predicted events to the relevant frames of the time-lapse sequence. The 19 validation z-stacks of the clean dataset were of a high quality and it was decided that using the low quality validation z-stacks from the complete dataset would not be an accurate indication of the network’s event location prediction capabilities, because MEL also performed poorly on these z-stacks.
The total number of predicted and correctly predicted fission, fusion and depolarisation events are summarised in Table 6.
3.0.2 Discussion and conclusion regarding the three-dimensional Pix2Pix GAN
A three-dimensional variation of the Pix2Pix GAN was trained to predict the locations of mitochondrial fusion, fission and depolarisation events. This implementation achieved accuracies of 35.9% for fusion, 33.2% for fission and 4.90% for depolarisation events, as was shown in Table 6.
Low accuracies were expected since a relatively small dataset was used to train the network. The ground truth data generated with MEL is not an absolute ground-truth and therefor it is expected that a larger and more accurate dataset will yield better results.
It is also possible, however, unlikely that these percentages are the upper regions of accuracy when attempting to predict the three-dimensional locations of mitochondrial fission, fusion and depolarisation events with a single z-stack as input.
3.0.3 Vox2Vox GAN results
As for the Pix2Pix GAN, the accuracy of the Vox2Vox GAN, for the location prediction of mitochondrial fission, fusion and depolarisation events, was determined by manually evaluating the results of the network for the 19 validation z-stacks of the clean dataset. The total number of predicted and correctly predicted events are listed in Table 7.
3.0.4 Discussion and conclusion regarding the results of the Vox2Vox GAN
The location prediction accuracies of the Vox2Vox GAN were 37.1%, 37.3% and 7.43%, respectively for fission, fusion and depolarisation events, as was listed in Table 7.
The Vox2Vox GAN also achieved better accuracies than the three-dimensional Pix2Pix GAN. The Vox2Vox GAN achieved this regardless of being a shallower network, which could adversely affect the accuracy of its predictions. This improvement could be ascribed to the Res-Net blocks, which are used in the bottle-neck of the generator architecture. These blocks passed additional information through the bottle-neck via the short skip connections.
4 Comparison of results with MEL accuracies
MEL was used to generate the ground-truth data that was used to train the Pix2Pix and Vox2Vox GANs. MEL achieved accuracies of 41.0%, 38.0% and 10.0% for fission, fusion and depolarisation events, respectively, for the dataset used in this paper. These accuracies were obtained by validating the events that were localised by MEL.
The Pix2Pix and Vox2Vox GANs (Tables 6 and 7) achieved comparable accuracies and the exceptionally low depolarisation event localisation accuracy of MEL explains the low accuracies of the GANs when predicting the locations of these events.
The GANs made their predictions using a single frame of a time-lapse sequence as input, in contrast to the two successive frames of a time-lapse sequence required as input for MEL. Notably, the GANs required less input information by predicting mitochondrial events using the information encoded only in the morphology of the mitochondria to achieve accuracies that are comparable to those achieved by MEL. Furthermore, the GANs made these predictions at a fraction of the time MEL requires to process the images. Neural networks have an inference time of a few seconds whereas MEL took 3-4 minutes to process a pair of z-stack images.
Conclusion
In this paper a modified version of the Pix2Pix and Vox2Vox GANs were trained to predict the three-dimensional locations of mitochondrial fission, fusion and depolarisation events using a single frame of a time-lapse sequence as input. The training dataset was generated using the MEL method.
The Pix2Pix GAN was capable of predicting the locations of mitochondrial fission, fusion and depolarisation events with an accuracies of 35.9%, 33.2% and 4.90%, respectively. Similarly, the Vox2Vox GAN was capable of predicting the locations of these events, with accuracies of 37.1%, 37.3% and 7.43%. The Vox2Vox GAN therefor outperformed the Pix2Pix GAN for the task of three-dimensional mitochondrial event location prediction.
The low prediction accuracies, although comparable to the un-validated results achieved by MEL, can be ascribed to inaccuracies in the dataset that was used for training of the neural networks. It is expected that these location prediction accuracies will improve with a larger and more accurate dataset. However, it may be possible that the accuracies obtained by the neural networks were the highest accuracies that can be achieved with these methods.
The location prediction accuracies that were obtained are currently too low for a viable implementation of these tools in the life sciences industry. However, it is clear that the morphology of the mitochondria were modelled by the networks to some degree of accuracy. These networks can be used for a qualitative study on where regions of mitochondria are likely to undergo events.
The use of deep neural networks to predict the occurrence and exact location of mitochondrial fission, fusion and depolarisation events in three-dimensions, has, to our knowledge, never been achieved before. For this reason, fission and fusion event location prediction accuracies in excess of 30% is acceptable. Future research projects can use these results as baseline for their results.
Acknowledgments
The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged (grant MND200423516047). Opinions expressed and conclusions arrived at are those of the author and are not necessarily to be attributed to the NRF.