Abstract
In many mammals, including rodents, social interactions are often accompanied by active urination (micturition), which is considered a mechanism for spatial scent marking. Urine and fecal deposits contain a variety of chemosensory signals that convey information about the individual’s identity, genetic strain, social rank, and physiological or hormonal state. Furthermore, scent marking has been shown to be influenced by the social context and by the individual’s internal state and experience. Therefore, analyzing scent-marking behavior during social interactions can provide valuable insight into the structure of mammalian social interactions in health and disease. However, conducting such analyses has been hindered by several technical challenges. For example, the widely used void spot assay lacks temporal resolution and is prone to artifacts, such as urine smearing. To solve these issues, recent studies employed thermal imaging for the spatio-temporal analysis of urination activity. However, this method involved manual analysis, which is time-consuming and susceptible to observer bias. Moreover, defecation activity was hardly analyzed by previous studies. In the present study, we integrate thermal imaging with an open-source algorithm based on a transformer-based video classifier for automatic detection and classification of urine and fecal deposits made by male and female mice during various social behavior assays. Our results reveal distinct dynamics of urination and defecation in a test-, strain- and sex-dependent manner, indicating two separate processes of scent marking in mice. We validate this algorithm, termed by us DeePosit, and show that its accuracy is comparable to that of a human annotator and that it is efficient in various setups and conditions. Thus, the method and tools introduced here enable efficient and unbiased automatic spatio-temporal analysis of scent marking behavior in the context of behavioral experiments in small rodents.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
1. We increased the training set size from 39 video clips to 97 video clips and the testing set size from 25 video clips to 60 video clips. The increase in training set size improved the overall accuracy from a mean F1 score of 0.81 in the previous version to a mean F1 score of 0.891 in the current version. The F1 score for urine detection was improved from 0.79 to 0.88. 2. We evaluated the accuracy of the DeePosit algorithm in comparison to a second human annotator and found that the algorithm accuracy is comparable to human-level accuracy. 3. The additional test videos allowed us to test the consistency of the algorithm performance across gender, space, time, and experiment type (SP, SxP, and ESPs). We found consistent levels of performance across all categories (see Figure 3), suggesting that errors made by the algorithm are uniform across conditions, hence should not create any bias in the results. 4. We tested the algorithm performance on a second strain of mice (male C57BL/6) in a different environmental condition (white arena instead of a black one) and found that the algorithm achieves comparable accuracy, even though C57BL/6 mice and white arena were not included in the training set. 5. Analyzing urination and defecation dynamics in an additional strain of mice revealed interesting strain-specific features, as discussed in the revised manuscript. 6. Overall, we found DeePosit accuracy to be stable with no significant bias across stages of the experiment, types of the experiment, gender of the mice, strain of mice, and across experimental conditions. 7. We compared the performance of DeePosit to a classic object detection algorithm: YOLOv8. We trained YOLOv8 both on a single image input (YOLOv8 Gray) and on 3 image inputs representing a sequence of three time points around the deposition event (t): t+0, t+10, and t+30 seconds (YOLOv8 RGB). DeePosit achieved significantly better accuracy over both YOLOv8 alternatives. 8. As for the algorithm parameters, we tested the effect of the main parameter of the preliminary detection (the temperature threshold for the detection of a new blob) and found that a threshold of 1.6C gave the best accuracy and used this parameter for all of the experiments instead of 1.1C which was used in the original manuscript. It's worth mentioning that the performance is quite stable (mean F1 score of 0.88-0.89) for the thresholds between 1.1C and 3C.
https://drive.google.com/drive/folders/13md92rBTyqe1blTBNV1_7ObcudG-Jh1u?usp=drive_link