ThruTracker: Open-Source Software for 2-D and 3-D Animal Video Tracking

Tracking animal movement patterns using videography is an important tool in many biological disciplines ranging from biomechanics to conservation. Reduced costs of technology such as thermal videography and unmanned aerial vehicles has made video-based animal tracking more accessible, however existing software for processing acquired video limits the application of these methods. Here, we present a novel software program for high-throughput 2-D and 3-D animal tracking. ThruTracker provides tools to allow video tracking under a variety of conditions with minimal technical expertise or coding background and without the need for paid licenses. Notable capabilities include calibrating the intrinsic properties of thermal cameras; tracking and counting hundreds of animals at a time; and the ability to make 3-D calibrations without dedicated calibration objects. Automated 2-D and 3-D workflows are integrated to allow for analysis of largescale datasets. We tested ThruTracker with two case studies. The 2-D workflow is demonstrated by counting bats emerging from bridges and caves using thermal Videography. Tests show that ThruTracker has a similar accuracy compared to humans under a variety of conditions. The 3-D workflow is shown for making accurate calibrations for tracking bats and birds at wind turbines using only the wind turbine itself as a calibration object. ThruTracker is a robust software program for tracking moving animals in 2-D and 3-D. Major applications include counting animals such as bats, birds, and fish that form large aggregations, and documenting movement trajectories over medium spatial scales (∼100,000 m3). When combined with emerging technologies, we expect videographic techniques to continue to see widespread adoption for an increasing range of biological applications.


38
Video-based animal tracking is a widely used tool in fields as diverse as biomechanics, animal behavior, 39 ecology, and population monitoring (Dell et al., 2014). The reduced price of thermal video cameras, 40 high-speed cameras, and other technologies such as unmanned aerial vehicles (UAVs) has expanded 41 dramatically the capabilities of investigators studying animal movement in the lab and in the field 42 (Jackson, Evangelista, Ray, & Hedrick, 2016). This has placed great demand for video processing 43 Intrinsic and extrinsic camera calibrations are used to generate direct linear transformation (DLT) equations that transform UV coordinates into real-world 3-D or XYZ points for each detection. XYZ points are then stitched together across frames to generate 3-D tracks. 8 cameras and lens of the same make and model. In some special circumstances one might need to 123 calibrate each individual camera-lens pair for increased accuracy. Cameras with variable focal length 124 (i.e., a zoom lens) typically need to be calibrated separately at each focal length used for recording. 125 We use MATLAB's built-in camera calibratio functions (Bouguet, 1999) with some modifications to 126 calibrate thermal images (Yahyanejad, Misiorny, & Rinner, 2011). We recommend using MATLAB's built-127 in camera calibrator app or freely available functions in OpenCV or Argus (Jackson et al., 2016) for 128 calibrating light-based cameras. 129 Traditionally, calibration procedures for light-based cameras rely on detecting the corners of a 130 checkerboard pattern. With thermal imaging, one must first heat the checkerboard so that the darker 131 squares will be hotter than the lighter squares because of higher light absorption (Figure 2a). For 132 example, we achieved this by moving two 100-watt lamps over the checkerboard pattern for about 20 133 seconds before taking thermal images. 134 135 However, uneven heating combined with rapid conduction of heat through the checkerboard image 136 cold and hot squares; Figure 2b). Instead of detecting the corners, the cooler portions of the image are 138 dilated so that each hot square is reduced in size and no longer connected to adjacent hot squares. A 139 blob-detector is then used to detect the centers of the hot squares. The pixel values of the image are 140 then inverted, and the procedure is repeated to detect the centers of the cold squares. This is repeated 141 for 20-30 images taken at different positions and orientations relative to the camera. The resulting data 142 are used by the software to calculate the camera's intrinsic parameters (Figure 2c, d). 143 Step 4: Extrinsic camera calibration. A second calibration procedure is required for generating 3-D tracks 144 for each recording setup-that is, any time the cameras are moved even slightly. calibration allows one to map objects that are detected in two or more cameras to real-world 3-D 148 coordinates. 149 As noted previously, there are several approaches for generating extrinsic calibrations. ThruTracker has 150 two options: a wand-based procedure that is based on methods used for easyWand (Lourakis & Argyros,151 2009; Theriault et al., 2014) and a procedure that is based exclusively on background points. One can 152 also import calibrations made using DLTdv or easyWand (Hedrick, 2008;Theriault et al., 2014). 153 Background points include any object that is visible within both cameras such as tips of tree branches, 154 wind turbines, or animals. Moving objects can be used if the images are taken synchronously between 155 cameras. This background point procedure [also known as structure from motion (Schönberger & 156 Frahm, 2016)] means that 3-D calibrations can be made without dedicated calibration objects within the 157 field of view. In effect, any object visible by two or more cameras can be used as a calibration object.
Two additional components of the calibration must be determined when no dedicated calibration 159 objects are used: the scale of the scene and the gravitational axis. The scale can be set by specifying two 160 points in the scene that are at a known distance from one another. Alternatively, one can use the 161 distance between cameras to set the camera scale. 162 Second, the gravitational axis can be set by measuring the inclination angle of one of the cameras using 163 an inclinometer. This value is input into ThruTracker's calibration app during the calibration procedure. 164 With these options, one can obtain calibration data rapidly in the field with only a few measurements 165 and with no requirement for deploying calibration objects. This is especially helpful for situations where 166 it is not feasible to deploy calibration objects or where their use would disturb animals under study. 167 A notable down-side of not using calibration objects is the absence of objects at known positions that 168 can be used to check the accuracy of the calibration. Therefore, it is important to conduct tests of the 169 calibration procedure using objects at known distances. For example, in our test using wind turbines 170 below, we measured the variation in the reconstructed lengths of the turbine blades at different times 171 and positions throughout our recording. Another alternative would be to conduct test setups in the field 172 at locations where it is easier to deploy calibration objects such as a wand. 173 Step The DLT residual is the distance in pixels between the observed image coordinates of a marker and the 178 "ideal" image coordinates computed from the estimated 3-D location of the marker and the calibration 179 information for the camera that captured the image. If we visualize each 2-D detection as a vector in 3-D 180 space with its origin at the camera, the residuals indicate how closely a given set of 2-D detections and their associated vectors come to crossing in 3-D space. The algorithm starts by creating 3-D points based 182 on sets of 2-D points with the lowest residuals and removing those points from the available pool. It 183 proceeds until there are no more 3-D points with residuals below the specified threshold. 184 Step 6: Generating 2-D and 3-D Tracks. Two-dimensional and Three-dimensional points are stitched 185 together across frames to make tracks. Each 2-D or 3-D point in the first frame is a putative track. 186 Detections in each proceeding frame are assigned to existing tracks, or if no assignment is made, they 187 become the beginning of a new track. A Kalman filter is applied to each track to predict the position of 188 the track in the next frame. The distance between each detection and the predicted positions are 189 computed. This is done both in 2-D and in 3-D to calculate a cost matrix. A Hungarian algorithm is used 190 to determine the combination of assignments that minimizes cost across the assignment matrix (Kuhn, 191 1955). Finally, a threshold is specified in ThruTracker such that assignments are only made if their cost is 192 below the threshold. One can specify the number of frames between detections that are allowed (i.e., 193 gap distance) before a track is terminated. One also specifies the minimum number of detections 194 required for a track to be retained. Longer gaps and smaller minimum numbers of detections increase 195 the number of tracks that are retained but increases the number of false positive tracks generated from 196 noise. 197 Step 7: Data visualization, analysis and classification: ThruTracker offers multiple tools for visualizing and 198 processing tracks. One can rapidly toggle between tracks and use shortcut keys to classify them into 199 different categories. For example, for wind turbine applications, the tracks can be labeled "bird", "bat", 200 "airplane", "noise", etc. Another option allows all the tracks to be visualized at once. Tracks can then be 201 selected as groups and classified based on their positions, start or end points. This tool is helpful for 202 selecting tracks based on their location, as exhibited in the bat emergence case study presented below. 203 Another option allows the user to draw a rectangle over the camera image to count exits and re-entries 204 as animals pass into or out of the rectangle. This is useful, for example, when counting bats exiting a 205 cave roost. Resulting 2-D, 3-D and track data can be exported into CSV files for use in other analysis 206 programs. 207 2.2 Case Studies 208 2.2.1 Case Study 1: Counting bat exits from bridges and caves using thermal imaging. 209 We used ThruTracker to count bats emerging from bridges using a DJI Zenmuse XT2 thermal camera 210 with a 13 mm lens (45-degree field of view) suspended from a DJI Matrice 300 drone (SZ DJI Technology 211 Co., Shenzhen, China). Because this analysis was done in 2-D there was no need for intrinsic or extrinsic 212 calibrations. The drone was flown at altitudes of 50 m and 80 m above a bridge known to be a roost 213 location for big brown bats (Eptesicus fuscus) in August of 2020 near Burnsville, NC, USA. We also 214 counted gray bats leaving caves using thermal cameras placed at ground level. Videos of gray bats were 215 provided by the US Fish and Wildlife Service. 216 Our goals were 1) to determine the maximum distance at which bats could be detected and 2) to 217 compare manual counts of emergences with those produced using ThruTracker. The bridge recordings 218 provide a test of relatively low numbers of bats counted near the limits of their detection range. The 219 cave recordings test detection of large numbers with high rates of occlusion. 220 Videos were processed in ThruTracker with the following parameters: sensitivity, 35; background 221 frames, 20; Min object pixels, 1; max object pixels 100; min track length 5, max gap length 5; match 222 threshold 10. After detections were made in ThruTracker, the applet TrackSelector was used to rapidly 223 select tracks that originated near the edge of the bridge. Manual observers used VirtualDub software to 224 play videos at 50% of normal speed and paused and reviewed videos frame by frame, as necessary. 225 We processed two videos taken at heights of 50 m and 80 m above the bridge and two videos 226 representing different bat densities at caves (Table 1). Videos were not meant to census the entire 227 emergences, but rather provide data for comparing detection abilities.
We tested ThruTracker's ability to make 3-D calibrations for tracking 3-D flights of bats and birds at wind 230 turbines. Studying animal movements at wind turbines is a problem of considerable interest, especially 231 for bats who are being killed in large numbers for mostly unknown reasons (Arnett & Baerwald, 2013).  showed that the 50 m recording height was comfortably within the detection range for the bats. Bat 249 detections had a size of 5.2 ± 3.4 pixels (mean ± st. dev.) and tracks extended across most of the image areas where they lacked contrast with the background (e.g., see tracks terminating as they approach 252 land area in bottom right portion of Figure 3a). Manual inspection of videos at these locations found 253 that bats were not readily visible to the human eye, so this appears to be a limitation of the thermal 254 imaging, not the detection algorithm. 255 The 80 m recording height was much closer to the detection limit for E. fuscus bats using this recording 256 setup. Bats had a detection size of 1.9 ± 1.1 pixels. Visual inspection of videos revealed that bats were 257 only barely visible and that they were not detectable to the human eye within a short distance of their 258 emergence. This is reflected in the ThruTracker tracks that terminate over open water some distance 259 from the bridge exits (Figure 3b). This may have resulted from bats dropping to a lower altitude (and 260 further distance from the camera) as they left the bridge, as was visible in other camera views taken at 261 an oblique angle relative to the ground. ThruTracker performed slightly worse under these at the 80 m 262 height than the 50 m height, but still detected 93% of tracks that were detected manually ( In our second case study, we demonstrate ThruTracker's 3-D workflow for calibrating large spatial 273 volumes using only the turbine itself as a calibration object. We calibrated the FLIR A65 thermal cameras 274 a bridge (a, b) and cave (c, d). Big brown bats (Eptesicus fuscus) were filmed exiting a bridge using a thermal camera on a UAV flying at 50 m (a) and 80 m altitude (b). Red tracks indicate exits and white tracks indicate other detections. In (a, b) circles indicate the starting point of tracks to highlight departures from the bridge and the entirety of all tracks are shown. In (c, d) detections from a single frame are shown (circles) with tracks indicating movement over the two previous frames. See table 1 for statistics.

Figure 3 Example ThruTracker detections of bats leaving
(640 x 512 pixel resolution) using the intrinsic calibration method described above. Example images and 275 resulting intrinsic camera parameters are shown in Figure 2. We used 28 checkerboard images, with an 276 average reprojection residual error of 0.69 pixels (range 0.48-0.83 pixels). 277 We generated an extrinsic calibration in ThruTracker using 67 points from the wind turbine as 278 background points (Figure 3). These included hot spots, corners and an anemometer on the nacelle, and 279 turbine blade tips. Points were digitized manually using DLTdv8 (Hedrick, 2008). Efforts were made to 280 select points that were 1) clearly visibly in both cameras, 2) distinct points in 3-D space, such as small 281 hot spots or sharp edges of objects, and 3) covering a broad range of 2-D and 3-D positions. We 282 excluded six points because they had DLT residuals > 3 pixels. The remaining calibration had mean 283 reprojection errors of 0.63 pixels. 284 We set the scale of the scene using the distance between the two cameras (33 m) and the gravitational 285 axis was set using the inclination angle of the second camera (62.2 degrees). The resulting calibration 286 had a volume of 235,597 m 3 assuming a maximum detection distance of 200 m. The maximum detection 287 range is likely less than 200 m for small bats (<30 g) with this camera setup, but it is possible that some 288 large birds could be detected at this distance. 289 To test the spatial accuracy of our calibration, we measured the distance between the tips of turbine   tracking; a tool for calibrating intrinsic parameters of thermal cameras; the ability to track and count 321 hundreds of animals simultaneously; and the ability to make 3-D calibrations without dedicated 322 calibration objects. We demonstrate these capabilities by counting bats leaving bridges and caves (Table  323 1; Figure 3) and making a 3-D calibration using only a wind turbine as a calibration object (Figure 4). 324 The software is compatible with thermal and light-based imaging and most standard video formats (e.g., 325 avi, wmv, mp4). ThruTracker uses an app-based environment with no coding required to make well-326 established detection and tracking algorithms widely available. Users simply import videos and select 327 the detection and tracking options. These features make it easy to track moving animals under a variety 328 of conditions. 329

330
ThruTracker uses a well-established background subtraction algorithm for object detection (Zivkovic,20 a number of frames that can be specified by the user. This approach is best suited for stationary 333 backgrounds and animals that are in near continuous motion. It has difficulty with animals that stay in 334 one place; however, using more images for generating the background would help address this problem. 335 ThruTracker aims to detect one point for each animal (the blob centroid). Without the requirements for dedicated calibration objects, it is now possible to calibrate nearly any 343 volume in the lab or field. We demonstrate this workflow for generating 3-D calibrations at wind 344 turbines, where it would be logistically challenging to put calibration objects in the airspace (Figure 4). 345 Another approach would be to use the animals themselves as background points (Corcoran & Hedrick, 346 2019). 347

348
Technological development is driving price reductions and capability expansion in thermal and high-349 speed cameras, along with supporting equipment such as UAVs. However, the software required to 350 make full use of these capabilities for research in fields as diverse as biomechanics, animal behavior, 351 ecology, and population monitoring remains the province of specialized workflows in individual lab 352 groups. ThuTracker provides an integrated, graphical, and user-friendly package to fill these needs, thus 353 expanding the number of researchers able to make effective use of these emerging technologies.