Artificial Pinnae for Imitating Spatial Target Localization by the Brown Long-Eared Bat (Plecotus auritus)

Echolocating bats locate a target by ultrasonic echolocation and their performance is related to the shape of the binaural conformation in bats. In this study, we developed an artificial sonar system based on the vertical sound localization characteristics of the brown long-eared bat (Plecotus auritus). First, using the finite element method, we found that the beam of the first side lobe formed by a pinna constructed according to that in the brown long-eared bat shifted in an almost linear manner in the vertical direction as the frequency changed from 30 kHz to 60 kHz. We then estimated the elevation angle of the spatial target using the bat-inspired artificial active sonar. We employed an artificial neural network trained with the labeled data obtained from target echoes to directly estimate spatial targets. In order to improve the accuracy of the estimates, we also developed a majority vote-based method called sliding window cumulative peak estimation to optimize the outputs from the neural network. In addition, an L-shaped pinna structure was designed to simultaneously estimate the azimuth and elevation. Finally, we established a model of the relationship between the time-frequency features of the echo emitted by brown long-eared bats and the spatial direction by using the pre-trained neural network. Our field experiments indicated that the binaural conformation and relative binaural orientation both played vital roles in spatial target localization by these bats. Accurate echolocation can be achieved using a simple binaural sonar device even without binaural time difference information.

The disruptive development of mobile and flying robots such as automatic guided 2 vehicles and unmanned aerial vehicles demands new sensory approaches for target 3 search and obstacle avoidance. Machine vision-based methods can satisfy the basic 4 requirements in favorable lighting environments but variables such as darkness, fog, and 5 smoke hinder their wider application. Sonar sensing can effectively complement the 6 commonly used vision sensing techniques in robot applications. In traditional sonar 7 systems, transmitter and receiver arrays are used widely for navigation and localization, 8 but the excessive number of sensors included in these systems often causes difficulties 9 during installation and layout. In contrast, the bat echolocation system is an extremely 10 compact sonar system with only two sound receivers and one speaker, which would 11 obviously be preferable for robot applications because of its simplicity, low cost, and 12 high performance [1] [2]. 13 The excellent navigation and target localization capabilities of bat sonar have 14 attracted much attention from researchers who have focused on various aspects. In 15 particular, the relationships between echolocation and the auditory neurons or auditory 16 cortex [3] [4] [5], have been investigated in order to identify a suitable echo processing 17 method for navigation and localization by observing the activities of the auditory nerve 18 and auditory cortex during echolocation by bats. In addition, studies of the behavior of 19 bats [6] [7] [8] [9] [10], have provided great insights into the intrinsic mechanism 20 responsible for navigation and localization by bats. Moreover, investigations have 21 considered the acoustic roles of the physiological structures of bats, including the sound 22 field characteristics related to the facial physiological structure [11], auricle 23 structure [12], and vocal tract [13]. All of these studies have above provided important 24 insights to facilitate the artificial reproduction of echolocation by bats. 25 Some interesting results in bat research have been physically reproduced and even 26 implemented in practical applications. Most two-dimensional target localization 27 methods based on binaural systems have been implemented based on the interaural 28 time difference [14] [15] or interaural intensity difference [16], whereas the majority of 29 the three-dimensional (3D) target localization systems have employed multi-receiver 30 designs [17] [18]. However, these types of applications are not fundamentally different 31 from the conventional sensor array methods. In addition, some reproduction techniques 32 exploit the unique sound field characteristics determined by the physiological structure 33 of bat ears, which more strictly imitate the localization effect of bat sonar. Müller et al. 34 analyzed spatial localization based on information theory and designed a set of devices 35 to imitate the structure of the horseshoe bat's ear [19]. Furthermore, Schillebeeckx et al. 36 presented a bat-like experimental device for single target 3D spatial localization using 37 artificial Phyllostomus discolor pinnae [20]. 38 Significant progress has been made in bat research but the large variety of bat 39 species has resulted in diverse methods for the artificial reproduction of bat sonar. For 40 example, Chiu et al. conducted experiments based on the behavior of big brown bats 41 (Eptesicus fuscus) and found that the tragus is related to vertical sound localization [21]. 42 In our previous study [22], we obtained similar results using a finite element method 43 (FEM) [23] based on sonar in the brown long-eared bat (Plecotus auritus). In our 44 localization simulation experiments using a bat-like device, we also examined the 45 vertical localization performance of the time-frequency characteristics. We found that in 46 February 26, 2020 2/16 a specific wide range, the time-frequency characteristics of sonar in long-eared bats had 47 a strong relationship with the vertical direction (i.e., elevation angle) and a weak 48 relationship with the azimuth angle. We also showed that the time-frequency 49 characteristics had a relatively strong correlation with the direction of the echo in the 50 vertical direction.

51
Due to the physiological differences among species of the bats, the bio-inspired 52 bionic methods developed for 3D spatial target localization are highly variable. Kuc 53 designed an experimental model based on acoustic mirror formed by rotating a lancet, 54 which exhibited an elevation versus notch frequency sensitivity [24]. Schillebeeckx et 55 al. [20] only used two microphones and one ultrasonic emitter, and successfully localized 56 a spatial target in the laboratory as an important advance in strictly bat-like sonar 57 localization. In their experiment, they treated every location as a category and made 58 independent assumptions for each direction in the hemispheric space. However, the 59 number of categories must be sufficiently large to obtain high resolution, which requires 60 the consumption of an excessive amount of computational resources for practical 61 applications. In the present study, we estimated the angle of the echo using another 62 approach inspired by the sound field characteristics of the pinnae in brown long-eared 63 bats based on FEM simulations.

64
This study obtained the following three main conclusions. First, we employed the 65 FEM to simulate the sound field characteristics of the auricle structure in brown 66 long-eared bats and the results indicated that the acoustic characteristics of this bat can 67 effectively provide spatial resolution information in the vertical direction. Second, we 68 constructed a bat-like device and conducted spatial target localization tests in a 69 physical system using real echoes and artificial pinnae. Third, we showed that the scope 70 of the spatial resolution measured by the bat-like device was related to the relative 71 positions of the two pinnae. In order to obtain more evidence to inspire the design of an accurate artificial bat-like 75 sonar system, we conducted numerical simulations of the pinnae in Plecotus auritus by 76 using a FEM and Kirchhoff integral [25] to analyze the spatial frequency characteristics. 77 In particular, 3D digital models of Plecotus auritus pinnae were obtained by digital 78 image processing [26] using computed tomography scans of pinnae ( Fig 1B). We placed 79 a sound source in the inner ear canal in the numerical model and performed numerical 80 calculations based on the FEM. First, we used FEM and Kirchhoff integral to obtain 81 the beam pattern for the digital ear in the simulation. The acoustic near field inside a 82 cuboid-shaped volume surrounding the ear was then calculated using a FEM comprising 83 linear cubic elements derived directly from the voxel shape representation. Finally, the 84 far-field directivity pattern was calculated by projecting the complex wave field 85 amplitudes outward onto the surface of the computational domain of the FEM using a 86 Kirchhoff integral formulation. bat. To avoid damage during assembly and to allow the fixation of a microphone, the 91 size of each printed artificial bat-like ear was three times larger than the original ear.

92
The frequency range of Plecotus auritus ears is 20-60 kHz, so the frequency range used 93 in the experiments was adjusted to 5-20 kHz according to the scale model principle [27]. 94 A pair of ultrasonic microphones (SPU0410LR5H-QB, Knowles Electronics, Itasca,

95
Illinois, USA) were placed inside the pair of artificial ears and insulating glue was 96 smeared in the gaps between the microphones and the pinnae to prevent outside sound 97 waves entering the microphones from the bottom of the model. The artificial ears were 98 then fixed to a rotating platform and tilted forward 40 • (Fig 2E). A stepping motor was 99 mounted under the rotating platform to facilitate rotation of the ears and to measure 100 positioning information at the azimuth. An ultrasonic loudspeaker (UltraSound Gate

101
Player BL Light; Avisoft Bioacoustics e.K., Glienicke, Germany) was fixed under the 102 stepping motor (Fig 2A,B). frequency-modulated pulse signal, which has similar frequency characteristics to a 115 chirp [28], so we set the ultrasonic loudspeaker to emit a chirp pulse signal (see Fig 3A) 116 in our active target echolocation experiments in order to imitate the pulse emitted by 117 the brown long-eared bat, and its time-frequency intensity is shown in Fig 3D. We set 118 the signal acquisition card in the work mode for two-channel synchronous signal 119 acquisition in order to collect the binaural signals in a synchronous manner. Details of 120 the signals emitted in the active echolocation experiment are also listed in Table 1.
where x(n) is the signal received after the endpoint detection process and W (n) is the 123 window function, which shifts the sound signal by a step length on the time axis. We E. Neural network for estimating the angle of the spatial target 139 Artificial neural networks are used widely in pattern recognition tasks as efficient 140 methods. At present, deep learning is very popular because of its good classification 141 performance, but we selected the traditional back-propagation (BP) feed-forward neural 142 network as an estimation tool because of its simplicity and practicality, but also because 143 of its convenient regression function. We applied the BP feed-forward neural network to 144 three tasks comprising elevation estimation in the case of parallel erect pinnae, elevation 145 in the case of orthogonal pinnae, and azimuth estimation. The BP feed-forward neural 146 networks used in the three tasks had the same structure ( Fig 4A), which comprised an 147 input layer with 60 neurons (30 + 30, i.e., the features extracted from the echo signals 148 for the left and right artificial ears where fed directly into the network), a hidden layer 149 with nine neurons, and an output layer with one output neuron. The three layers were 150 fully connected in the BP neural network. Tan-sigmoid and pure linear were selected as 151 the activation functions in the hidden layer and output layer, respectively. In the time-frequency patterns generated from untrained ultrasonic echoes were fed into the 156 network. The structure of the neural networks used in the three tasks was the same but 157 the normalization functions were different because the output angles differed. The three 158 types of outputs were linearly transformed into activities between 0.05 and 0.95, and 159 they are shown in Fig 4B, 4C, and 4D, respectively. In the estimation process, the 160 output neuron activities were linearly re-transformed according to the activity functions 161 and they were used as the angle estimates for the respective inputs in each task. Frequencies spanning the entire frequency range (from 20 kHz to 60 kHz with a step 205 size of 1 kHz) known to be covered by the first harmonic [29] of the biosonar pulses were 206 analyzed using the numerical method (Method A). The first side lobe in the beam 207 pattern (Fig 6A, 6B) exhibited a frequency-driven scanning characteristic and its power 208 was relatively strong when the frequency exceeded 32 kHz. When the frequency was less 209 than 30 kHz (Fig 6C), the half-power beam width curve was relatively large whereas the 210 power of the side beam was low and the orientability of the lobe was not concentrated, 211 which resulted in low directional resolution. The beam direction of the first side lobe 212 shifted along the elevation in an almost linear manner as the frequency changed from 30 213 kHz to 60 kHz, whereas the azimuth of the side lobe remained almost stable (Fig 6D). 214 The peak of the beam for different frequencies appeared to alternate along the elevation, 215 thereby suggesting that in this frequency band, the combination mode of the frequency 216 intensities changed greatly in the direction of elevation, and this implied that the  Table 1.   Three times the original size  3  Microphone distance  5 cm  4 Relative position between target and device As illustrated in Fig 5,from the top view, the target was fixed on the green spots, and the device was located in the eight different locations in turn to obtain the training data. Yellow spots A,B and C were used for locating the device to collect the testing data. The detailed layout of the target and device in the training and testing processes are described in the Results section. 5 Transmission signal Linear frequency modulation signal 6 Frequency range 5-20kHz 7 Passive signal duration 2 s 8 Active signal duration 5 ms 9 Alteration mode of azimuth By altering the height of the target(small ball). 10 Alteration mode of elevation By rotating the pinna. 11 Scope of azimuth alternation -90 • to 90 • (step length: 10 • ) for estimating the elevation in parallel erect pinnae experiments; -28 • to 28 • (step length: 7 • ) for the orthogonal pinnae experiments. 12 Scope of elevation measurement 20-55 • (step length: 5 • ) in parallel erect pinnae experiments, 12-68 • (step length: 7 • ) for the orthogonal pinnae experiments.
We used the artificial bat-like sonar device to emit and receive acoustic signals with 235 the two artificial pinnae pointing upward and collected data using the data acquisition 236 device (Method C), as summarized in Table 1 The simulation results demonstrated the differentiation of the frequency combination 242 with the elevation in a linear region. To test whether the properties of this frequency 243 combination were related to the elevation in a wider range, we combined 10 data sets of 244 spatial feature samples according to the different widths in the horizontal direction, and 245 estimated the elevations in each data set. Single pulse and different sets of data 246 categories under different azimuth scopes for every statistic (Fig 7A) were used to 247 evaluate the accuracy of the elevation estimated under different width ranges. In order 248 to obtain reliable results, the sonar device was located at eight different sites (blue 249 circles in Fig 5), where the small ball used as the target was located at the center, as 250 indicated by the green circle in Fig 5. The horizontal distance between the ball and the 251 sonar device was 1.5m. The training data and testing data were different but they came 252 from the same data set, as explained in Method G. The values of −N • to +N • 253 represented the scope of the limited azimuth angle from the left and right (Fig 7A). nearly 100% with the scope from −20 • to +20 • of the limited azimuth. The average 261 accuracy exceeded 90% when the limit angle of the azimuth was −60 • to +60 • and the 262 average accuracy exceeded 80% even when the limit angle of the azimuth was −90 • to 263 +90 • . These findings illustrated the good wide-angle effect on elevation estimation and 264 the high accuracy in the middle direction of the azimuth, which are essential for target 265 detection. In contrast, the average accuracy of the azimuth estimation was only about 266 35 percent when the limited angle of elevation was 20 • to 55 • (Fig 7B). Thus, the

3) Generalization tests 284
Robustness testing methods were designed to verify the generalizability of the 285 system used for estimation. The training data were the same as those used for single 286 pulse target elevation estimation. The training process was also similar to the process 287 employed for single pulse target elevation estimation, except no data in the training set 288 were retained for use as test data. Thus, for every limited scope of the azimuth, the 289 amount of data used in each training set was (((N/10) × 2 + 1) × 8). The generalization 290 testing samples were collected in the same experimental chamber but under conditions 291 where the artificial bat-like devices were placed outside the eight sites used in the 292 training process (e.g., sites A and B in Fig 5). The neural network used for the test was 293 trained with the ±90 • limited scopes of the azimuth and under the 1.5-m condition, as 294 described above. Cross-validation was not conducted in the testing process. SWCPE 295 (Method F) was also used for multiple pulse testing to obtain the final estimated values. 296 Fig 8C shows the performance under six conditions. The results were similar for the 297 steel ball used as a reflector in this experiment, although it was not used for training.

298
The results were similar although the steel ball was not used for training. The In the elevation estimation process described above, the two pinnae of the brown 306 long-eared bat were parallel to each other. We observed that the two pinnae of the 307 brown long-eared bat often stretch to a certain angle when hunting for prey. If the angle 308 is 90 • , the orthogonality of the two pinnae can be used to obtain the aspect angles in 309 the two orthogonal directions. The orthogonal pinnae of the artificial brown long-eared 310 bats used for target localization are shown in Fig 9. Except for the differences shown in 311 Table 1, the other measurement conditions were the same as those employed in the 312 generalization tests.

335
In this study, we developed a "strictly bat-like" sonar system inspired by the brown  direction were related to the combination of the frequency components and this vertical 364 correlation was maintained in a wide range up to ±60 • (Fig 8B). The correlation was 365 still present in the horizontal direction (Fig 7B), but it was significantly weaker than 366 that in the vertical direction. 367 We also identified the intrinsic relationship between the time-frequency features of 368 the echoes used by the brown long-eared bat and the spatial direction. Spatial angle 369 information could be obtained from the echo, and thus the echo must contain angle 370 According to the relationship between the features of the echolocation-related transfer 372 function (ERTF) and spatial position, if we use X to denote the ERTF features and f 373 to express their distribution in two-dimensional space (θ, ϕ), then X can be written as 374 follows.
Our experiments suggested that in a certain range of azimuth angles, a function 376 g(X) exists that is only related to ϕ: where θ and ϕ denote the azimuth and elevation of a target, respectively. Thus, it is 378 necessary to determine the function in terms of the relationship between the angle 379 information and features in the orthogonal coordinate system. This function is usually 380 difficult to obtain, but the relationship can be approximated by the BP-based feed 381 forward neural network as: where k is the number of output variables, and f 1 and f 2 are discrimination 383 functions in the hidden and output layers, respectively. In the case of parallel erect 384 pinnae, we can obtain: where f is the discrimination function in the hidden layer: For orthogonal pinnae, we can obtain g ϕ (X) and g θ (X) separately using two 387 independent neural networks.

388
Feature distortion had important effects on the estimated results and a change in the 389 SNR ratio was one of the factors responsible for feature distortion, as also mentioned by 390 Schillebeeckx et al [36]. However, for the results estimated directly for the targets made 391 of different materials and at different distances (Fig 8C), the SNR ratio was not the 392 only factor responsible for distortion. In particular, the SNR ratio for the echo from the 393 steel ball was much stronger than that for the wave reflected from the rubber ball, but 394 there was no significant difference in the estimates for the steel ball and rubber ball.

395
When we tested the targets located at 2 m using the network trained at 1.5 m, the 396 results were poor even though the SNR of the reflected echo at 1 m was better than 397 that for the target at 2 m. No other form of distance compensation was conducted in 398 our experiments apart from using a simple classification decision method for multiple 399 outputs. However, a real bat may have the ability to compensate for distance.

400
Effectively compensating for the loss of echo features at different distance should be 401 addressed in future research.

402
In this study, we developed a single target direction estimation method within a 403 large field of view. In addition to the direct application of this method to the 404 recognition of a single target by artificial machine bats, it can also provide a preliminary 405 estimate before subsequently obtaining more accurate estimates. Multi-target direction 406 estimation and sonar imaging based on a strictly bat-like system will be more 407 challenging, and this will also be considered in future research. It should be noted that 408 the current system is only an approximation and some problems still need to be 409 investigated, such as whether a single receiver located at the ear root is an authentic 410 reproduction of the reception method in bats and the potential effects of different 411 auricle materials. Therefore, the bat-like model will be improved further in order to 412 authentically reproduce the working mode of bat sonar based on biological discoveries. 413 Moreover, further applications inspired by this method will be investigated in our future 414 research, such as changing the media to electromagnetic waves, the application of 415 biomimetic radar, and applications in underwater sonar systems.

417
In this study, we demonstrated that the frequency combination had a strong 418 relationship with the direction of the sound beam based on FEM simulations of the 419 pinnae in Plecotus auritus. We then constructed a physical experimental platform for 420 collecting the active sonar signals. Erect pinnae with a BP feed-forward neural network 421 algorithm were employed to estimate the spatial angle of a single target and tenfold 422 cross-validation results indicated the high accuracy of the elevation estimations.

423
Generalizability experiments using a rubber reflector and steel reflector demonstrated 424 the robustness of our method. A pulse train was used to provide multiple inputs in 425 order to compensate for the loss caused by distance. We also developed a new method 426 called SWCPE for determining the optimum value based on multiple outputs from the 427 pulse train input. Experiments using orthogonal pinnae showed that the L-shaped 428 pinna structure could improve the target detection ability in the horizontal direction 429 without the need to acquire more information from the echo signals, which was not the 430 case with the erect pinnae. In contrast to other spatial angle classifier estimation 431 methods, our proposed algorithm only used a one-dimensional label, i.e., either the 432 elevation or azimuth, as the output from our classifier model. Our experiments also 433 suggested that the proposed structure and model are suitable for accurate spatial target 434 echolocation with a limited computational load in a real-time system.

436
We thank all the anonymous reviewers for their helpful and stimulating comments on