The repo with code that infers tag detections from raw data is find_tags. The documentation of command-line parameters is in find_tags_motus.cpp. The comments in the code are fairly out-of-date, but still perhaps useful to get a general idea of how it works. But the description of command-line parameters in lines 222-456 is current.
Detection of tags is not a deterministic process. Although the tags are simple enough that we can count on them to produce a continuous sequence of bursts with fairly precise timing, there are effects that disrupt the signal on its way to our detector code:
atmospheric effects can drastically change the quality and number of transmission paths from tag to receiver
orientation of the antenna on the bird, and spatial configuration of the bird's body can change the strength of signal leaving in any given direction (the bird's body is well within the tag antenna's "near field", and so can influence the pattern of radiation it emits)
pulses from multiple tags can collide, preventing detection of either, or succeeding for one but not for the other tag. To detect a burst from a tag, we need all four of its pulses to be detected. (In principle, we could allow for missed pulses, using a mechanism similar to what is used for missed bursts; this would require filtering out sequences that are compatible with multiple tags, and I'd guess would significantly increase memory consumption and running time of the tag finder).
noise can mask pulses, as well as generating new ones
timing latency in the USB stack can lead to discrepancies in the inter-pulse spacing, causing a burst not to be recognized. The USB hub on SGs (whether built-in or external) has to interleave packets generated independently on multiple USB devices, and send them down a single USB channel to the host. This means some packets are delayed more than others. Additional latency issues are caused by USB 1.0 audio packets from the funcubedongles having to be re-packaged into USB 2.0 frames (transaction translation).
excessive noise on the USB system can cause data packets to be dropped, which, because funcubedongles use USB isochronous transfer mode (where low latency is prioritized over correctness), means that packets are simply lost. This disrupts short-scale time calculations, again leading to a non-recognized burst.
antennas might be multiplexed (e.g. on Lotek receivers), meaning only one antenna's signal is processed at any given time, so the stream of detections from a single antenna may miss bursts. e.g. if a tag has a burst interval of 10 seconds, and the receiver is multiplexing 6 antennas with a switching period of 10 seconds, then on average, a given antenna will only see every 6th burst. And even all antennas considered together might not see all bursts, depending on the direction and distance to the tag. (Most set-ups use very directional Yagi antennas, so that there are large volumes of effectively uncovered space between regions of good coverage, especially beyond a certain range).
Because of such factors, we have to allow for considerable slop in detections. This is controlled by parameters to find_tags_motus:
pulse_slop: for timing latency issues that mean intra-burst intervals won't always be exact
burst_slop: for larger discrepancies sometimes due to temperature effects on the tag's own clock
max_skipped_bursts: we want to assemble runs of bursts from the same tag, but take into account the fact that we often won't see every burst in a run. So we allow for up to a maximum specified number of bursts to be skipped, keeping track of the elapsed time and requiring the next detected burst to come at an appropriate time.
timestamp_wonkiness: Lotek receivers appear to have a crude way of syncing their clock to GPS time, wherein the whole number of seconds is synced but the fractional seconds (used for calculating inter-pulse intervals) is not. This prevents time sync from breaking up bursts, but leads to burst intervals which can be (exactly) +/-1 s off what they should be.
Why are runs the basic unit of detection?
individual bursts don't uniquely identify tags; there are only 521 distinct coded IDs (in the Lotek4 codeset; only some 250 in Lotek3), and several of these are not (should not be?!) used because of high susceptibility to false positives. We need at least 2 bursts to uniquely (ignoring ambiguity) identify a tag.
many raw files are noisy, containing far more bursts (Lotek) or pulses (SG) than are being generated by tags. Because detection of a burst depends only on presence of pulses at the right intervals (and some sanity checking of frequency and signal strength coherence, on SGs), presence of many noise pulses means many more apparent bursts detected. But the chances of noise pulses generating a sequence of pulses (or bursts) compatible with a given real tag decline exponentially with the number of bursts, so we can use runs to filter the potentially huge number of false positives.
Our traditional filtering criteria have been:
freqsd < 0.1 (SG detections only; this enforces a fairly tight level of offset frequency coherence among pulses within a single burst; we don't enforce this across bursts because there are temperature, Doppler, and body configuration effects that can change offset frequency significantly between bursts; the 4 pulses from a burst are generated in ~250 ms or less, while bursts can be up to 40 seconds apart, so the time scales are quite different).
runLen > 2 (sites with low noise) or > 3 (most sites). Sometimes, users have had to increase this. A run with only two bursts (runLen == 2) should almost never be treated as a real detection, unless there are strong corroborating circumstances (e.g. nearby runs of the same tag on other antennas of the same receiver)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.