Monday, March 3, 2014

Reality mining: tracking individuals in groups

The study of animal social behavior is undergoing a phase shift. Ever-improving tracking technology now allows unprecedented resolution of the mechanisms undergoing social interactions in large groups in the wild. Researchers can now begin asking questions requiring knowledge on individual identities, fine-scale association patterns, and more. The review article in this blog post, Krause et al. 2013, investigates and explains these exciting new technologies. 

Article details
- Krause J, Krause S, Arlinghaus R, Psorakis I, Roberts S, Rutz C. 2013. Reality mining of animal social systems. Trends in Ecology & Evolution. 28: 541-551.
- Corresponding authors (Krause J and Rutz C) are affiliated with:
   o The Department of Biology and Ecology of Fishes in the Leibniz Institute of Freshwater Ecology and Inland Fisheries and the Department of Crop and Animal Sciences in Humboldt University, Germany
   o The School of  Biology at the University of St. Andrews, UK

Very brief summary
Automated technology can now provide data at time intervals small enough to get at the mechanism of social processes in terrestrial and marine animal groups. The acquiring and analyzing of these tremendous amounts of data has been dubbed 'reality mining.' Reality mining can give descriptive and predictive models on disease transmission, predator-prey interactions, information flow within and between populations, formation of social hierarchies, cooperation between individuals, and responses to ecological events. A major trade-off in these technologies is battery life and weight versus how detailed the data are. Two major applications of reality-mined data are social network analyses and Hidden Markov Models.

Glossary
- association pattern - information on which individuals were physically close to one another over time. Loggers can measure direct patterns, where proximity between two individuals causes their loggers to record each other's identities, versus indirect patterns, such as when an individual was near a receiver (e.g. by a bird feeder), or a comparison afterwards of the GPS spatial positions of two individuals indicates they were near each other at some point

- biologger - a miniature animal-attached tag that stores information. It is not uploaded continuously; rather, it has to be within range of a receiver for the data to be downloaded (see biotelemetry: passive)

- biotelemetry - a general term for technology where a signal is sent from a transmitter (e.g. on the animal) to a receiver elsewhere. Active biotelemetry is where the data are uploaded continuously, while in passive biotelemetry, the data are uploaded when the transmitter and receiver are within range

- GPS - global positioning system. The system uses satellites to provide time and location information

- Hidden Markov Model - a model of probabilities based on random processes that can't be directly observed, only the outcomes of those processes. e.g. a model depicting the probability that we observe a fish school migrating at this particular moment, based on movement data that says "when the fish move like this, they are preparing to migrate; when they move like that, they are just hungry; when they move this other way, they are courting, etc." We can't actually know what's going on in the heads of the fish, but based on their movement patterns (which we can observe), we can estimate their motivations

- PIT/RFID - passive integrated transponder tag / radio-frequency identification. A small microchip placed on the animal allows for touch-free future identification; when exposed to radio field around a receiver, the tag transmits its identification code to be recorded by the receiver. e.g. a bird feeder equipped with an RFID reader can record the identity and time a PIT-tagged bird uses the feeder

- reality mining - the collection and analysis of machine-acquired data on social behavior of animals or humans

Article summary-----------------------------------------------------------------------------------
From humans to animal groups
The advent of the internet completely redefined the possibilities of researching human behavior. Humans browsing the internet leave digital 'footprints' of the websites they visit, from passive information like visitation rates and visit duration, to more active data like comments and clicks. Consider the fact that as of February 2014, more than 95% of the Western world uses cell phones, Facebook has 1.3 billion monthly active users, and more than 500 million messages are posted on twitter daily. This allows for analyses of nearly entire human populations with incredible spatiotemporal accuracy (though it definitely makes you think twice about your internet habits...!).

The interdisciplinary field of data mining is dedicated to finding patterns in enormous pools of data like human internet usage. Population-level patterns such as the topology and dynamics of social networks (i.e. who you're connected to and how that changes over time), the flow of information within and across populations (e.g. how a town learns about the murder of a politician vs. how the rest of the U.S. finds out), and daily activity (e.g. when, exactly, are 13-17-year old males visiting Facebook, to be most efficient with advertising that action movie?) of an entire population can emerge from analyzing aggregates of individuals' data.  

Where we're coming from
For decades, behavioral ecologists have been limited to observing animals indirectly or by eye or surreptitiously-placed video cameras. This makes systematic and disturbance-free observations of animals in the wild incredibly difficult, especially animals that steer clear of humans, are rare, or live in inaccessible habitats (such as underwater!). Now, though, tracking technology has advanced to allow us to begin collecting tremendous amounts of data on the movements, behavior, physiology, and/or environments of animals, even the animals we had trouble tracking before. This tidal wave of data is expected to have an impact similar to what the internet had on understanding human behavior.

Previously, researchers relied on standardized resighting methods to obtain data on a small number of individuals in a population. Say you wanted to catalogue the social interactions of a 40-member baboon troop. If you want to have accurate results for an individual, you'll want data on many of that individual's social interactions. Simultaneously, though, you'll want data from as many individuals as possible... what's the point of knowing a lot about only one individual, 2.5% of the group?


You and two undergraduate assistants set out to record this troop with video cameras, focusing on six of the forty individuals (those six are the easiest to tell apart from a distance). You get 25 hours of video, over four days. You'd like more time but the troop crossed a river into a game reserve that doesn't allow researchers in. Now comes the hard part. Extracting the social interactions from those videos will take FOREVER (only a minor exaggeration). The more detail you want from the videos, the more times you will have to rewatch the videos. With whom did Individual 1 interact? Where were they? How long was the interaction? Was it aggressive, neutral, conciliatory? Was there more than one stage to the interaction? You return from Africa and spend the next few months coding the videos. (Let's read that again: you will go through the start of the academic year, Halloween, Thanksgiving, the winter holidays, the New Year, and probably Valentine's Day watching and rewatching the same videos from your summer in Africa.)

The more time you spend coding, the more detailed your data will be... but ultimately, the data only cover 15% of the members of the troop, over a very short time scale. Your data hint at an interesting association between subordinate adult males and subordinate juvenile males...  but despite all your work, there just aren't enough data to know for sure. You realize that for the next field season, the scope of your questions must be much more focused and your data collection more rigorous than you could have anticipated.

Current tracking technology
Current reality-mining technology now makes the research in the previous paragraphs immensely easier. It's now possible to get social associations and spatial data down to the second for every member of a group (assuming you have the money, of course). The output of these technologies isn't a video that needs to be analyzed by hand; the output is the data you would have had to have extracted from the video, and it's likely to be more accurate than if the data were collected by a human. 

While the major tracking technologies differ in what, exactly, they record and how the data are uploaded, they all involve some sort of miniature logger placed on the animal. The logger can be a collar or a chip planted underneath the skin or attached to the foot. The black-capped chickadee on the right is carrying such a logger (a PIT tag in this case) on its foot. (The lime green rings are for visual identification.)

The loggers fall into two general categories: biologgers and biotelemeters. Biologgers store the data on internal memory, which is later uploaded. Think of the logger as an SD card for your camera. Biotelemeters, on the other hand, are like a radio shooting off information to be recorded elsewhere. Biotelemeters can be active, where they're transmitting information constantly (and usually require a battery source), or passive, where they transmit information when they're close to the receiver (and get a jolt of life from the receiver). 

So what can you actually gain from biologgers and biotelemetry? For one, quantifying social interactions becomes much easier. Association patterns are essentially lists of when an individual was close to another particular individual. Instead of watching a video (or more challenging: through binoculars live) to record this, the tracker on the animal does it for you. It then forms either a direct or indirect association pattern. Direct encounter patterns are a list of animal-animal interactions; any time two loggers are close, both record who the other individual was and when they met (and in a few exceptional cases, also where). This requires that the loggers act as both transmitters and receivers, which can require more expensive technology and battery life.

Indirect encounter patterns are more common and less costly. These data can be the GPS spatial coordinates of an individual over time (which you can then compare with the coordinates of other animals to see when they were close to each other), or they can be fixed receivers that record an animal's visit to a location. This location can be a feeder for mammals or birds, or it can be a section of a river, as in the picture on the right. The monitors record the identities of tagged fish that swim through this section of the stream. While these data aren't nearly as detailed as direct encounter patterns, the advantage is that the loggers can be much smaller and have markedly longer battery life.

Current limitations
The data produced from biologging and biotelemetry have enormous potential, but a few limitations remain. Chief among these is the lack of context for the social interactions. What was actually happening in those 4.773 seconds when Individual A44J482DKK was near A44A990RWE? Future work aims to include sensors on the loggers that could measure heart rate or hormone levels, or miniature video cameras or accelerometers for a window into the animal's surroundings and behavioral state.

One last limitation may remain no matter how far we push the technology: we will always have to deal with missing data. In the wild, it is exceedingly difficult for every individual in a social network to be tagged; aside from the challenges of capturing every member of a group, many groups change membership over time. In species with fission-fusion groups (members entering and leaving, groups joining and breaking apart), the actual number of individuals needing to be tagged becomes much larger than the 20-bird flock or 100-fish school you originally had in mind. (Otherwise, your data will look like your animals are interacting with ghosts!)

Applications of reality mining data
Reality-mined data serve to vastly improve studies on social networks and Hidden Markov Models. A great example of reality-mined data in social networks was performed by Dr. Lucy Aplin and colleagues at Oxford University. In the study, sympatric great tits, blue tits, and marsh tits in the Edward Grey Institute woods were PIT-tagged and feeders were placed at random locations in the woods. These feeders recorded the identities of visiting birds, allowing for an analysis of whether social learning (i.e. where the feeder is) has any relation to social connectivity. In other words, do birds differ in their access to and use of information on what others are doing? The answer was yes: the three species of tits all varied in their social connectivity, and more connected individuals found food more quickly than less-connected birds. Because the feeders recorded the identities and times that birds visited, constructing associations between individuals was possible. The social network from the study is shown above.

Hidden Markov Models (HMM) can also benefit from reality-mined data. An HMM is a probabilistic model on a stochastic process that we can't directly observe, only its outcomes. An example would be using data on where a human is and at what time to try to figure out if he is awake or sleeping. If a person is at the mall at 3:00am, for example, he's probably awake. If he's at his office at 10:30am, he's also (probably) awake. However, if he is in his room at 3:00am, he's probably asleep. We can never say for sure whether an individual is awake or asleep, but with sufficient data, we can give accurate estimates for whether they will be awake or asleep at a given time and place.

Conclusions
The sheer amount of data reality mining offers is exciting. It is becoming increasingly feasible to investigate how population-level structure emerges from highly dynamic, individual-level associations. This tidal wave of data is likely to change our understanding of the mechanisms of evolutionary adaptations of animal groups and social behavior in the wild.

The full text of the article is available here.

Photo credits: 
- Baboon troop - Ari Strandburg-Peshkin

No comments:

Post a Comment