CV4Animals: Computer Vision for Animal Behavior Tracking and Modeling
In conjunction with Computer Vision and Pattern Recognition 2021
Afternoon, June 25
Many biological organisms have evolved to exhibit diverse behaviors, and understanding these behaviors is a fundamental goal of multiple disciplines including neuroscience, biology, animal husbandry, ecology, and animal conservation. These analyses require objective, repeatable, and scalable measurements of animal behaviors that are not possible with existing methodologies that leverage manual encoding from animal experts and specialists. Computer vision is having an impact across multiple disciplines by providing a new tools for the detection, tracking, and analysis of animal behavior. This workshop brings together experts across fields to stimulate this new field of computer-vision-based animal behavioral understanding.
How to Attend?
You can attend our workshop by registering for the CVPR conference (http://cvpr2021.thecvf.com/node/47). The CVPR virtual platform will link our workshop landing pages including Zoom and Gatherly.
Check our CV4Animals How-to: https://docs.google.com/presentation/d/1qu1YdOUBQWcO166GetKcVSbbrGBN4vsUnkh1GM-oiCU/edit?usp=sharing
June 25, 2021 (Afternoon; Pacific Time)
12:00 - 12:10 Welcoming Remark and Workshop Overview (Michael Black)
Session I (Chair: Natalia Neverova)
12:10 - 12:40 Invited Talk I: Mackenzie Mathis (EPFL)
Title: Robust and Efficient Animal Pose Estimation
Abstract: Neural networks are highly effective tools for pose estimation. I will discuss our work on robustness and leveraging transfer learning for animal pose estimation. In particular, I will discuss open challenges for robustness to out-of-domain data, and ongoing work to leverage optimized datasets for building more robust and body plan-agnostic sharable tools and models for the larger community of users across life sciences and computer vision.
Bio: Mackenzie Mathis is the Bertarelli Foundation Chair of Integrative Neuroscience at EPFL, and a lead-developer of DeepLabCut, the software package for animal pose estimation. Her lab works on understanding adaptive mechanisms in intelligent systems. She received her PhD at Harvard in 2017.
12:40 - 12:45 Paper Presentation I: Spatio-Temporal Event Segmentation for Wildlife Extended Videos
12:45 - 12:55 Q&A
Session II (Chair: Helge Rhodin)
12:55 - 01:25 Invited Talk II: Silvia Zuffi (IMATI-CNR)
Title: 3D Shape Models of Animals for Mesh Reconstruction
Abstract: Reconstructing the 3D articulated shape of animals from images is an interesting problem with many applications including biology, conservation and entertainment.
Bio: Silvia Zuffi is a research scientist at the CNR Institute for Applied Mathematics and Information Technologies in Milan (Italy), and affiliated researcher at the Max Planck Institute for Intelligent Systems in Tübingen (Germany). She graduated in Electronic Engineering at the University of Bologna (Italy). After graduation she worked for some time in industry, then joined the Italian National Research Council (CNR) in 2000, where her research interests were colour imaging, specifically multispectral imaging and readability of coloured text on Web pages. In 2008 she won a Marie Curie fellowship and then started a PhD in computer vision at Brown University with the supervision of Micheal J. Black. She graduated with a thesis on Shape Models of the Human Body for Distributed Inference. In 2015 she was a postdoc at the MPI Institute for Intelligent Systems, in the Perceiving System department in Tuebingen, Germany. Now at IMATI-CNR, her research interest is modeling the shape of animals for applications in computer vision and graphics.
01:25 - 01:30 Paper Presentation II: Tracking Grow-Finish Pigs Across Large Pens Using Multiple Cameras
01:30 - 02:00 Invited Talk III: Andrew Fitzgibbon (Microsoft)
Title: Humans and other animals
Abstract: My interest in the field of animal motion was always rather selfish: it wasn’t an interest in animals per se, but in 3D objects that change their shape over time. To be interesting (by which I really meant “hard but not too hard”) those shape changes had to be somehow gentle and repetitive. Hence giraffes, clownfish, pigeons, dolphins. Ultimately that work led to applications in human body and hand tracking, but animal motion was the impetus. Recently, with my student Ben Biggs and other collaborators, we have been crystallizing the differences between human and animal reconstruction, looking at how to transfer learnings from one to the other. I will talk about these learnings, and about potential future directions that mean we are not in a world where “all animals are equal, but some are more equal than others”.
Bio: Andrew Fitzgibbon has been closely involved in the delivery of three groundbreaking computer vision systems over two decades. In 2000, he was computer vision lead on the Emmy-award-winning 3D camera tracker “Boujou”; in 2009 he introduced large-scale synthetic training data to Kinect for Xbox 360, and in 2019 was science lead on the team that shipped fully articulated hand tracking on HoloLens 2. His passion is bringing the molten gold of mathematics to the crucible of real-world engineering. He has numerous research awards, including ten “best paper” prizes at leading conferences, and is a Fellow of the UK’s Royal Academy of Engineering.
02:00 - 02:10 Q&A
02:10 - 03:00 Coffee Break and Poster Session (Gatherly)
Session III (Chair: Shohei Nobuhara)
03:00 - 03:30 Invited Talk IV: Kristin Branson (Howard Hughes Medical Institute)
Title: Can Self-supervised Machine Learning Help Us Discover Principles of Animal Behavior?
Abstract: In recent years, supervised machine learning approaches have made huge strides in solving important computer vision problems related to animal behavior analysis, including video-based tracking and action recognition. These approaches involve training classifiers to replicate human labels, with the goal of automating, at high-throughput, tasks humans perform well at. A new frontier is to try to use self-supervised machine learning to predict relationships between different types of data better than humans. In our work, we have been training classifiers that can predict animals' future behaviors from their pasts. Our goal is to discover principles of animal behavior by interpreting and interrogating these classifiers. In this talk, I will discuss our current progress toward these goals, and the open questions we are facing.
Bio: Kristin Branson is a Senior Group Leader and the Head of Computation and Theory at the Howard Hughes Medical Institute's Janelia Research Campus. Her research focuses on developing cutting-edge machine vision and learning methods that can advance biological insight. She has developed influential software for video-based tracking and categorization of animal behavior that has been adopted by hundreds of labs. Her lab is using these tools to understand the structure, organization, and neural implementation of behavior. As the Head of Computation and Theory, she is responsible for leading all computational research at Janelia. Branson received her undergraduate degree from Harvard, her PhD from University of California San Diego with Serge Belongie and Sanjoy Dasgupta, and performed her postdoctoral studies at Caltech with Pietro Perona and Michael Dickinson. She joined Janelia as a Group Leader in 2010.
03:30 - 03:35 Paper Presentation III: hSMAL: Detailed Horse Shape and Pose Reconstruction for Motion Pattern Recognition
03:35 - 03:40 Paper Presentation IV: Analysis of Visual Attention of a Harris’ Hawk in Flight using a Synthetic Reconstruction of its Visual Field
03:40 - 03:50 Q&A
Session IV (Chair: Angjoo Kanazawa)
03:50 - 03:55 Paper Presentation V: Towards Self-Supervision for Video Identification of Individual Holstein-Friesian Cattle: The Cows2021 Dataset
03:55 - 04:25 Invited Talk V: Marc Badger (UPenn)
Title: Birds of a Feather Flock Together in 3D Shape Space
Abstract: Automated understanding of animal activity is poised to transform many fields in biology. Of particular interest to ecology, biomechanics, behavior, and neuroscience are studies of animals moving through naturalistic 3D environments and interacting in complex social groups. In both situations, we aim to capture movement dynamics and social cues encoded in pose trajectories and shape changes (such as puffing and hair raising), while gracefully handling frequent occlusions that occur when animals interact with each other and objects in the environment. Working with socially gregarious cowbirds as a model species, we developed a method for single-view pose and shape estimation using an articulated 3D mesh model, allowing accurate reconstruction of a flock of birds interacting in a large aviary using relatively few cameras. In our most recent work, we disentangle pose and shape to capture shape spaces of novel bird and dog species directly from image collections, bypassing the need for multi-view setups or 3D scans of many individuals. We then formed a multi-species shape space, which reflects the phylogenetic relationships among birds better than learned perceptual features. I will also discuss the challenges involved with using such parametric models to study animal morphometrics and the challenges related to multi-view multi-animal tracking in settings where only limited annotations are available.
Bio: Marc Badger is a postdoctoral researcher in Computer Science at the University of Pennsylvania advised by Marc Schmidt and Kostas Daniilidis. His research is at the intersection of machine perception and animal locomotion, focusing on animal pose and shape estimation, tracking, and action recognition. He loves creating high-throughput robotic systems to examine the relationships between biomechanics, learning, and animal behavior. Marc received his PhD in Integrative Biology at UC Berkeley, where he studied hummingbird flight biomechanics and was advised by Robert Dudley, and received his Bachelor’s in Physics from Harvey Mudd College. He has also spent time with Stacey Combes at UC Davis working on maneuvering flight and obstacle avoidance in bees. His work has been funded by a Graduate Research Fellowship from the National Science Foundation and a CiBER IGERT Traineeship, and has been featured in National Geographic Magazine.
04:25 - 04:35 Q&A
04:35 - 04:40 Closing Remark (Hyun Soo Park)
04:40 - Additional Poster Session and Mingling (Gatherly)