Have you ever wondered what it would be like if your dog could talk? Or, if you knew what they were saying or wanted? Researchers from the University of Michigan could have the answers thanks to artificial intelligence.

Understanding Animal Communication

Researchers from the University of Michigan are exploring the possibilities of AI. They are developing tools that identify playfulness or aggression in a dog’s bark. The same AI model extracts information from animal vocalizations, such as the animal’s age, breed, and sex. According to the study, AI models originally for human speech are a starting point for training new systems targeting animal communication.

Rada Mihalcea is a professor at U-M and the director of the University’s AI Laboratory. “By using speech processing models initially trained on human speech, our research opens a new window into how we can leverage what we built so far in speech processing to start understanding the nuances of dog barks,” she said. “There is so much we don’t yet know about the animals that share this world with us. Advances in AI can be used to revolutionize our understanding of animal communication, and our findings suggest that we may not have to start from scratch.”

University of Michigan researchers developed an AI tool that differentiates a playful and aggressive dog bark; Photo: Marcin Szczepanski/Michigan Engineering

There is a lack of publicly available data, which is a major obstacle to developing AI models that analyze animal vocalizations. There are multiple resources for recording human speech; however, collecting the same data from animals is difficult.

“Animal vocalizations are logistically much harder to solicit and record,” said Artem Abzaliev, lead author and U-M doctoral student in computer science and engineering. “They must be passively recorded in the wild or, in the case of domestic pets, with the permission of owners.”

Dog Vocalizations

Because of the lack of usable data, developing techniques for analyzing dog vocalizations is difficult. The ones that exist are limited because they lack training materials. However, the researchers overcame the challenges by repurposing existing models that analyze human speech. This approach enabled the researchers to tap into the backbone of voice-enabled technology, including voice-to-text and language translation. The voice-enabled models pick out nuances in human speech. For example, the models analyze the tone, pitch, and accent. The information is changed into a format for computers to identify what words are said, recognize the individual speaking, and much more.

“These models are able to learn and encode the incredibly complex patterns of human language and speech,” Abzaliev said. “We wanted to see if we could leverage this ability to discern and interpret dog barks.”

Analyzing a Dog’s Bark

Researchers recorded the barks of 74 dogs of various breeds, ages, and sexes to create a dataset. Abzaliev used the recordings to modify a machine-learning model, identifying patterns in a large dataset. The team chose a speech representation model called Wav2Vec2, which originally analyzed human speech data. Researchers used this model to generate and interpret representations of the acoustic data collected from the dogs. They found that Wav2Vec2 succeeded at four classification tasks. In addition, it outperformed other models trained specifically for dog bark data.

“This is the first time that techniques optimized for human speech have been built upon to help with the decoding of animal communication,” Mihalcea said. “Our results show that the sounds and patterns derived from human speech can serve as a foundation for analyzing and understanding the acoustic patterns of other sounds, such as animal vocalizations.”