posts

Introducing AudioLoop

I’ve been working for a while on AudioLoop, so I figure it needs a post. AudioLoop is a tool to help researchers create labels for large unlabeled audio datasets. It’s meant to indirectly solve a problem I see in bioacoustics: the lack of many large labeled datasets. Hopefully AudioLoop can help others create more of them. The problem is that labeling can be a long, drawn-out, draining manual process. Having to listen to a million clips and click “yes” or “no” for each one is challenging even for the most determined, not to mention expensive. And many bioacoustics datasets are highly imbalanced: events of interest can be quite rare. Listening to 100 clips of noise for every one clip of interest isn’t a great way to spend time or resources. ...

Perch and MBARI Clips: Wrap-Up

(Previously: 1, 2, 3, 4, 5, 6, 7, 8) It’s time to wrap up this investigation for now. I’ve been looking into how Google’s Perch 2.0 model handles raw clips from the MBARI Monterey Bay hydrophone. I’ve been checking how Perch’s generated embeddings might be used to distinguish between: 1) the presence or absence of a humpback vocalization; and 2) different types of humpback vocalizations. I’ve been using the earlier Google humpback detector model for direct comparison in (1) and indirect comparison in (2). ...

Perch and MBARI Clips: Low Confidence Neighbors

Time to continue looking into whether Perch 2.0’s embeddings make it easy to distinguish between humpback-present and humpback-absent clips in raw MBARI hydrophone recordings. Last time, I looked at two clips with high-confidence scores from Google’s humpback detector model. Then I compared their Perch-computed embeddings to their nearest neighbors using a ~9000-clip sample stratified across detector-score buckets. For the high-confidence humpback-positive clip, the Perch embedding’s 150 nearest neighbors and the detector scores seemed to be in line: the neighbors were almost all in the detector highest-confidence bucket. ...

Perch and MBARI Clips: Noisy Neighbors

For my next experiment testing Perch 2.0’s effectiveness in identifying humpback vocalizations in raw MBARI hydrophone clips, it’s time to take a step back. In the first three posts (one, two, three), I was looking at whether Perch embeddings could be used to determine the presence or absence of a humpback vocalization in a particular clip. In one test, I checked high-confidence-positive(>=90% score) vs. high-confidence-negative(<=5%) clips as scored by the older humpback detector model. In a later test, I used lower-confidence clips (10-30% and 70-90% scores). ...

Perch and MBARI Clips: Signal and Noise

In a previous post, I was checking to see if Perch 2.0 places similar humpback vocalizations near each other in its embedding space. “Bloop” sounds seemed to show up reliably in the 15 closest neighbors to a strong bloop, so it’s time to see if the embeddings perform similarly for another type of sound. Here’s a fairly different sound. It has some tonal similarities to the bloop but the timbre is very different. It’s a sort of…tonal growl? ...

Perch and MBARI clips: A Quick Reset

Four posts into this (one, two, three, four), it’s probably time to back up for a moment and recap. Questions I’ve been looking at two different questions: Given raw MBARI hydrophone clips, can Perch 2.0: Distinguish between clips where a humpback vocalization is present vs not? Distinguish between different types of humpback vocalizations? Data Using three different groups of clips:1 High-confidence detector-score clips (as estimated by the Google humpback detector model with scores of >=90% and <=5%) Lower-confidence detector-score clips (scores of 70–90% and 10–30%) A large (~9000) stratified sample across detector-score buckets Assessment And checking the results in two ways: ...

Perch and MBARI Clips: Bloops

In the last post, it started to look like Perch 2.0, at least on less-definitely humpback MBARI hydrophone clips, doesn’t reveal much whale/non-whale structure in its embeddings. But a quick listen to some of the positive clips raises an important point: humpbacks make a lot of different types of sounds.1 It’s possible that the embeddings are tied more to the qualities of the individual calls than to the presence or absence of a call at all. ...

Perch and MBARI Clips: Listening to the Neighbors

In the last post, I was seeing if Perch embeddings produced from less-confident predictions of the humpback detector model produced easy-to-differentiate PCA clusters. While there was some obvious grouping, the embeddings didn’t appear to provide clear visual clusters of whale and non-whale clips. But maybe we just needed more data. That last plot just used 20 clip embeddings. So let’s plot 100 likely-positive and 100 likely-negative one instead. Here’s the PCA plot: ...

Perch and MBARI Clips: Beyond the Easy Cases

Continuing on from the last post about Perch 2.0 embeddings from MBARI raw hydrophone data, it seems important to note that the PCA plot in that last only looked at the extremes–the clips where the humpback detector model expressed strong positive confidence (>90%) or extreme negative confidence (<5%). So let’s look at what happens when we take slightly less obvious examples. Let’s compare 70-90% vs 10-30% confidence clips. We’ll use the same day, December 21, 2016, because it has a lot of high-confidence examples.1 ...

Creating MBARI clip embeddings with Perch 2.0

I’ve been fascinated with the MBARI hydrophone recording archive for a while now. It’s from a hydrophone installation in Monterey Bay that’s been recording almost 24/7 since 2015. It’s partly what inspired me to make AudioLoop–all that data deserves to be made into some useful labeled datasets. I’ve been curious to see how well Perch 2.0 does at extracting useful embeddings from these recordings. A Google DeepMind paper claims that “despite having almost no marine training data,” Perch 2.0 performs well at marine species classification. ...