As the main contributor of SoundFingerprinting, I regularly review issues and bug reports that I receive on GitHub.
A large number of users submit issues without steps to reproduce or even some basic details.
It’s even funnier when some of the issues are plain rants. Two times I received one-liners with what developers think are bugs with their managers in the CC.
In Sound fingerprinting, why we are making so many database connection in sound fingerprinting? Is it really required? Since it will create too many connection issue.
SoundFingerprinting is Shazam for developers. Since the framework is very domain-specific, I try to understand the details of every submitted issue as some of them may lead to bigger discoveries.
One morning I received the following bug submission:
I am asking you for help, because I am stuck in code :(
internal class FastFingerprintDescriptor : FingerprintDescriptor
the problem is in method:
public int Find(int kth, float[] list, int[] indexes, int lo, int hi)
results of this method are different because it has
rand.Next(lo, hi + 1)
, but still the end results ofExtractTopWavelets
are the same for most framesI am wondering how can I fix the logic, so when we will extract fingerprints for the same song we will get 100% same HashTables all the time
It wasn’t a particularly detailed bug report. The user didn’t specify any steps to reproduce, didn’t attach files he was fingerprinting. I had to step back and try to guess what was he trying to achieve.
After rereading it a couple of times, I suspected that the user was fingerprinting an audio file and was getting different fingerprints during multiple runs on the same file.
My first reaction: It can’t happen! I’ve fingerprinted thousands of audio files and was sure that audio fingerprints for the same file always stay the same.
As unlikely as the bug sounded, I decided not to ignore it and try to reproduce my assumption with a simple Monte Carlo simulation. The approach was simple:
I wrote a simple test to do it:
[Test]
public void ShouldGenerateSameSignatureDuringMultipleRuns()
{
const int count = 4096;
const int topWavelets = 200;
var random = new Random();
for (int run = 0; run < 50000; ++run)
{
(float[] a, float[] b) = GetTwoRandomCopies(count, random);
ushort[] indexes1 = Range(0, count);
ushort[] indexes2 = Range(0, count);
int akth = algorithm.Find(topWavelets - 1, a, indexes1, 0, a.Length - 1);
int bkth = algorithm.Find(topWavelets - 1, b, indexes2, 0, b.Length - 1);
Assert.AreEqual(akth, bkth);
AssertFingerprintsAreSame(topWavelets, a, b);
}
}
The test passed at 1,000 iterations. Same for 10,000, 25,000, and then boom! The test result was red.
Not matched: 0.9458628,-0.9458628 //same value different sign
To my horror, it failed when I increased the number of iterations to 50,000. The simulation reproduced the issue. The error message indicated that the audio fingerprints are different. In other words, the same random array generated two fingerprints that had the same last value in the array but with a different sign.
To explain the reasons why this happened, I will dive a bit into the algorithm that produced these results.
The call to
int akth = algorithm.Find(topWavelets - 1, a, indexes1, 0, a.Length - 1);
is made to find top N
elements of the array by their absolute values.
The simplest solution to find top N
elements is to sort the array, and then iterate over first N
elements. This approach is highly inefficient in case you need only a small number of N
elements out of a large array (i.e., 200 largest elements out of 8192) since you have to sort it fully.
The approach I took was a modification of the Quickselect algorithm. It sorts the array until it finds the Nth element in its pivoting position. As a consequence, the elements on the left-hand side of the pivot are larger than the elements on the right-hand side.
This is exactly what’s needed, find unordered top N
elements.
public int Find(int kth, float[] list, ushort[] indexes, int lo, int hi)
{
var rand = new Random();
while (lo != hi)
{
int mid = lo + ((hi - lo) / 2);
int pi = rand.Next(lo, hi + 1); // problematic line
pi = Partition(list, indexes, pi, lo, hi);
if (pi == kth)
return pi;
if (pi > kth)
hi = pi - 1;
else
lo = pi + 1;
}
return lo;
}
After reviewing the code, the epiphany settled. The reason why elements were different for the same input on two separate runs, was a consequence of three combined reasons.
[.... 200th, 201st ...] // indexes
[.... 13 , -13 ...] // values!
These reasons combined produced the bug. Since the preconditions are very unlikely, I would probably never spotted it on my own. The takeaway, some of the issues, even if not very well described are worth the effort for more analysis.
After rereading the bug report, I was actually surprised to how close to the solution the reporter was. He put quite a bit of effort to understand the root cause!
The solution itself was simple. Instead of choosing a random pivot, I would always pivot around middle element: int pi = mid;
This doesn’t make pivoting stable (audio fingerprinting doesn’t require it), but it provides consistent results for the same input when executed multiple times.
Monte Carlo simulations used in tests can really help you in spotting highly unlikely events. Since the time I received this bug report, I started to use them more frequently.
Even if the bug report is not very well-formed, it may still be worth the effort to try and understand the details, leading to surprising discoveries in your code.