Fingerprinting MP3 files with libfooid

I’ve been working on a project that needs technology to fingerprint MP3 files in order to identify different recordings of the same track. There are a number of commercial solutions for this, but there are also one or two open-source libraries. Unfortunately none of the open-source solutions that we could find are actively maintained.

One promising library is libfooid: it’s not maintained (I can’t even figure out how to contact the person who recovered the project from Google source code search and took over after the original maintainer dropped out) but it’s simple, published under a friendly license and apparently used to work well enough to be the basis of a fully-operational web service for identifying tracks (now defunct).

The libfooid code downloaded from Google code is easy enough to compile on linux, though it needs a Makefile writing for it. The programming interface is simple but it requires the caller to break the file up into individual samples. This will also require decoding the MP3 file.

Decoding the MP3 file

The simplest way to do this, which will work for testing purposes, is just to decode the MP3 to a WAV at the command line. I had to install mpg123 on my Ubuntu box to support this, then it was simple:

$ mpg123 -w test.wav test.mp3

Reading the WAV file

WAV is a pretty simple format, but it doesn’t make sense to try and parse it ourselves. The libsndfile project appears to do just what we need. Downloading the tarball and installing worked perfectly well for me:

$ tar -zxvf libsndfile-1.0.20.tar.gz
$ cd libsndfile-1.0.20
$ ./configure
$ make
$ make install

It’s then easy enough to link this library in:

$ gcc main.o -L. -lfooid -lsndfile -o test

Some quick test code shows that this works to some extent:

#include "fooid.h"
#include "sndfile.h"

void print_file_info(SF_INFO * file_info)
{
    printf("--- File Info ---n");
    printf("Sample rate:t%dn", file_info->samplerate);
    printf("Channels: t%dn", file_info->channels);
    printf("n");
}

int main(int argc, char ** argv)
{
    SF_INFO * sfinfo = malloc(sizeof(*sfinfo));
    SNDFILE * file = sf_open("test.wav", SFM_READ, sfinfo);

    print_file_info(sfinfo);
}

With my test data, this gives the expected output:

timm@howler:~/dev/libfooid$ ./test
--- File Info ---
Sample rate:	44100
Channels: 	2

Compiling libfooid

The currently-available libfooid distribution comes with some Windows build files, but no Makefiles. I wrote a very quick and dirty makefile, but there were one or two issues.

libfooid attempts to use a customised round() function written in assembler, presumably for speed, and there are some preprocessor macros to switch between this and a pure C version that is preferred on 64-bit windows. It was necessary to tweak the preprocessor commands to prevent compiling this in on linux, since it clashes with the implementation in the standard library.

Actually, the issue with round() wasn’t quite that simple. If you leave it in, you get the message “conflicting types for built-in function ’round'”, which implies that the function is defined by the library. However, if you take the declaration out, you get “incompatible implicit declaration of built-in function ‘round’”. This seems highly confusing, but there’s a simple enough explanation: The function exists as a built-in, but only under the C99 standard, which isn’t enabled by default. Even though the declaration isn’t enabled by default, it’s still possible for the built-in to conflict with the customised declaration (and a full definition would form a linker error). The solution is to compile with the -std=c99 compiler flag.

The library also seems to assume that there is a min() function available in the standard library, which is not available in standard C. I inserted a trivial definition of a function to fill the gap.

You can get hold of my tweaked version of libfooid on github. I’m not very good with recursive make, so you have to run make first in the libresample directory, then in the root libfooid directory.

Testing the fingerprinting

The code on github includes a very simple command-line wrapper that takes the name of a WAV file on the command line and runs it through the fingerprinter. You can use it like this:

timm@howler:~/dev/libfooid$ ./test test.wav
--- File Info ---
Sample rate:    44100
Channels:     2

00 00 e9 c5 00 00 07 05 e7 0f 41 40 00 ec 80 02 25 d4 c4 08 40 e0 95 14 00 cc d2 21 40 dc 80 01
44 fc a1 10 00 e0 40 41 70 bc 60 41 fe f4 00 50 84 fc 55 04 80 c8 40 00 02 fc 80 00 3d ec 40 40
92 fc 00 44 04 cc 86 00 00 c0 40 40 04 cc c4 40 24 f0 10 00 00 cc 40 55 00 c0 40 14 00 e0 84 00
0f f4 08 c3 20 cc 40 c7 7c b4 42 c3 30 b9 48 83 81 f8 a1 c3 8b 3f 81 c3 30 fc 30 e3 04 3c 82 c3
80 fc 92 d3 74 f1 02 f7 30 a0 4c 33 b0 c8 fc 30 34 30 94 30 30 f8 8d 30 fd 38 69 30 f3 bd 80 30
30 f0 80 f0 71 e3 1f 21 33 a7 51 c6 08 cc 90 d3 38 c0 40 d3 80 ec 11 d3 d0 f5 40 c3 40 fc 83 d3
40 fc ad f3 30 fd a6 b7 f8 f7 00 30 31 c8 6c 70 74 f0 02 31 f5 f9 1c 30 33 bc 0c 31 b3 f0 0b 70
30 b0 00 f3 3c bc 33 f7 77 c0 78 f3 71 fe 10 f7 b0 fc 01 f7 3c d8 9c e1 b4 e1 01 f3 04 fc e8 f7
70 7c 68 f7 b8 f4 0f f7 73 fe 05 f3 bf 30 9a f3 3f fc 0b f7 f0 fd 0f ff bf 38 df fb 7e 38 13 f3
fc f4 37 b7 b1 fc 20 fb 3a f0 0f f3 f0 30 7a c7 50 d0 50 cb f0 44 58 c3 48 32 d4 d7 28 bf 68 9f
fc 24 40 cf f0 3c 28 c8 82 81 8d cf 64 00 32 ff 70 3c 03 ff 34 d1 00 ff 31 30 40 ff 90 d2 b0 ef
49 c0 b0 ff b6 78 ff ff ff ff ff ff ff ff ff f3 bf ff ff ff 39 ef ff be ef ef be fb ef bc f3 cf
3c ff ff 27 ef ef be fb ef 8b 2c b2 d3 4d 34 d7 59 65 93 4d 32 cb 2c b4 d2 49 35 96 55 32 cb 2c
b2 cf 3c f4 d3 49 35 80

One thought on “Fingerprinting MP3 files with libfooid

  1. Daniel

    Hi, I would like to experiment a little bit with this fooid library, but I’m having one problem when I’m trying to compile main.o :(.
    When I run:
    gcc main.o -L. -lfooid -lsndfile -o test
    I get this:
    /usr/bin/ld: skipping incompatible ./libfooid.a when searching for -lfooid
    /usr/bin/ld: cannot find -lfooid
    :(. I’m quiet new to linux and just can’t figure out why the library libfooid.a is incompatible for me…
    Do you have any idea? BTW I’m running on 64bit Fedora 12.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *