32

The title says it all. How can I detect duplicates in my media library?

Isaiah
  • 60,750
Ingo
  • 6,348

7 Answers7

27

dupeGuru Music Edition is what you want. Set the scan type to "Audio Contents" in Preferences. Please note that the program is fairware so please contribute if you can.

alt text

I suggest you couple this with MusicBrainz Picard which can tag your music files automatically.

alt text

Li Lo
  • 16,382
10

There is a plugin that was made some time ago for this. I've used it recently but it still leaves a little to be desired. There is a "PPA" for it - but no built packages yet, just the Bazaar branch. The install instructions go something like this:

wget http://scrawl.bplaced.net/duplicate-source.tar.gz -O tmp.tar.gz && mkdir -vp ~/.gnome2/rhythmbox/plugins/duplicate-source/ && tar -xf tmp.tar.gz -C ~/.gnome2/rhythmbox/plugins && rm -v tmp.tar.gz

If you're interested in using the Bazaar'd source code do the following instead:

mkdir -vp ~/.gnome2/rhythmbox/plugins && cd ~/.gnome2/rhythmbox/plugins && bzr branch lp:rb-duplicate-source duplicate-source

Once it's installed restart Rhythmbox and you should have a Duplicates Finder now in the plugin list.

plugins list

After activating it - there are additional configuration options available.

configuration window

After the plugin is enabled - and when it finds duplicates - it'll add an additional option to your library list:

list

A few settings that I've found as "odd" - I've tried this on a media library with over 120,000 songs (over 1,000 duplicates) and a library with about 1,000 songs and maybe 30 duplicates. On the former it took a VERY long time and crashed Rhythmbox several times during the search. I eventually went with Automatically "Remove from Library" to avoid having to rebuild the list. On smaller libraries everything works great though.

When a duplicate is found - if you have the default options selected - the lower quality version of the song will be added to the list. So it's safe to select all songs on the Duplicates list and "Remove" (Either delete from disk or remove from library).

Marco Ceppi
  • 48,827
6

You can use fdupes for that:

$ fdupes -r ~/Music

which gives you a list of all duplicate files.

You can easily install it with

sudo apt-get install fdupes
Johann
  • 77
4

It might be a dozen years late, but I just wrote a command-line program that tries to detect similar audio files by comparing acoustic fingerprints: https://codeberg.org/derat/soundalike

It uses the fpcalc utility from Chromaprint to generate the fingerprints, and then builds a lookup table to find possible matches before comparing fingerprints more rigorously.

derat
  • 41
3

I ran into a similar issue when I had a bunch of duplicate image files. In my case, I just used md5sum on the files and sorted the results:

for file in $(find $rootdir -name "*.jpg"); do echo $(md5sum $file); done | sort

Files with the same contents generated the same hash, so duplicates could be found easily. I manually deleted the dupes from there, although I could have extended the script to delete all but the first occurrence, but I'm always paranoid about doing that in an ad-hoc script.

Note that this only works for duplicate files with identical contents.

John Bode
  • 139
1

Try FSlint or dupe gredtter

To install FSlint type in terminal (Ctrl-Alt-T)

sudo apt-get install fslint

hope this is useful..

stephenmyall
  • 9,885
-2

I've used FSlint to find duplicate files in general. FSlint is "a utility to find and clean various forms of lint on a filesystem."

Aputsiak
  • 234