38

When I run fdupes it finds more than 30,000 duplicate files. I need to keep one file and delete all the other duplicates (because some of them are systemfiles). Please give me a command or script to do this without pressing "1 or 2 or all" for each and every group of duplicate files.

user84055
  • 593

5 Answers5

46

You can do this if you want to run this silently (I've just used it to clear 150 GB of dupes running on rackspace block storage ..£kerching!!)

fdupes -rdN dir/

r - recursive
d - preserver first file, delete other dupes
N - run silently (no prompt)
user288359
  • 461
  • 4
  • 2
10

fdupes has a rich CLI:

fdupes -r ./stuff > dupes.txt

Then, deleting the duplicates was as easy as checking dupes.txt and deleting the offending directories. fdupes also can prompt you to delete the duplicates as you go along.

fdupes -r /home/user > /home/user/duplicate.txt

Output of the command goes in duplicate.txt.

fdupes will compare the size and MD5 hash of the files to find duplicates.

Check the fdupes manpage for detailed usage info.

Eliah Kagan
  • 119,640
Amol Sale
  • 1,043
5

I would use this safer way:

Create a script and move the duplicated files to a new folder. If you move to a folder outside the original folder, fdupes won't report the duplicated files on a second scan, and it will be safer to delete them.

#!/bin/bash

# Save default separator definitions
oIFS=$IFS
# define new line as a separator, filenames can have spaces
IFS=$'\n';

# For each file (f) listed as duplicated by fdupes, recursively
  for f in `fdupes -r -f .`
  do
    # Log the files I'm moving
    echo "Moving $f to folder Duplicates" >> ~/log.txt
    # Move the duplicated file, keeping the original in the original folder
    mv $f Duplicates/
  done

# restore default separator definitions
IFS=$oIFS
derHugo
  • 3,376
  • 5
  • 34
  • 52
2

I have used fslint and DupeGuru for quite some time.

  • FSlint supports selection by wildcard and other cleanup methods
  • DupeGuru supports regex

Both can handle >10000 files/folders

seb
  • 2,341
0

I have tried them all, diff, fdupes, rsync, rdfind shell scripts and without a doubt fslint beats them all hands down. It shows the duplicates, allows you to examine them and merge or delete. The GUI is very clean and easy to use. I'm using Ubuntu 20.04.

John
  • 1