52

I have a folder that has many files and "rm -rf" takes a lot of time to complete. Is there any faster way to remove a directory and it's contents (subdirs, etc)?

7 Answers7

50

You could try unlinking the inode for the directory but that would leave you with a whole load of orphan files that fsck will flip out about.

rm is as good as it gets.


A few people are mentioning edge cases where some things are faster than others. But let's make sure we're comparing the best versions of the same things.

If you want to delete a directory and everything in it, I'm suggesting you:

rm -rf path/to/directory

rm will internally list the files and directories it's going to delete. And that's all in compiled C. It's those two reasons it's fastest.

This is very pointedly not the same thing as rm -rf path/to/directory/* which will expand at shell level and pass a load of arguments into rm. Then rm has to parse those and then recurse from each. That's much slower.

Just as a "benchmark" that compares find path/to/directory -exec {} \; is nonsense. That runs rm once per file it finds. So slow. Find can xargs-style build commands arguments with -exec rm {} + but that's just as slow as expansion. You can call -delete which uses an internal unlink call to the kernel (like rm does) but that'll only work for files at first.

So to repeat, unless you throw the disk into liquid hot magma, rm is king.


On a related note, different filesystems delete things at different rates because of how they're structured. If you're doing this on a regular basis you might want to store these files in a partition formatted in XFS which tends to handle deletions pretty fast.

Or use a faster disk. If you have tons of RAM, using /dev/shm (a RAM disk) could be an idea.

Oli
  • 299,380
20

If you don't need the free space the quickest way is delay the deletion and do that in the background:

  • mkdir .delete_me
  • mv big-directory-that-i-want-gone .delete_me

Then have a crontab that does it in the background, at a quiet time, with a low I/O proiority:

3 3 * * * root ionice -c 3 nice find /path/to/.delete_me -maxdepth 1 ! -name \. -exec echo rm -rf "{}" +

Notes:

  • check your output before removing the echo in the crontab!
  • the .delete_me directory has to be in the same filesystem - in case it's not obvious to everyone.

Update: I found a neat trick to run multiple rm in parallel - this will help if you have a large disk array:

ionice -c 3 nice find target_directory -depth -maxdepth 3 | xargs -d \n -P 5 -n 5 rm -rf
  • -depth to do a depth-first traversal.

  • -maxdepth to limit the depth of the directory traversal so we don't end up listening individual files.

  • -d \n to handle spaces in filenames.

  • -P and -n handles the degree of parallelism (check manpage).

ref: http://blog.liw.fi/posts/rm-is-too-slow/#comment-3e028c69183a348ee748d904a7474019

Update 2 (2018): With ZFS shipped with Ubuntu 18.04 I use it for everything and I will create a new dataset for any big project. If you plan ahead and do this beforehand you can simply "zfs destroy" a filesystem when you are done. ;-)

I used the instructions from the zfsonlinux wiki to install Ubuntu to ZFS natively: https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS

14

Sometimes, find $DIR_TO_DELETE -type f -delete is faster than rm -rf.

You may also want to try out mkdir /tmp/empty && rsync -r --delete /tmp/empty/ $DIR_TO_DELETE.

Finally, if you need to delete the content of a whole partition, the fastest will probably be umount, mkfs and re- mount.

mivk
  • 5,811
2

I think the issue is that there is no perfect way to remove a very large directory and its entire set of contents with out a true indexed filing system that understands unlinking and doesn't mean it thinks it has missing files ala FSCK. There has to be a trust.

For instance I have zoneminder running for a golf range. I constructed a linux raid of 1.5 TB to handle the immense amount of data she captures a day ( 12 camera feeds ) how she ran on 120 GB drive is beyond me. Long story short the folder for all the captured data is about 1.4 TB of her storage. Lots to purge

Having to reinstall ZM and purge the 1.4 TB old library is no fun because it can take 1 - 2 days to delete the old images.

A true indexed FS allows the drop of directory and knows that data under it is dead and the zero'ing out the data is a waste of our time and PC resources. It should be an option to zero out deleted data. RM just takes to long in the real world on ext4.

Answer: Recursively unlinking all files would be faster marginally but you would still have to set aside a time to run FSCK.

Create a script running a recursive "FOR" command that can "unlink" all files under your folders then just rm or rmdir all folders to clean it up. Manually run FSCK to zero out the rest of the data when its convenient. Kinda lazy didnt write it out sorry :).

1

The fastest way to delete all files and folders recursively that I was able to come up with is (it's faster than everything posted here, so definitely faster than rm -rf):

perl -le 'use File::Find; find(sub{unlink if -f}, ".")' && rm -rf *

More details and benchmarks can be found here: https://www.slashroot.in/which-is-the-fastest-method-to-delete-files-in-linux

These are the benchmarking results:

Find Command with -exec: 14 Minutes for half a million files
Find Command with -delete: 5 Minutes for half a million files
Perl: 1 Minute for half a million files
RSYNC with -delete: 2 Minute 56 seconds for half a million files
1

I've created a multi-threaded replacement for rm with the sole purpose of being the fastest way to delete files, period. In my benchmarking, the worst it performs is 20% faster than anything else and tends to be 2-3 times faster than rm.

The tool: https://github.com/SUPERCILEX/fuc/tree/master/rmz
Benchmarks: https://github.com/SUPERCILEX/fuc/tree/master/comparisons#remove

1

Albeit not useful if you want to purge an existing directory, I'll mention that a possible strategy if you know you will have a directory with a lof of files which you will need to purge regularly is to put the directory on its own filesystem (e.g., partition). Then when you need to purge it, unmount it, run a mkfs, and remount it. For example OpenBSD advises to do this for /usr/obj, where many files are created during a build of the system, and must be deleted before the next build.

fkraiem
  • 12,813