Delete all files EXCEPT for a defined list (including files with spaces, quotes, special characters, etc)

Sometimes a bunch of unwanted files end up somewhere they don't belong. Maybe a cp gone wrong, or other command line typo creates a slew of oddly named files -- but you've got a number of files you'd really like to keep around mixed in with the bad. The following should be able to handle things:

find /tmp/disasterpiece -type f -print0 | grep -zxvf SAVE_THESE | xargs -0 -L1 rm -v
# or if you want to remove links and other non-directories
find /tmp/disasterpiece ! -type d -print0 | grep -zxvf SAVE_THESE | xargs -0 -L1 unlink

The find will list all the normal files (not directories, etc), or in the second command's case all non-directories. That's passed to grep which has been given the following options:

  1. -z  delimit lines based on nuls rather than the normal carriage returns
  2. -x  only match if the pattern matches the whole line (e.g. foo doesn't match food)
  3. -v  invert the selection (e.g. only print files that aren't in SAVE_THESE)
  4. -f  read patterns out of the specified file

Without the -v SAVE_THESE becomes "REMOVE_THESE" which if you're nervous may be a better way to go given that this command pointed at the wrong location is going to find a lot of files that aren't in "SAVE_THESE".

Making SAVE_THESE is easy using the find command and editing the output:

    find /tmp/disasterpiece -type f > SAVE_THESE
    # do a little clean up
    vim SAVE_THESE

    Finally, by giving xargs the -0 option in the original command it will break things up on the nuls so that issues with unusual files name can be avoided, the -L1 sends only one file at a time to rm or unlink.

    And this is what is should look like:

    $ mkdir /tmp/disasterpiece
    $ cd /tmp/disasterpiece
    $ # going to use STDIN redirection to create 0 length files with odd names
    $ > '!!!'; > "line
    feed"; > "s p a c e d"; > '"quoted"'; > -opt; > --longopt; > '*'; > '???'; > $(echo -e "be\007ll"); > good_file; > not_bad
    $ ls -1
    !!!
    ???
    *
    be?ll
    good_file
    line?feed
    --longopt
    not_bad
    -opt
    "quoted"
    s p a c e d
    $ # question marks are due to unprintable characters, in this case a line feed (\n) and bell (\a)
    $ # to get a better idea of what these filenames actually are use -b
    $ ls -b1
    !!!
    ???
    *
    be\all
    good_file
    line\nfeed
    --longopt
    not_bad
    -opt
    "quoted"
    s\ p\ a\ c\ e\ d
    $ # time to make the list of good files
    $ cd /tmp/
    $ find /tmp/disasterpiece -type f > SAVE_THESE
    $ # a little vi house cleaning, only going to keep good_file and not_bad
    $ # could have just as easily done
    $ # echo -e "/tmp/disasterpiece/good_file\n/tmp/disasterpiece/not_bad" > SAVE_THESE
    $ # for this simple case
    $ vim SAVE_THESE
    $ # could throw an echo in before the rm in the xargs section to get an idea
    $ # of the commands that will be issued
    $ # or give xargs the -p option so that it will prompt for each command
    $ find /tmp/foo -type f -print0 | grep -zxvf SAVE_THESE | xargs -0 -L1 rm -v
    removed ‘/tmp/disasterpiece/!!!’
    removed ‘/tmp/disasterpiece/be\all’
    removed ‘/tmp/disasterpiece/???’
    removed ‘/tmp/disasterpiece/*’
    removed ‘/tmp/disasterpiece/--longopt’
    removed ‘/tmp/disasterpiece/-opt’
    removed ‘/tmp/disasterpiece/"quoted"’
    removed ‘/tmp/disasterpiece/s p a c e d’
    removed ‘/tmp/disasterpiece/line\nfeed’
    $ cd /tmp/disasterpiece
    $ ls -1
    good_file
    not_bad

    Tags: