Suramya's Blog

Visit suramya.com Who am I?

December 14, 2014

Cleaning your Linux computer of cruft and duplicate data

When you use a computer and keep copying data forward everytime you upgrade or work with multiple systems it is easy to end up with multiple copies of the same file. I am very OCD about organizing my data and still I ended up with multiple copies of the same file in various locations. This could have happened because I was recovering data from a drive and needed a temp location to save the copy or forgot that I had saved the same file under another directory (because I changed my mind about how to classify the file). So this weekend I decided to clean up my system.

This was precipitated because after my last system reorg I didn’t have a working backup strategy and needed to get my backups working again. Basically I had moved 3 drives to another server and installed a new drive on my primary system to serve as the Backup drive. Unfortunately this required me to format all these drives because they were originally part of a RAID array and I was breaking it. Once I got the drives setup I didn’t get the chance to copy the backup data to the new drive and re-enable the cron job that took the daily backup snapshots. (Mostly because I was busy with other stuff). Today when I started copying data to the new Backup drive I remembered reading about software that allowed you to search for duplicate data so thought I should try it out before copying data around. It is a good thing I did because I found a lot of duplicates and ended up freeing more than 2 GB of space. (Most of it was due to duplicate copies of ISO images and photos).

I used the following software to clean my system:

Both of them delete files but are designed for different use cases. So let’s look at them in a bit more detail.

FSlint

FSlint is designed to remove lint from your system and that lint can be duplicate files, broken links, empty directories and other cruft that accumulates when a system is in constant use. Installing it is quite easy, on Debian you just need to run the following command as root

apt-get install fslint

Once the software is installed, you can either use the GUI interface or run it from the command line. I used the GUI version because it was easier to visualize the data when seen in a graphical form (Yes I did say that. I am not anti-GUI, I just like CLI more for most tasks). Using the software was as easy as selecting the path to search and then clicking on Find. After the scan completes you get a list of all duplicates along with the path and you can choose to ignore, delete all copies or delete all except one. You need to be a bit careful when you delete because some files might need to be in more than one location. One example for this situation is DLL files installed under Wine, I found multiple copies of the same DLL under different directories and I would have really messed up my install if I had blindly deleted all duplicates.

Flossmanuals.net has a nice FSlint manual that explains all the other options you can use. Check it out if you want to use some of the advanced features. Just ensure that you have a good backup before you start deleting files and don’t blame me when you mess up your system without a working backup.

BleachBit

BleachBit is designed for the privacy conscious user and allows you to get rid of Cache, cookies, Internet history, temporary files, logs etc in a quick and easy way. You also have the option to ensure that the data deleted is really gone by overwriting the file with random data. Obviously this takes time but if you need to ensure data deletion then it is very useful. Bleachbit works on both Windows and Linux and is quite easy to install and use (at least on Linux, I didn’t try it on Windows). The command to install it on Debian is:

apt-get install bleachbit

The usage also is very simple, you just run the software and tick the boxes relevant to the clutter that you want gone and BleachBit will delete it. It does give you a preview of the files it found so that you can decide if you actually want to delete the stuff it identifies before you delete it.

Well this is all for now. Will write more later.

Thanks to How to Sort and Remove Duplicate Photos in Linux for pointing me towards FSlint and Ten Linux freeware apps to feed your penguin for pointing me towards BleachBit.

– Suramya

2 Comments »

2 Responses to " Cleaning your Linux computer of cruft and duplicate data "

  1. So good info in so little words.
    Keep it up!

    Comment by Ashish — December 15, 2014 @ 2:17 AM

  2. Thanks. Glad you liked it…

    Comment by Suramya — December 26, 2014 @ 2:26 AM

RSS feed for comments on this post. TrackBack URL

Leave a comment

Powered by WordPress