Introduction: Cleaning Up My Data Storage - the Adventure...

About: Daddy-O...

Howdy All -

I've finally got a handle on this, and here's my story.


I mention different hardware and software in this Instructabus. I will just share the names and basic functions.

Should you want or need to use these packages, please perform some research, as I did, to learn the functions and details and configuration on your own.

Time does not allow me to address specific details on how to use these. I find that the forums for various products are very helpful.

I recognize that some folks are going to be more computer-savvy than I am, and others will be less so. I hope this provides enough of a general set of steps for folks who may have similar data storage challenges.

I'm confident that some readers have a more efficient way of performing these operations - these are the steps that have worked for me. I'll chat this up with my colleagues face-to-face to discuss other options.

Step 1: Some Data Storage History...

I have been using a 2TB RAID 1 NAS (2 x 2TB SATA drives), my Z: Drive, for about 4 years to backup my Windows laptops and computers at home. This has been hard-Ethernet-wired to the switch ports on my WiFi router. It's been showing about 300GB free space over the last couple of months.

I've been using GoodSync to make my backups happen. GoodSync has been handy for backing-up my USB sticks to my laptop and Google Drive (especially when I lose a USB stick - AGAIN!!...) (I often need access to these 2D/3D drawing files from other locations, namely the TechShop)

Step 2: Pondering...

After many months (perhaps years?) of wondering how to go about it, I started my Storage Pruning project about 3 weeks ago.

I finally started getting some clarity once I wrote down some of the general tasks required to do this.

I've been also moving my home media distribution to a new, compact, ASUS computer and I'm looking to add more Linux activity to my home network and systems.

Step 3: Beginning...

Not wanting to be a cowboy about the whole operation, I first did the following:

Starting a couple of weeks ago. I re-provisioned a portable (Y: drive) 2TB WD USB3.0 drive, which I attached to my (linux-based) WiFi router. I found a new OS for the router and gained SFTP functionality to make copying files easier and more visible using FileZilla.

I fully duplicated the Z: Drive to the Y: drive, making me feel even more comfortable about my photo/music/movie/video/document files being safe.

Being basically paranoid, when the Z: drive had only about 500GB/2TB of data, I made a copy of it and have stored this at a different house.

Step 4: Tragedy...

As I was about to Prune, I had a tragedy happen - both the Y: and Z: drives jumped from 7 feet (without parachutes, mind you) to the floor!!!. I won't explain the how or why...

I was pleased, though, because the portable Y: drive was working, allowing me to begin building ANOTHER redundant data set on a new 2TB USB3.0 Seagate (BLUE Drive).

About 18 hours into the transfers, though, I found that my Y: drive would only click, and could not be initialized by Windows 7. I could not get access to these files...

Step 5: But Wait! There's More!

One of the 2 SATA drives from the NAS was also clicking when I powered the NAS back up.

I was no longer able to reach the configuration GUI on this (newly engineered and created!) NAS brick, and I couldn't make changes or transfer files - networking was broken.

The other SATA drive was happier, fortunately, and I used a USB3.0/SATA converter to regain access to the RAIDed files on this old Z: drive.

But, Windows 7 was not able to readily access this RAID partition on the SATA drive.

Step 6: Phew!!!

Using DiskInternals Linux Reader for Windows 7, I successfully copied ALL files from my (formerly RAID 1) Z: drive to the new Blue Drive. The (formerly) NAS Box had a Linux-based OS, before successfully spilling all the 1s and 0s all over the floor.

The healthy old Z: drive will be kept in a safe place in case I might ever need it again for recovering files.

Step 7: Proceeding...

Here's how I'm proceeding now:

Step 8: Finding the Files of Different Types...


I used SpaceMonger to get a feel for where there might be "hidden" media files in my (formerly NAS) data set and Laptop Windows 7 Data Drive(F:).

This helped speed-up my scans using Duplicate Cleaner Pro by searching directories with the highest likelihood of finding a match for my duplicate scan.

Step 9: The Spreadsheet


I created a spreadsheet with the following 5 column headings:

Date/Time, Step #, Scan Profile Location, Scan against self?, Free space after this scan

Step 10: It Helps!


Wanting to go through the Storage Pruning in steps, slowly and deliberately, the spreadsheet helps me keep track.

Step 11: Logging


I made sure to log all the activity while using Duplicate Cleaner's logging function. It shows the scan configuration, matched duplicates, deletions, etc which may come in handy at some point.

The same log file is appended upon each duplicate scan.

Step 12: Scan Profiles


As I took the small, deliberate Pruning steps, I saved each Scan Profile (saved just after setting file parameters and the scan locations and before I launched the scan) with a descriptive name.

For example:

C:\Users\ToGo\Desktop\Duplicate Cleaner\old z drive match all audio only.dc

The descriptive name also appears in the log file, which is very handy to confirm what operations were taking place on which parameters. I also placed these file names in my spreadsheet.

Step 13: Time Stamps


I took the time stamp from the Scan (I found this time stamp in the log) and placed this in my spreadsheet just to help me keep track of the operations - much faster and easier for me than grepping the log file.

Step 14: Monitor the Scan Progress


During each scan, I monitored the disk activity using the Windows 7 Resource Monitor (Task Manager - Performance - Resource Monitor).

This gave me a feel for how far along the scan was - I could see which specific files in which specific disk locations were being accessed for the scan and then the audio hash build.

Fortunately, the disk reads were on either my laptop's F: drive or on the USB3.0-attached Blue drive. They went very fast.

My ReadyBoost-dedicated 32GB USB stick kicked-in a number of times during the scan, as I was able to see in the Resource Monitor, among the listing of Disk Reads.

Step 15: Pruning...


Once the scan and hash were done, I pruned (marked, confirmed, and deleted) the files using various Selection Assistant functions in Duplicate Finder.

Step 16: Free Space...


To help me keep track, I also added the Blue Drive free space on my spreadsheet.

Step 17: It Continues...


I'll keep going through these Pruning steps by scanning with SpaceMonger to look for the likeliest locations of duplicate files, then using Duplicate Finder to clear space.

Step 18: Second Tier...


I've ordered another 2TB USB3.0 drive, which I'll use as a second tier backup location for my Blue Drive.

(as I'm writing this, I've just received a message from Amazon that my drive has been shipped!)

Step 19: Moving Forward


Once I'm back to steady-state, I'm looking into using rsync to keep my backups and storage more tidy.

Coded Creations

Participated in the
Coded Creations