Introduction: Code to Order Filenames to Your Liking

In this Instructable you will learn how to quickly order files into a desired sequence. We will do this by padding numbered file names with letters so they sort naturally sort in incremental order. Padding / zero padding is a method to give each file the same number of characters.

This is specifically useful when you are preparing files for input to a system or process that will only handle files in natural sorted order. I built the solution in the perl programming language and also using powershell. With the code I am supplying here I can pad and order file names without need for additional software.

The problems that caused me to come up with this solution are as follows:

I have a stereo that in addition to normal CDs, will play CDs which I've burned mp3 files onto. I wanted a way to control the order in which the songs are played. I found that the mp3 filenames can be whatever I like. However, the player will only use the filename order to decide which order to play the files. There might be a way to build a playlist, but even if there is it's too time consuming compared to simply naming the files the way I want from the beginning.

Another reason the included code is handy is when I use the command ffmpeg to convert a sequence of images into video. The files need to be in alphabetical order for the sequence to be correct in the video.

Step 1: Making a "Pile of Files" ( Test Files )

First we'll need a set of test files so you can play with ordering. I've attached a file (filepile.tarfile.gz) that can be expanded into test files (using the free program 7zip), or you can run one of the following scripts to create test files that better match your situation (filepile.pl or filepile.ps1)


The file name structure will have a heavy influence on the way the solution handles parsing, so we did this first. In this case file names will be structured as "name-#.mp3" allowing us to use the "-" and the"." for parsing.

As a side note, programs like CDEX can create zero padded files when extracting tracks. However, its pretty easy to forget, and you need to know the proper number of places to start.

For Linux I used Perl:

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper; 

my $dir = "./data/";
my $title = "track-";
my $extention = ".mp3";
foreach my $index (1 .. 100) {
  my $fn = $dir . $title . $index . $extention;
  `touch $fn`;
}

For Windows I used Powershell

$array = 1..100
$dir = '.\data'
foreach ($element in $array)
{
  $filnam = "lecture-" + $element + ".mp3"
  #Write-Host $filnam
  "Hello" > $dir\$filnam
}

In the end you will have a directory with 100 files that don't include zero padding.

Step 2: Using Powershell to Pad Files

I wanted to create a solution that could be used with a pre-installed scripting for a platform
On Windows, the choice I made was PowerShell. This scripting language built on top of the .NET framework comes configured on the most recent versions of Windows and is even available for download for XP. Additionally, the framework provides a long laundry list of features that peak my curiosity. Most of those features fall under the category of - "that made things a whole lot simpler" - but are outside the scope of this discussion. On the whole, I found the language easy to pick up using a combination of trial and error with some language reference guides and internet searches for syntax examples. Searches would not only help me find answers to immediate problems , but give me ideas for other solutions. Documentation invited discovery, and was easily available at the prompt. Additionally, an ide (powershell_ide) is available with no need to do any additional installations. The only thing I needed to do was enable my scripts to run. The two images illustrate the command I ran as administrator allowing my scripts to be run, and the ide that can be quickly accessed by context clicking a file with a ps1 extension and selecting edit.

The solution is customized to the input files naming structure. Different naming conventions might require some slight alteration to to the script process. (more on this later)

The Preconditions:

  • The file sequence will be separated into their own directory.
  • The files will have the naming convention of sequence-#.extension, allowing them to later be combined with other sequences.
    • examples: lecture-1.mp3 lecture13.mp3 lecture-5.mp3
    • examples: ece453_lecture-1.mp3 ece453_lecture-1.mp3
    • example - not :ece453-lecture-1.mp3

The process:

  • iterate through all filenames in a directory
    • break down filenames
    • find the maximum number of digits in the numbered part of the sequence
  • iterate through all filenames in a directory
    • break down filenames (reuse if stored)
    • rebuild the filename with the max number of digits found above
    • move the file to new name (test output before executing / copy instead of move)
# Zero padding file Numbers<br>$dir = "c:\dev\data"
$items = Get-ChildItem -Path $dir
# create new file names & move
foreach ($item in $items)
{
      # if the item is a directory, then process it.
      if ($item.Attributes -ne "Directory")
      {
            # Write-Host $item.Name
            $fs1 = $item.Name.split(".")
            $fs2 = $fs1[0].split("-")
            $siz = $fs2[1] | Measure-Object -Character | select -expandproperty characters
            if ( $siz -gt $places)
            {
                $places = $siz
                # Write-Host "places is " $places
            }
            #$num = $fs2[1].toString("000")
            #Write-Host $num
      }
}

foreach ($item in $items)
{
      # if the item is a directory, then process it.
      if ($item.Attributes -ne "Directory")
      {
            # Write-Host $item.Name
            $fs1 = $item.Name.split(".")
            $fs2 = $fs1[0].split("-")
            #$siz = $fs2[1] | Measure-Object -Character | select -expandproperty characters
            
            $fmt = '{0:D' + $places + '}'
            $newnum = $fmt -f $fs2[1]
            $newnum = $fs2[1].toString().PadLeft($siz+1, "0")
            $newnam = $fs2[0] + "-" + $newnum + "." + $fs1[1]
            # Write-Host $newnam
            $item | rename-item -NewName $newnam
      }
}

The name structure allows me to peel back the different parts of the file name and get at the number. I check to see if the number has more digits than previously checked numbers. If so, I update the number to check for subsequent iterations. Once I have checked all the files, I can use what I have found to reconstruct the number and move the file into the desired name.

Step 3: Using Perl to Pad Files

The perl version is useful on linux boxes as a commonly pre-installed interpreter, however, perl can easily be installed on windows for free as well.

The solution follows almost the exact same routine. I added in a check for file name actually having changed before executing the move command. Otherwise minor syntax changes, along with trading out functions that perform the same function. For example, measure-object becomes length , and tostring becomes sprintf.

#!/usr/bin/perl<br>use strict;
use warnings;
use Data::Dumper; 

my $dir = "./data/";

opendir(D, "$dir") || die "Can't open directory $dir: $!\n";
my @list = readdir(D);
closedir(D);
my $len = 1;
foreach my $index (@list) {
  if (($index ne '.') && ($index ne '..')) {
    my @fn1 = split("\\." , $index);
    my @fn2 = split('-' , $fn1[0]);
    if ($len lt length($fn2[1]) ){
      $len = length($fn2[1]);
    }
  }
}

#print "length : " . $len . "\n";

my $formatstring = "%0" . $len . "d";

foreach my $index (@list) {
  if (($index ne '.') && ($index ne '..')) {
    my @fn1 = split("\\." , $index);
    my @fn2 = split('-' , $fn1[0]);
    my $nfn = $fn2[0] . '-' . sprintf($formatstring , $fn2[1] ). '.' . $fn1[1] ;
    if ($index ne $nfn) {
      my $cmd = 'mv ' . $dir . $index . ' ' .  $dir . $nfn ;
     `$cmd`;  #execute
    }
  }
}

Step 4: Further Development and Reference Links

As I mentioned earlier, the name structure has a large impact on how the script handles parsing. However, fact checking the ffmpeg documentation gave me the idea that I might be able to use regular expressions to make a program that might adapt to different name structures.

Additionally, I was recently told a Microsoft takes .NET open source and cross-platform/

I think this might be a good learning experience to explore implementing the project with that tool.

Some of the links I found helpful in this developing this solution:

Perl documentation perldoc.perl.org

Enable your scripts to run : how-to-create-and-run-a-powershell-script.html

Zero padding:formatting-through-f-parameter.html/

http://tfl09.blogspot.com/2007/11/formatting-with-...

http://snipplr.com/view/14130/leftpadding-in-power...

Getting directory contents : http://geekswithblogs.net/TechnicallySpeaking/arch...

Renaming file: http://irisclasson.com/2013/12/14/renaming-files-i...

Coded Creations

Participated in the
Coded Creations