Rabu, 27 Juli 2011

App::UniqFiles (a case for building app with Dist::Zilla and Sub::Spec)

When watching videos at Tudou or Youku, both Chinese YouTube-like video sites, you'll often get one/two 15- or 30-second video ads at the beginning. Since I download lots of videos recently, my Opera browser cache contains a bunch of these video ads files, each usually ranging from around 500k to a little over 1MB. But there are also duplicates.

I thought I'd collect these ads, for learning Chinese, but I don't want the duplicates, only one file per different ad. The result: App::UniqFiles, which contains a command-line script called uniq-files. Now all I need to do is just type mkdir .save; mv `uniq-files *` .save/ and delete the duplicate videos, which are files not moved to .save/.

With the help of Dist::Zilla, Sub::Spec::CmdLine, Pod::Weaver::Plugin::SubSpec, and Log::Any::App, I managed to finish App::UniqFiles, from scribbling down the concept to uploading the first release to CPAN and github, in just about under an hour (00:54 to be exact). Not super-speedy for a small script (I can probably write a one-off script version in 15-30 minutes), but for an extra 30 minutes, I get:


  • a proper Perl distribution, with tests and POD and all;
  • all the core functionality contained in subroutines (which is much more reusable than a script);
  • a POD API documentation for the subroutines;
  • a command-line application with --help message, argument parsing, configurable log levels, even bash completion with just 3 lines of code.


I think developing with Dist::Zilla and Sub::Spec is great, mainly because they realize the DRY ("Don't Repeat Yourself") principle and free you from mundane tasks. Having to repeat the same stuffs or do mindless/tedious tasks is indeed a significant source of frustation for programmers. It deflects us from the real, important task: writing the code to actually solve our problems.

Dist::Zilla allows you to generate dist's README from the main module's POD instead of you having to create this file manually. It inserts LICENSE, AUTHORS, VERSION sections to your POD instead of you having to insert and update them manually. It frees you from the mundane tasks like creating dist tarballs, checking ChangeLog, incrementing version numbers, uploading to CPAN, etc. Really, I wouldn't want to build dists manually ever again without tools like Dist::Zilla.

Sub::Spec allows you to specify rich metadata for your sub in one place, from which you can generate Getopt::Long options, POD documentation, command-line --help message, etc from it, instead of you having to maintain each of them manually. Module like Sub::Spec::CmdLine also frees you from many mundane UI issues (which, coincidentally, I hate) like parsing arguments and formatting output data to screen.

Senin, 25 Juli 2011

Undocumented Getopt::Long::Configure feature

Getopt::Long has a Configure() function to let you customize its parsing behaviour, e.g. whether or not to be case-sensitive, whether or not unknown options are passed unmodified or generate an error, etc. However, this customization is global: it affects every piece of code using Getopt::Long.

Since I use Getopt::Long in a utility module, which might conflict with the module user using Getopt::Long along with my module, I need to localize my Configure() effect. I was about to submit an RT wishlist ticket pertaining to this, but some quick checking revealed that Configure() already has this feature.

Configure() returns an arrayref containing all the current options. If you pass this arrayref to it, it will set all the options. This way, you can save and restore options.

Thanks to the Getopt::Long author, Johan Vromans, who apparently has maintained this module since 1990!