Senin, 07 Februari 2011

The coming bloated Perl apps?

A few weeks ago, I got annoyed by the fact that one of our command line applications was getting slower and slower to start up (the delay was getting more and more noticable), so I thought I'd do some refactoring, e.g. split large files into smaller ones and delay loading modules until necessary.

Sure enough, one of the main causes of the slow start up was preloading too many modules. Over the years I had been blindly sticking 'use' statements into our kitchen sink $Proj::Utils module, which was used by almost all scripts in the project. Loading $Proj::Utils alone pulled in over 60k lines from around 150 files!

After I split things up, it became clearer which modules are particularly heavy. This one stood out:

% time perl -MFile::ChangeNotify -e1
real 0m0.972s


% perl -MDevel::EndStats -e1
# Total number of module files loaded: 129
# Total number of modules lines loaded: 46385


So almost 130 files and a total of 45k+ lines just from loading File::ChangeNotify alone. 130 files just for a filesystem monitoring routine! Who would've thought that a filesystem monitor needs so many lines of program? Compare with, say, a recent HTTP client:

% perl -MHTTP::Tiny -e1
# Total number of module files loaded: 18
# Total number of modules lines loaded: 6089


I quickly switched to Linux::Inotify2 and things are much better now (but I might have to revisit this since we want to give the new Debian/kFreeBSD a Squeeze).

As I suspected (since the module is written by Dave Rolsky also), File::ChangeNotify utilizes Moose, which is not particularly lightweight either:

% time perl -MMoose -e1
real 0m0.712s


% perl -MDevel::EndStats -MMoose -e1
# Total number of module files loaded: 100
# Total number of modules lines loaded: 35760


Compare with:

% time perl -MMouse -e1
real 0m0.089s


% perl -MDevel::EndStats -MMouse -e1
# Total number of module files loaded: 20
# Total number of modules lines loaded: 6675


Come to think of it, running Dist::Zilla is also quite painfully slow these days. Just running "dzil foo" pulled in around 60k lines and took 1.7s! Of course, dzil is Moose-based.

While it is a good thing that Moose is getting more popular, it's a bit shameful to see that Ruby and Python scripts "get OO for free" while Moose scripts have to endure a 0.7s startup penalty. Mouse, Moo, Role::Basic come to the rescue but I wonder what would Ruby/Python programmers think (you have how many object systems?? Why do you people can never agree on one thing and TIMTOWTDI everything?)

Disclaimer: Number of lines includes all blanks/comment/POD/DATA/etc from all files loaded in %INC, actual SLOC is probably significantly less. Timing is done on a puny HP Mininote netbook (Atom N450 1.66GHz) which I'm currently stuck with in the past few weeks. With all due respects to all authors of modules mentioned. They all write fantastic, working code.

2 komentar:

  1. rumor has it that if Moo is good it's going to become part of Moose

    BalasHapus
  2. This is exactly why I forked Config::JFDI to Config::ZOMG. JFDI, even after switching from Moose to Mouse, still did a lot of weird stuff (mostly, recursively messing with the data structure) that hardly anyone ever needs. If I recall correctly, after switching to Moo and removing a couple small but unused features the test suite ran in 20% of the time.

    BalasHapus