Sure enough, one of the main causes of the slow start up was preloading too many modules. Over the years I had been blindly sticking 'use' statements into our kitchen sink $Proj::Utils module, which was used by almost all scripts in the project. Loading $Proj::Utils alone pulled in over 60k lines from around 150 files!
After I split things up, it became clearer which modules are particularly heavy. This one stood out:
% time perl -MFile::ChangeNotify -e1
% perl -MDevel::EndStats -e1
# Total number of module files loaded: 129
# Total number of modules lines loaded: 46385
So almost 130 files and a total of 45k+ lines just from loading File::ChangeNotify alone. 130 files just for a filesystem monitoring routine! Who would've thought that a filesystem monitor needs so many lines of program? Compare with, say, a recent HTTP client:
% perl -MHTTP::Tiny -e1
# Total number of module files loaded: 18
# Total number of modules lines loaded: 6089
I quickly switched to Linux::Inotify2 and things are much better now (but I might have to revisit this since we want to give the new Debian/kFreeBSD a Squeeze).
As I suspected (since the module is written by Dave Rolsky also), File::ChangeNotify utilizes Moose, which is not particularly lightweight either:
% time perl -MMoose -e1
% perl -MDevel::EndStats -MMoose -e1
# Total number of module files loaded: 100
# Total number of modules lines loaded: 35760
% time perl -MMouse -e1
% perl -MDevel::EndStats -MMouse -e1
# Total number of module files loaded: 20
# Total number of modules lines loaded: 6675
Come to think of it, running Dist::Zilla is also quite painfully slow these days. Just running "dzil foo" pulled in around 60k lines and took 1.7s! Of course, dzil is Moose-based.
While it is a good thing that Moose is getting more popular, it's a bit shameful to see that Ruby and Python scripts "get OO for free" while Moose scripts have to endure a 0.7s startup penalty. Mouse, Moo, Role::Basic come to the rescue but I wonder what would Ruby/Python programmers think (you have how many object systems?? Why do you people can never agree on one thing and TIMTOWTDI everything?)
Disclaimer: Number of lines includes all blanks/comment/POD/DATA/etc from all files loaded in %INC, actual SLOC is probably significantly less. Timing is done on a puny HP Mininote netbook (Atom N450 1.66GHz) which I'm currently stuck with in the past few weeks. With all due respects to all authors of modules mentioned. They all write fantastic, working code.