At the September Los Angeles Perl Mongers meetup, Tommy Stanton presented on an in-progress bit of code he’s working on, App::Git::HomeSync. (Presentation)
As you’d hope and expect from something headed CPAN-ward, he’s got lots of tests. As you might have guessed from the name, this module needs to run git quite a bit, with different command line arguments. Tommy’s approach to testing this is good – ship the module with some “fixtures” (a directory in a known state which gets unpacked into a temp directory) and then run the command line app in that directory.
There is another way to approach this, and I realized I didn’t have any “open sourceable” code which demonstrates this technique. I got a lot of the way through writing this before realizing I have blogged a more basic version of this idea before, but this is a new-and-improved take on things, with much deeper examples.
tl;dr
- Use IPC::Run to run your command line apps
- In your test code, intercept the calls to
IPC::Run::runand return your own data, based on the command line used - Store sample command line output inside the test file using Data::Section
I’ve stashed a full, functional example of this idea in my Acme::System repository on GitHub.
Code Walkthrough
Using IPC::Run in your code
First, there’s the module code itself. This module does 2 very stupid things.
- It returns the sum of all the PID’s (Process ID’s) on the system, and it calls
psto get this information. - It returns the value of one of the columns from the
vmstattool.
The important thing is to use IPC::Run::run to actually run the code, instead of a blind system() call. Because it’s a module call, and has a very simple interface, it’s much easier to mock it (“override the functionality with ‘fake’ functionality) for testing.
This is in lib/Acme/System.pm. Here’s the pidsum method (the vmstat_col method is basically the same, check out the full code if you’re curious):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
So, as I said, stupid code. I only care about getting the output, so I pass in undef for the other values – IPC::Run::run wants you to supply them anyway.
However, that little sum/map/split thing sure looks fancy. How much would be willing to wager that it’s bug-free? Probably not much. How would you even test code like that?
So, let’s cheat – run ps just the once, stash the results, and use those for testing from there on out.
Storing the command line output results
Check out the the whole test file to see what’s going on overall.
There’s a few bits of weirdness, for sure. Down the end you’ll see:
1 2 3 4 5 6 7 8 9 10 11 12 | |
This is the format that Data::Section wants. You just need to stash each command you run between the underscored square brackets, and follow it with some sample command line data.
The tricky thing is this – Data::Section was built to be used with Modules, not “plain jane .t files.”
So to make that work, you need to:
- Give your test a package name. I chose to just stick a
Test::in front of the module name. - Give
Data::Sectionan instance of that object to stick it’s methods into.
1 2 3 | |
This is the only part of this whole technique that feels truly hacky. If anyone has suggestions for better ways to manage this, let me know.
Overriding IPC::Run::run in a test
There are lots of ways to override a module’s methods; I have had good experiences with Test::MockModule. It’s pretty easy.
Write a callback that emulates IPC::Run::run
Here’s the code that makes that evil hack above worthwhile. Here’s all you have to do to recover those canned program execution results, using the section_data method provided by Data::Section (and that hacky $data reference):
1 2 3 4 | |
“Mock” that in place
1 2 3 | |
Actually doing the testing
At this point, test away!
1 2 3 4 5 6 7 8 9 | |
So I can (independently) calculate what the results should have been, given the arbitrary data I’ve saved in the __DATA__ block, and test based on those values. Awesome.
1 2 3 4 5 | |
Trust me, the normal code will still actually call the system
Just for fun, I threw in a script that actually uses this module to get live data:
1 2 3 4 5 6 | |
And sure enough, if you run it, the data is getting updated live. IPC::Run really is working on the live system.
1 2 3 4 5 6 7 8 | |
Wrapping it all up
Other than the hackish trick to get Data::Section working in what’s not really a module, this code is really clean, readable, and easy to maintain. It works well for pretty much any module you might care to use instead of IPC::Run – there are lots of options, but as long as you use one of the module-ized ones, you can hook the module name and go from there.
Especially if you write lots of sysadmin tools, and especially if they have costs or risks associated with running them (fsck? rm -rf?) this technique can be a lifesaver. It’s only as good as the inputs you give it, though. I made a mistake the first time I figured this workflow out of grabbing an output which, in real life, ended up with more whitespace than I’d accounted for, because the counters had gotten bigger between when I snagged my “output to test with” and when it was running on “live output.”
I hope it helps, and if you have any suggestions about how to improve the technique, let me know (or send a pull request!)