Perl

Packaging CPAN Modules for Debian

I'm giving a talk on CPAN at this year's YAPC in Pittsburgh. One of the things I plan to mention is the cpan2dist program from CPANPLUS. I use cpan2dist (with CPANPLUS::Dist::Deb) to build Debian packages for our internal use.

I had a conversation on IRC this morning with someone who asked, roughly, "What's the difference between cpan2dist and dh-make-perl?" Until recently I don't know that I would have been able to answer this, and I think the answer reveals a distinction that is important to keep in mind whenever you're trying to bridge between CPAN and your operating system's package manager.

Specifically, I think dh-make-perl is for maintaining a Debian package that happens to be a Perl module, and cpan2dist is for wrapping modules from CPAN in a package manager-friendly way. Let me expand on both of those:

dh-make-perl was written by Debian people, and seems aimed towards people who would like to maintain a module in full Debian style, filling in all the metadata like a changelog and description and so on.

  • It leaves you a source tree with a debian/ directory, so it's easy to take the autogenerated files and go from there on your own.
  • It lets you easily tweak an individual package's Debian package information.
  • It does not recurse into dependencies at all, it just notes them in the generated debian/control file, though I think by default it will warn and ignore any dependencies that don't have a Debian package.

If there are one or two modules you need to install that aren't in Debian -- maybe because they're internal code -- and you don't mind maintaining the Debian metadata, you probably want dh-make-perl (or to submit a request for packaging bug to the Debian Perl Group).

cpan2dist was written by Perl people, and CPANPLUS::Dist::Deb really only pays lip service to Debian's packaging conventions.

  • It assumes that modules' licenses permit redistribution, and fills the bare minimum metadata needed to install the package in many places.
  • It does recursively build debs of dependencies, and gives you some control over how to build all of those packages -- a common prefix to use in the package name, build tool options, etc.
  • It doesn't know anything about some of the weird historical decisions that have gone into Debian's perl (e.g. libcgi-perl is not CGI.pm but some distribution named cvswebedit), and does not (currently) give you quite enough control to make the packages it generates fit in with Debian's perl in every case.

I treat cpan2dist as a way of making it easier on my sysadmin to manage Perl modules that I would otherwise just install with the CPAN shell -- installing into /usr/local instead of /usr, for example -- and I don't try to match up with any of those historical edge cases. If you frequently need newer versions of modules than Debian provides, or if you want to install a lot of internal Perl code as debs (especially if you have an internal CPAN mirror that your code is injected into, e.g. with CPAN::Mini::Inject), or if you don't really feel like interacting with Debian's perl much at all, you probably want dh-make-perl.

(Disclaimer: I do not use dh-make-perl with any regularity, so my assessment of it may be insufficiently deep. Please correct me if I've said anything wrong.)

Long term, I'd like to be distributing my own perl package, at which point I'd be using apt-get and dpkg as a convenient distribution and management mechanism. cpan2dist is a much better fit for that goal -- I don't want to have to spend the time munging each Perl module's Debian package's control files for use with my own perl.

If you aren't using Debian, there may be something like dh-make-perl written specifically for your packaging system (I'm aware of g-cpan and g-cpanp for Gentoo, and I'd be amazed if RPM didn't have something too). cpan2dist should work the same everywhere; it has backends for Arch, Gentoo, Deb, RPM for Fedora and Mandriva, and finally PAR, which is itself cross-platform.

In every case there will probably be this same tradeoff between tighter integration with the native package manager's assumptions and ease of use for installing arbitrary CPAN modules, so be prepared to choose one or the other.

Tagged as: CPAN, Debian, Perl, Ubuntu

Catalyst-Action-REST update

A while ago, I talked about my plans for Catalyst-Action-REST, and since then I've made some progress on refactoring to use roles instead of classes. Unfortunately, I've been stalled for the last few weeks, and I could use a nudge to get back on track.

My brain is stuck, basically, on naming issues. This feels a little silly and trivial, but it's something I've run aground on twice now, so I need to fix it somehow. The biggest problem is "serialize" -- Catalyst-Action-REST uses it both as a generic name for "things that either serialize or deserialize" and specifically for "things that serialize" (that is, "convert data to formatted text"). This means I end up wanting to use it to describe a bunch of things:

  • Serialization formats, like YAML or JSON
  • Common code shared by all serialization formats
  • Actions implementing specific serialization formats
  • The role for actions that do either serialization or deserialization
  • The specific action role that does serialization

You can see the results of this sort of naming confusion in the current code on github: Catalyst::ActionRole::SerializeFormat is a pretty goofy namespace.

In my previous post I threatened to split serialization out into its own distribution. I'm not sure how this would overlap with something like Data::Serializer, which is currently used by some of C-A-REST's serialization formats. Maybe it'd be simpler to just punt to Data::Serializer for all the serialization and deserialization, but even then I'd have to keep some class names around for backwards compatibility.

Thoughts?

Tagged as: catalyst, Perl, REST

Dieter and Schwern at YAPC (Yet Another Perl Conference)

Two of OpenSourcery's Perl developers will travel across the country for the 2009 YAPC (Yet Another Perl Conference), the premier conference for Perl developers. It's a quality, affordable conference with roots in the Perl Mongers user groups across the world.

Dieter will present two 50-minute presentations and one 20-minute presentation during the course of YAPC. The first is called Code Reuse with Moose, which covers how to create your own reusable extensions in Moose, with roles, type constraints, and Moose::Exporter. The second is called CPAN, A Big Enough Lever to Install the World. Dieter (aka confound) will share tricks on how Perl developers can convince their operations staff to trust CPAN, using techniques such as:

  • CPAN Distroprefs
  • cpan2dist, PAR, local::lib
  • CPAN::Mini
  • and more!

Finally, Dieter will hold forth for twenty minutes on the topic of Dist::Zilla - Automating quality since 2008.

Schwern, who's also speaking at OSCON and the Open Source Bridge, will deliver two presentations. First, he'll give a talk that literally only Schwern could give: "Trapped in a Room with Schwern." Details on that talk are sketchy at best. Second, he'll lead a session on "Better Programming through Testing," which will cover everything from basic testing to specific best practices for version control, debugging, and so on.

Tagged as: conferences, open source events, Perl

local::lib followup: now with easy bootstrapping

Last week I posted about local::lib, some of the work I was doing on it, and what I wanted to get done.

Well, it's done (for now). As of 1.004001, local::lib comes with an example script that completely bootstraps local::lib into an arbitrary directory. It installs cleanly on a fresh install of Perl 5.8.8, just like this:

  wget -O- http://weftsoar.net/~hdp/scripted_install.pl \
    | TARGET=./local perl

(I copied the script to my server so the URL is shorter -- it's the same as what's on CPAN, though.)

Future enhancements:

  • A shorter url (t0m suggested http://install.local-lib.pl, which would be pretty cute)
  • Better shell/environment integration -- right now, you have to Just Know how to get at the directory that local::lib created for you
  • Build tool integration, e.g. make locallib, make localinstalldeps (that's a bit of a mouthful)

Anyone have other suggestions for useful features?

Tagged as: CPAN, Perl

working on local::lib

local::lib is really, really fantastic. It's a simple module: it bundles up all of the settings and environment variables needed to install Perl modules into a private directory. By default it uses ~/perl5, but you can give it some other directory for an argument, so it's perfect for bundling dependencies with your application or installing things in a scratch directory for testing. (It also depends on the correct versions of toolchain modules needed to make things behave consistently, which is a sweet bonus.)

The Makefile.PL has a bootstrap mode, which means that you can configure and install local::lib into one of these private directories without having it pre-installed at all and without touching your @INC.

This is really great, but I want one step better; I want to be able to bootstrap local::lib into a target directory without having to find the latest version, unpack it, etc. I've been working on a script to do this, something like

wget -O- $some_url/local-lib | TARGET=./local perl

(possibly with some sort of integration with David Golden's -Mylib), but there are a few tangles I still have to work out.



I did make some progress yesterday. I found a bug in Module::AutoInstall that was breaking the normal installation of local::lib under some circumstances, and I fixed the bootstrapping code in Makefile.PL so that the latest version should install cleanly on Perl 5.8.8 (still the default on many operating systems). Hopefully I'll figure the rest of my problems out this weekend, and local::lib will become even easier to use.

Tagged as: CPAN, Perl

Schwern speaking at OSCON, Open Source Bridge

The spring conference season is fully upon us, and various OpenSourcerers have hit the circuit to spread the good word on open source technologies. Michael Schwern has been particularly active.

Schwern begins close to home, presenting "Is the Web Down: A Practical Look at How the Web Works" at the Open Source Bridge conference. In this session, Schwern and Joshua Keroes will lift the veil that keeps many users from understanding "how the Web and its plumbing works." This talk is part of the Chemistry track at Open Source Bridge, which is dedicated to "Understanding how our systems work, in order to improve and extend."

On Thursday, July 23rd, Schwern will share "How to Lie Like a Geek" at OSCON. In short, the talk will cover how geeks, while relentlessly pursuing the Truth, sometimes find themselves accidentally perpetrating lies: Lies by Omission, Lies by Precision, Lies by Irrelevancy, and, perhaps most painfully, Lies rooted in the word "should." As in, "The user /should/ have realized."

Stay tuned to learn what Schwern will deliver at the upcoming YAPC (Yet Another Perl Conference).

Tagged as: conferences, open source events, Perl, portland events, technology events

Ticket tracking with SD

I have a few longer posts that I want to write, but none of them are coming together in my head right now, so I'll put together a quick note about Prophet and SD, by Jesse Vincent and Chia-liang Kao. Prophet is a replicated, peer-to-peer database (I'm cutting its sales pitch short, see the website for more), and SD is a bug tracking system written on top of it.

I had looked at SD a year or so ago, and then forgot about it until I read a post by gugod describing how to use SD to interact with rt.cpan.org, through SD's helper script git-sd. This caught my interest, of course, because I use rt.cpan.org for all the CPAN distributions I maintain.

What makes SD interesting to me, at least more so than other local-storage ticketing systems like ticgit, is that it has adaptors ("foreign replica" classes, in Prophet terms) for several other bug tracking systems, including (as mentioned above) RT and Trac, both of which we use at work.

I like the idea of having tickets available right there with my repository, and I especially love not to have to switch contexts, fiddle with my web browser, etc., every time I finish one bug and I'm ready to move on to the next, but I don't have the luxury of telling all my coworkers that we're switching to some new ticketing system. The fact that I also get free disconnected operation is just icing on the cake. SD feels like git in this regard -- distributed, customizable, and able to play well with existing systems instead of needing a complete change-over.

Since reading gugod's post and getting excited about SD, I've tweaked git-sd a little, and pushed my patches back upstream. In particular, git-sd now defaults to using GIT_DIR/.git/sd for its replica, instead of requiring that you manually configure one, which streamlines the process of setting it up. I have plans for a few more things that I want to do before I use SD heavily:

  • make it easy to store username and password per-replica
  • give replicas a short name to refer to for push and pull, like git remote

SD and Prophet are relatively young, but so far almost all the problems I've
run into have been minor interface issues rather than big showstopping bugs.
Here's the only thing I'm aware of that might fall into that category:

<obra> soooo close to trac roundtrip too
<confound> which would be great for me
<confound> since we use trac at work
<obra> I have one crippling bug.
<obra> I just need a 4 hour block of time
<confound> what is it?
<obra> when you create a ticket in sd, push to trac and pull from trac, it
doesn't properly map the uuid
<obra> oh.
<obra> I know what it is now
<obra> (explaining problems)++

And now that Jesse has achieved enlightenment through explaining the problem to me,
I expect it'll be fixed quickly. So glad I could help! :)

Tagged as: Perl, Prophet, RT, SD, Trac

Bugzilla on Catalyst

One of the talks I submitted for this year's YAPC::NA is on converting legacy CGI and mod_perl applications to run on Catalyst instead, focusing on ways to get Catalyst in between existing code and the webserver so that incremental refactoring is a possibility, rather than "maintain as-is" or "rewrite entirely". I've done this successfully for elementalClinic, and I wanted to find a few more applications, maybe with a little more complexity (the nice way of saying "insanity", in this context), to try my hand at.

Unfortunately, I failed. Instead of a crazy application that would test the limits of my ability to get Catalyst into those hard-to-reach places, I found Bugzilla. (Yes, Bugzilla has some very strange internals, as someone who has hacked on it has told me, but it's not a mixture of three separate templating engines/application systems spread across your webserver's document root.) The end result was anticlimactic -- I got Bugzilla running on Catalyst's standalone http server after only a few hours and some patches to Catalyst::Controller::CGIBin and HTTP::Request::AsCGI. I ended up spending as much time on making it easy for other people to duplicate my results as I did on the actual Bugzilla-specific code, which is, in its entirety:

around wrap_perl_cgi => sub {
  my $next = shift;
  my $code = shift->$next(@_);
  sub { $Bugzilla::_request_cache = {}; $code->(); Bugzilla::_cleanup() };
};

This is just a quick hack, and for normal users of Bugzilla it's probably not a big deal. If the Bugzilla maintainers wanted to start converting to Catalyst, though, it would be a great place to start; for example, it'd be trivial now to set up some Catalyst actions to clean up Bugzilla's urls from e.g. /show_bug.cgi?id=17 to /show_bug/17. That could start out as simple as this:

sub show_bug :Local :Args(1) {
  my ($self, $c, $id) = @_;
  $c->request->params->{id} = $id;
  $c->go('/CGI_show_bug_cgi');
}

Later, more logic could be brought into Catalyst controllers, and eventually the old show_bug.cgi path could exist only for backwards compatibility. (I don't know that the Bugzilla team cares about this at all, but it is the kind of thing that people often would like to do with their own applications.)

Not all of the Catalyst code is perfect. I did have to spend a few hours
fixing bugs in the CGI-wrapping controller code, and there's at least one more
I know of related to handling file uploads. Still, it's a great place to
start, and each bug fixed in the library is time you don't have to spend
working on when you want to modernize your applications.

Tagged as: bugzilla, catalyst, hack, Perl

elementalClinic installer rebuilt

Until today, elementalClinic (emC) could be installed in one of two ways:

  1. from a Debian package, generated by a script hidden somewhere in svn
  2. by hand, after reading the INSTALL instructions, with cp

This is as inconvenient as it sounds.

Now, though, emC uses a boring, standard Perl distribution installer written with Module::Build. This is a huge win for several reasons:

  • less custom installer code to maintain, and what's left is Perl instead of make/shell
  • improved handling of Perl module dependencies -- I had hand-rolled something to work with the old Makefile, but the standard Perl tools are better
  • a wider selection of tools in general -- anything anyone's ever written to analyze, transform, or package a Perl distribution now works with emC

It was a real joy to go through the install instructions and delete huge swaths of text instructing people to find those dependencies by hand, not to mention the fact that the dependency lists were often somewhat stale (as such documentation tends to be).

Tagged as: elementalClinic, Perl

Meta-Moose wrapup

A few weeks ago I gave my talk to the Portland Perl Mongers on Moose. Slides are up, if anyone who couldn't make it wants to read them.

The talk went fairly well, especially considering that I haven't presented anything since I talked about POE for the Philadelphia Perl Mongers; there were some projector issues, which were completely my fault for assuming that the projector would be the same resolution as my laptop screen, but I managed to keep some decent momentum going anyway.

The audience mostly seemed to understand the subject matter, which surprised me a little. I expected more questions than I got, so either I was very clear or people were so baffled that they couldn't even formulate coherent queries. (Or everyone was really eager to get to the Lucky Lab for beer.)

Next I'll rip these slides apart and submit one or two talks for YAPC (submission deadline this Friday!); I know someone else is planning to talk about the MOP, so I'll do a more in-depth talk about type constraints or roles, either of which should provide me with plenty of material.

Tagged as: Moose, Perl, talk, yapc

ExodusCC, an IM/Chat communication center application.

Recently I started work on a new chat/im application in my spare time. I recently decided to work on it as my bench project here at OpenSourcery. I know there are a lot of IM and IRC clients out there, and many are quite good. However, I find that none meet my needs exactly.

Desired features of my ideal chat/im application

  • Ability to leave it logged in 24/7 on one system and access it remotely form any computer, even multiple computers at once - No client I could find does this.
  • Ability to connect to AIM, Yahoo, Google talk, ICQ, Jabber, and IRC all in one client - Pidgin does this.
  • Usable without X - IRSSI and a few other IRC only clients do this.
  • Easily hackable in perl
  • Ability to connect to multiple IRC networks/channels

I decided that to have all these features I would need to write my own. At first I did not expect I would get very far. Obviously this program is not a small undertaking. However, I knew at the very least it would be a good learning experience. If nothing else I figured it would help me feel content with my other IM/Chat clients.

I decided if I was going to do something big I might as well go all out. As such I am taking this as an opportunity to learn several new technologies: Moose, POE, Fey::ORM, and possibly Catalyst if I make a web front-end. These are relatively new and amazing perl packages.

As of today I completed stage 1, which is a working IRC bot that can connect to multiple networks and channels, and logs several key events to a database. This was accomplished using Fey::ORM, which builds most of the object code I need at run-time using the database schema. And also using POE::Component::IRC, which is the best IRC client module currently available for perl.

Next week I intend to work on stage 2, which will bring in basic client-server connectivity so that a client app on any machine can connect to the IRC lurker I have now.

Tagged as: AIM, Chat, IM, IRC, Perl

Catalyst and elementalClinic

Part of my job involves development on elementalClinic (emC). emC is a few years old, and has undergone a lot of architectural change; it started as a collection of CGI scripts, grew into a collection of scripts, templates, and modules, and made the transition to a mod_perl application in late 2006.

Now, mod_perl is fine for writing Apache modules; but I have a web application, and I don't need most of the power that mod_perl exposes. Being tied so tightly to Apache has caused some problems and inconveniences along the way. For example, any tool to deal with Perl module dependencies becomes much more complicated when one of them is mod_perl, and running tests through Apache has made certain kinds of test failures much more difficult to debug.

We'd talked internally about moving emC to Catalyst at some unspecified point in the future. Catalyst is mature, well-tested, and very powerful, and while I don't love everything about it, there are a lot of people working on it (besides, I have a commit bit, so if I don't like how things work it's partly my own fault anyway). I've been using Catalyst for another project recently after being away from it for a while, and decided that maybe it wouldn't take as much effort to start moving emC over as I thought.

So, on Thursday, I told Randall that I was going to take a few hours from the next couple of weeks, and see how far I got in just replacing the dispatch mechanism from emC -- converting the controllers and the session and so on is something we can do incrementally, but getting Catalyst in "underneath" everything is a big first step.

It turns out that the answer to "how far can I get" is "all tests passing"; with just over 5 hours' work, Catalyst is now sitting between the outside world and emC's controllers. The immediate benefit is that emC can run on any of Catalyst's engines: mod_perl, standalone, prefork, FastCGI, etc. The long-term benefits are no more maintenance of hand-rolled dispatching code and the ability to incrementally replace even more hand-rolled code with modules from CPAN.

A huge factor in this is that Matt Trout had already written Catalyst::Controller::WrapCGI, a controller for running existing CGI scripts seamlessly inside a Catalyst application. A shout out to Rafael Kitover, too, for helping me find and fix a problem with it under mod_perl, and then releasing a new version with my patch included.

My first target for Catalyzing emC's code is probably going to be the functional tests, which use WWW::Mechanize and currently each have to spin up their own Apache process -- Catalyst's test module can fake up HTTP requests in-process, which is really nice for speed during the test-edit-test cycle, and it already has a WWW::Mechanize-based wrapper, so I shouldn't need to change too much code to make use of it.

This morning I merged the Catalyst branch into emC trunk. I'm excited about how easy it was to slip Catalyst in underneath the existing emC code, and I'm looking forward to the changes it'll make possible in the future.

Tagged as: catalyst, elementalClinic, Moose, Perl

Plans for Catalyst-Action-REST

A while ago, I told Jay Shirley (in a moment of weakness) that I'd co-maintain the Catalyst-Action-REST distribution. Over the past week, I've closed several bugs, and I feel comfortable enough with the codebase to start making more sweeping changes.

C-A-REST is certainly useful in its current state, but it has its warts. At the same time, I've been thinking about the fact that Catalyst 5.80 (svn) is built on Moose, and about how Moose could possibly help with some of C-A-REST's rough spots.

Here's a list of things I'd like to improve in the near future, in rough order of how much I've thought them through:

  • Using roles instead of classes for every type of object (Action, Request, and Controller) will make it much easier for C-A-REST's modules to play nicely with other Catalyst extensions. The old classes will stay around for backcompat, but new code will be able to use with 'Catalyst::Controller::Role::REST' instead. (See also Catalyst::Controller::ActionRole, thanks to Florian Ragwitz.)
  • Serialize/Deserialize plugins are currently only loaded from under Catalyst::Action:: -- there's no way to look under MyApp::Action:: or any other namespace. There's also no good way to pass configuration to plugins (a common request). Finally, they all share a lot of code, like looking up data in the stash or reading the body, that needs to be refactored.
  • Serialize/Deserialize are symmetrical -- that is, they're forced to use the same Content-Type. This is OK for basic use, but it puts a pretty hard barrier on extending the basic de/serialization mechanism for application-specific content types, and I'd like to make it easier to specify that certain content types are only used in one direction or another.
  • REST status helpers have clunky syntax and don't cover enough HTTP status codes. Calling $self->method($c, ...) makes me sad; we've talked about using a REST view and/or a REST response role instead. Whatever the solution, the REST component of Catalyst::Controller::REST needs to feel less bolted-on than it currently does.
  • REST and Serialize/Deserialize are unnecessarily conflated. I'm not entirely sure they should even be in the same distribution, but they definitely need to be easier to use separately. Switching to roles may be enough to make me happy, here.

Check out Catalyst-Action-REST on github, and let me know what you think.

Tagged as: catalyst, Moose, Perl

"Meta-Moose" at the Portland Perl Mongers

I'll be speaking about Moose at the next Portland Perl Mongers meeting (April 8, 6:53pm, at Free Geek).

Moose is a postmodern object system for Perl 5.

Moose's recent rise in popularity has led to a surge of declarative class-building and accessor-generating modules, but the real power of Moose comes from its metaclass fundamentals, not from the syntactic sugar of has(). Using Moose as a foundation makes it easier for your code to grow and scale.

I'll cover some of the concepts in Moose that the MOP (Meta-Object Protocol) makes possible, especially roles and type constraints. If we have time, I'll go through a simple Moose extension, focusing on the mechanisms Moose provides to help your code play nicely with others'.

If the first sentence of this description was news to you, you should at least read the SYNOPSIS of Moose, and if you can get through Moose::Manual and Moose::Manual::Concepts, so much the better. I'll expect a lot of questions, but I hope to move past "what is an object" pretty quickly.

By the end of the night I hope you'll have a better understanding of the depth of what Moose provides, and why has() is only the tip of the iceberg. I don't expect that everyone will immediately understand every concept provided – my goal is to impress you so much with Moose's awesomeness that you're willing to follow up later on the documentation pointers that I throw out.

3:o Moose!

Tagged as: Moose, pdx.pm, Perl

Use your digital camera as a data logger

Now that the days are getting shorter, bike commuting demands some lights - and the brighter the better! To this end, I recently acquired a powerful LED flashlight that mounts on my handlebars. The light contains the impressive Seoul Semiconductor P7, with an advertised output of 900 lumens, but it comes at the cost of battery life. I wanted to find out precisely how long the battery would last, but I didn't feel like sitting around watching it. I also thought it would be cool to see how the intensity changes over time.

Lacking a lumen meter, or any other sophisticated test equipment, I came up with a plan: My aging Canon camera, ImageMagick, and a little Perl. The camera came with some remote capture software, so I configured it to take a photo every 30 seconds, while the light was aimed at a white wall. Note that it's important to manually set the camera's shutter and aperture so that you get a consistent exposure. I also locked the ISO speed, white balance, and focus - no auto-anything to interfere with my measurements. Finally, I set the resolution to the smallest available setting, to save processing time later.

The next morning, I had several hundred JPEG files on my desktop. A quick perl script, and I had a CSV file ready for plotting.

#!/usr/bin/perl -w
 
open MAGICK, "convert *JPG -colorspace gray -verbose info: |";
open STATS, ">stats.csv";
 
while (<MAGICK>) {
  if (/Mean:/) {
    @fields = split /[()]/;
    print STATS "$fields[1],";
  } elsif (/Exif:DateTime:/) {
    @fields = split " ";
    $fields[1] =~ s/:/-/g; # convert to ISO 8601
    print STATS "$fields[1] $fields[2]\n";
  }
}

battery life
Good news! My normal commute is about 50 minutes, round trip, and so this is a pretty decent safety margin. If I need to stretch the battery life, the light does have some low power modes, and a strobe. (What? Low power? Whatever for?!)

Tagged as: bike, ImageMagick, Perl, sustainability

Syndicate content