Diffing Images on the Command Line

July 24, 2015 holman

So about a year ago I realized that a play on Spaceman Spiff — one of Calvin’s alter-egos — would be a great name for a diffing tool. And that’s how spaceman-diff was born.

Then I forgot about it for a year. Classic open source. But like all projects with great names, it eventually came roaring back once I was able to make up an excuse — ANY mundane excuse — for its existence.

So today I’ll shout out to spaceman-diff, a very short script that teaches git diff how to diff image files on the command line.

Most of the heavy lifting is handled by j2pa: spaceman-diff is just a thin wrapper around it that makes it more suitable for diffing.

Install

This ain’t the README, dammit, so go to the repo to learn about all of that junk.

Learning via Git internals

Part of the fun of doing this (of doing anything silly like this, really) is digging into your tools and seeing what’s available to you. Writing spaceman-diff was kind of a fun way to learn a little bit more about extending Git’s diffing workflow.

There’s a couple different approaches to take to do this within Git. The first was slightly naive and basically involved overriding git-diff entirely. That way, spaceman-diff handled all the file extension checks and had quite a bit more control over the actual diff itself. git-diff was invoked using an external diff tool set up with gitattributes. If the file wasn’t an image, we could pass the diff back to git-diff using the --no-ext flag. This was cool for awhile, but it became problematic when you realize your diff wrapper would have to support all flags and commands passed to git-diff so you can fall back correctly (and, because of how Git passes commands to your external diff script, you don’t have access to the original command).

Another option is to use git difftool here. It’s actually a decent approach if you’re looking to completely replace the diffing engine entirely. Maybe you’re writing something like Kaleidoscope, or maybe a tool to view diffs directly on Bitbucket instead of something locally. It’s pretty flexible, but with spaceman-diff we only want to augment Git’s diff rather than rebuild the entire thing. It’d also be great to let people use git-diff rather than try to remember to type git-difftool when they want to diff images.

The Pro Git book has a nice section on how to diff binary files using gitattributes. There’s even a section on image files, although they use textconv, which basically takes a textual representation of a file (in their case, a few lines of image exif data: filesize, dimensions, and so on), and Git’s own diffing algorithm diffs it as normal blocks of text. That’s pretty close to what we want, but we’re not heathens here… we prefer a more visual diff. Instead, we use gitattributes to tell Git to use spaceman-diff for specific files, and spaceman-diff takes over the entire diff rendering at that point.


Nothing ground-breaking or innovative in computer science happening here, but it’s a fun little hack. Git’s always interesting to dive into because they do offer a lot of little hooks into internals. If you’re interested in this, or if you have a special binary file format you use a lot that could be helpful as a low-fi format diff, take a peek and see what’s available to you.

Provided, of course, you have a great pun for your project name. That comes first.