Archive for the 'Best practices' Category

macOS installer cards

Thursday, September 29th, 2016

Being both a creature of habit and a digital packrat, I never install macOS from the “Install macOS” app.

Instead, what I do is have a cache of 8 GB class-10 microSDHC cards (usually bought from Micro Center, but they make great Amazon filler items, too), which I permanently assign as macOS installer media.

(I would buy SDHC cards, but Micro Center sells the microSDHC cards for cheaper—50¢ cheaper as I write this.)

Populating the card

Once you have a card, you’ll need to put the bits on it. DiskMaker X is the easiest way to do this, though there is also a manual process.

Labeling it

Anonymous cards all of the same capacity and speed and possibly even description does not scale well as an inventory system. Ideally, your card comes in a clamshell case, like the Micro Center ones do, or you can reassign such a case that came with another card. You can probably even buy empty microSD cases, like you can CD jewel-cases.

I have a Brother labeler that I use to print a small label with the release’s version and marketing name on it: “OS X 10.7 Lion”, “OS X 10.8 Mountain Lion”, “OS X 10.9 Mavericks”, “OS X 10.10 Yosemite”, “OS X 10.11 El Capitan”, and now “macOS 10.12 Sierra”. That label goes on the front of the case.

Photo of my flash memory cards on which I have the past several years' macOS installers. Lion was on an SD rather than microSD card; Micro Center have changed their case form factor a few times; I started printing rather than hand-writing labels with 10.9, and I haven't yet labeled the card that will go to Sierra.
I wrote this while the Sierra installer application was downloading.

And then done

The very pretty installation volume that DiskMaker X created for Sierra, complete with window background matching the installer icon.

Now you have labeled, permanent installer media that you can use forever (or however long the cards last). If you ever need to roll back to an old version, reinstall the dot-zero from the card and then combo-update up to whatever version you want.

So you want to run an open-source project

Sunday, September 11th, 2016

How much of this you’ll want to do will depend on just how much of a Thing you want your project to be, on a scale from “I’m just gonna drop some code on GitHub” to “I want to wear the title of ‘maintainer’”.

GitHub

You don’t have to use GitHub as such; Gitlab and Bitbucket are alternatives, and they’re all more or less equivalent. But GitHub has network effects in its favor: Most potential contributors are there.

“Contributors” doesn’t just mean code. Anybody can clone your repository, check something in, and email you the patch. But writing issue tickets and editing the wiki (if you choose to have one) are valuable, too, and those typically require an account.

Your project on GitHub (or wherever) will include one or more of the following:

  • Code/contents (the Git repository from which GitHub and Gitlab take their names)
  • An issue tracker (more on that later)
  • Forks (other people’s copies of your repository, to which they have their own access to push changes)
  • Merge requests (changes made in one fork, then submitted “upstream” toward the original repository for consideration and possible integration into the “main” code base)
  • A wiki (documentation)
  • A website

Travis

Like with GitHub, there are alternatives. Travis is the one I’ve used.

Travis is a continuous-integration system. Whenever someone pushes changes to your repository, or submits a merge request, Travis will build your project and run your tests, according to instructions you provide. It’ll then provide an indication—on GitHub, right on the merge request page—of how well or poorly that went.

There’s a related service called Coveralls that tracks the code coverage in your project—how much of your code is being run by your tests. Code paths that the tests aren’t going through are potential bug sites: When that code path does get traveled, what happens might not be what you expect.

Coveralls tells you how much of your code was exercised, as a percentage, and also gives you a line-by-line map of what code was run and what code wasn’t.

Project infrastructure documents: README, LICENSE, CONTRIBUTING/CODE_OF_CONDUCT

(If you write these in Markdown, they’ll have the .md extension: README.md, etc.)

README

Your README is your Getting Started guide for both users and contributors. Awesome Labs has a template you can follow (optimized for iOS libraries); the basic points you should cover include:

  • What is the product? What does it do? Why do I need it?
  • What does it look like? (Screenshots/example terminal sessions/example code)
  • What does it require?
  • How do I install it?
  • How do I use it, at least in the most common case?
  • How do I uninstall it (if applicable)?

LICENSE

This is where you put your statement of copyright, and the license agreement for your program, outlining what sorts of copying you explicitly allow and which ones you explicitly don’t. Common ones include variants of the BSD and MIT licenses, GPLv2, and GPLv3.

GitHub suggests a few popular licenses when you create a new repository, and provides an advanced section with many more licenses where you can compare their finer points.

CONTRIBUTING

This file describes guidelines for contributors, including:

  • Requirements of the contributions themselves. Examples of such requirements include:
    • Please add tests for any new features, and update tests for any behavior changes.
    • Please run the tests and be sure they pass before submitting your merge request.
    • Your change must pass all tests on all supported platforms. See the README for our platform requirements.
    • Please match the style of existing code. In particular, please use one tab/four spaces/eight spaces/6.34 spaces per indent, and please put opening braces ({) on the same line/their own line.
    • Please submit the complete, original series of changes—don’t squash merge.
    • Please squash-merge your changes—we don’t want the complete, original series of changes.
  • Information on how to contribute, if it differs at all from the usual “fork the project, commit and push your change, then submit a merge request”.
  • Requirements of contributors (e.g., a code of conduct such as the Contributor Covenant).
  • Info about any project mailing lists, IRC channels, Slack channels, etc., perhaps with their own code of conduct, such as the Community Covenant.

Codes of conduct are important for making explicit the standards of behavior you expect from project members and contributors. Don’t tolerate jerks—if somebody breaks your rules, even if their contributions are good on technical merit, kick them out. (You define what “kick them out” means, and may want to document it in your CONTRIBUTING file.)

Known jerks continuing to be tolerated/welcomed in your project is a visible “stay away” sign to other people who might otherwise bring equally good contributions with less hassle.

Issue tracker

Issue trackers are two things:

Your project management tool

You use issue tickets to track the work that has been done and that needs doing.

Most issue trackers provide a category/tag/label feature to identify on each ticket what sort of work it is. You can also create labels for priorities, if you want to track how essential or postponable each item is. (Some issue trackers have a separate priority field.)

Have a “help wanted” or “up for grabs” tag you can put on bugs that you don’t want to do yourself, and mention it in your CONTRIBUTING instructions. “help wanted” bugs are the ones you explicitly reserve for new contributors to do. Most should be simple or good introductions to the codebase, but you can also include harder jobs where (for reasons of your own time) you’re willing to accept the patch but not write it yourself.

You may also be able to create milestones for future releases you’re planning, into which you put each ticket to track when that work should be done (or at least landed), and, inversely, what work must be done before a release can ship.

Everyone else’s wish list

You’ll get a lot of bug reports and feature requests. Many of them will be duplicates, which you’ll close with a reference to the original/anointed ticket for the same bug/request/task.

If your project grows large/popular enough, you may choose to lock down the issue tracker so only contributors can add or edit tickets, and then take requests and bug reports instead through a support email address or form. Whoever reads that email then updates tickets as needed and sends ticket links or ticket numbers to support querents.

You will need to say no to some things, and some people will be displeased. You will also need to postpone some things because they aren’t as important or there isn’t a clear path forward (e.g., no steps to reproduce), and some people will be displeased because they want their feature/a fix now. There are no two ways about it; you are going to make some people unhappy. It’s not your fault.

Be polite; be fair; be the better person. Sometimes that still won’t be enough—some people won’t accept anything less than “yes, right away, it’ll be fixed in the next ten minutes”. And that sucks. But it’s not your fault.

Code review

Code review is where one or more contributors (and/or the maintainer) look at a proposed (or even already committed) change, looking for errors and potential improvements.

Some guidelines:

  • Review the patch, not the person. Don’t call the contributor an idiot or accuse them of wasting their time. Be gentle; be constructive. Yes, even if the patch is total trash—make it a teachable moment. Recognize language barriers and early learners. (But also don’t waste your time excessively—sometimes you will have to give up on someone who is a time sink. Even then, be nice.)
  • Prefer questions to statements. Be open-minded, not reactionary. Be willing to accept that an approach that isn’t what you would have done may be better. At the very least, be willing to accept it if it isn’t worse.
  • Praise good ideas, novel (but not overly clever) approaches, attention to performance, reliability, and ease-of-use, and hard work. Code review isn’t just a place to point out what they did wrong—it’s also a place to appreciate, explicitly, what they did right.

The simplest form is to look at the diff (the proposed change) in each merge request. You can comment on the merge request as a whole, and, on most services, on individual lines within the diff. Use these comments to ask for clarification, ask for documentation/comments, suggest changes, or suggest alternative approaches.

There are dedicated code review tools, such as Phabricator and ReviewBoard. I’ve used Phabricator; it’s good. I haven’t used ReviewBoard.

On a GitHub project, I typically just use the built-in merge request interface.

Package managers

They are many, and which ones you have to choose from will depend on what you’re making. For libraries for Apple platforms, there are CocoaPods and Carthage. For tools for macOS, there are Homebrew, MacPorts, and Fink.

Your README should include information on which package managers you support, and how to obtain your software through them. Expect people to request that you add your software on other package managers, or somebody to add it themselves and other people to subsequently complain to you when it doesn’t work. You decide what you do when these happen.

Documentation

It should start with your README, but ideally should not end there.

If you’re making a library, have header/API documentation. On Apple platforms, Xcode now has built-in support for this; it’ll display the contents of your documentation comments when you option-click a symbol in C, Objective-C, or Swift code.

Another place to put documentation (and also do broader planning than fits comfortably in an issue ticket) is your project’s wiki.

Alternatives to “guys”

Friday, June 19th, 2015

When you’re addressing a mixed-gender or unknown-gender group, you should not use the word “guys”.

(Everything in this post also applies to “dudes” and “fellows”, and the singulars of all three. For example, “the IT guy”.)

The word refers to people who are male. It doesn’t matter what you meant by it in a particular instance, or how you as an individual tend to use it: That is what it means. That is what it conveys. That is what people hear when you say it.

As Julia Evans found, different usages vary in how they’re received, and there’s almost always a difference by gender. You’re more likely to think “guys” is gender-neutral if you’re a guy.

When you use it to address people of mixed or unknown gender, you reinforce the idea of male-as-default: This masculine word can refer to anybody! Funny how that doesn’t work for feminine words.

When you use it to address people of mixed or unknown gender, you erase the non-male people in the audience: Everybody here is guys! There’s nobody else here, no non-guys at all, no, sir.

So stop it. Stop saying “guys”.

(Except, of course, when you really do mean a group entirely of guys, like a men’s sport team.)

You may think it’s perfectly normal. It’s common, and that’s different. “Normal” implies healthy, and this isn’t healthy.

If you start paying attention, it won’t sound so “normal” after all, once you start noticing every time somebody refers to people who aren’t all guys as guys.

You might think “OK, I’ll say ‘guys and girls’ or ‘ladies and gentlemen’ instead.” Don’t do that. That does not include everybody: There are more than two genders, and not everybody inhabits any of them.

We can do better than that. We can include everybody.

So, what should you say instead?

These are words and phrases that include rather than exclude. That acknowledge rather than erase.

If you have other alternatives to suggest, please do suggest them in the comments.

The to-do graph

Sunday, December 21st, 2014

Earlier this year, I started practicing GTD.

I’m not very good at it yet. But I’ve learned some things.

And I’ve proven that lists suck.

A list is a single, flat, ordered collection of items. In GTD, you’ll have one list per project, and each item is an action.

This makes sense if you consider the list to be a record of everything you did toward that project, in order, written before instead of after.

But who does that?

Who micromanages the order of things they’re going to do before they do it?

Who has such perfect foresight?

Who has that kind of time?

Moreover, the order of a list is implicitly transitive: Every item in the list must come after every item before it and before every item after it.

That isn’t true: Some things can be done at the same time, or in either order. For example, you can work on assets while you install Xcode, and then you can work on assets before code or code before assets.

A list is a serial queue. No work can proceed until the work before it is finished.

The alternative to that is to throw out the ordering altogether: Put items in in the order you think of/receive them, and then every time you want to start one, scan through the whole list until you find something that you haven’t done and can do.

It’s a choice between false information—order relationships that don’t really exist—and no information.

The latter seems worse: Some of the orders are true, and therefore valuable, so why throw them all out?

How can we avoid that?

What makes the true orders true? Why do those actions have to be done before those other actions?

Dependencies.

This action must be done before that other action because the other action depends on it.

Some actions depend on other actions. Some actions depend on multiple other actions.

This is a graph.

That’s what I’ve switched to. I still use OmniFocus (version 1), but only as an inbox; I migrate those items to my to-do graph, which I keep in OmniGraffle.

Example graph of two projects, “Build new app” and “Work on existing app”, with actions such as “Create Xcode project”, “Fix bug in frobulator”, and “Add BSP reticulation”.
You can tell this is a made-up example because some of these actions are not concrete enough to be proper GTD.

The graph enables me to express dependencies without making up false orderings. Items that can be done in parallel are in parallel.

I mainly edit the outline, rather than the boxes on the graph directly. You could use something like OmniOutliner or TaskPaper, but those can’t visualize the graph. OmniGraffle has an “auto layout” (no relation to the Cocoa feature) option that automatically creates and arranges boxes in the graph corresponding to items in the outline.

The top-level items in the outline, the roots of the graph, are goals. I typically write these as high-level imperative sentences such as “build initial version of app”.

All, or occasionally nearly all, of the other nodes are indivisible actions. Each is a single concrete step toward the goal.

The leaf nodes are “next actions”: At any time, I should be able to pick a next action as the next thing I’m going to do.

I also create nodes for other people’s actions that I’m waiting on. These nodes look like “So-and-so: Do such-and-such”. When that happens, I take it off; its parent—if it has no other dependencies—then becomes a next action.

I set OmniGraffle to lay the graph out as an upward tree, so that each project actually does look tree-like, with the “root” at the bottom.

To mark an action as done, I have two choices: I can set the node’s font to strike-through, or just delete it. I typically strike through my own completed actions, and delete obviated actions, completed actions by others, and completed goals.

You’ve gathered by now that OmniGraffle lacks some things for this.

  • It wasn’t designed to be used as a to-do list, so it lacks a concept of “done”. Instead of a checkbox, I have the Font Panel.
  • Similarly, there are no priority or due-date options. I generally don’t use these, but some projects do, in fact, have a due date, or greater or lesser priority than other projects, and it’d be helpful to track that.
  • The outline editor, which superficially looks like a mini OmniOutliner, lacks or changes about half of OmniOutliner’s keyboard commands. I really wish that if the keyboard focus is on the outline, that it would just respond exactly as OmniOutliner would to every possible keypress.
  • The option to have a node pinned to the far (top) row of the tree is a per-node option, so I can’t have it automatically lay out all next actions in the same row.
  • The outline structure, like OmniOutliner’s, means I cannot have multiple nodes depend on the same action—which they very well could in reality.

I have two points:

  • A graph is a much better way to express to-dos than a flat list.
  • There currently isn’t a Mac app ideally suited to this. OmniGraffle is a great graph editor, but I’m using it for a purpose it wasn’t designed for. I’d pay good money for an app of OmniGraffle’s quality and basic nature, but optimized for to-do keeping.

On the API design of CGBitmapContextCreate

Friday, June 1st, 2012

Let’s review the prototype of the CGBitmapContextCreate function:

CGContextRef CGBitmapContextCreate (
 void *data,
 size_t width,
 size_t height,
 size_t bitsPerComponent,
 size_t bytesPerRow,
 CGColorSpaceRef colorspace,
 CGBitmapInfo bitmapInfo
);

The arguments:

  • data may be a pointer to pixels. If you pass NULL, the context will create its own buffer and free that buffer itself later. If you pass your own buffer, the context will not free it; it remains your buffer that you must free after you release the context, hopefully for the last time.
  • width and height are what their names say they are, in pixels.
  • bitsPerComponent is the size of each color component and the alpha component (if there is an alpha component), in bits. For 32-bit RGBA or ARGB, this would be 8 (32÷4).
  • bytesPerRow is as its name says. This is sometimes called the “stride”.
  • colorspace is a CGColorSpace object that specifies what color space the pixels are in. Most importantly, it dictates how many color components there are per pixel: An RGB color space has three, CMYK has four, white or black has one. This doesn’t include alpha, which is specified separately, in the next argument.
  • bitmapInfo is a bit mask that specifies, among other things, whether components should be floating-point (default is unsigned integer), whether there is alpha, and whether color components should be premultiplied by alpha.

The most immediate problem with this function is that there are so damn many arguments. This is especially bad in a C function, because it’s easy to lose track of what each value specifies, especially when so many of them are numbers. Suppose you want to make an 8-by-8-pixel grayscale context:

CGContextRef myContext = CGBitmapContextCreate(NULL, 8, 8, 8, 8, myGrayColorSpace, kCGImageAlphaNone);

Now, without looking at the prototype or the list, which argument is bitsPerComponent, which is bytesPerRow, and which are width and height?

Objective-C’s names-and-values message syntax can help with this, as we can see in the similar API (for a different purpose) in NSBitmapImageRep:

NSBitmapImageRep *bir = [[NSBitmapImageRep alloc]
    initWithBitmapDataPlanes:NULL
                  pixelsWide:8
                  pixelsHigh:8
               bitsPerSample:8
             samplesPerPixel:4
                    hasAlpha:YES
                    isPlanar:NO
              colorSpaceName:NSCalibratedRGBColorSpace
                 bytesPerRow:8
                bitsPerPixel:8*4];

But this has other problems, notably the redundant specification of bitsPerPixel and samplesPerPixel. With that and the isPlanar argument, this method takes even more arguments than CGBitmapContextCreate. More importantly, it doesn’t solve the greater problems that I’m writing this post to talk about.

EDIT: Uli Kusterer points out that bitsPerPixel is not redundant if you want to have more bits not in a component than just enough to pad out to a byte. That’s a valid (if probably unusual) use case for NSBitmapImageRep, so I withdraw calling that argument redundant.

I’m going to use the example of both of these APIs, but mainly CGBitmapContextCreate, to talk about a few principles of API design.

The first is that it should not be possible for an object to exist in an unusable state. From the moment a freshly-created object is returned to you, you should be able to use it without it blowing up in your face.

From this principle follows a corollary: Everything an object needs in order to function, it should require when you instantiate it. Otherwise, the object would exist without the needed information—and thereby be unable to function—until you provide it.

It might seem that these APIs are as long as they are in order to uphold that principle. After all, a bitmap context needs to have someplace to put its pixels, right? (In fact, CGBitmapContextCreate‘s buffer argument was required until Snow Leopard and iOS 4.) It needs to know what format the pixels should be in, right?

Now for the second principle: Any information that an object does not need in order to function should be omitted from initialization and provided afterward. In Objective-C, the most common means of this post hoc specification are readwrite properties and delegate messages. Generally, for anything that could be specified in the initializer, the post hoc way to specify it would be via a property.

We’d like to invoke the second principle and move things out of the initializer, but that would seem to conflict with the first principle: What can we move that the context does not require?

The resolution is in a third principle—one that is not specific to APIs, but applies to all interfaces, including user interfaces: An interface should have reasonable defaults for as many parameters as it can—it should only require the user to provide values for parameters for which no default can be reasonably chosen in advance.

With that in mind, let’s look at some of CGBitmapContextCreate‘s arguments and see how we might apply the reasonable-defaults principle to simplify it:

  • bitsPerComponent, bitmapInfo, and colorspace: Most commonly, the caller will want 8-bit RGBA or ARGB, often with the goal of making sure it can be used on the graphics card (either by way of a CG- or CALayer or by passing the pixels directly to OpenGL). That’s a reasonable default, so these three can be eliminated.

    We could make them properties, but there’s an alternative: We could dynamite bitmapInfo and merge some of its values with bitsPerComponent in the form of several pixel-format constants. You’ve seen this approach before in QuickTime and a few other APIs. CGBitmapContext only supports a specified few pixel formats anyway, so this simply makes it impossible to construct impossible requests—another good interface principle.

  • bytesPerRow: Redundant. The number of bytes per row follows from the pixel format and the width in pixels; indeed, CGBitmapContextCreate computes this internally anyway and throws a fit if you guessed a number it wasn’t thinking of. Better to cut it and let CGBitmapContextCreate infer it.

    Making you compute a value for bytesPerRow does provide an important safety check, which I’ll address shortly.

    EDIT: Alastair Houghton points out another case for keeping bytesPerRow. This doesn’t apply to CGBitmapContextCreate, which rejects any value that doesn’t follow from the pixel format and width in pixels, but could be valid for NSBitmapImageRep and CGImage.

  • data (the buffer): Since Snow Leopard and iOS 4, the context will create its own buffer if you don’t provide one. That makes it explicitly optional, which means it is not required.

The only arguments that are truly required are the width and height, which tell the context how many pixels it should allocate its initial buffer for in the given (or default) pixel format.

In fact, if we take the above idea of replacing three of the arguments with a single set of pixel-format constants, then we don’t actually need to make any of the properties readwrite—there isn’t any reason why the owner of the context should be changing the pixel format on the fly. You might want to change the width or height, but CGBitmapContext doesn’t support that and we’re trying to simplify, not add features.

So, what problems do the current APIs solve, what problems do they raise, and how would we address all of both problems?

  • Specifying the pixel format (bitsPerComponent, colorspace, bitmapInfo) up front saves the context having to reallocate the buffer to accommodate any pixel-size changes.

    If we simply removed the pixel format arguments from the initializer and made them readwrite properties (or a property), then the context would have to reallocate the buffer when we change the pixel format from the default (ARGB or something similar) to something else (e.g., grayscale).

    The immediate solution to that would be for the context to allocate its buffer lazily the first time you draw into it, but that would mean every attempt to draw into the context would hit that “have we created our buffer yet” check.

    A better solution would be to follow the above idea of condensing the specification of the pixel format down to a single constant; then, we could have a designated initializer that would take a pixel-format value, and a shorter initializer for the default case that calls the DI with the default pixel-format value.

  • Specifying the buffer as a plain pointer (or pointer to one or more other pointers) requires the dimensions of the buffer to be specified separately.

    It’s a mystery to me why CGBitmapContextCreate doesn’t take a CFMutableData and NSBitmapImageRep’s initializers don’t take an NSMutableData. With these, the length in bytes would be associated with the buffer, enabling the context/rep to check that the length makes sense with the desired (or default) pixel format. This would be better than the current check in two ways: First, the current check only checks bytesPerRow, ignoring the desired height; second and more importantly, the current check only checks the value you gave for bytesPerRow—it can’t check the actual length of the buffer you provided.

    (From that, you can derive a bit of guidance for using the current API: If you pass your own buffer, you should use the value you computed for bytesPerRow in computing the length of your buffer. Otherwise, you risk using one stride value in allocating the buffer and telling a different one to CGBitmapContextCreate.)

  • Requiring (or even enabling) the buffer to be provided by the caller is redundant when the API has all the information it needs to allocate it itself.

    This was especially bad when the buffer was required. Now that CGBitmapContext can create the buffer itself, even having that optional input is unnecessary. We can cut this out entirely and have the context always create (and eventually destroy) its own buffer.

  • The caller must currently choose values for parameters that are not important to the caller.

    The current API makes you precisely describe everything about the context’s pixels.

    WHY? One of the central design aspects of Quartz is that you never work with pixels! It handles file input for you! It handles rendering to the screen for you! It handles file output for you! Core Image handles filtering for you! You never touch pixels directly if you can help it!

    99% of the time, there is no reason why you should care what format the pixels are in. The exact pixel format should be left to the implementation—which knows exactly what format would be best for, say, transfer to the graphics card—except in the tiny percentage of cases where you might actually want to handle pixels yourself.

With all of this in mind, here’s my ideal API for creating a bitmap context:

typedef enum
#if __has_feature(objc_fixed_enum)
: NSUInteger
#endif
{
    //Formats that specify only a color space, leaving pixel format to the implementation.
    PRHBitmapContextPixelFormatDefaultRGBWithAlpha,
    PRHBitmapContextPixelFormatDefaultRGBNoAlpha,
    PRHBitmapContextPixelFormatDefaultWhiteWithAlpha,
    PRHBitmapContextPixelFormatDefaultWhiteNoAlpha,
    PRHBitmapContextPixelFormatDefaultCMYK,
    PRHBitmapContextPixelFormatDefaultMask,

    PRHBitmapContextPixelFormatARGB8888 = 0x100,
    PRHBitmapContextPixelFormatRGBA8888,
    PRHBitmapContextPixelFormatARGBFFFF, //128 bits per pixel, floating-point
    PRHBitmapContextPixelFormatRGBAFFFF,
    PRHBitmapContextPixelFormatWhite8, //8 bpc, gray color space, alpha-none
    PRHBitmapContextPixelFormatWhiteF, //Floating-point, gray color space, alpha-none
    PRHBitmapContextPixelFormatMask8, //8 bpc, null color space, alpha-only
    PRHBitmapContextPixelFormatCMYK8888, //8 bpc, CMYK color space, alpha-none
    PRHBitmapContextPixelFormatCMYKFFFF, //Floating-point, CMYK color space, alpha-none

    //Imagine here any other CGBitmapContext-supported pixel formats that you might need.
} PRHBitmapContextPixelFormat;

@interface PRHBitmapContext: NSObject

- (id) initWithWidth:(NSUInteger)width
    height:(NSUInteger)height;
- (id) initWithWidth:(NSUInteger)width
    height:(NSUInteger)height
    pixelFormat:(PRHBitmapContextPixelFormat)format;

//There may be an initializer more like CGBitmapContextCreate/NSBitmapImageRep's (taking individual pixel-format values such as color space and bits-per-component), but only privately, to be used by the public DI.

//Mutable so that an asynchronous loader can append to it. Probably more useful in an NSBitmapImageRep analogue than a CGBitmapContext analogue.
@property(readonly) NSMutableData *pixelData;

@property(readonly) NSColorSpace *colorSpace;
@property(readonly) bool hasAlpha;
@property(readonly, getter=isFloatingPoint) bool floatingPoint;
@property(readonly) NSUInteger bitsPerComponent;

- (CGImageRef) quartzImage;
//scaleFactor by default matches that of the main-menu (Mac)/built-in (iOS) screen; if it's not 1, the size (in points) of the image will be the pixel size of the quartzImage divided by the scaleFactor.
#if TARGET_OS_MAC
- (NSImage *) image;
- (NSImage *) imageWithScaleFactor:(CGFloat)scale;
#elif TARGET_OS_IPHONE
- (UIImage *) image;
- (UIImage *) imageWithScaleFactor:(CGFloat)scale;
#endif

@end

With the current interface, creating a context generally looks like this:

size_t bitsPerComponent = 8;
size_t bytesPerComponent = bitsPerComponent / 8;
bool hasAlpha = true;
size_t bytesPerRow = (CGColorSpaceGetNumberOfComponents(myColorSpace) + hasAlpha) * bytesPerComponent * width;
CGContextRef context = CGBitmapContextCreate(NULL, width, height, bitsPerComponent, bytesPerRow, myColorSpace, myBitmapInfo);

With an interface such as I’ve described, creating a context would look like this:

PRHBitmapContext *context = [[PRHBitmapContext alloc] initWithWidth:width height:height];

Or this:

PRHBitmapContext *grayscaleContext = [[PRHBitmapContext alloc] initWithWidth:width height:height pixelFormat:PRHBitmapContextPixelFormatWhite8];

Why you should almost always write a test app for your Radar bug reports

Wednesday, November 9th, 2011
  1. You’ll understand the bug better. This means you can write a better bug report, which will help Apple fix it more quickly (meaning you may get the fix more quickly).

  2. They’ll understand the bug better. This, too, helps Apple fix it more quickly.

  3. You may find that it is not a bug in the API at all, but that you were misusing it. Perhaps you were using something on a thread that you shouldn’t have been, or expecting some argument to be used a certain way when it’s actually used differently.

    In this case, you may be able to use the API after all, saving you the time you would have spent hacking around a non-bug. This also saves them the time they would have spent triaging and eventually responding to a non-bug.

    If your misunderstanding was borne out of poor documentation (misleading, inaccurate, vague, incomplete), you can file a bug report about that instead. Then the documentation gets better and future users of the same API avoid making the same error you did.

A test app isn’t appropriate for every kind of Radar, but when it is, including it helps everyone.

The application delegate and the new singletons

Friday, March 18th, 2011

Here is a global variable:

Wizard *gWizard;

I’ll call this a zeroth-order global, on the premise that I need to talk to exactly zero objects (including classes) to gain access to this object.

Next, let’s look at a singleton:

[Wizard sharedWizard]; //hope he's not busy

I’ll call this a first-order global, as we need to ask the class for it (1 step) to gain access to it.

Now, here’s a second-order global:

MyAppDelegate *appDelegate = [[UIApplication sharedApplication] delegate];

(I use UIApplication here because I see this most frequently in Cocoa Touch code, but the pattern applies equally to Cocoa.)

And here’s a third-order global:

Wizard *wizard = [appDelegate wizard];

I need to (1) ask the UIApplication class for the application object, (2) ask that for its delegate, and (3) ask that for the wizard. (Assume here that wizard is a property, not a factory method that creates Wizards on the fly.)

None of these is any less global. If I can get to it from anywhere in the program without knowing about it directly, it is global.

Therefore, all the problems of globals apply:

  • What if two threads want to use the same Wizard?
  • What if the Wizard has a delegate of its own, and I have two objects that want to be its delegate?
  • What if the Wizard keeps internal state that may be corrupted by multiple objects trying to use it? (Nothing should have to worry about this outside of the Wizard itself.)

Your application’s objects form a graph. It should not be a complex one like this:

At the top, the application object. From it, its delegate. From it, your controller objects and a wizard. From each controller object, a path (colored in red) back to the delegate and then to the wizard.

Whenever you have paths bouncing around off of other objects like that, that’s a problem. The red arrows in the problem graph show where you violate the Law of Demeter.

Your object graph should, instead, be straightforward:

At the top, the application object. From it, its delegate. From it, your controller objects. From each one, a wizard.

Note that each of your controllers should own—or, if you prefer, hire—a Wizard all to itself. This eliminates contention between objects and reduces the likelihood of contention between threads (assuming each of the owning objects is supposed to only work on a single thread and not juggle multiple threads).

If contention is not a problem and you have a good reason why there should be only one Wizard, such as memory pressure or union regulations, then use a singleton. But use a real singleton, and only when necessary, and beware of singletons in disguise.

Mentoring

Tuesday, March 8th, 2011

I originally wrote this for the Adium development list, in a discussion of the upcoming Google Summer of Code, for the benefit of those of our developers who have never mentored students in GSoC before and may be considering doing so. Read this in that context; for example, “our” means “the Adium project’s”.


  • Work with students from the very beginning on the quality of their [application] submissions. 90% of the submissions will be crap. Look past this. See what lies within. If it can be improved, try to get them to improve it. Those who do will be good students.

  • Spur your student to work on code. Make sure they’re committing at a regular and satisfactory rate.

  • Ensure your student actually writes code and doesn’t just crib together tutorials and/or Stack Overflow answers and/or ask you to provide code* for various tasks they need to accomplish. Wildly varying coding styles, inconsistent variable and method naming, and poor indentation are warning signs.

  • Communicate constantly. This includes the aforementioned spurring your student to work as well as being available to answer any questions they have about where things are in our code base, what they need to do to achieve certain goals or specific sub-tasks, etc. These questions are virtuous; you should encourage them, just short of demanding them. The student is not that only in name; they are here both to write code and to learn.

  • Find a medium that works for both of you. For me, email (or Twitter, nowadays) would be best. Maybe you prefer IM, or even voice chat/phone (if it isn’t too expensive). Don’t try to force your habits on the student; if you love the phone but they hate it (or vice versa), a happy medium will work better.

  • Be sure they understand and use Mercurial and good VCS practices generally. Frequent, discrete commits; neither waiting “until it’s done” to commit (it should compile) nor committing half-done work periodically (e.g., daily) nor committing amalgams of unrelated work (they should commit specific sections that comprise a single change).

  • Nowadays, I recommend that you have them fork our Bitbucket repo and push to their fork.

  • Review their code constantly. Subscribe to the commits list (or their fork’s commits feed) and review everything they write.

  • Read their commit messages, not just their code. Lists are a warning sign (unrelated changes lumped together). Inadequately describing the change is also a problem. Work the student out of these habits as soon as possible.

  • Don’t wait until mid-term exams to sit your student down for a serious talk about their work or lack thereof. If they’re committing garbage, set them straight as soon as possible—do not wait. If they don’t commit, get them writing and committing as soon as possible.

  • Always be ready to fail your student. Be compassionate, understanding of life’s realities; they’re not a slave. But they are here to work, and if they don’t do the job or if they do a bad job, be ready to fail them.

  • Make sure they know where they stand. If there’s 1–2 weeks before exams and they’re still in danger of failing, make sure they know what’ll happen if they don’t shape up.

I can’t claim to have been perfect in all of these points in my own mentoring (many of these I learned by not doing them), but it’s what I’ve found works.

There might also be something on the GSoC sites about this. Some viewpoints vary, particularly along the leniency-to-hardassness spectrum.

If you are not willing to do all of this, or don’t think you’ll have the time, you should not mentor a student.

* Six words that should worry every mentor or other help-offerer: “Can you show me a sample?”

Ship-It Saturday: PRHEmptySingleton repository

Saturday, September 4th, 2010

The singleton-done-right example from my article on the subject is now in a Mercurial repository on Bitbucket. The repository includes not only the class (which I’ve put in the public domain), but also a test suite for some of the test cases listed in the post.

There are some adventurous testing techniques at work here.

First, since we don’t want multiple test runs to use the same singleton instance, the test cases actually run in subprocesses. Each test method is prefaced with this code:

if (!isInSubprocess) {
    [self runTestInSubprocess:_cmd];
    return;
}

That method calls fork.

In the child process, the test case sets isInSubprocess to YES, then tells itself to perform that selector; this gets execution back to the test method, which checks the variable again, finds that it’s true this time, and continues on with the test.

The parent process calls waitpid and checks the result; if the child failed, the parent propagates the failure. If the child crashed (got a signal), the parent raises the same signal; if the child simply exited abnormally, then the parent exits with the same status.

Second, there’s test case #6:

  • If [super init] returns a different object, alloc/init won’t break.

That one is hard to test, because PRHEmptySingleton’s superclass is NSObject, and -[NSObject init] doesn’t return a different object. (Not in any version of Foundation I’ve ever encountered, anyway.)

So, the first step is to create an intermediate class that does return a different object. But that doesn’t help as long as PRHEmptySingleton is still a direct subclass of NSObject.

The simple solution would be to just change PRHEmptySingleton’s superclass, but that reduces the purity of the testing: A test should be able to work without modifying the code under test, and any changes to the code under test should be permanent changes that aren’t only to enable the test; you should be able to explain the changes as if the test case did not exist.

So what I did was to import everything in my prefix header, even more than I usually do, and then import my intermediate class’s header, and then use the preprocessor to ensure that any direct subclasses of NSObject declared elsewhere are declared as subclasses of the intermediate class. Thus, the prefix header causes PRHEmptySingleton to be a subclass of the intermediate class with no changes to PRHEmptySingleton’s header. It’s a bit of a hack, and doing this sort of thing could potentially cause problems in a larger program or test suite, but in this context, it works.

With that, five of the six test cases in the original post are now covered (I’m not sure how to cover #3 without changing PRHEmptySingleton.[hm]), and the code is under version control, so you can subscribe to and track changes.

Please use singletons responsibly.

The Green Checkmark of Acceptance

Saturday, March 13th, 2010

Here’s how answers to a question on Stack Overflow appear to the questioner:

Every answer has, below the helpful/unhelpful buttons, a hollow checkmark button.

When the questioner clicks on one of those checkmarks, it marks the answer as the accepted answer to that question, and changes the checkmark from a gray stroke to a green fill.

Everybody else reading the question will see, below the questioner’s name, an indication of how many of their questions have accepted answers. Today, for example, my questions have:

Peter Hosey
23.1k 2 14 28
67% accept rate

This indicates that I have accepted answers on four-sixths of my questions.

Sometimes, I see a comment like this semi-fictional example (written by me, based on several real examples I’ve seen) on a question whose author has a low or zero acceptance rate:

0% accept rate? You really should accept answers on your questions, or people may not answer any further questions from you.

This is a bad reason to accept answers.

The real reason to accept an answer is that you believe it’s the correct answer.

Sometimes questioners choose bad answers (deprecated APIs, hacky solution, etc.). When that happens, it’s a problem because it may lead future readers astray—they may think that this is the correct answer (because the questioner said so), without reading the other answers or the comments and finding out that this way sucks and/or there is a better one.

The same problem happens when a questioner accepts an answer because they think they have to, out of some sort of social obligation, rather than because they truly believe it is the correct answer. They may not have the correct answer yet, or there may not be a correct answer yet, but they feel like they have to accept something, so they accept the best answer they have, however good or bad it is, solely to raise that all-important number.

That sucks.

Questioners: About a day after asking a question, you should return to it, read all the answers, try them in descending order by votes, and accept the one that works and is the least hacky, for the benefit of other people who have the same question you asked. Take comments into account—something may not look hacky, but a comment may point out the hackiness.

And if there is no good answer, you don’t need to accept anything. For the same reason (the benefit of future readers), you should leave the question open.

It’s OK to have an acceptance rate that is below 100% or even low, as long as you are accepting answers that you find work and are non-hacky, on as many of your questions as you can. As long as you’re making that effort, you’re doing it right.

People who post comments like the one above: Why are you so desperate for karma? It’s not like it’s scarce or valuable. Net scores on answers are meaningful (usually), but your personal total, like mine, is next to meaningless. It’s a reward, yes, but an empty one, so I don’t see why you get all hurt when you perceive a risk that someone may not give it to you.

In summary: Don’t worry about it. Accept correct answers, write correct answers, and don’t worry about your acceptance rate or anyone else’s.

Blog posts vs. web pages

Monday, January 25th, 2010

Steve Smith says “Stop Blogging”:

I mean it. All of you people are writing fantastic, useful articles about code, methods, and technologies, but you’re putting them in blog posts — a date-based format that encourages us to leave things as they were, historically.

This got me to thinking about the difference between two of the tutorials I’ve published.

The pointers tutorial is a single web page. There’s a date stamp, but it’s way down at the bottom. The ASL series is nine blog posts.

In the three years since the previous version of the pointers tutorial, dozens of people emailed me to tell me about its major errors.

In the two years since I published the last of the ASL series (ignoring approximately a week afterward), nobody has told me of an inaccuracy in any of the posts.

There are a number of possible explanations for the ASL series receiving fewer (that is, no) corrections:

  • That its audience is narrower: Anyone who programs C has to deal with pointers. Only a very few Mac OS X programmers will ever touch ASL.
  • That it is less visible: One of these is linked from my home page and plenty of CS course reading lists (exhibits A, B, C, and D), and was linked for a while from the Wikipedia article on the C programming language; the other is practically unknown to anyone who wasn’t subscribed to my blog at the time.
  • That I’m just that good. (Ha!)
  • That ASL hasn’t changed at all since Leopard. (Ha!)

Smith writes from the perspective of the author and publisher, who must maintain a web page; he says that the author and publisher finds no (or not much of) such obligation for a blog post. I think the difference in my supply of corrections hints at a reader side to this, although, as shown above, my two examples are hardly comparable.

I have been meaning to move the ASL tutorial into a pointers-style web page at some point, although I don’t know when. I may start receiving corrections then, which means I’ll have to spend time to fix them. The flip side to that is that if I leave it as blog posts, I’ll have that time for other things, but the posts will be consigned to periodically-increasing inaccuracy.

I expect to think more about Smith’s suggestion.

There’s also the merit of the word “blog”, which is wearing thin for me.

Warnings I turn on, and why

Saturday, November 7th, 2009

I’ve started turning on most of Xcode’s warning options and one warning-related build setting in all of my personal projects. I suggest you do the same.

ADDED 2013-09-27: The easiest way to do this nowadays is to use this xcconfig file, which turns all of the settings described here on at once.

There are some warnings I don’t turn on, for any of several reasons:

  • They’re inapplicable. For example, “‘Effective C++’ Violations” doesn’t apply to me, because I don’t write C++ code.
  • They don’t help anything. An example is “Four Character Literals”, which is about 'abcd' literals for four-byte-code types such as OSType. These sacrifice something convenient for no benefit, so I leave them off.
  • They are impossible for me to fix. An example is “Multiple Definition Types for Selector”. Cocoa raises that one all over its headers, and I can’t do anything about Cocoa.

The rest of the warnings, I turn on because either they make something clearer or they tell me about either real or potential (i.e., future real) bugs. They are:

  • Check Switch Statements

    Warn whenever a switch statement has an index of enumeral type and lacks a case for one or more of the named codes of that enumeration. The presence of a default label prevents this warning.

    Leave no case unhandled.

    Consider whether a default: label is appropriate for your enumeration. If your switch statement handles all possible values, cut out the default and assert that the value is one of the possible ones instead. An easy way to do this, if the enumeration values are serial and the enumeration is not part of an API you expose, is to have one extra name defined as the number of possible values:

    enum Foo {
    	kFooFoo,
    	kFooBar,
    	kFooBaz,
    	kFooNumberOfValidFooValues
    };
    

    Then, in your assertion macro, compare the value against that name:

    #define MyParameterAssertValidFoo(foo) \
    	NSAssert1((foo) < kFooNumberOfValidFooValues, @"Invalid Foo value: %d", (foo));

    When you add kFooQux, insert it above kFooNumberOfValidFooValues, and the value of kFooNumberOfValidFooValues will increase by one to fit it.

    The result is that your switch statement covers all known values for the enumeration (or you get a warning because it doesn't), and your method throws an exception (from the assertion) whenever anyone passes an unknown value.

  • Hidden Local Variables

    Warn whenever a local variable shadows another local variable, parameter or global variable or whenever a built-in function is shadowed.

    One common way to get this warning is to name a variable index, because there is a function by that name in the standard C library. That's not as much of a false positive as you may think: If you fail to declare your index variable, all your references to it will actually refer to the index function. You can see how it would be bad to send a message such as [myArray objectAtIndex:index] with this bug.

    The solution is simple: Never, ever name a variable index.

  • Implicit Conversion to 32 Bit Type

    Warn if a value is implicitly converted from a 64 bit type to a 32 bit type.

    This is most useful when converting old code to work correctly in a 64-bit architecture. Storing a pointer into an int variable (such as a reference constant) when targeting an LP64 architecture is a good way to get this warning, and rightly so.

  • Initializer Not Fully Bracketed

    Example, Here initializer for a is not fully bracketed, but that for b is fully bracketed.

    	int a[2][2] = { 0, 1, 2, 3 };
    	int b[2][2] = { { 0, 1 }, { 2, 3 } };

    This is a cleanliness warning. It also applies to structures, such as NSRect:

    NSRect warns = { 0.0f, 0.0f, 640.0f, 480.0f };
    NSRect doesNotWarn = { { 0.0f, 0.0f }, { 640.0f, 480.0f } };

    (In real code, I'm more likely to use NSZeroPoint instead of the { 0.0f, 0.0f } element above. It's harder to spell that wrong and get away with it than it is to get away with typing 9.9f, 1.1f, or 2.2f instead of 0.0f.)

  • Mismatched Return Type

    Causes warnings to be emitted when a function with a defined return type (not void) contains a return statement without a return-value. Also emits a warning when a function is defined without specifying a return type.

  • Missing Braces and Parentheses

    Warn if parentheses are omitted in certain contexts, such as when there is an assignment in a context where a truth value is expected, or when operators are nested whose precedence people often get confused about.

    Also warn about constructions where there may be confusion to which if statement an else branch belongs. Here is an example of such a case:

    	if (a)
    		if (b)
    			foo ();
    	else
    		bar ();

    In C, every else branch belongs to the innermost possible if statement, which in this example is if (b). This is often not what the programmer expected, as illustrated in the above example by indentation the programmer chose.

    This may appear to be just a cleanliness warning, but as you can see from the example, it can also warn you about code that may not flow the way you expect it to.

  • Missing Fields in Structure Initializers

    Warn if a structure's initializer has some fields missing. For example, the following code would cause such a warning, because "x.h" is implicitly zero:

        struct s { int f, g, h; };
        struct s x = { 3, 4 };

    This option does not warn about designated initializers, so the following modification would not trigger a warning:

        struct s { int f, g, h; };
        struct s x = { .f = 3, .g = 4 };

    I'm not sure why it warns about the former and not the latter, since all the members get initialized in both code examples (C99 §6.7.8 ¶21). If nothing else, this warning is good motivation for you to switch to designated initializers, which make your code more explicit about which members it's initializing.

  • Missing Newline At End Of File

    Another cleanliness warning—this one, about the cleanliness of diffs.

  • Sign Comparison

    Warn when a comparison between signed and unsigned values could produce an incorrect result when the signed value is converted to unsigned.

  • Strict Selector Matching

    Warn if multiple methods with differing argument and/or return types are found for a given selector when attempting to send a message using this selector to a receiver of type "id" or "Class". When this setting is disabled, the compiler will omit such warnings if any differences found are confined to types which share the same size and alignment.

    I don't turn this one on, because it's unnecessary. When the multiple declarations differ significantly (e.g., one method returns an object and the other returns a float), the compiler will raise the warning whether it's turned on or not. When the declarations don't differ significantly (e.g., both methods return an object), the difference won't cause a problem, so you don't need to worry about it.

    So, you should leave this one off.

  • Typecheck Calls to printf/scanf

    Check calls to printf and scanf , etc, to make sure that the arguments supplied have types appropriate to the format string specified, and that the conversions specified in the format string make sense.

    The biggest reason to turn this on is that it checks your use of methods that take a nil-terminated list of arguments:

    NSArray *array = [NSArray arrayWithObjects:@"foo", @"bar"];

    That message should have a nil after the last argument. With this warning turned on, the compiler will point out that I don't.

    The ostensible main reason to turn this on is to have the compiler check your uses of printf and scanf formats. I don't use printf often (and I never use scanf), so that's not so important for me, but when I do, this could come in handy.

    Sadly, it doesn't work on NSLog calls. This has been fixed in Clang as of 2013. Your printf, NSLog, and stringWithFormat calls all get checked (despite the name of the setting not having changed).

  • Undeclared Selector

    Warn if a "@selector(...)" expression referring to an undeclared selector is found. A selector is considered undeclared if no method with that name has been declared before the "@selector(...)" expression, either explicitly in an @interface or @protocol declaration, or implicitly in an @implementation section.

    Another benefit of this warning is that you can use it to get a warning when you pass a wrong key to a KVC, KVO, KVV, or Bindings method. Uli Kusterer has a macro for that.

  • Unused Functions

    Warn whenever a static function is declared but not defined or a non-inline static function is unused.

    Works best with a policy of declaring any function as static that you don't need to be visible elsewhere in your program.

  • Unused Labels

  • Unused Values

  • Unused Variables

    These follow the general rule of “code you don't have is code you don't have to debug”. If you're not using a label, expression statement, or variable, you don't need it, and you will find your code clearer without it.

    You may notice that I don't turn on Unused Parameters. Most times when I trip that warning, it's a callback function or method, so I can't get rid of the argument. Rather than litter my code with bright yellow #pragma unused(foo) directives, I prefer to just turn this one off. (See my rule above about less code being better.)

Once I have turned on all of these warnings and then eradicated them from my code, I turn on two more build settings:

  • Treat Warnings as Errors

    I call this “hardass mode”.

    Remember what I said above: Almost all of these warnings represent real or potential (i.e., future) bugs in my program. Rather than tolerate them, I turn this setting on so that any time I write such a bug, I break the build.

    I haven't been able to turn this on yet in Adium or Growl, although I have turned it on in Adium's Spotlight-importer project. I do, however, turn it on in all of my solo projects.

  • Run Static Analyzer

    Activating this setting will cause Xcode to run the Clang static analysis tool on qualifying source files.

    The Clang Static Analyzer is the find-your-bugs-for-you tool you've heard so much about. This setting makes Xcode run it whenever you build. Thus, every build, you get all those warnings errors and your analysis results.

    Whenever possible, I leave this on; if there's a source file that it takes a long time to analyze (e.g., GrowlPluginController.m), then I turn it off, but only then.

UPDATE 2009-11-22: Jonathan “Wolf” Rentzsch wrote a script to turn on all of these settings in all of the projects you have open.

UPDATE 2009-11-28: Updated the entry on “Typecheck Calls to printf/scanf” after seeing that Jeremy W. Sherman pointed out a much better benefit of it in a comment on a Stack Overflow answer.

UPDATE 2009-12-05: Corrected the discussion of the index problem. You can't use index, or any other function, as a C-array subscript, so the problem only affects higher-level arrays, such as NSArray.

UPDATE 2013-09-27: Added link to xcconfig; updated section on “Typecheck Calls to printf/scanf”.

The peril of index(3)

Thursday, November 5th, 2009

This is mainly for Andy Finnell on Twitter, who wonders why some of us avoid naming variables index.

I pointed out that there is a function in standard C named index, and this causes one of two problems: If you declare a variable named index, you have shadowed the function and should get a warning for that; if you fail to declare the variable, you pass the pointer to the index function as your array index, which is probably not what you intended.

I say “should” there because, as he noted in his response, the shadowed-name warning is off by default. You should turn it on, because it catches bugs. In fact, the index bug is one that it can prevent.

Suppose you do name a variable index, and either you don’t have the shadowed-name warning turned on or you ignore it. You initialize the variable with an index, but don’t otherwise assign to it. Then, you attempt to access an object in an array by this index.

All well and good so far. index is a variable, so everything works as intended.

But then, one of several things happens:

  1. You comment out both the declaration and the usage of index, for whatever reason, but then you uncomment the usage but forget to uncomment the declaration.
  2. You update and/or merge in your version-control system, or otherwise apply one or more diffs. Usually, this works, but today isn’t your lucky day: The merge breaks your source code. Perhaps it introduces conflicts, and you resolve them incorrectly. Or maybe it breaks the code silently (e.g., by merging in another branch’s division of this function into two).
  3. You move the code to another location, but you forget to move half of it, or you move one half and delete another, forgetting that the declaration of index was in the code you deleted.

You had a variable named index, but now you don’t—but the index function is always there*. Since there is something named index, your code compiles. It’s the wrong type, so you’ll get a warning, but maybe you don’t notice it.

Then you run the code and it crashes. Why? Because you passed a function as the index into an array.

In the worst possible case, it was #2 and you weren’t aware that this code was affected. Maybe you’d been working on something else. Anyway, since you hadn’t been working on the now-broken code, you aren’t testing it**, so you don’t know that it’s now broken.

So you ship it. You ship this index-way-out-of-range crasher. And then your user runs the code and gets the crash.

This isn’t theoretical. I’ve had this happen more than once (fortunately, not in the hands of a user). It’s one reason why I turn on the shadowed-name warning and “Treat Warnings as Errors”, and it’s the reason why I never use index as a variable name.

UPDATE 2009-12-05: To clarify, this problem does not affect C arrays, as C does not allow you to use a pointer in an array subscript. It mainly affects higher-level array interfaces, such as Cocoa’s NSArray.

* Assuming that, directly or indirectly, you’ve included string.h. If you’re using Cocoa, you have (Core Foundation includes it).

** Unless, of course, you have automated test cases covering this code, and you run those.

How not to use UTIs

Tuesday, September 22nd, 2009

John C. Welch has written an article telling you to use UTIs to replace creator codes.

It is wrong.

Daring Fireball, rosscarter.com, and (the most sober of the bunch), TidBITS have all gotten on the “Apple has abandoned creator codes, now we’re screwed, damn you Apple” bandwagon. They’re all right about one thing: Apple has dumped creator codes in Snow.

So before you panic, there’s a replacement mechanism, the UTI, or Universal Type Identifier.

No. Uniform Type Identifiers are a replacement for type codes (and MIME media types, and filename extensions), not for creator codes. There is no replacement for creator codes.

Uniform Type Identifiers are just that: They identify types of data. Using them for anything else is misusing them.

Ideally, Coda should assign an application-specific UTI for CSS files it creates.

No, it should not.

As it turns out, Coda is a UTI-aware application, and according to Coda’s info.plist file, that identifier is com.panic.coda.css.

And that identifier is wrong, because CSS is not Panic’s format.

I use CSSEdit. Currently, it doesn’t declare a UTI, but if they should follow Coda’s example, they should make one up of their own, such as com.macrabbit.cssedit.css. Now we have one type with two names. Which is right?

Suppose I write an application that accepts CSS data. Which UTI do I look for: com.panic.coda.css or com.macrabbit.cssedit.css? Should I look for both of them? What if I find both? What if a third developer releases a CSS editor? Now I have to keep up with the entire population of Mac CSS editors just to accept all the many names for one type.

The right type identifier would be something like org.w3.css or org.w3.cascading-style-sheets, following the rule that the type should be named after whoever controls it. The W3C controls CSS, so its UTI should be named after them. Some other formats, such as PNG, aren’t controlled by anybody, so they go in public.

You might point out public.html and public.xml, as both of those are also controlled by the W3C. Obviously, I disagree that these should be in public.*. But it’s too late to change them now, so we have to put up with them.

Better examples include com.adobe.pdf and com.microsoft.word.doc. In each of these cases, some third party controls the format, so they get the format’s UTI named for them. Notice that Preview does not invent a com.apple.preview.pdf UTI, and TextEdit does not invent a com.apple.textedit.word.doc UTI; they use the UTIs named for the creators and maintainers of those types.

So now we see a problem. All the text-ish files have the same type identifier: com.apple.traditional-mac-plain-text.

Because they are all the same type. And that type is named after Apple because Apple controls that format (in this case, a text encoding).

The Photoshop PNG file is typed as public.png.

There’s no such thing as a “Photoshop PNG” file. PNG is an open format controlled by no-one (or, arguably, the W3C, in which case the type should have been named for them). What you have is a PNG file that you created in Photoshop. That latter detail is irrelevant to its type, which is why HFS had the creator code separate from the type code in the first place.

If there is such a thing as a “Photoshop PNG” file (that is, if Adobe incompatibly extends the PNG format in some way), then they should assign that format its own name (say, com.adobe.png), because it’s a different format.

If you take away the creator code, you’re screwed, because all you have are the extensions.

Exactly why John Gruber, Ross Carter, and Matt Neuberg don’t like the change.

Coda CSS file props:

displayed name:"test.css", 
creator type:"TStu", 
type identifier:"com.apple.traditional-mac-plain-text",

I’m surprised you didn’t notice this. Why isn’t this file’s type identifier com.panic.coda.css?

… why aren’t any of these applications doing the right things? Pages ’09 does. It sets the type for .pages files to “com.apple.iwork.pages.pages”.

Because it’s Apple’s own format specifically for Pages. Again, see also com.microsoft.word.doc.

Unsurprisingly, it doesn’t even have a creator code.

Yeah. Cocoa makes it difficult to apply a creator code to a file and doesn’t do it automatically, so expect most Cocoa apps to not do it.

Even thought the default for the .html extension is BBEdit, the UTI overrode that. When I changed the handler for public.html to BBEdit, then the file opened in BBEdit. That’s exactly the behavior I’d expect, and it matches what you got with Creator Codes.

It matches what you got when you assigned the default handler of a file type, as well. That’s what you did the newer equivalent of.

There never was a way to set which application opened files that had a given creator code. That was what the creator code was for: Determining which application would opened the file.

Yes, Apple did abruptly kill off Creator Codes, however, they’ve been warning about that for some time now.

Sort of. They’ve been warning that they’d kill off the combination of file type codes and creator codes. UTIs are the replacement for file type codes; as I said above, there is none for creator codes.

Now, I know that ‘simple’ is a big fat lie when it comes to application development, however, at some point during the 10.5 lifecycle, all these applications should have been updated to not only support UTIs, but to use them. Instead, they all relied overmuch on FIle Types/Creator Codes …

Actually, it was more common to use filename extensions. See what I said above about the difficulty of assigning file types and creator codes from a Cocoa app. I think that was actually quite rare, simply because filename extensions were so much easier.

… and now that those are gone, well, the OS does what it can with the information it has available.

No, it doesn’t. The creator code information is still available, and the OS ignores it. The OS now only uses type information to determine how to open a given file, except for files that the user has assigned a specific application to.

Oh, and what about when you change a file association in the Finder? Well, unsurprisingly, the Finder cheats. It adds a resource fork to the file with the path to the desired application coded within. … [M]aybe the Finder could start doing the right thing too?

Agreed. I say it should set the file’s creator code.

Incidentally, I tested in Tiger, Leopard, and Snow Leopard, and Finder did this resource trick in all three versions. If the Info window ever did set the file’s creator code, it hasn’t done that for many years.

If the only way to have a unique UTI associated to a file is for that file to use a completely unique file extension, then WHAT THE FUCK GOOD ARE UTIS? Fucking seriously, why the FUCK bother? BBEdit creates TEXT FILES. Does Apple seriously expect that Bare Bones, to properly use UTIs, is going to now create a .bbedittext extension, a .bbedithtml extension, a .bbeditohmyfuckinggodwhatthefuckisgoingon extension, then make you EXPORT to ‘standard’ extensions, just so you can know a file created in BBEdit will open in it?

And now you know why smashing type information and creator/opener information into a single value is a bad thing.

If this is what Apple is pushing, then everyone who signed off on non-settable UTIs is a fucking idiot…

There is no reason why the average user should be able to change the type of a file. If they created an HTML file, it’s an HTML file, and there’s no reason for a user to be able to set it to anything different unless they really know what they’re doing. (I’ve done it, and I’m assuming you have, too. We’re exceptions.)

What you want to do is to change which application opens the HTML file, and that is completely separate from the fact that the file contains HTML.

Variables for clarification

Friday, August 14th, 2009

Before:

if (![newTrackURL isEqualToString:trackURL] || ([newTrackURL hasPrefix:@"http://"] && !([lastPostedDescription isEqualToString:displayString]))) { // this is different from previous notification, or it's a stream and the playing track has changed

After:

BOOL URLChanged = ![trackURL isEqualToString:newTrackURL];
BOOL isStream = [newTrackURL hasPrefix:@"http://"];
BOOL descriptionChanged = ![lastPostedDescription isEqualToString:displayString];
if (URLChanged || (isStream && descriptionChanged)) {

Why remove the comment? Well, I had just changed the code to what you see in the Before example, but I had not changed the comment, and I thought I was ready to commit. I almost committed a stale comment. It’s that easy. Luckily, I caught it in reading the patch before I qfinished it, so I was able to change the comment to what you see above.

Now I have no comment at all, but the code is clear, so I don’t need one. This is an improvement.

The correct way to write a bug report in RadarWeb

Wednesday, July 15th, 2009

While I’m making screencasts of OmniWeb…

RadarWeb provide a JavaScript tool to fill in the section headers for their desired format automatically.
QuickTime/H.264, 960×538, 376 K

This is the format they want you to use, and since a few months ago, it’s now easy for you to use it.

Singletons in Cocoa: Doing them wrong

Wednesday, June 17th, 2009

UPDATE 2010-09-04: I wrote this post about an older version of Apple’s singleton example. The current version is much better: It still doesn’t handle over-releases the way I would like, but there’s nothing objectively wrong with it. As such, there’s nothing wrong with using Apple’s current example. You can continue reading this post to see my criticism of Apple’s old singleton example and my own implementation.


A singleton is a simple concept: It’s an object that you can only have one of. Being that simple, you may think it’s hard to get wrong.

If you do, I welcome you to the wonderful world of programming.

The Cocoa Fundamentals Guide is necessary reading for all Cocoa programmers. Even so, nobody’s perfect, and the CFG’s authors are no exception.

On Twitter last night, a mistake of theirs came to light when someone named Nick tweeted about how hard it is to make a singleton in Cocoa. Nick cited an example from the CFG; Chris Hanson responded that “that example does WAY too much work”, and Bill Bumgarner added that the example is “ridiculous”. They‘re were both right.

Doing it wrong.

The example, in order to “ensure singleton behavior”, overrides four basic Foundation methods:

  • retain
  • release
  • autorelease
  • retainCount

In the override implementations, retain, release, and autorelease are all no-ops. release does exactly nothing; the other two do nothing but return self. The override implementation of retainCount unconditionally returns UINT_MAX.

Before I proceed any further, I’ll give you a caution about singletons. You generally do not need to make your object a singleton at all. Even if your application has a single-window interface, with only one of each controller, you probably don’t need to enforce that. Just don’t create more than one. Then, if you do create more than one, you have a bug.

Let’s digress for a moment. Imagine if you allowed it. Imagine that you allow, say, your app delegate to create more than one of your root controller object. Your app delegate will then proceed to set up both twin objects, and they both respond to such things as notifications.

Bad, right? Now let’s say you “fix” this by making your root controller a singleton.

Your app delegate is still trying to create two root controllers. It’s still trying to set up two root controllers, and as a result of the two set-up messages, the controller is signed up twice for notifications.

But now you really only have one root controller. Your app delegate creates the first one, then gets it again when it tries to create the second. All according to plan so far. Then your app delegate sets up the first object—and then it sets it up again, again thinking that it’s talking to a second object. Even worse, because the object is now signed up twice for notifications (once per set-up message), every notification response happens twice.

You now have only one root controller, but you didn’t fix the bug, which wasn’t in the controller at all, but in the app delegate. To fix the bug, you must fix the app delegate; you don’t need a singleton for this at all.

OK, digression over. Singletons are bad. Avoid them. If you have a lot of them, rip most of them out. (Dave Dribin has bookmarked a lot of other good cases against singletons, and BJ Homer points out that they aren’t all bad. More on BJ’s post later.)

Back to the singleton at hand.

First, let’s look at retainCount. A comment explains that UINT_MAX “denotes an object that cannot be released”, implying that the framework itself considers UINT_MAX a special value.

This is actually correct, although I originally thought (and wrote, in an earlier version of this post) that it was bogus. The documentation for retainCount explicitly says that objects whose release method does nothing should return UINT_MAX.

Next on the hit list: autorelease. This is just mostly pointless. autorelease means nothing more than “send this a release message later”. If release does nothing, then autorelease becomes “do nothing later”. All this override does is move the nothing up to now. A performance optimization, perhaps, but I’d say it’s premature.

And now we come to the real villains: retain and release.

First off, you shouldn’t retain a singleton anyway. I can’t imagine why you would do this, except incidentally, by making it a timer target or something.

But even if you do think of a reason to retain a singleton, you still need to balance retain and release. If you retain without releasing afterward, you are under-releasing the object. If you release without retaining previously, you are over-releasing the object. These two statements have no exceptions.

If you break retain and release, then you’re able to over-retain or over-release (or even both!) the object with impunity. Your app no longer crashes, but you didn’t really fix the problem; you’re just getting away with it. You’re still trying to over-retain and/or over-release an object.

The Cocoa Fundamentals Guide’s primary audience is new Cocoa developers. Those who have never used a retain-counting system before may very well over-release the singleton, and Apple’s example singleton implementation hides that bug from them. That’s bad; every Cocoa programmer should know how to recognize an over-release crash, and breaking retain and release denies the reader an opportunity to learn that. I’ve filed a documentation bug report about the example.

Also, a rebuttal

BJ Homer responds at a different angle to last night’s flurry of tweets.

First, though, we need to get a definition straight. A singleton is a class of which there should only be one instance in any given process. There are actually very few singleton classes in the Cocoa framework including NSDocumentController and NSFontManager. You cannot create more than one of these objects; …

This is true. A singleton object is an object that your app can only have one of. There are a couple of different ways to do this:

… if you try to call [[NSDocumentController alloc] init], you’ll get back the exact same object as you do when you call [NSDocument sharedDocumentController], no matter how many times you call alloc and init. NSApplication is arguably a singleton as well; you can alloc another one, but you’ll get an assertion failure when you call init.

Another way is to simply implement the sharedFramistan method, and leave allocWithZone: and init alone. I suspect that this is common, since it’s the easiest way, but BJ is right that it isn’t a true singleton, since it allows multiple instances.

Where BJ goes wrong is in his defense of the retain and release overrides:

Consider the following example:

MyFloozit *floozit1 = [[MyFloozit alloc] init];
[floozit1 doSomething];
[floozit1 release];

MyFloozit *floozit2 = [[MyFloozit alloc] init]; // MyFloozit is a singleton, so this should be the same object as floozit1
[floozit2 doSomething]; // CRASH HERE
[floozit2 release];

We’ll leave aside the problem that you probably should be using sharedFloozit to obtain the singleton instance of MyFloozit.

When floozit1 is set, a new MyFloozit is allocated, and a static MyFloozit pointer is set. When floozit1 is released, that static pointer is still pointing to the old instance. As a result, when we try to set floozit2 (or when anyone else tries to call [MyFloozit sharedFloozit]), we get back a pointer to that same instance. The one that has been dealloc-ed.

BJ is missing something else the CFG says, which he even quoted later on:

Situations could arise where you want a singleton instance (created and controlled by the class factory method) but also have the ability to create other instances as needed through allocation and initialization. …

(Emphasis added by me.)

A singleton object owns itself. As such, it retains itself. As such, it should never die, because it always has at least one owner—and that’s without breaking retain and release. If it does die, it’s because something over-released it; later, something that was using it will crash, which lets you know that you have a bug. This is a good thing, because now you can fix the bug.

Each of the release messages in BJ’s example balances an alloc message above it. That alloc message may actually return an existing object, but we’re expecting to own a new object. Therefore, the singleton’s allocWithZone: should implicitly retain the existing object.

There is no good reason to override retain and release. Don’t do it. This also goes for autorelease. And, since you never override release, you also never need to override retainCount.

Doing it right

So, having thoroughly dismantled the [old] Apple documentation’s poor implementation of a singleton, let’s look at the correct way to do it.

Let’s go through the requirements:

  • The One True Instance is the only instance. (If you deliberately allow multiple instances, I call this a “multipleton”. I’ll leave that as an exercise for the reader, and concentrate on the true singleton here.)
  • There is a sharedFramistan method. It tests whether the One True Instance exists; if not, it creates the object and remembers it in file-scope static storage. Then it returns the One True Instance.
  • We’ll allow going through alloc and init, and return the same instance. We’ll do this in allocWithZone:, as Apple did. We’ll also need to make sure init doesn’t do its work twice on the same object.

[UPDATE 2010-09-04: I used to have the code inline in the post here; I have since moved it to a Mercurial repository on Bitbucket. You can download PRHEmptySingleton.h and .m from there.]

This is actually even more complex than Apple’s example, but it passes all six of these tests:

  1. sharedFramistan always returns the same object.
  2. alloc/init always produce the same object (which, itself, is the same object as sharedFramistan).
  3. alloc/init will not return an object confused by multiple init messages.
  4. Over-releasing causes a crash.
  5. Keeping alloc/allocWithZone:/retain balanced with release never causes a crash.
  6. If [super init] returns a different object, alloc/init won’t break.

Apple’s [old] example fails test 4. BJ doesn’t show an implementation for his example, but according to his own description of its behavior, it fails test 5. My version passes all six.

Take-aways

Hiding bugs is bad. Even worse is giving code that can hide a bug to new Cocoa programmers who could really use practice in detecting and fixing that kind of bug.

If you really need to implement a singleton, there is a right way to do it. The way currently shown in the Cocoa Fundamentals Guide isn’t it.

Don’t change the behavior of retain, release, retainCount, or autorelease. Ever.

UPDATE 2009-09-04: Removed hasInited instance variable after Christiaan Hofman pointed out its redundancy in the comments.

UPDATE 2009-09-19: Moved assignment of the sharedInstance static variable from +initialize to -init, as suggested by Uli Kusterer.

UPDATE 2010-09-04: Replaced inline code sample with links to the Mercurial repository for the sample.

New tool: The Symbolicator

Sunday, April 19th, 2009

The most significant behind-the-scenes change in Growl 1.1.5 is that we now build using the DWARF-with-dSYM debug symbol format.

Now, we distribute a fully-stripped executable for every release—even betas—and we keep the dSYM bundles on hand separately. The idea here is that we can use the dSYM bundles to obtain symbolic information (function name, filename, and line number) for the bare addresses in users’ crash logs.

Unfortunately, there are no good tools for symbolicating Mac OS X crash logs. If this were an iPhone app, we could just drag the crash logs into the Xcode Organizer window, but Xcode doesn’t do this for Mac crash logs.

I tried all the other tools as well, and every one of them had one of two problems:

  1. Didn’t work at all
  2. Needed the dSYM bundle and main bundle to be next to each other on disk

#2 is not a deal-breaker, but it is a hassle.

So I wrote my own symbolication tool.

The Symbolicator is a Python script that:

  1. Reads in a crash log.
  2. Finds the necessary dSYM bundles using Spotlight. (This means that you can put the dSYM bundles anywhere you want, and as long as Spotlight can find them, the Symbolicator will be able to use them.)
  3. Uses dwarfdump to extract the relevant symbol information.
  4. Replaces the address offsets with the symbol information (just like CrashReporter does when it has debug symbols to work with).
  5. Writes the symbolicated log to stdout.

Use it like this:

% symbolicator < unsymbolicated.crash > symbolicated.crash

Or use ThisService to make a service out of it.

If you want to see it in action right now, download Growl 1.1.5b1 and the corresponding dSYM bundles from the Growl beta page. Make Growl crash (killall -TRAP works well), then unpack the bundles and use the Symbolicator on the crash log. (If you unpack the bundles first, CrashReporter will symbolicate the log before you even get to it. Handy, unless you’re trying to test the Symbolicator. ☺)

To make this work on your own app, follow the instructions in the aforementioned ADC article, then make sure you archive the dSYM bundles for every release, including betas. On Growl, I added code to our Release Makefile for this.

Note: If your app is closed-source, you should not put your dSYM bundles on your website, since the debug symbols are arguably trade secrets (information about your source code). Keep them locally, perhaps on a flash-memory drive. Disclaimer: IANAL.

Don’t pass objects as context pointers

Friday, April 3rd, 2009

Cocoa has several functions and methods that run a panel (usually as a sheet) asynchronously. These functions and methods take a target object, a selector, and a context pointer, and when they finish, they send a message to the target with that selector, passing that context pointer.

A common practice is to pass an object in the context pointer. You retain the object before you call the run-panel function or method, then release it in your callback. For example:

    NSBeginAlertSheet(/*title*/ NSLocalizedStringFromTableInBundle(@"Update Available", nil, bundle, @""),
                      ⋮
                      /*modalDelegate*/ self,
                      /*didEndSelector*/ NULL,
                      /*didDismissSelector*/ @selector(downloadSelector:returnCode:contextInfo:),
                      /*contextInfo*/ (void *)downloadURL,
                      …);

In this example, downloadURL is a local variable, and the object in it was retained above, or maybe not autoreleased in the first place. The downloadSelector:returnCode:contextInfo: method will cast the context pointer back to an object type, probably do something with it, then release it.

This has recently become a visible problem because of the Clang Static Analyzer. The analyzer doesn’t know that NSBeginAlertSheet will retain the object, so it flags this call as a leak. This is the most-often-cited example of the checker’s “false positives”: things that it flags as bugs but that aren’t really bugs.

But is it really not a bug? Is it really appropriate to use an object as a context pointer?

Let’s think about what we’re doing here:

  1. The panel doesn’t know the context pointer is really an object, so it doesn’t retain it. To compensate for this, we’re retaining an object on behalf of another object (the panel). This alone should make you feel dirty.

  2. The panel doesn’t know the context pointer is really an object, so it won’t release the object, either. This is particularly a problem if something releases the panel (quite possible in the case of, say, an NSAlert instance). Then it really is a leak.

  3. If we get released (and, hopefully, take the panel with us or at least cancel it), then we leak the object, because we are only expecting to have to release it (or even to have access to it!) in the callback.

Passing objects through context pointers is an anti-pattern. I hope the Clang Static Analyzer developers never remove this warning, because this is not a false bug—it is a real bug that we should all fix in our programs.

Solutions

That object needs to go in an instance variable somewhere.

The lightweight solution is to create an instance variable on the object that is running the sheet. You store the object in the instance variable when you run the sheet (and leave the context pointer NULL), then retrieve the object from the instance variable in your callback. You’ll still need to retain it when putting it in, and release it both in your callback and in dealloc—those aspects don’t change.

I can hear you saying that this is a waste of an instance variable. What are they, rationed? There’s nothing to waste; instance variables are cheap. And leaking objects is far more of a waste of memory than adding an instance variable.

But this isn’t the only solution.

Another way is to create your own subclass of the panel class (NSOpenPanel, NSSavePanel, or NSAlert), and give the subclass a property you can use instead of the context pointer. You could make it abstract (making the property similar to NSMenuItem’s representedObject, nothing more), or make it specific to your application. The former approach is simpler, whereas the latter gives you a base on which to build a more-customized alert if you later decide you want one. It’s up to you.

The advantage of this solution is that, since you’ve formalized the instance variable as part of a class’s API (and since the instance variable is no longer a bit player but the central part of a class), it’ll be harder for you to forget to manage your ownership of it correctly. Memory problems solved.

One way or another, though, you need to put that object into an instance variable somewhere to avoid leaking memory, feeling dirty, and having the Clang Static Analyzer chew you out.

(And yes, all this applies with garbage collection as well. See the section “GC Gotchas” in episode 36 of Late Night Cocoa, but I disagree with Andre Pang’s recommendation of CFRetain and CFRelease; the right solution is not to dip into Core Foundation, but to avoid the context pointer altogether and use an instance variable instead.)

hg precommit hooks and the Clang Static Analyzer

Friday, April 3rd, 2009

Fraser Speirs has a post about configuring your Git repository to vet commits with the Clang Static Analyzer.

The idea of pre-commit hooks is that you get to run a script before the commit happens. Depending on the result code of the script, the commit will either proceed or be aborted.

I wrote a wrapper around the scan-build tool, so that I could run the analyzer by hand with my preferred options at any time: …

The –status-bugs flag is the trick here: it makes scan-build return a non-zero status code if it detects bugs. That’s what we want with Git pre-commit hooks: a non-zero status indicates a possible bug, and that causes the Git commit to be aborted.

Mercurial, of course, has the same feature. The hg book has instructions; I’ll show you how I set up my repository to do this.

First, I created a shell script named RunClang.zsh:

#!/bin/zsh -f
~/bin/checker-latest/scan-build \
    -checker-cfref -warn-objc-methodsigs -warn-objc-missing-dealloc -warn-objc-unused-ivars \
    --status-bugs -o checker.out \
    xcodebuild -configuration $1

Next, I added my precommit hook to the repository’s .hg/hgrc file:

[hooks]
precommit.RunClang = ~/bin/RunClang.zsh Development

Here’s what an example session looks like:

% echo 'Testing precommit testing hook' >> test.txt
% hg ci -m 'Testing precommit testing hook'
[churn churn churn]
** BUILD SUCCEEDED **
scan-build: 17 bugs found.
scan-build: Run 'scan-view [snip]/growl-boredzo-precommit-test/checker.out/2009-04-03-2' to examine bug reports.
abort: precommit.RunClang hook exited with status 1

(255)% hg log --limit=1
changeset:   4188:b208862a586d
tag:         tip
user:        Peter Hosey
date:        Fri Mar 13 05:40:09 2009 -0700
summary:     Fix encoding of the Norwegian Growl-WithInstaller strings file.

17 bugs—mostly leaks. Glad I didn’t commit this test file!

So now you know how to have Mercurial block you from committing if the clang checker can find bugs. This should also work for your unit tests. And if you have 100% test coverage (lucky!), you can combine them: have scan-build build your test-bundle target. Then, the hook will prevent the commit if the checker can find bugs or any tests fail.

I don’t think I’ll actually use this set-up, though.

First, in order to find bugs, you need to build your entire main product. Any significantly large program is going to take a long time to build and analyze—Growl, for example, takes about one-and-a-quarter minutes for a clean build. Even committing to a Subversion repository over dial-up was quicker.

More significantly, precommit hooks like this interfere with patch queues. The mq extension implements patches as mutable commits, so any qnew or qrefresh will run the hook. It would be useful on qfinish, but it’s just annoying on qnew and especially qrefresh, as the all-too-frequent builds thwart rapid iteration.

So, if you use patch queues, this won’t work for you. But, if you don’t, then this should work as well in Mercurial as it does in Git.