Archive for the 'Creations' Category

New service: del.icious Info for URL

Sunday, May 20th, 2007

I just posted my second ThisService-created service: del.icio.us Info for URL. Select a URL within some text, then invoke the service, and it will open the del.icio.us info page for the URL (assuming that at least one person has bookmarked it) in your browser.

AFAIK, this is also the first pure-shell-script service.

The most efficient way to waste time

Wednesday, May 16th, 2007

In profiling CPU Usage, I need to get my CPUs busy so that the CPU-usage views have something to do. This means that I need a program to busy-wait.

Busy-waiting means running a tight loop that doesn’t actually do anything except run. In C, the most efficient such loop is:

for(;;);

That’s all well and good, but it only busy-waits a single processor. I have four, and I need 1 < n < 4 of them to be lit up so that CPU Usage has something to indicate (otherwise it will sit there showing 0-0-0-0, which doesn’t make good profiling—busy processors will jump around a bit, which gives CPU Usage something to do).

Now, my first approach was to write this in Python. That’s my go-to language for anything without a GUI. Here’s what came out:

#!/usr/bin/env python

def busy_wait():
    from itertools import repeat
    for x in repeat(None):
        pass

import thread, sys
try:
    num_threads = int(sys.argv[1])
except IndexError:
    num_threads = 100

for i in xrange(1, num_threads): #We'll do the first one ourselves after starting all the other threads.
    throwaway = thread.start_new_thread(busy_wait, ())
busy_wait()

Looks good, right?

What was weird is that I couldn’t seem to get it to max out all my processors, even with num_threads=5000. That seemed mighty suspicious.

It was then that I remembered the Global Interpreter Lock.

You see, in CPython, only one thread can be running Python code at a time. (Exceptions exist for things like I/O, of which my program contains none.) This means that my yummy multithreaded busy-wait program—being purely Python—was effectively running single-threaded.

I reimplemented the program in pure C. Not only does it run much more efficiently now (no interpreter overhead), but it also requires far fewer threads: Four threads will light up all four processors. Victory!

If you want a copy for yourself, here it is. It takes one argument, being the number of threads to spawn. It defaults to the number of logical CPUs in your machine (HW_CPU in sysctl), so if you just run it with no arguments, it will will spawn one thread per processor.

New utility: EasyMD5

Thursday, May 10th, 2007

I’ve just released a simple application called EasyMD5. All it does is compute an MD5 hash for any file you drop on it.

I plan to use this to try debugging “your disk image doesn’t work” reports that we get on the Adium feedback list occasionally.

New service: Make Obj-C Accessors

Thursday, April 19th, 2007

A pair of services, actually: One to generate declarations, and the other to generate definitions. Simply select a run of instance variable declarations (ideally copied and pasted into the area where you want the methods to go), then invoke the service. Couldn’t be simpler—at least until Obj-C 2.0 arrives.

New utility: qtsetclip

Wednesday, April 11th, 2007

I’ve just released my latest command-line utility, qtsetclip. This is a utility that allows you to set the clipping region of a QuickTime movie to a rectangle you specify. I use it in producing the Adium screencasts, since Final Cut Express limits me to choosing among certain predefined resolutions. (Further details will appear on the Adium Blog on Saturday.)

A novel way to reduce the size of a grayscale PNG file

Sunday, April 8th, 2007

Today, I scanned in one of my old drawings: a study of five-pointed stars that I made when I was trying to figure out how to draw a proper star (this was at the time of me working on Keynote Bingo MWSF2007 Edition, and a derivative of the same star is used in TuneTagger).

The odd thing is, after I corrected the image using Preview’s Black Point and Aperture controls (no relation to the photo-management program), the image weighed about two-fifths as much:

du -b Five-pointed\ star\ study* %~/Pictures(0)
1403443 Five-pointed star study-adjusted levels.png
3346498 Five-pointed star study.png

(These sizes are after pngout, but even if I re-correct the original image and save it elsewhere, it comes out 1790244 bytes long.)

Go figure.

Negative Turing Test now supports deletion

Wednesday, April 4th, 2007

As of r58, you can now tell Negative Turing Test to delete spam comments instead of marking them as spam. (This is in the NTT Options pane.)

I just turned this on here. It worked fine on the test post; we’ll see how well it works in real usage.

Oh, and in case you ever need to delete a comment from a WP plug-in: Use wp_set_comment_status. I thought for so long that WP had no programmatic way to delete comments—now I know that it does.

CPU Usage 0.4

Tuesday, March 27th, 2007

Those of you with multiprocessor Macs may have been eagerly awaiting this, and now it’s here. CPU Usage version 0.4 makes the meter work correctly for multiple CPUs. (Obviously, actually having a multiprocessor Mac helped me test it. I went through 25 alphas back when I was on the Cube; thanks go out to my three testers for banging on those.)

The other big thing in the 0.4 release is that you can now have your CPU usage meter in the Dock tile. You can have the floater or Dock tile or both. I prefer having a floater up the right side of my screen, but if you’d rather have it in the Dock, now you have it.

Here's a screenshot of my floater in CPU Usage 0.4.

New utility: exif-confer

Monday, March 26th, 2007

Not too long ago, I was at the bank and decided to take this photo of a couple of magazines sitting next to each other. As you can see, I edited out the bank’s address.

I did this using Lineform. The problem is, Lineform is a vector app, so it doesn’t keep any EXIF data from the original image (most of the time, that would not make sense). In my situation, I did want to keep the EXIF info, but there’s no way to make Lineform do that.

So I wrote a command-line tool to bring EXIF properties over from one image to another image. I call this image exif-confer. Enjoy.

Beads how-to updated

Sunday, March 18th, 2007

I just updated the two paper images from my How to draw Beads article.

Brief recap: Each bead in the application has a sheen. The how-to includes two mock-ups that I drew on paper so that I could more easily write the Quartz code to draw them in the computer.

Those who saw the article before may remember how awful the mock-ups looked. In case you didn’t or you don’t, here’s how awful they looked:

First draft of the bead sheen.
Second/final draft of the bead sheen.

Yowch. I didn’t have a scanner back then, so I did the best I could with my Zire 71‘s built-in camera. Mainly, it suffered from poor lighting: the Zire 71 is very sensitive to having or not having Just the Right Amount of light. In this case, I was on the not-enough side, so I had to use Photoshop to try to make up the difference.

I said to myself that when I got a scanner, I’d redo the images. I got one last week, so now I’ve done it and you can see the glorious results:

First draft of the bead sheen.
Second/final draft of the bead sheen.

If you’ve never seen the how-to before, have a look. There’s some good info there on drawing and Quartz.

Sweet, somebody else is now using Negative Turing Test!

Saturday, March 17th, 2007

From time to time, I check a Technorati search for my blog, which I have bookmarked, to see who’s linking to me and what they have to say.

Today I find that chucker has installed NTT. Cool!

chucker: I invite you to email me about you getting spam every few minutes. I’m interested to hear more specifics. My address is on the front page. (Initial hunch: Try turning off Akismet. I gave up on Akismet after all the false positives we had when we tried using it on Adium Trac; as such, I’ve never used it here.)

LMX and Adium message history Q&A

Saturday, March 17th, 2007

There’s been some discussion of LMX on the web since I announced LMX 1.0’s release. As I mentioned then, LMX is the library that powers Adium’s message history feature. Mostly, people have questioned whether XML was the best choice for logging given the message history requirement.

I recommend first reading my post on the Adium blog about message history. (If you came here from the Adium blog, sorry for the bouncing back and forth—that’s the last bounce, I promise. ;)

Welcome back. Let’s begin.

The questions and objections listed here are drawn from the comments on my LMX 1.0 announcement post, this article on the O’Reilly XML blog, the reddit post about LMX, and Tim Bray’s mention of LMX.

  • Why not just store the messages in reverse order? Then you wouldn’t need a backward parser; you could retrieve the n most recent messages from the top with an ordinary parser.

    Because file I/O doesn’t have an insert mode; you can only overwrite. That means that Adium would have to rewrite the entire rest of the file every time it inserted a message (which is when the message is received or sent). That would get very expensive for long transcripts, and some Adium users leave their chats open all the time, so their transcripts would indeed get very long.

  • How do you append to the transcript? You must have to leave off the end tag, which means that the file is not a valid XML document until you close it, which would be bad if Adium crashed, since the end tag would never get written and the transcript would be broken XML.

    Not so. This time, overwrite behavior is our friend: Adium simply overwrites the </chat> tag each time it writes a message, and appends a new </chat> tag in the same write. The file is always a valid XML document, thanks to overwriting.

    Yes, this is slightly wasteful, but the waste here is constant (that is, it does not go up over time) and insignificant. The upsides vastly outweigh the downsides.

  • Why go with XML if you have to perpetrate such hackery as a backward parser? Why not use SQLite or a plain-text format?

    SQLite: We would have had to include it with Adium, since Adium 1.0’s minimum requirement was OS X 10.3, and SQLite has only been bundled with Mac OS X since 10.4. LMX is much smaller than SQLite. Also, we’re not big on formats that aren’t directly human-readable.

    Plain-text format: A simple format (e.g. TSV) would have some growing pains if we ever wanted to grow (or shrink) the format, and a more complex format would require a new parser from the ground-up just like XML does. For this purpose, we like XML’s trade-off between readability and extensibility, and LMX fills in the gap for reading from the end.

    For more on formats we didn’t elect and why not, you can read our LogFormatIdeas page on the Trac (deprecated since we chose a format, but still around for posterity).

  • How will you determine the encoding of the data, or read entity declarations? Those things are at the start of the file, and you’re parsing from the end.

    LMX naïvely assumes that the data is UTF-8 and that the application knows about any entities it will need. Yes, this is wrong, but Adium didn’t need anything different.

    Either 2.0 or 3.0 will do a forward parse until the opening tag of the root element, in order to discover the actual encoding and any entity declarations. (I’m not doing it in a 1.0 version because 1.0’s parser is a hedge of thorns, and I’m not willing to touch it for something that most people won’t need anyway. And I’m tempted to leave this out of 2.0 as well, since 2.0 will be a big enough version with its rewrite of the parser in pure C.)

  • How does LMX tell whether –> is the end of a comment or simply an unescaped > following two hyphens?

    Simple: It assumes it’s the end of a comment.

    There’s no way to definitively find out one way or another without scanning all the way to the start of the data and backtracking. This is one of the pitfalls of a backward parser. It’s the nature of the game, so all I can do is say “make really sure you’re feeding good XML to the parser”. That includes not having unescaped ‘>’s in your text.

  • What about storing one message per line and scanning through the file line-by-line?

    Because you can have a valid XML log file without that constraint, and constraints like that are the sort of detail you don’t want to rely on, because other apps can break them. (To Tim Bray: Part of the point of the Unified Logging Format is that we want other IM/chat clients to use it, which means that we should be forgiving when their output doesn’t exactly match ours.)

  • Can I grep these logs?

    Mostly. You can grep an XML log in the usual way, but your expression can’t contain non-ASCII characters, <, >, or & unless you replace them with the appropriate entity references. We recommend using the search field in the Chat Transcript Viewer anyway.

I’m glad I finally announced LMX 1.0—not just because it is now, finally, out the door, but also because people have suggested new alternatives that we on the Adium team never thought of. For example, this reddit comment suggests saving one file per message (in a directory per chat), and this other one suggests inserting a fake start tag before the –nth message element, and the O’Reilly article suggests a hybrid XML+binary format. We never thought of any of these.

To be totally clear, we’re not switching—this post is a clarification, not an announcement. Two of those ideas won’t work for various reasons; the problem with the one-file-per-message idea can be overcome by tarring old chats. But LMX is not a future plan—we’ve written it and it’s here, and the same goes for the Unified Logging Format.

Call it inertia, but replacing either one with something else will require either the existing solution to break or the proposed replacement to exhibit massive, world-changing superiority. These things are done and they work, so at this point, we’re not going to rock the boat. It ain’t broke anymore, so we’re not fixing it.

The design for LMX 2.0

Monday, March 12th, 2007

LMX 1.0 didn’t really have much design to it. I set out to clone NSXMLParser‘s API, which I did, but didn’t give a whole lot of thought to how I would actually implement the parser.

As a result, the parser itself is one humongous method that takes a lot of effort to read. It is only navigable at all because I had the foresight to put in lots of #pragma marks.

LMX 2.0 will not make that mistake. This time, there’s a design, and the parser will not all be in one function. Here’s the design, which I drew on a quadrille pad:

All states have prefix “lmx_state_”. All states are functions; struct LMXParser's “state” member has type LMXParserStateFunc, which is a function pointer. There is also a “saved_state” member, used when entering entity_ref state. parser->state is called for every character in the XML data.

To be explicit, these are all implementation details which will not be exposed to clients of the API.

And the scanner I used to import this from dead tree format is the CanoScan LiDE 600F I mentioned in passing in my post about my HP M425 camera.

One month of Negative Turing Test

Monday, March 12th, 2007

One of Negative Turing Test’s most recent features is a counter of how many spams it eats. I added this feature last month, and made a note to reveal today what it got up to.

The number of spams blocked by NTT from 2007-02-12 to 2007-03-12 is:

6,220

I should probably get around to making it delete those…

LMX 1.0 released

Saturday, March 3rd, 2007

Some of you know that I’m a developer on Adium. (Hopefully all of you; it is mentioned in the sidebar. ;)

Adium has a feature called “message history”. When you open a new chat with a person, message history shows you the last n messages from your previous chat with that person. Since 1.0 (which changed message history to draw from the logs rather than separate storage and changed the log format to be XML rather than bastardized HTML—more info on the Adium blog post), message history has been implemented using a library that I wrote called LMX.

LMX is a reverse XML parser. Whereas most XML parsers (AFAIK, all of them except LMX) parse the XML data from the start to the end, LMX parses it from the end to the start. Thus, while characters are kept in their original order (“foo” will still be “foo”; it will not become “oof”), everything else is reported in the reverse order: elements close before they are opened, and appear from last to first. All this is by design, so that Adium can retrieve the last n message elements without having to parse all the message elements before them.

Today, LMX gets its very own webpage (not just a page on the Adium wiki, but a real webpage), and is released at version 1.0. It’s the same code as shipped with Adium 1.0.1, but shined up into a release tarball.

So, if you too ever find yourself in desperate need of a reverse XML parser, now there is one.

What’s the resolution of your screen?

Sunday, February 4th, 2007

A few weeks ago, I installed Adobe Reader to view a particular PDF, and noticed something interesting in its Preferences:

Its Resolution setting is set by default to “System setting: 98 dpi”.

“Wow”, I thought, “I wonder how it knows that.” So I went looking through the Quartz Display Services documentation, and found it.

The function is CGDisplayScreenSize. It returns a struct CGSize containing the number of millimeters in each dimension of the physical size of the screen. Convert to inches and divide the number of pixels by it, and you’ve got DPI.

Not all displays support EDID (which is what the docs for CGDisplayScreenSize say it uses); if yours doesn’t, CGDisplayScreenSize will return CGSizeZero. Watch for this; failure to account for this possibility will lead to division-by-zero errors.

Here’s an app to demonstrate this technique:

ShowAllResolutions' main window: “Resolution from Quartz Display Services: 98.52×96.33 dpi. Resolution from NSScreen: 72 dpi.”

ShowAllResolutions will show one of these windows on each display on your computer, and it should update if your display configuration changes (e.g. you change resolution or plug/unplug a display). If CGDisplayScreenSize comes back with CGZeroSize, ShowAllResolutions will state its resolution as 0 dpi both ways.

The practical usage of this is for things like Adobe Reader and Preview (note: Preview doesn’t do this), and their photographic equivalents. If you’re writing an image editor of any kind, you should consider using the screen resolution to correct the magnification factor so that a 8.5×11″ image takes up exactly 8.5″ across (and 11″ down, if possible).

“Ah,”, you say, “but what about Resolution Independence?”.

The theory of Resolution Independence is that in some future version of Mac OS X (possibly Leopard), the OS will automatically set the UI scale factor so that the interface objects will be some fixed number of (meters|inches) in size, rather than some absolute number of pixels. So in my case, it would set the UI scale factor to roughly 98/72, or about 1+⅓.

This is a great idea, but it screws up the Adobe Reader theory of automatic magnification. With its setting that asks you what resolution your display is, it inherently assumes that your virtual display is 72 dpi—that is, that your UI is not scaled. Multiplying by 98/72 is not appropriate when the entire UI has already been multiplied by this same factor; you would essentially be doing the multiplication twice (the OS does it once, and then you do it again).

The solution to that is in the bottom half of that window. While I was working on ShowAllResolutions, I noticed that NSScreen also has a means to ascertain the screen’s resolution: [[[myScreen deviceDescription] objectForKey:NSDeviceResolution] sizeValue]. It’s not the same as the Quartz Display Services function, as you can see; it seemingly returns { 72, 72 } constantly.

Except it doesn’t.

In fact, the size that it returns is premultiplied by the UI scale factor; if you set your scale factor to 2 in Quartz Debug and launch ShowAllResolutions, you’ll see that NSScreen now returns { 144, 144 }.

The Resolution-Independent version of Mac OS X will probably use CGDisplayScreenSize to set the scale factor automatically, so that on that version of Mac OS X, NSScreen will probably return { 98.52, 98.52 }, { 96.33, 96.33 }, or { 98.52, 96.33 } for me. At that point, dividing the resolution you derived from CGDisplayScreenSize by the resolution you got from NSScreen will be a no-op, and the PDF view will not be doubly-magnified after all. It will be magnified by 133+⅓% by the UI scale factor, and then magnified again by 100% (CGDisplayScreenSize divided by NSDeviceResolution) by the app.

Obviously, that’s assuming that the app actually uses NSScreen to get the virtual resolution, or corrects for HIGetScaleFactor() itself. Adobe Reader doesn’t do that, unfortunately, so it suffers the double-multiplication problem.

So, the summary:

  • To scale your drawing so that its size matches up to real-world measurements, scale by NSDeviceResolution divided by { 72.0f, 72.0f }. For example, in my case, you would scale by { 98.52, 96.33 } / { 72.0, 72.0 } (that is, the x-axis by 98.52/72 and the y-axis by 96.33/72). The correct screen to ask for its resolution is generally [[self window] screen] (where self is a kind of NSView).
  • You do not need to worry about HIGetScaleFactor most of the time. It is only useful for things like -[NSStatusBar thickness], which return a number of pixels rather than points (which is inconvenient in, say, your status item’s content view).

Weekly Cocoa app challenge #3 solution

Thursday, January 18th, 2007

HERE THERE BE SPOILERS. If you intend to participate in the challenge, read no further because my solution follows.

DudeMenu-PRH-1.0-source.tbz

Source and executable are both included. It took me about an hour and 19 minutes to complete. Source-control-wise, this is revision 7 of its SVN repository.

Things used therein:

  • Acquisition of and intake from a plist
  • Definition of custom symbols for the auto-documentation of dictionary keys
  • Dynamically building a menu
  • Validation of input, both with alerts and assertions
  • Many, many comments
  • String localization
  • String localization using a custom table
  • Primitive token-replacement (primitive because there is no provision for escaping)
  • Inspection of the current locale
  • A status item
  • Menu items representing other objects
  • An app without a main menu (as in NSApp’s mainMenu)
  • LSUIElement
  • A menu item that messages NSApp directly
  • Retrieval of images from the search path by name
  • Walking arrays
  • Auto-incrementation

A Core-Image-less Image Unit

Wednesday, January 17th, 2007

Can you imagine an Image Unit that didn’t actually use Core Image?

I just wrote one.

Well, OK, so I did use CIFilter and CIImage — you can’t get away without those. But I did not use a CIKernel. That’s right: This simple filter does its work without a kernel.

For the uninitiated, a kernel is what QuartzCore compiles to either a pixel shader or a series of vector (AltiVec or SSE) instructions. All Image Units (as far as I know) use one — not only because it’s faster than any other way, but because that’s all you see in the documentation.

But I was curious. Could an Image Unit be written that didn’t use a kernel? I saw nothing to prevent it, and indeed, it does work just fine.

The image unit that I wrote simply scales the image by a multiplier, using AppKit. I call it the AppKit-scaling Image Unit. Feel free to try it out or peek at the source code; my usual BSD license applies.

Obviously, this Image Unit shouldn’t require a Core Image-capable GPU.

The first working version of Negative Turing Test

Friday, January 12th, 2007

I just committed revision 44 of Negative Turing Test, and am running it now on this blog (and I’ve turned off the Comment Authorization plug-in, which is what used to email you when you commented, prompting you to approve your own comment). It now correctly blocks spam and allows ham; these being the minimum requirements, I call r44 the first working version of NTT. Feel free to try it out on your own blog — or on mine — and report any problems (preferably using the Google Code issue tracker, but email‘s fine too).

I’m not quite done with it. My next step is to add an option for it to outright delete spam instead of simply stamping it “spam” and saving it for some plug-in that I don’t use to study. I’m confident that I will never see a false positive, and if I ever see a false negative, I can simply change the problem that it poses in order to avoid future false negatives.

The plug-in comes with no default challenge: All fields are empty. This means that if you want to block spam with it, you’ll need to think of a challenge to put in there. Please don’t borrow mine, as I put in no default for a reason: If there’s a default (or a really popular challenge), the spammers will pre-program their bots with the correct response and the plug-in will be defeated (and all NTT users who’ve used that challenge will have to change it, and/or will send me a bunch of email). I recommend searching a book of easy jokes or logic riddles.

All known Keynote Bingoes for MWSF 2007

Tuesday, January 9th, 2007