Parsing the preprocessor

2006-06-15 12:20:00 -08:00

If you’ve ever run GCC’s preprocessor alone and looked at its output, you’ve seen lines like these:

# 1 "/usr/include/sys/types.h"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "/usr/include/sys/types.h"
# 66 "/usr/include/sys/types.h"
# 1 "/usr/include/sys/appleapiopts.h" 1 3 4
# 67 "/usr/include/sys/types.h" 2


# 1 "/usr/include/sys/cdefs.h" 1 3 4
# 70 "/usr/include/sys/types.h" 2

And you probably wondered what all that means. Here’s your secret decoder ring.

First, these are called “line markers” in libcpp. The format of a line marker is:

  1. A line number
  2. The path to the relevant file
  3. Flags

The flag values are:

1
Push (enter) header
2
Pop (leave) header
3
This is a system header (determined by these rules with this modification)
4
Requires extern "C" protection (determined by the same rules as above); never found without 3

Note that a pop applies to the header above (in the include stack) the one referenced in the marker.

Example:

# 66 "/usr/include/sys/types.h"
# 1 "/usr/include/sys/appleapiopts.h" 1 3 4
# 67 "/usr/include/sys/types.h" 2
  1. Fast-forward to line 66 of <sys/types.h> (nothing interesting occurs before this line).
  2. Enter <sys/appleapiopts.h>. Everything from this point until the next marker is from that header. Note that this header is a system header (3) and requires extern “C” protection (4).
  3. As it turns out, nothing interesting happened there. So the very next line is a pop marker: <sys/appleapiopts.h> is popped, so now we’re back in <sys/types.h>, now on line 67 (the line after the #include <sys/appleapiopts.h>).

The relevant code in libcpp is in directives.c. The function that parses line markers (presumably used by the compiler rather than the preprocessor itself; the preprocessor generates them) is do_linemarker. Additional include-related code is in files.c.

UPDATE 23:24 PDT: Beware of pragmas. Seems obvious now, but I didn’t think of it earlier: The preprocessor leaves #pragma directives untouched, being that they’re for the compiler rather than the preprocessor. So if you’re only looking for line markers, you may get tripped up if you don’t properly handle/ignore a pragma.

Technorati tags: , , .

Leave a Reply

Do not delete the second sentence.