Everything you need to know about pointers in C

DELTA FROM 1.2


Table of contents


Starting off

Updated to refer to Intel processors instead of PowerPC processors.

This variable occupies some memory. On current mainstream Intel processors, it occupies four bytes of memory (because an int is four bytes wide).


Interlude: Declaration syntax

Declarations are not statements in C. These are two of many corrections to the use of the word “statement” to describe what is actually a declaration:

The obvious way to declare two pointer variables in a single statementdeclaration is:

int* ptr_a, ptr_b;


C's declaration syntax ignores the pointer asterisks when carrying a type over to multiple declarations. If you split the declaration of ptr_a and ptr_b into multiple statementdeclarations, you get this:

int *ptr_a; int ptr_b;


Added:

Further reading: The right-left rule for reading C declarations.


Dereferencing

Another declaration-referred-to-as-statement fix, and removed references to the PowerPC.

int bar = *foo_ptr;

In this statementdeclaration, the dereference operator (prefix *, not to be confused with the multiplication operator) looks up the value that exists at an address. (On the PowerPC, tThis is called a “load” operation.)

It's also possible to write to a dereference expression (the C way of saying this: a dereference expression is an lvalue, meaning that it can appear on the left side of an assignment):

*foo_ptr = 42; Sets foo to 42

(On the PowerPC, tThis is called a “store” operation.)


Interlude: Arrays

Replaced claims that arrays always decay to pointers with explanation of the difference between arrays and pointers, and (hopefully) clearer explanation of when they do decay.

Here's a declaration of a three-int array:

int array[] = { 45, 67, 89 };

Note that we use the [] notation because we are declaring an array. int *array would be illegal here; the compiler would not accept us assigning the { 45, 67, 89 } initializer to it.

This variable, array, is an extra-big box: three ints' worth of storage.

But here’s a little secret: you can never refer to this array again.

‘What?’ you say. ‘But the compiler lets me do that! Watch!’

printf("%p\n", array); Prints some hexadecimal string like 0x12307734

Ah, but what does %p mean?

It means ‘pointer’.

When you use the name of an array in your code, you actually use a pointer to its first element (in C terms, &array[0]). This is called ‘decaying’: the array ‘decays’ to a pointer. Any usage of array is equivalent to if array had been declared as a pointer (with the exception that array is not an lvalue: you can’t assign to it or increment or decrement it, like you can with a real pointer variable).

One neat feature of C is that, in most places, when you use the name array again, you will actually be using a pointer to its first element (in C terms, &array[0]). This is called “decaying”: the array “decays” to a pointer. Most usages of array are equivalent to if array had been declared as a pointer.

There are, of course, cases that aren't equivalent. One is assigning to the name array by itself (array = )—that's illegal.

Another is passing it to the sizeof operator. The result will be the total size of the array, not the size of a pointer (for example, sizeof(array) using the array above would evaluate to (sizeof(int) = 4) × 3 = 12 on a current Mac OS X system). This illustrates that you are really handling an array and not merely a pointer.

In most uses, however, array expressions work just the same as pointer expressions.

So, when you passed array to printf, you really passed a pointer to its first element, because the array decayedfor example, let's say you want to pass an array to printf. You can't: When you pass an array as an argument to a function, you really pass a pointer to the array's first element, because the array decays to a pointer. You can only give printf the pointer, not the whole array. (This is why printf has no way to print an array: It would need you to tell it the type of what's in the array and how many elements there are, and both the format string and the list of arguments would quickly get confusing.)

Decaying is an implicit &; array == &array == &array[0]. In English, these expressions read “array”, “pointer to array”, and “pointer to the first element of array” (the subscript operator, [], has higher precedence than the address-of operator). But in C, all three expressions mean the same thing.

(They would not all mean the same thing if array were actually a pointer variable, since the address of a pointer variable is different from the address inside the variable—thus, the middle expression, &array, would not be equal to the other two expressions. The three expressions are all equal only when array really is an array.)


Pointer arithmetic (or: why 1 == 4)

Another PowerPC-to-Intel update.

In case you're wondering about 1 == 4: Remember that earlier, I mentioned that ints are four bytes on a PowerPCcurrent Intel processors. So, on a PowerPCmachine with such a processor, adding 1 to or subtracting 1 from an int pointer changes it by four bytes. Hence, 1 == 4. (Programmer humor.)


Indexing

The correct name of the operator for accessing elements of an array by index is the subscript operator. While descriptive, “the index operator” is not the correct name.

This is another one of those secrets of C. The indexsubscript operator (the [] in array[0]) has nothing to do with arrays.

Wording change to reflect the above correction to the discussion of decaying.

Oh, sure, that's its most common usage. But remember that, in most contexts, arrays decay to pointers. This is one of them: That's a pointer you passed to that operator, not an array.


Multiple indirection

Added the pointer-to-member operator as an operator that decreases the pointer level of an expression.

Thus, the & operator can be thought of as adding asterisks (increasing pointer level, as I call it), and the *, ->, and [] operators as removing asterisks (decreasing pointer level).


Pointers and const

Corrected the distinctions between locations of the const keyword.

The const keyword is used a bit differently when pointers are involved. These two declarations are not equivalent:

const int *ptr_a; int const *ptr_b; int const *ptr_a;

These two, however, are not equivalent:

int const *ptr_a; int *const ptr_b;

In the first example, the int (i.e. **ptr_a *ptr_a) is const; you cannot do **ptr_a *ptr_a = 42. In the second example, the pointer itself is const; you can change **ptr_b *ptr_b just fine, but you cannot change (using pointer arithmetic, e.g. ptr_b++) the pointer itself.


Function pointers

Minor wording change, reflecting above correction to discussion of decaying.

It's possible to take the address of a function, too. And like, similarly to arrays, functions decay to pointers when their names are used. So if you wanted the address of, say, strcpy, you could say either strcpy or &strcpy. (&strcpy[0] won't work for obvious reasons.)

Another declaration-referred-to-as-statement fix.

Here’s a pathological declaration, taken from the C99 standard. ‘[This declaration] declares a function f with no parameters returning an int, a function fip with no parameter specification returning a pointer to an int, and a pointer pfi to a function with no parameter specification returning an int.’ (6.7.5.3[16])

int f(void), *fip(), (*pfi)();

In other words, the above is equivalent to the following three statementdeclarations:

int f(void); int *fip(); Function returning int pointer int (*pfi)(); Pointer to function returning int

















Version history

1.2 — 2006-01-11
1.1 — 2006-01-01
1.0 — 2005-12-22
First public release.

This document is also available in zip format. The previous version is also available.


2010-01-16 http://boredzo.org/pointers
Valid XHTML 1.0! Valid CSS!