QR Code

A Guide to Understanding Even the Most Complex C Declarations

By Greg Comeau

[ This text was originally published in Microsoft Systems Journal, Volume 3, Number 5. We could not find an archived copy online so are reproducing it here. — Ed. ]

The C language has been around for a good number of years, but many parts of its syntax are still not clearly understood, even by professional programmers. The reason for this ambiguity is the lack of unified documentation covering all the options, especially concerning pointers, that are now available within the language, including the American National Standards Institute (ANSI) and Microsoft extensions to the language.

Unfortunately, as long as these features are not clearly understood, or worse, misunderstood, programmers will not be utilizing C to its full capacity. Instead, code is being written, even by talented programmers, that is more complicated than it needs to be and may in some instances be incorrect. The purpose of this article is to clarify some of the typical constructs of C declarations that baffle novice and expert C programmers alike. As a starting point, let's look at a declaration that many C programmers find difficult to comprehend:

struct vtag far * (far * const far var[5])();

If you know for certain what this declaration is saying, just skip to the end of this article and work on the declaration offered there as a challenge. If you are at that stage of the game where I was just a short time ago, that is, you think you know but aren't quite sure, then stay around. You will be able to read and make use of the preceding identifier by the end of this article, and you should even be able to explain reading and writing C declarations to others. Only then can you use the complete power of the language.

Declaration Syntax

To use a language, you must know something about its structure and syntax. The first thing to look at regarding C declarations is how they are organized. Within a given declaration arrangement, different attributes can be specified. Depending on what attributes are used, the type of an identifier can be determined. The syntax for explicitly declaring identifiers in C is shown in Figure 1.

The syntax of a C declaration is: storage-class type qualifier
declarator = initializer; where storage-class is only one of the
following:

  typedef
  extern
  static
  auto
  register

Type could be one or more of the following:

  void
  char
  short, int, long
  float, double
  signed, unsigned
  struct ...
  union ...
  typedef type

A declarator  contains an  identifier and one or more, or none at
all, of the following in a variety of combinations:

  *

  ()

possibly grouped within parentheses to create different bindings.
Figure 1: Standard Syntax for C Declarations

A declaration can contain many of the specifiers shown in Figure 1, but it must contain at least a type and a declarator. Note that some compilers will allow only a declarator to be specified if the identifier is declared outside any function; C programmers should not use such coding practices, because it is bad programming style and may lead to routine errors and maintenance problems. Furthermore, the draft ANSI proposal has made this practice obsolete; therefore, this article will not discuss cases where more than one identifier is being declared in a declaration.

Declarations: Theory

Many of us can read declarations such as

int i;
char *p;

and we can even decipher

int *ia[3];
int (*ia)[3];

thanks to Brian W. Kernighan and Dennis M. Ritchie, The C Programming Language (Prentice-Hall, Inc., 1978) — or "K & R", as I will refer to it from here on. By now we've all simply memorized them. Our good fortune is that these probably account for 85 percent of the declarations you will encounter. However, understanding the other 15 percent can be an uphill battle, and that's where the trouble begins.

Many programmers suffer through complex declarations and the usage of the identifiers declared by them. Guessing is usually the only recourse one has on short notice because of poor documentation. Sooner or later it will burn you since guessing means making generalizations that are not necessarily true. Even though your guess may be close, the low-level code generated by your compiler may be completely different from what you wanted.

I wish that I had not been forced to play the guessing game when I first learned C. This is especially bothersome now since the actual theory behind reading declarations is very simple. You only need to understand that declarations are based on the C operator precedence chart, the same one you use to evaluate expressions. In the case of declarations this means bindings in the order of

( ) or [ ] highest, associatively from left to right, lowest

with parentheses overriding normal bindings. That's all there is to it. Knowing this, all we have to do is formulate a set of rules such as the one suggested in Figure 2.

1.  Parenthesize declarations as if they were expressions.

2.  Locate the innermost parentheses.

3.  Say "identifier is" where identifier is the name of the variable.
    Say "an array of X" if you see [X].
    Say "pointer to" if you see *.

4.  Move to the next set of parentheses.

5.  If more, go back to 3.

6.  Else, say "type" for the remaining type left (such as short int).
Figure 2: Rules for Reading and Writing K&R Declarations

Let's go through some sample declarations. Using page 200 of K&R as a reference point, study the declarations shown in Figure 3, which are parenthesized according to the C language precedence chart.

int      i;
int      (i);
│         │
│         ▼
│         i is
▼
an int
───────────────────────────────────────────
int      *i;
int      (*(i));
│         │ │
│         │ ▼
│         │ i is
│         ▼
│         a pointer to
▼
an int
───────────────────────────────────────────
int      *i[3];
int      (*((i)[3]));
│         │  │
│         │  ▼
│         │  i is an array of 3
│         ▼
│         pointers to
▼
int
───────────────────────────────────────────
int      (*i)[3];
int      ((*(i))[3]);
│          │ │
│          │ ▼
│          │ i is
│          ▼
│          a pointer to
▼          an array of 3
ints
───────────────────────────────────────────
int      *i();
int      (*((i)()));
│         │  │
│         │  ▼
│         │  i is a function returning
│         ▼
│         a pointer to
▼
an int
───────────────────────────────────────────
int      (*i)();
int      ((*(i))());
│          │ │
│          │ ▼
│          │ i is
│          ▼
│          a pointer to a function returning
▼
an int
Figure 3: Interpreting Declarations by Parenthesizing

As you can see, once a declaration has been parenthesized, deciphering it is merely a case of stating what each parenthesized expression is. This is similar to parenthesizing arithmetic expressions, for instance, where a certain addition may have precedence over a certain multiplication. The only difference is that multiplication and division are binary operators (they must work on two operands) whereas here we are dealing with unary tokens, which need only one operand.

Note that every parenthesized declaration in Figure 3 is also a valid declaration syntactically; they are not shown merely for their equivalence.

Declarations: Usage

Although the rule stated in Figure 1 is simple, its disadvantage is that you've got to sit there and parenthesize the declaration. Instead of spending a lot of time parenthesizing "subdeclarations," it would be helpful to generalize the theory into something more useful. The set of rules in Figure 4, sections 1, 2, 3, and 5 permit you to read declarations "on the fly." At first glance these rules appear to be more complex than the rule in Figure 1, but they are merely expanded versions of that rule.

1. Given: Intermediate attributes are [], (), and *, implying
   array, function, and pointer, respectively.

2. Memorize right-to-left rule: Look to the right (within
   parentheses), pick up intermediate attributes if any, then look
   to the left and pick up intermediate attributes, if any.

3. To convert a C declaration to English:

   a. Locate the identifier in the declaration. Say "identifier is"
      where identifier is the name of the variable.

   b. Look to the right of the identifier for the intermediate
      attributes () or []. Note there may be none.

      Say "an array of" if you see [].
      Say "an array of x" for each [x] you see.
      Say "an x-by-y array of" if you see [x][y].
      Say "an x-by-y by ... array of" if you see [x][y][...].
      Say "functions returning" for () if the last right attribute
      found was [].
      Say "a function returning" for ().

   c. Now look to the left of the identifier (per the right-to-left
      rule), and look for any further intermediate
      attributes. We're only concerned about asterisks here, and
      any other attribute would be an error. Note there may be
      none. Also, be aware of parentheses. So, for each *, say
      "pointers to" if the last attribute found on the right was []
      and the current attribute is *, otherwise, say "a pointer to"
      if you see *.

   d. Look to the right again for any more intermediate
      attributes. There may be none. Be careful of parentheses
      here. If any, go back to b.

   e. You should be left with terminating attributes: char, int,
      short, long, float, double, struct, union, and/or their
      respective modifiers-signed, unsigned, static, register, and
      extern.

      Say "struct of type y" for struct y.
      Say "union of type y" for union y.
      Say "attribute" read left-to-right (verbatim) for the
      terminating attributes.

4. Rules for converting English to a C declaration:

   a. Write "identifier", where identifier will be the name of the
      variable.

   b. We will need to keep track of whether the last attribute in
      our processing was an asterisk by using a flag. We will call
      our flag "active-*" and set it equal to 0.

   c. Write "*" to the left of what you've written down as long as
      you see "pointer to" or "pointers to" in the English
      description. Also set active-*=1 (signal that the last
      attribute was an asterisk).

   d. Write "(previous_written_down_attributes )" if active-*=1.
      Write "[x]" to the right if you see "array of x."
      Write "[x][y]..." to the right if you see "an x-by-y array of."
      Write "[]" to the right if you see "array of."
      Write "()" to the right if you see "function returning" or
      "functions returning."

   e. Go to b if there are still more intermediate attributes in
      the list.

   f. Write "simple attribute" to the left of everything, where
      simple attribute is one of the combinations of terminating
      attributes (keywords) shown in Figure 1.

      Write ";" to right of everything.

5. Notes:
   a. You cannot have an array of functions. You can have an array
      of pointers to functions, though. The declaration "int a[5]()"
      is invalid.

   b. A function cannot return an array. A function can return a
      pointer to an array, though. The declaration "int a()[]" is
      invalid.

   c. A function cannot return another function, only a pointer to
      one, meaning that "int a()()" is invalid.
Figure 4: Reading and Writing K&R Declarations

After going through several sample declarations, you'll see how natural these rules are. They are really the same as the rule shown in Figure 1, but since we know about operator precedence, there's no sense in actually parenthesizing the declaration. Taking the examples from Figure 3, we will get the results shown in Figure 5.

int    i;
│      │
│      ▼
│      i is
▼
an int
───────────────────────────────────────────────────────────────────────────
int    *i;
│      ││
│      │▼
│      │i is
│      ▼
│      a pointer to        (nothing to the right, so look to the left)
▼
an int                     (terminating attribute is all that's left)
───────────────────────────────────────────────────────────────────────────
int    *i[3];
│      ││
│      │▼
│      │i is an array of 3 (first look to the right)
│      ▼
│      pointers to         (no more attributes to right, so look to left)
▼
an int                     (terminating attribute is all that's left)

───────────────────────────────────────────────────────────────────────────
int    (*i)[3];
│       ││
│       │▼
│       │i is
│       ▼
│       a pointer to       (found parentheses-nothing to right, * to left)
│
│       an array of 3      (finished paren attributes, look to right)
▼
ints                       (terminating attribute is all that's left)
───────────────────────────────────────────────────────────────────────────
int    *i();
│      ││
│      │▼
│      │i is a function
│      │returning          (found () to the right)
│      ▼
│      a pointer to        (nothing more to right, * on the left)
▼
an int                     (terminating attribute is all that's left)
───────────────────────────────────────────────────────────────────────────
int    (*i)();
│       ││
│       │▼
│       │is is
│       ▼
│       a pointer to       (* is within parentheses)
│       a function
│       returning          ( () is attribute to the
│                            right of the parenthesis)
▼
an int                     (terminating attribute is all that's left)
Figure 5: Interpreting Declarations Using Rules

Creating several obvious derivations from K&R's declarations we get declarations such as the ones shown in Figure 6. And using derivations from examples in the Microsoft(R) C Compiler (MSC) Version 5.0 Language Reference we get the declaration in Figure 7. We know this declaration must be different from "char *(*(abc()))[10]", in which abc is a function returning a pointer to a pointer to an array of 10 pointers to char, which can also be written as "char *(**abc())[10]". Finally, Figure 8 shows an example of a union declaration.

int    **i;
│      │││
│      ││▼
│      ││i is
│      │▼
│      │a pointer to           (no attributes to right, * to left)
│      ▼
│      a pointer to            (another * to the left)
▼
an int
───────────────────────────────────────────────────────────────────────────
int    *(*i)();
│      │ ││
│      │ │▼
│      │ │i is
│      │ ▼
│      │ a pointer to a        (nothing to right, * to left)
│      │ function returning    (finished with parentheses, () is to right)
│      ▼
│      a pointer to            (no more attributes on right, try left)
▼
an int
───────────────────────────────────────────────────────────────────────────
int    *(*i[])();
│      │ ││
│      │ │▼
│      │ │i is an array of     (stay with parens, look to right first)
│      │ ▼
│      │ pointers to           (now to left)
│      │ functions returning   (finished with parens so look to their right)
│      ▼
│      pointers to
▼
an int
Figure 6: Interpreting Declarations Using Rules

char    *(*(*i)())[10];
│       │ │ ││
│       │ │ │▼
│       │ │ │i is
│       │ │ ▼
│       │ │ a pointer to a function returning
│       │ ▼
│       │ a pointer to an array of 10
│       ▼
│       pointers to
▼
a char
Figure 7: Interpreting Declarations Using Rules

union sign    *(*i[5])[5];
│             │ ││
│             │ │▼
│             │ │is is an array of 5
│             │ ▼
│             │ pointers to arrays of 5
│             ▼
▼             pointers to
union sign
Figure 8: Interpreting Declarations Using Rules

Writing Declarations

Rules for writing declarations from English descriptions are every bit as simple since it's exactly what we've done so far, only in reverse. Figure 4, sections 1, 2, 4, and 5 show the rules for accomplishing this. The examples from Figures 5, 6, 7, and 8 are listed in Figure 9, but in reverse translation.

i is -► an int
│       │
▼       │
i       │
        ▼
        int i;
───────────────────────────────────────────────────────────────────────────
i is -► a pointer -► to an int
│       │            │
▼       │            │
i       │            │
        ▼            │
        *i           │
                     ▼
                     int *i;
───────────────────────────────────────────────────────────────────────────
i is -► an array -► of 3 pointers -► to int
│       │           │                │
▼       │           │                │
i       ▼           │                │
        i[]         ▼                │
                    *i[3]            │
                                     ▼
                                     int *i[3];
───────────────────────────────────────────────────────────────────────────
i is -► a pointer -► to an array -► of 3 ints
│       │            │              │
▼       │            │              │
i       ▼            │              │
        *i           ▼              │
                     (*i)[]         │
                                    ▼
                                    int (*i)[3];
───────────────────────────────────────────────────────────────────────────
i is a function returning  -► a pointer -►  to an int
│                             │             │
▼                             │             │
i()                           ▼             │
                              *i()          │
                                            │
                                            ▼
                                            int *i();
───────────────────────────────────────────────────────────────────────────
i is a pointer -► to a function returning  -►   an int
│                 │                             │
▼                 │                             │
*i                │                             │
                  ▼                             │
                  (*i)()                        │
                                                ▼
                                                int (*i)();
───────────────────────────────────────────────────────────────────────────
i is a pointer  -► to a pointer   -► to an int
│                  │                 │
▼                  │                 │
*i                 │                 │
                   ▼                 │
                   **i               │
                                     ▼
                                     int **i;
───────────────────────────────────────────────────────────────────────────
i is a pointer -► to a function returning -► a pointer -► to an int
│                 │                          │            │
▼                 │                          │            │
*i                ▼                          │            │
                  (*i)()                     ▼            │
                                             *(*i)()      ▼
                                                          int *(*i)();
───────────────────────────────────────────────────────────────────────────
i is array-► of pointers-► to functions returning-► pointers-► to int
│            │             │                        │          │
▼            │             │                        │          │
i[]          ▼             │                        │          │
             *i[]          ▼                        │          │
                           (*i[])()                 ▼          │
                                                    *(*i[])()  ▼
                                                               int*(*i[])();
───────────────────────────────────────────────────────────────────────────
i is a pointer-► to a    -► returning-► to an array -► to char
│                function   a pointer   of 10 pointers │
│                │          │           │              │
▼                │          │           │              │
*i               ▼          │           │              │
                 (*i)()     ▼           │              │
                            *(*i)()     ▼              │
                                        (*(*i)())[]    │
                                        (*(*i)())[10]  │
                                        *(*(*i)())[10] ▼
                                                       char*(*(*i)())[10];
───────────────────────────────────────────────────────────────────────────
i is an array-► of 5 pointers-► to arrays of 5-► union signs
│               │               pointers to      │
▼               │               │                │
i[]             ▼               │                │
                i[5]            ▼                │
                *i[5]           (*i[5])[]        │
                                (*i[5])[5]       │
                                *(*i[5])[5]      ▼
                                                 union sign *(*i[5])[5];
Figure 9: Deriving Declarations from English

ANSI Extensions

The X3J11 committee of the ANSI has enhanced the K&R definition of C by adding keywords (see Figure 10) and nomenclature (mainly function prototypes, which we will not get into) to its C language standardization proposal.

The syntax of a C declaration as mandated by the Draft ANSI proposal:

  storage-class type qualifier declarator = initializer;

Additional new qualifiers can be one or more of the following:

  const
  volatile

Additional new types can be one of the following:

  void
  signed char
  unsigned char
  unsigned int
  unsigned long
  long double
  enum ...
Figure 10: ANSI Extensions

The keywords we will discuss are const and volatile. The const type qualifier specifies that the object associated with that type will not be modified, that is, it will not be assigned, incremented, or decremented. The volatile type qualifier specifies that the object associated with that type must be evaluated according to the sequence rules of C, which guarantee that C statements and objects used by these statements must follow a specific order of execution. Since the sequence rules generally rule out a large class of optimizations that can be applied to a given piece of code, the values of volatile objects can be modified by something other than the program "owning" the object that has been declared without fear of inconsistency.

Points to note regarding keywords include:

To understand these keywords better let's look at a practical example, an excellent one from the proposed standard that provides a declaration for a memory-mapped input port connected to a real-time clock. This is given as

extern const volatile int real_time_clock;

which declares an integer that cannot be modified by the program (const), but can be modified by some external event (volatile), the clock.

Now that we have some idea of what these keywords accomplish, we must integrate their usage into our rules. An addendum to our previous rules is shown in Figure 11. The ramifications of this are that type qualifiers modify the type of an object, whether it is the base type or a pointer type. So, in the declaration

const int i;

i is declared as an integer that cannot and will not be modified, for example, through assignment. However, in such a situation, i may be initialized when it is being declared since it is an error to have a statement assigning i = some value after you've declared a given identifier as a const. It is essential to understand that const will modify what one can do with i, but it will not change anything about the basic type directly; this is true of all the qualifiers. An analogy would be that painting a big green house another color (changing one of its characteristics) would not change its size or the fact that it is a house.

Reading and writing ANSI declarations is similar to reading and writing
K&R declarations. However, there are a few additional concerns due to
the addition of type qualifiers. They are as follows:

1.  Our main concern with proposing an extension to the rules given
    in Figure 4 is that the ANSI additions are keywords and don't
    necessarily have a precedence. Therefore keywords don't follow a
    consistent pattern per se.

2.  Disregarding other type specifications, each qualifier (const,
    volatile) has a corresponding pointer type (const *, volatile *).

3.  A missing type specifier is taken to be int. For example, "const
    x" means "int const x". However, this should be considered
    obsolescent style.

4.  Type qualifiers and type specifiers may be intermixed without
    concern for their order. It will not change the resulting type in
    question. For instance, "const int var" and "int const var" are
    the same.

5.  Intermixing type qualifiers within  declarators (the part of a
    declaration specifying the identifier and function, array, and
    pointer attributes) does change the meaning of a declaration.
    Therefore the binding of qualifiers changes depending upon
    context.

6.  To further clarify (d) and (e) and to make clear a case such as
    "const int * p", we need to propose the following: type
    qualifiers modify the type of a declarator. Also, the case of a
    declaration involving pointers such as the case of "type *
    type-qualifier(s) declarator" (say, "int * const  var") is said
    as "declarator qualifier(s) pointer to type" (var is a constant
    pointer to int).

7.  Nonqualifier declarations (those using  default qualifications)
    can be read as such. So, nonconst is "variable"  and nonvolatile
    is "nonvolatile." It is usually good practice to include the
    default qualifications of an identifier when translating
    declarations, even though the default qualification is not a
    keyword.
Figure 11: Reading and Writing ANSI Declarations

Besides qualifying an identifier or object, qualifiers can also qualify pointers. You must understand here how to code a qualified pointer, which is probably one of the most difficult aspects of reading declarations. For instance, would a constant pointer be coded as "int const * p" or "int * const p"? If "int * p" says that p is a pointer to an int, then "int const * p" says p is a pointer to a const int. So, since p is a pointer to a constant, then it must be a constant pointer, right? Maybe, since in the case of "int * const p," p itself is a const that is a pointer to an int, so it is also a constant pointer, right?

It can only be one or the other. Let's explore this in more detail, as a similar problem will come up later on when we discuss the Microsoft extensions (near and far keywords) to C. The point is to understand the difference between a "constant pointer" and a "const pointer." Although it is a small difference, it resolves an ambiguity that crops up in using generic terms like constant pointer.

The difference is that a constant pointer does not change; it is constant, a pointer whose value cannot change. Pointers that are not constant are variable. A const pointer (const *), on the other hand, is a pointer to a constant of some given type. Please note the use of the English word constant in one explanation and the keyword const in the other; they are not the same.

Here are some examples to reinforce this:

int i;

i is a variable integer; it can be assigned various values.

const int i;

i is a constant integer; it cannot be assigned any values after it has been declared.

int *p;

p is a variable pointer to a variable integer; both p and the integer that it points to can be assigned values.

int * const p;

p is a constant pointer to a variable integer; p cannot change value, the integer it points to can.

const int *p;

p is a variable pointer to a constant integer; p can be modified, the integer it points to cannot.

int const *p;

p is a variable const pointer to an int. Since a const pointer is a pointer to a constant integer, p is a variable pointer to a constant integer, the same as const int *p.

const int * const p;

p is a constant pointer to a constant integer. Neither p nor the integer it points to can be modified.

const *p;

p is a variable pointer to a constant integer. This confirms const int * p and int const *p since there is no other way to read it. However, note that ANSI considers the lack of an explicit type declaration to be bad form as it encourages sloppiness in coding declarations.

With these examples under your belt, you can now go back and look at some derivations of the examples used earlier on. For simplicity, I will only use the const attribute in the examples below. Microsoft C only acknowledges the volatile keyword syntactically, not semantically. Also, we will leave out the so-called variable attribute that I introduced since that is understood.

const int i;

i is a constant int.

const int *i;

i is a pointer to a constant int.

int * const i;

i is a constant pointer to an int.

const int * const i;

i is a constant pointer to a constant int.

const int *i[3];

i is an array of three pointers to constant int.

int * const i[3];

i is an array of three constant pointers to int.

const int * const i[3];

i is an array of three constant pointers to constant int.

const int (*i)[3];

i is a pointer to an array of three constant ints.

int (* const i)[3];

i is a constant pointer to an array of three ints.

int (const *i)[3];

This is an error since const cannot immediately follow a left parenthesis.

const int (const * const i)[3];

This is also an error since const cannot immediately follow a left parenthesis.

const int (* const i)[3]

i is a constant pointer to an array of three constant ints.

const int *i();

i is a function returning a pointer to a constant int.

int * const i();

i is a constant function returning a pointer to an int. Many compilers will choose to ignore the qualification of function types. To ease parsing, ANSI does allow it without requiring that a compiler issue a diagnostic; however, ANSI has classified this as "undefined behavior." At first glance, qualification of function types appears to be meaningless but it might be used to specify that the function be placed in ROM.

const int (*i)();

i is a pointer to a function returning a constant int.

int (* const i)();

i is a constant pointer to a function returning an int.

int (const *i)():

This is an error since const cannot immediately follow a left parenthesis.

const int **i;

i is a pointer to a pointer to a constant int.

int ** const i;

i is a constant pointer to a pointer to an int.

int * const *i;

i is a const pointer to a pointer to an int; i is a pointer to a constant pointer to an int.

int const **i;

i is a pointer to a pointer to a constant int.

int const * const *i;

i is a pointer to a constant pointer to a constant int.

const int const * const *i;

This is a syntax error, since the first two const keywords are being applied to the type and not to the first asterisk.

const int * const * const i;

i is a constant pointer to a constant pointer to a constant int.

You may have noticed in the descriptions above, in particular those of "const int *i[3]", "int * const i[3]", and "const int * const i[3]", that the terminology used is not "constant array," but is "array of constant ..." instead. A brief explanation is in order.

If, while parsing a declaration, the last type is an array type and the current type is a qualified type, we say that the elements of the array should inherit the qualified type and not the array type itself. This makes sense when you consider some of the idiosyncrasies of array types and arrays, specifically that an array represents an address and does not have many other attributes beyond that since the address is merely some memory location. Besides, it's actually the elements of the array that we want to give the qualification to because we do not want anything to be assigned to them. In addition, you should keep in mind that an array invocation is also a constant.

MSC Extensions

Along with ANSI, Microsoft has also considered it necessary to add new keywords to the C language (see Figure 12). Four of them are applicable to attributes given to functions.

Microsoft has  extended the  K&R and  ANSI definitions with extensions
of two keywords in the area of qualifiers, as follows:

  far
  near
  huge
  cdecl
  pascal
  fortran
  interrupt
Figure 12: Microsoft Extensions

The first three, cdecl, pascal, and fortran, deal with calling functions written in a language other than C (BASIC, FORTRAN, MASM, or Pascal) or letting C functions be called from other languages. These special keywords are really compiler directives that tell the compiler to generate code so that parameters to a function are handled as if they were in the language used in the function being called. So, in order to declare a function from C that might be written in Pascal, you would use:

extern int pascal func(long, int);

A pointer to such a function might look like this:

int (pascal *fp)(long, int);

The interrupt keyword is also a compiler directive and thus informs the compiler to generate special code. In this case the code would allow the function that is given the interrupt attribute to function according to a certain sequence of events that interrupt handlers should follow.

The other three special keywords, near, far, and huge, deal with addressing an object on a machine that supports a segmented architecture such as that of the Intel microprocessor family, the 80286/88 and 8086/88 series). These keywords, especially the far attribute, will be used often in the mixed-memory model programming typical of this architecture.

Microsoft C treats all the special keywords as if they were declarator (not declaration) qualifiers, that is, they can only modify syntactic entities on the right-hand side of a declaration and cannot modify base types, such as int, directly. The Microsoft C manual states that the special keywords modify the "item" immediately to their right in a manner similar to the const keyword. The manual's description is incomplete on this, though.

Recalling the discussion on the ANSI additions, this means that the special keywords either modify an identifier or pointer, that is, they modify objects or pointers to objects. There is a difference between the ways these are handled, however, which is not clear from the manual. In Microsoft C, the syntax is organized so that pointers can be modified by one of the special keywords. Therefore a sequence such as

int (far *p);

is valid, but since ANSI prohibits such syntax

int (const * p);

is not. What this boils down to is that Microsoft C will accept a declarator syntax of

MODIFIER * more_declarator_info

while ANSI only accepts

* QUALIFIER more_declarator_info

where MODIFIER and QUALIFIER are optional positional parameters.

Keep in mind, though, that both far and near were made part of the Microsoft C language long before const and volatile came along. The reason for this unfortunate difference is that ANSI borrowed the syntax of type qualifiers from C++ (a superset of C supporting some object-oriented programming features), which was being developed independently around the same time that Microsoft introduced its modifiers.

The discussion concerning the difference between const pointer and constant pointer holds true here as well. In the case of these attributes, say far, the situation is even more ambiguous since far is already an English word. In other words, is int * far p a far pointer (since p is a pointer located in far memory), or is int far * p a far pointer (since it is a pointer that points to a far address)?

We resolve it in exactly the same manner as we do the const case. We remain consistent and declare that "far *" (a keyword modifying a pointer) means far pointer. As for the other case, we say that a pointer located in far memory is called a distant pointer; this is similar to the variable versus constant idea used to clarify the const keyword.

One final point to remember is that the Microsoft C memory model keywords cannot be applied to automatic variables. This makes sense since they are stack-based variables and there is no mechanism for you to control their nearness or farness. This does not imply that you cannot have local static variables that are near or far, nor does it mean that you cannot have automatics that are pointers to data in segments other than the segment that the function containing the automatic variable is located in. All it says is that the automatic itself cannot have the attribute.

Finally, here are several more examples to emphasize these points:

int * far p;

p is a distant pointer to an int.

int far * p;

p is a far pointer to int. Since a far pointer is a pointer to far, p is a pointer to a far int.

int far * far p;

p is a distant pointer to a far int.

Down the Road

As of this writing I do not have Version 5.1 of the Microsoft C Compiler. It will be interesting to see if it has any new surprises. In the meantime, you should certainly be familiar enough with C declarations and the rules of the road by now to be able to go back and solve the declaration put forth at the beginning. Try to figure it out before reading on. Just to be sure we are all in agreement,

struct vtag far * (far * const far var[5])();

says: var is a constant distant array of five pointers to far functions returning pointers to far struct vtags.

It's not really all that difficult, is it? Now, let's see if you really understand it. What does the following declare?

unsigned long(far * (far * const(far * far const V[2])[4])())[6];

As a final thought, consider the possibility of writing a program that takes as its input either a C declaration or an English declaration and produces either the appropriate English translation or the appropriate C declaration. Remember to take into consideration the case where there may be more than one declarator, such as int far * p, far q;──should this occur, immediately stop all processing and issue a stern warning declaring this to be poor programming style. Have fun.