Syntax of Variables

This page was translated by a robot.

A variable is written in program code by first writing a type and then the symbol of the variable. After the declaration, the variable can be addressed elsewhere by specifying the symbol. A variable can be initialized when it is defined.

int    a;
double b;
char   c        = 'M';
int*   ptr      = &a;
int    array[3] = {5, 6, 7};

Details

The syntax and semantics of variables are explicitly described here on this page. For a general overview of the topic of declaration and definition, please refer to the relevant page under concepts .

A variable is a variable value that can be addressed by the programmer by name. In the C and C++ languages, every variable is strictly assigned a type. This type is written first in a variable declaration. This is followed by the symbol (the name) of the variable.

If a variable is defined, it can always be given a start value. Here, after the symbol, the initial value is written after an equals sign =. More explanations can be found below.

Multiple variables can be declared at the same time with the same type by listing the symbols in a comma ,-separated list. Note that this isn't the sequence operator , it's just a listing of symbols. Each symbol can also be initialized individually. Caution is advised with pointer types, see also the explanations below.

int   a, b, c;
float x = 4.f, y = 5.f, z = 6.f;

A variable declaration, definition and initialization will always end with a semicolon ;. Although this makes these expressions look like normal statements, they are not semantically. A statement consists of a concatenation of operators and accordingly generates executable code. However, declarations and definitions do not. A variable declaration only serves as a hint for the compiler. Variable definitions ensure at most that enough storage space is reserved somewhere, which can in any case be determined at compile time. Executable code is only generated if an initialization takes place in addition to the definition.

Initialization of Variables

When defining a variable, it is possible or even necessary to initialize it, i.e. to give the variable a start value:

int x = 1;
double blue[3] = {0., 0., 1.};
Particle p = oldparticle;
static int numerrors = 0;

Initializations are allowed for each variable definition. However, it should be noted that this only applies to definitions, in no case to declarations (see declaration and definition ). It is therefore not possible, for example, externto initialize a variable declared with the keyword or a variable within a class declaration. The corresponding initializations must be made elsewhere (preferably in a separate implementation file).

Although an initialization with the equals sign =and the final semicolon looks ;like a normal statement, it is not to be regarded as such semantically. Rather, it is an indication for the compiler with which value a variable should be loaded BEFORE it becomes valid. A compiler will generate different code depending on the situation:

If the variable was declared with the statickeyword , the initialization may only take place once. The compiler solves this, for example, by defining all static variables in the global area and initializing the variables before calling the main function. Built-in types are coded directly into the binary, so no initialization code needs to be executed at all.

PODs can be populated with values ​​using aggregate initialization . This allows multiple values ​​to be initialized at the same time, which can be significantly accelerated by the compiler. However, aggregates are explicitly only allowed for initializations.

If the variable type is a complex class with a complex constructor or even a large array of such objects, initialization can take a lot of time. It should be noted that for objects of classes with virtual methods, the compiler also implicitly vtableinserts in each constructor (even an empty default constructor) the initialization of a hidden variable (the so-called ).

If any expensive initialization is placed within a function that is called very often, it will dramatically affect the runtime of the program. In such cases, the use of the statickeyword could help, but it should be pointed out again that staticthe initialization only takes place once with the keyword.

The Strange Syntax of Pointer Declarations

The C and C++ languages ​​define the syntax of declarations in such a way that pointers, arrays, or references are considered to belong to the symbol and not to the type. In the following example, the first line incorrectly suggests that the pointer character is syntactically related to the type. In the second line, the syntactic affiliation is more correct in terms of spelling.

int* a;
int *b;

If only a single variable is declared, however, the case doesn't matter, since both lines are evaluated exactly the same by a parser. There are only problems when several symbols are to be declared at the same time:

One int pointer   and one int
Two int pointers
One int array     and one int
Two int arrays
One int reference and one int
Two int references
int  *a,       b;
int  *c,      *d;
int   e[7],    f;
int   g[7],    h[7];
int  &i = b,   j;
int  &k = b,  &l = b;

Here it becomes clear that the pointer, array and reference characters are considered to belong syntactically to the symbol and not to the type. This affiliation to the symbol is unintuitive at first glance and has repeatedly caused confusion, especially in earlier times.

The C90 standard stipulated that all declarations must be at the beginning of a block. This forced early programmers to write down all the declarations (which could be dozens) in one place. To improve readability, several similar declarations have been written on a single line. So the notation shown in the last example made perfect sense at the time. However, starting with the C99 standard, the restriction of declaration at the beginning of the block was removed and declarations could be written where they made sense.

When developing libraries and frameworks, however, the C90 standard for compatibility with the corresponding compilers often still has to be adhered to today. In the course of the last few decades, specially named pointer types have been increasingly typedefintroduced using . When declaring several symbols with such a type, it is no longer possible to mix up the affiliation.

Such framework types are often marked with suffixes or prefixes such as ptr, por addr. Marking symbols with prefixes and suffixes used to be very common, especially since IDEs were not as powerful as they are today and since compilers did not allow symbol names that are too long (sometimes not more than 8 characters). For example, the type stands LPCWSTRfor Long Pointer to Constant Wide String . While for some programmers this notation is still a valuable aid, for other programmers the same notation is the epitome of unreadable code.

Author's Recommendation

During all these long years, the author of this page has never been able to get used to the supposedly correct spelling of the pointer asterisk. In his understanding, a pointer belongs both logically and semantically to the type, which is why the asterisk is usually written to the type throughout the ManderC page.

However, the author does not recommend anything regarding the position of the asterisk. Every programmer can solve this syntax peculiarity as he wants.

In the author's opinion, the tiresome spelling of the pointer asterisk is just an echo from earlier times, which can only degenerate into debatable recommendations if several symbols are to be declared to a common type at the same time. However, due to the mixing of declarations with executable code in C99, widespread code structuring guidelines and the use of significantly longer symbol names nowadays, multiple declarative asms are rarely encountered.

Yet, to this day, thousands of innocent novice programmers are tortured into putting the asterisk in the variable name, not the type. This is usually justified by the fact that the syntax of C is defined in this way. However, what is forgotten is the reason WHY it is defined in this way. Here is the answer:

The creators of the C language thought it was a good idea.

Your consideration was: A declaration should look as similar as possible to how it will later be used in the code. The asterisk of the pointer in the declaration is also used in C and C++ in the code for the dereference operator . And when a variable declared as a pointer is dereferenced, it yields exactly the type that was declared when it was declared.

int i;
int a;
int *b;
int **c;

i = a;
i = *b;
i = **c;

This consideration was also carried out consistently for the declaration of arrays and function pointers. But it shows its ugly head mainly in variable declarations with the pointer asterisk, since the array brackets []have to be written to the symbol one way or the other. Arrays of pointers are much rarer than normal arrays, and just like function pointers , they already look very complicated when they occur. For this reason, several symbols of such types are rarely declared together in one line.

The idea of ​​writing the pointer asterisk to the symbol has survived to this day. In the author's opinion, however, the risk of making a semantic error if this spelling is not adhered to is extremely low. The author recommends using the simultaneous declaration of several symbols only in tried and tested exceptional cases.