Null Pointer

This page was translated by a robot.

Addresses of memory can be thought of as unsigned integers, which are stored in pointers. The address 0is thus a potentially valid address just like any other integer. However, this address is reserved for system-specific things in all current systems, and access to this address nowadays leads to a program crash due to various security measures.

The C standard stipulates that the address 0is explicitly considered invalid. Corresponding macros are defined in the standard libraries. The macro NULLis used in C and the keyword in C++ from the C11++ standard nullptr. Both serve to mark a pointer as invalid or uninitialized .

Details

Pointer variables are often initialized with a valid address when they are defined. However, this initialization is not always possible and is not carried out automatically. It is possible that a non-initialized pointer variable contains a value that describes an invalid address. If you access this address, the program might crash.

For example, pointer variables are used to store dynamically allocated memory blocks (see new-operator ). It is up to the programmer when exactly such an allocation takes place during the program run. The null pointer is used to indicate whether the pointer already contains a valid address or not. A null pointer explicitly has the semantic meaning that the pointer does not point to a valid address.

The macro is provided as a null pointer in the standard libraries both in C and in C++ NULL. The macro NULLis basically stddefdefined in the library, but is used in so many places that it is also defined in other standard libraries. The easiest way for a programmer to make the macro available is to NULLinclude the stdio library. In either case, the macro is defined as a voidpointer with value 0

Since the C++11 standard, there is a new keyword for the null pointer in C++: nullptr. This is NULLdefined as a pointer type with the value 0, but it is not a macro, but a separate keyword and thus part of the language itself. However, there are still many compilers that do not yet support this standard, or programmers are due to older ones Codes sometimes forced to use older standards.

When using older standards, the preprocessor macro should always be used NULLas a null pointer. When programming in pure C or C++, it is strongly discouraged to 0use other macros or even just the number. In particular, this includes keywords such as NUL, NIL, Nil, nil, Null, nullor zero. These terms and keywords are used for other things or by other languages.

Since a null pointer explicitly 0evaluates to a value in C, this definition means that a pointer can also be understood as a boolean value , which falseresults when the pointer is a null pointer and truein every other case. This property is very commonly used in control structure conditions . For example, the following code checks whether memory allocation was successful:









a Allocated.
#include <stdio.h>
#include <stdlib.h>

int main(){
  int* a = NULL;
  int* b = NULL;
  a = (int*)malloc(10 * sizeof(int));
  b = (int*)malloc(1000000000 * sizeof(int));
  if(a) {printf("a Allocated.\n");}
  if(b) {printf("b Allocated.\n");}
  free(a);
  free(b);
  return 0;
}

In this example, the allocation of was bnot successful because there was not enough memory. Note that the free function (like the delete operator in C++) also explicitly accepts a null pointer without an error occurring.

Final remark: It should be noted that a null pointer in C has the value 0, but in principle has nothing to do with the German term Null . The word null is much more to be understood in English as meaning invalid.

Billion dollar mistake?

The use of null pointers sometimes leads to heated discussions among students, and even experienced programmers often find themselves in a conflict between ideomatic programming and practicality. This is because the concept of the null pointer is very simple, but just as error-prone. Today's computer science students often learn that programming languages ​​can be designed without the concept of an invalid address and thus the question of correct addressing can be left entirely to the compiler.

Since such concepts are known today and are also common in modern languages, the question arises as to whether the seemingly simple solution from a time when C was invented might not, in the long run, require a great deal of additional effort for debugging. If all the debugging times spent worldwide over the past fifty years, which are caused by null pointers, are added up, it quickly becomes clear why people are talking about a trillion dollar mistake here. The question is whether this assumption is justified.

The answer to this must be given by each person himself. Those used to working with null pointers will continue to use them, and those who despise them will continue to despise them. The opinion of the author is the following:

Null pointers aren't a trillion dollar mistake for the simple reason that while they're possible to make, they're arguably the easiest, and even most helpful, bugs out there. From experience it can be reported that when a program crashes, it does so with high probability because of a wrong pointer, very often because the pointer is null. A simple run through a debugger - which often takes no more than a few seconds - is enough to pinpoint the precise location of the problem. The solution is usually found just as quickly. And writing unit tests to detect such errors is either trivial or often not even necessary, since they occur in the course of normal program flow anyway.

Far more problematic are bugs that arise due to automatisms, i.e. when pointers are automatically initialized using an apparently valid value. Languages ​​that don't allow null pointers design their compilers and interpreters in such a way that even an undefined object is still valid. However, the semantics of what these initial objects mean is left up to the user.

The frequency of how often an object is not correctly initialized is about the same in both language paradigms, since this is ultimately always the responsibility of the programmer. The fact that addressing a variable that has not been fully initialized can lead to an error is also about the same.

The problem with automatic initialization is that even with a debugger it is not always possible to determine immediately what is behind an apparently valid object, how, when and where it was initialized. The same error that leads to an immediate crash using a null pointer can therefore lead to hours of troubleshooting using an automatically set value, since the symptom of the problem may only appear at a completely different point in the code and the programmer can only use clever detective work and possibly extensive knowledge in the current area of ​​the source code can find the actual cause.

Such and similar errors are referred to by the author as semantic hide-and-seek . In favor of syntactic correctness and blind, idiomatic rule compliance, potential sources of error are disguised as correct code in such a way that the compiler, static analysis tools and even unit tests no longer work. However, the fact that the problem is still latent is ignored. The following example is intended to illustrate the problem in a very simplified way:

Must compile as cpp






Forgotten function














remote file
-----------
local file











Inch to cm:
1234.0
Error
#include <stdio.h>

typedef struct Unit Unit;
struct Unit{
  float scale;
};

void setUnitScale(Unit* unit, float scale){
  unit->scale = scale;
}

Unit standardUnit = { 1.f };

typedef struct SuperMegaSafeScaler SuperMegaSafeScaler;
struct SuperMegaSafeScaler{
  Unit* unit = &standardUnit;
};

typedef struct BillionDollarScaler BillionDollarScaler;
struct BillionDollarScaler{
  Unit* unit = NULL;
};

// ---------------------------------

float GetLength1(float length, SuperMegaSafeScaler scaler){
  return scaler.unit->scale * length;
}

float GetLength2(float length, BillionDollarScaler scaler){
  return scaler.unit->scale * length;
}

int main(){
  SuperMegaSafeScaler safeScaler;
  BillionDollarScaler nullScaler;
  printf("Inch to cm:\n");
  printf("%f\n", GetLength1(1234., safeScaler));
  printf("%f\n", GetLength2(1234., nullScaler));
  return 0;
}

In this example, the idea would actually be to convert inches into centimeters using an automatic conversion. setUnitScaleUnfortunately, however, it was forgotten to initialize the unit with the correct scaling factor (2.54) by calling . While the first printf line prints a result, the program crashes on the second line due to a null pointer access. However, it is not obvious that the program is already defective in the first line.

This example is very simple and the error is more or less obvious here. However, note that the code above the dividing line can potentially be far away from the main function, possibly nested in many more objects, possibly in a non-visible area of ​​code, or possibly even in a precompiled module. Finding out that the unit object isn't valid can become an ordeal. Such errors can sometimes go undetected for years and lead to inconsistencies during program execution.

Over the many years of programming, the author has learned that one of the most important quality characteristics of code is maintainability . There are some constructs of modern programming, which in theory might be more optimal in terms of runtime, memory consumption or even valid programming paradigms, but in practice they are unfortunately difficult to maintain. Unfortunately, practicality and the ability to quickly decipher the code and thus be able to flexibly redesign it without a complete redesign is a very prominent part of software development, which is sometimes ignored in purely ideological planning.

In short, the use of null pointers is incredibly convenient.