Integer Arithmetics

This page was translated by a robot.

Integer arithmetic handles the basic operations of addition, subtraction, multiplication and division within a processor. Integer numbers are processed in processors in so-called registers . Registers are memory cells that store a single value. At the beginning of the home computer era, registers processed 8 or 16 bits, but a little later 32 bit values ​​could also be processed. The 32-bit era lasted relatively long, establishing a quasi-standard of 32 bits for the intubiquitous type. Today's processors have 64-bit registers.

Different integer sizes can be stored and processed in a single register, but the built-in basic operations normally expect the registers of the two operands to store the same size, i.e. the same number of binary digits.

In principle, the four basic operations can be treated in binary in exactly the same way as the decimal arithmetic, which is learned in elementary school. Only calculations have to be mapped in a processor using the binary system and carried out using logical operations. On this page, for the sake of simplicity, only integer encoding using two's complement is considered.

Addition and Subtraktion

When adding two numbers with the same number of digits (regardless of the number system), the result can always be stored with the same number of digits plus one additional digit. The additional digit is called Übertrag in German or Carry in English . For example, gives 7 + 8 = 15, ie the sum 5with carry 1. The following four combinations are possible for binary digits:

Addition       Sum             Remainder
------------------------------------------
0 + 0 =  0     0 XOR 0 = 0     0 AND 0 = 0
0 + 1 =  1     0 XOR 1 = 1     0 AND 1 = 0
1 + 0 =  1     1 XOR 0 = 1     1 AND 0 = 0
1 + 1 = 10     1 XOR 1 = 0     1 AND 1 = 1

This table shows that the sum can be simulated as an XOR operation and the carry as an AND operation of the two operands. A processor executes these logical operations for all binary digits at the same time. If a carry remains after the link, this is added to the next higher digit in a further step using a shift-left operation (SHL).



Decimal 14
Decimal 11













Sum: 25
  XOR         AND         SHL
--------    --------    --------
00001110    00001110
00001011    00001011
--------    --------
00000101    00001010 -> 00010100

00000101    00000101
00010100    00010100
--------    --------
00010001    00000100 -> 00001000

00010001    00010001
00001000    00001000
--------    --------
00011001    00000000

00011001

Since the registers in a processor cannot store any number of binary digits, an addition can result in a carry that can no longer be accommodated in the register. This excess bit is called the carry bit . If, for example, 4-bit registers are used below (with bit numbers 0 to 3), then the carry bit (abbreviated to C) designates the hypothetical bit number 4 of the register. The same example as above now looks like this:



Decimal 14
Decimal 11













Sum: 9 + Carry
C XOR     C AND     C SHL
------    ------    ------
0 1110    0 1110
0 1011    0 1011
------    --------
0 0101    0 1010 -> 1 0100

0 0101    0 0101
1 0100    1 0100
------    --------
1 0001    0 0100 -> 0 1000

1 0001    1 0001
0 1000    0 1000
--------    --------
1 1001    0 0000

1 1001

The result is (wrongly) 9with the carry bit set. After the addition has been carried out, the carry bit remains set in the processor and can be read out if required. Thus, the carry bit serves as a status indication of whether the previous operation resulted in a carry. In this context, the bit is also referred to as a flag (display flag).

However, a set carry flag only means that a carry into the surplus bit took place when the digits were linked. Depending on how the bits are interpreted, this does not necessarily mean that the result would be wrong. So far, it was implicitly assumed in the example that the numbers used are unsignedto be interpreted without a sign. However, if the numbers are signedassumed to be correct, the following interpretation of the calculation results:

Decimal: -2
Decimal: -5

Sum: -7 + Carry
  1110
  1011
+ ----
1 1001

This result already corresponds to the correct result if the numbers are encoded in two's complement . The most significant bit of the 4-bit register is interpreted as the sign and is referred to as the sign bit . This bit also remains stored in the processor as a sign flag after the operation has been carried out , regardless of whether the numbers are to be interpreted as signedor unsigned. This makes it very easy to query whether the result is negative or positive (if it is a signednumber).

By using two's complement, both positive and negative numbers can be treated exactly the same when adding. In addition, by using the two's complement, an addition can easily be made into a subtraction: simply negate the second operand.

More Arithmetic Flags

The flags discussed above are not important for the actual operation, but are important for subsequent decisions. For example, a programmer might want to run different code depending on whether the sign flag is set (that is, whether the result is positive or negative). Just as often a programmer wants to test whether a number is zero or whether one number is greater or less than another. Such decisions are all made in the processor by subtracting two binary numbers. After the subtraction has been performed, all the required flags are set and a processor can read them and execute appropriate code at the programmer's request.

However, the carry and sign flags mentioned above are not yet sufficient for correct handling and decision-making in all possible cases of addition or subtraction. On the one hand, the so-called zero flag is required. It is set by the processor if (and only if) the result is zero. This can be used, for example, to check whether two numbers are equal by subtracting them from one another. If the result is zero, they are equal.

Furthermore, the so-called overflow flag is required, which indicates whether an operation caused an overflow of a signedvalue. This bit is set if the carry to the most significant place differs from the carry bit when adding two values.

The following consideration is made for an explanation of this link: If two positive signedvalues ​​are added, the carry bit is definitely deleted. If the values ​​are too large and there is a carry to the most significant bit, the result will be negative, which is not correct. If, on the other hand, two negative signedvalues ​​are added, the carry bit is definitely set, but the sum of the two sign bits alone results in 0, which would not be correct. The result is therefore only correct if there is also a carry to the most significant bit, i.e. the two summands are sufficiently small. Thus the result is correct in both cases if the carry to the most significant bit is the same as the carry bit.

If when adding twosigned-Addends with different signs are used, a distinction must be made as to whether the amount of the negative addend is smaller or larger than that of the positive addend. If the absolute value of the negative addend is smaller, the result must be positive. Because of the two's complement coding, there is always a carry to the most significant bit. Since exactly one of the summands was negative, the carry bit will also be set as a result. If the absolute value of the negative addend is greater than that of the positive addend, there is guaranteed to be no carry to the most significant bit due to the coding using the two's complement. However, since only one of the addends was negative, the carry bit will not be set either. So the result is atsigned- Operands with different signs are always correct and also follows the rule that the carry flag must be the same as the carry to the most significant bit.

Summary: The carry flag, the overflow flag, and the sign flag denote different things. The sign flag indicates how the most significant bit of the result is set. The carry flag indicates whether the value range of a unsignednumber has been exceeded. The overflow flag indicates whether the value range of a signednumber has been exceeded.

When adding any unsignedand signednumbers, different combinations of carry , sign and overflow flags can result. The following table lists all possible cases. The zero flag is not listed here, it is simply set when the result is zero.

Operands           4-Bit Example        unsigned        signed        C S O
---------------------------------------------------------------------------
pos + pos = pos    0011 + 0100 = 0111    3 +  4 =  7    3 +  4 =  7   0 0 0
pos + pos = neg    0100 + 0101 = 1001    4 +  5 =  9    4 +  5 = -7   0 1 1
pos + neg = pos    0101 + 1110 = 0011    5 + 14 =  3    5 + -2 =  3   1 0 0
pos + neg = neg    0100 + 1010 = 1110    4 + 10 = 14    4 + -6 = -2   0 1 0
neg + neg = pos    1011 + 1100 = 0111   11 + 12 =  7   -5 + -4 =  7   1 0 1
neg + neg = neg    1101 + 1110 = 1011   13 + 14 = 11   -3 + -2 = -5   1 1 0

This table shows that the result of an unsignedarithmetic is wrong if the carry flag is set. However, the result of an signedcalculation is incorrect if the overflow flag is set. In these cases, the register is underflow or overflow, which is referred to in English as underflow and overflow , respectively . However, a clear distinction must be made here between the overflow of the register and the overflow bit. We generally speak of an overflow or underflow when the range of values ​​of a type is exceeded during any operation and the result can therefore not be stored with the same type.

The tricks with which these flags are used to decide whether numbers are greater than, less than or equal to are not covered on this page. For this, please refer to literature on assembler. The comments on the flags are only included here for the sake of completeness.

Multiplication

Multiplication works very similar to written multiplication in the normal decimal system:

123 * 456
---------
     1368
     912-
+   456--
---------
    56088

Each digit of the first operand is individually multiplied by the entire second operand, with the respective intermediate result being shifted one digit to the left (digit shift) and summed up at the end. In the computer, this now works according to a very similar scheme, whereby here it is only necessary to multiply by 0 or 1 (which admittedly is not too difficult). In any case, the result fits into a register that is twice as large as the original operands. In the following example, 4-bit registers are used for calculation, which means that the result can be stored in an 8-bit register.

Decimal: 3
Decimal: 5








Decimal: 15
0011
0101

0011 * 0101
-----------
       0101
      0101-
     0000--
+   0000---
-----------
   00001111

Here on this page, the explanation of multiplication is basically complete with this example. Nevertheless, it should be noted that the actual execution in a processor can still be greatly optimized. Since register space was very scarce in the past, each bit was used twice where possible. For detailed information, however, reference is made to other pages.

However, attention should be paid to the fact that this method only works correctly if both operands are positive. Here is an example of an incorrect calculation:

Decimal: -3
Decimal: 5








Decimal: 65
1101
0101

1101 * 0101
-----------
       0101
      0000-
     0101--
+   0101---
-----------
   01000001

The most obvious solution to this problem would be to convert the negative operands using two's complements, carry out the multiplication as normal with positive numbers and, at the end of the calculation, negate the result again depending on the sign of the two operands (minus times minus gives plus). Digital technology also has some amazing mathematical tricks for calculating negative multiplications. However, they will not be explained further on this page; the interested reader should find out more elsewhere.

Division

When dividing integers, a so-called subtraction algorithm is normally used. The first operand is called the dividend and the second operand is called the divisor. The idea behind the subtraction algorithm is that the largest possible multiple of the divisor is always subtracted from the dividend until the dividend ends up being too small to be divided at all. This remainder is called remainder , as in textbook mathematics, or remainder in English.

In order to produce a multiple of the divisor in the binary system, all bits can simply be shifted to the left. In the following example, two 8-bit numbers are divided by each other. At the beginning, the divisor is shifted to the left so far that the most significant digit 1 of the binary number is in the most significant position of the 8-bit register. The divisor is then successively shifted one digit to the right (shift right, SHR), while the dividend becomes smaller and smaller when the divisor finds its place in the dividend and is subtracted from it.

Dividend: 220
Divisor:   12

Dividend: 220
Divisor:  192

Dividend:  28
Divisor:   96

Dividend:  28
Divisor:   48

Dividend:  28
Divisor:   24

Dividend:   4
Divisor:   12

Ergebnis: 18 Rest 4
  11011100
  00001100 -> 11000000 (shifted to the left)

  11011100
- 11000000 1 times
----------
  00011100
- 01100000 0 times
----------
  00011100
- 00110000 0 times
----------
  00011100
- 00011000 1 times
----------
  00000100
- 00001100 0 times
  
10010 remainder 00000100

A detailed description of various optimizations and boundary conditions is not included on this page. Here, too, reference is made to the special treatment of negative numbers, but information on this can be found in other sources.

Next Chapter: ASCII and UTF-8