Endianness

Location
  1. Courses

    /

  2. Complete C++ Course

    /

  3. The basics

    /

  4. Endianness

Big endian vs little endian

Endianness is not a concept directly related to C/C++. However, if we are not aware of it, it may cause errors in some situations.

The are basically two types of endianness: Little endian and big endian. Endianness is a characteristic of the processor. Intel processors are generally little endian and AMD processors, big endian.

Endianness has to do with... Byte storing order. Storage devices (hard drives, solid states drives...) and the memory of computers (RAM) only store bytes. So if we want to store a variable of type int, which let us say weights 4 bytes, we need to divide it into 4 bytes and then store them.

Let us look at the following number: 1234. The digit 1 is the most significant digit, because it is the one who weight the most (1000), and 4 is the least significant digit. When we divide a variable into bytes, we can also classify them from most significant to least significant. So, should we store the bytes from the least significant byte to the most significant or the other way around? Well... Here is the problem. Big endian processors store bytes from the most significant to the least significant and little endian processors store bytes from the least significant to the most significant.

By the way, we write numbers in big endian order. For example, in the number 5002, the most significant digit is 5 and we write it first.

When can endianness be problematic?

When a program does not interact with other computers, the endianness does not matter so much. However, if the program receives, through internet, data from a computer of different endianness or if it reads a binary file written by a computer with an other endianness, the variables that are made of more than one byte will be wrongly interpreted. A solution to this is, when the endianness is different, to invert the order of the bytes of the variables that weight more than 1 byte.

Obviously, a variable made of one byte is stored the same way on a big and little endian machine, because there is no most or least significant byte in a set of one byte.

Detecting the endianness

We can use functions from libraries to retrieve the endianness of the machine the program is running on, but we will see how to test it manually. This is not very complex. Note that the following example works only if the byte size is 8 bits. Here is a way to do it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
/* Returns true if the processor is little endian. Otherwise, false. */ bool isLittleEndian() { // We define an unsigned int and set it to 0. All its bits are set to 0. unsigned n = 0; // We set the first byte of n to 255 (Maximum value an unsigned byte can hold). ((unsigned char*)&n)[0] = 255; // Now, the first 8 bits (Stored in the memory) of n are set to 1. /* If the processor is little endian, then variables are stored with their least significant bytes first. Setting the first 8 bits to 1 and the others to 0 should then set the value of an unsigned int variable to 255. 255 == 0000000011111111b */ return n == 255; } /* Returns true if the processor is big endian. Otherwise, false. */ bool isBigEndian() { // We define an unsigned int and set it to 0. All its bits are set to 0. unsigned n = 0; // We set the first byte of n to 255 (Maximum value an unsigned byte can hold). ((unsigned char*)&n)[0] = 255; // Now, the first 8 bits (Stored in the memory) of n are set to 1. /* If the processor is big endian, then variables are stored with their most significant bytes first. Setting the bits of the second byte to 1 and the others to 0 should then set the value of an unsigned int variable to 65280. 65280 == 1111111100000000b */ return n == 65280; } #include <iostream> int main() { if(isLittleEndian()) std::cout << "Little endian." << std::endl; else std::cout << "Big endian." << std::endl; return 0; }

The variable n as been defined as unsigned int. It could have been something else. All that matter is that the variable is an unsigned integer made of at least 2 bytes.

The example above works, but it is not the best way to test endianness and it works only if the byte size is one octet (Which is not too problematic, since it is, most of the time, safe to assume it is. However, it is better to code byte-size independent code if possible). The example above is there only to help illustrate endianness.

Here is a better, byte-size independent code to retrieve the endianness:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Enum class is a feature of C++11. We can replace it by a simple enum to make it C compatible. enum class Endianness{ Little=0, Big=1 }; /* Returns a value, of the enum class type Endianness, representing the endianness of the processor. */ Endianness endianness() { unsigned n = 0; ((unsigned char*)&n)[0] = 255; unsigned n2 = 0; ((unsigned char*)&n2)[1] = 255; /* Even without knowing the byte-size, we know that if the endianness is big endian, then the value of the variable n is bigger then the value of n2. The reason is that if the processor is big endian, then it considers that the first bits stored are the most significant and the first bits of n are set to one while the first bits of n2 are set to 0. If the processor is little endian, it is the other way around. */ /* If the result is 1 (true), then the endianness is big endian (int(Endianness::Big) == 1). If the result is 0 (false), then the endianness is little endian (int(Endianness::Little) == 0). */ return (Endianness)(n > n2); } #include <iostream> int main() { if(endianness() == Endianness::Little) std::cout << "Little endian.\n"; else std::cout << "Big endian.\n"; return 0; }

Switching the endianness of a variable

Converting a variable from little to big endian (Or the other way around) is done by inverting the order of its bytes. Here is a way to do it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
/* Switches the endianness of a variable between big and little endian. The first argument is a pointer to the variable to convert. The second argument is the size, in bytes, of the variable. */ void switchEndianness(void* var, unsigned byteSize) { /* We switch the values of the first and last byte, then we switch the values of the second and before last byte and so on, until all bytes are inverted. */ unsigned temp; // To temporarily hold the value of a byte for the switch. for(unsigned c=0; c < byteSize/2; c++) { // Saves the value of the byte at the left temp = ((unsigned char*)var)[c]; // Assign the value of the byte at the right to the one at the left. ((unsigned char*)var)[c] = ((unsigned char*)var)[byteSize-(c+1)]; // Assign the value of the byte at the left to the one at the right. ((unsigned char*)var)[byteSize-(c+1)] = temp; } } #include <iostream> int main() { unsigned short num = 255; std::cout << num << std::endl; // Switches the endianness of num switchEndianness(&num, sizeof(unsigned short)); std::cout << num << std::endl; return 0; }

If we consider the type unsigned short weights 16 bits, then, when the value of a variable of that type is 255, its bits (The most significant ones being at the left) are equal to:

00000000 11111111b

If we change its endianness, its value becomes 65280 and its bits:

11111111 00000000b