Binary IO with files

Opening a file

Before reading and writing from/to a file, we must open it. Using the C Standard Library, we open a file using the function fopen, which header is FILE* fopen(const char* fileName, const char* mode). The first argument represents the path to the file and the second one represents the mode in which the file will be opened (Example: Read only, write only...). Both arguments are null-terminated strings. If no error occurred, a pointer to a structure of type FILE is returned, otherwise a null pointer is returned. The structure FILE represents what is called a stream. A stream is an interface used to input and output data from/into something (We will see later that this is not used only for files).

Here are the opening modes:

ModeDescription
"r"Opens an existing file for reading operations only.
"w"Creates an empty file for writing operations only. If it already exist, its content is deleted.
"a+"Opens a file where writing operations are added to the end of it. Attempts to change the output position are ignored. The file is created if it does not exist.
"r+"Opens a file for input and output operations. The file must exist.
"w+"Opens a file for input and output operations. If the file does not exist, it is created.
"a+"Opens a file for input and output operations. If the file does not exist, it is created. We can change the position inside the file for reading, but each writing operation sets the cursor back to the end of the file. When outputting, we can only write to the end of the file.

By default, files are opened as text files. If the character b is added after the first letter of the opening mode, the file is opened as a binary file (Ex. "rb", "wb", "ab+", "rb+"...). It is also possible to add the b character after the plus sign + (Ex. "a+b", "r+b"...). Whether the file is in binary or text mode may change the behavior of some read and write operations. For example, on some systems, a new line might be represented using two characters instead of one. So, if we output the number 10, which is the newline character, in text mode, the result might be that two bytes will be written instead of one. In binary mode, what we output is exactly what is written.

Since the 2011 version of the C language (C11), it is possible to append the letter x to the opening mode to make the opening of a file fail, in order to prevent the content of the file from being erased, when the opening mode is w and the file already exists.

Note that even though it is possible to create files using the C Standard Library, it is not possible to create folders. Other libraries must be used for that.

1
2
3
4
5
6
7
8
9
10
11
#include <stdio.h> int main() { FILE *file = fopen("file.txt", "w"); /* The file "file.txt" is created and opened for writing, in text mode. The pointer 'file' points to an object of type FILE which is used as an interface to execute input and output operations on the file. */ return 0; }

Closing a file

Something is missing from the example above. Files must be closed using the function which header is int fclose(FILE* stream), when we are finished with them. The function fclose returns 0 if the file has been successfully closed.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h> int main() { FILE *file = fopen("file.txt", "w"); // The file "file.txt" is created and opened for writing. fclose(file); // We close it. return 0; }

Reopening a file

The function freopen, which header is FILE* freopen(const char* path, const char* mode, FILE* stream), can be used to reopen a file using the same stream object of type FILE. The first two arguments are the same as the function fopen. The third one is the stream object to reuse. If the first argument is an empty string, the same file is reopened with the access mode specified as second argument.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h> int main() { FILE *file = fopen("file.txt", "w"); // The file "file.txt" is created and opened for writing. freopen("", "r", file); // The file "file.txt" is reopened for reading. freopen("secondFile.txt", "w", file); /* The file "secondFile.txt" is created and opened for writing using the stream 'file'. */ fclose(file); // We close the file. return 0; }

Writing binary data

The header of the function to output binary data (bytes) to a stream (file) is size_t fwrite(const void *data, size_t size, size_t count, FILE* stream). The first argument is a pointer to the data to write, the second is the size of one element in the array pointed by the first argument, the third is the number of element to output and the last one is a pointer to the file stream object which represents the file to output to. The function returns the number of elements that has been successfully outputted. Note that the type size_t is a typedef of an unsigned integer type.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <stdio.h> int main() { float data[] = {5.5, 6.7, 9.4}; FILE *file = fopen("file.bin", "wb"); // Opens in binary write-only mode. fwrite(data, sizeof(float), 3, file); // Writes 3 float from "data" to the file. fclose(file); return 0; }

Reading binary data

The header of the function to read binary data from a file is size_t fread(void* data, size_t size, size_t count, FILE* stream). The first argument is a pointer to the array where the data read from the file will be written to, the second one is the size of one element in the array, the third one is the number of element to read and the last one is a pointer to the file stream from which the data will be read.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <stdio.h> int main() { unsigned data[5]; FILE *file = fopen("file.bin", "rb"); // Opens the file in binary read-only mode. fread(data, sizeof(unsigned), 5, file); // Reads 5 values of unsigned type into the array 'data'. fclose(file); return 0; }

Binary data compatibility

Unlike text files, which are compatible between different machines, binary data might not be compatible for two mains reasons. It is however possible to resolve those potential compatibility problems by manually converting the data.

Endianness

The first reason binary data might not be compatible between two machines is endianness. If a little endian computer outputs numbers (Composed of more than one byte) to a file and a big endian machine reads that file, the numbers will not be interpreted correctly. A solution would be to write a byte at the start of the file with the value 0 if the data inside the file is written in little endian or with the value 1 if the data is in big endian. When reading the file, we would read the first byte first to know the endianness of the data inside the file and we would then convert (Invert the order of the bytes) the multi-byte data we read from the file if the endianness of the machine is different.

Floating point number representation

The other problem would be if the machine writing a binary file represents floating point numbers differently from the machine reading it. Converting between floating point number representations is however a bit more complex than switching the endianness of a group of bytes. Floating point number representation conversion is not the subject of this page, so this will not be presented here.

Why using binary files

Why using binary files if they are more complex to handle on a compatibility point of view? Because of performance. Storing numbers in binary instead of text is almost always more compact and loading them into memory is way faster because there is no conversion from text to binary to do.