Iterators

The problem

Containers like static arrays and vectors store their elements, in memory, one after the other. The first element is stored at the address of the container, the second one is stored after the first and so on... We can then easily iterate (Access each element, one by one, from the first to the last) through the elements of such containers, simply by using pointers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream> int main() { double array[] = {1.1, 2.2, 3.3}; // Defines a pointer and makes it point to the first element of the array. double *ptr = array; // Prints the value of the element pointed by the pointer. std::cout << *ptr << std::endl; // Makes the pointer point to the next element of the array. ptr++; std::cout << *ptr << std::endl; ptr++; std::cout << *ptr << std::endl; return 0; }

An other example with a vector:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <iostream> #include <vector> int main() { std::vector<int> vec; vec.push_back(1); vec.push_back(2); vec.push_back(3); // Defines a pointer and makes it point to the first element of the vector 'vec'. int *ptr = &vec[0]; /* As long as the pointer does not point to the element after the last element of the vector (Outside of the vector). */ while(ptr != ((&vec.back()) + 1)) { // Prints the value pointed by the pointer. std::cout << *ptr << std::endl; // Makes the pointer point to the next element of the vector. ptr++; } return 0; }

Now, the problem is that not all types of containers store their elements, in memory, next to another. Let us look at the structure below:

template<class T> struct Cell { T value; Cell *before; };

A certain kind of container could, for example, store its elements inside structures like the one above. There would be one object of type Cell for each element of the container. The container would have one member, a pointer of type Cell, and it would point to its last element. Each cell would be dynamically allocated individually, so they would not be stored next to another. The member value of a cell would store the value of the element and the member before would point to the element before it. We could make the member before of the first element point to either the address 0 or the last element, in order to detect it is the first one. To access the second last element, we would have to access the first element and use its pointer before, which would point to it. To access the element before the second last element, we would have to go through the last one and then one before it. As we will see later, there are advantages and disadvantages to that way of storing elements.

Now, the thing is that with that kind of containers, it is not possible to directly access an element of the container. We have, in order to access a specific element inside it, to go through all the elements between it and the last one. Also, to increment through the elements, we can not use pointers and simply increment them using the increment operator ++, because the elements are not stored, in memory, one after the other...

The solution

The solution to the problem presented above is... Using iterators. An iterator is a special kind of pointer made to increment through the elements of a container. Using the increment operator ++, we can make the iterator point to the element after the one it points to, and using the decrement operator --, we can make it point to the element before the element it points to, and that, no matter what kind of container is used. If the container used is a vector, the iterator will behave similarly to a normal pointer, but if the container stores the elements differently, the iterator will execute the appropriate instructions to access the element after/before the one it points to.

Like classic pointers, an iterator has a type and an iterator of a certain container can not point to the elements of an other kind of container.

Defining and using iterators

An iterator is defined like any other variable. Its type must be the type of the container it will iterate through, suffixed with ::iterator. There is no asterisk *.

1
2
3
4
5
6
7
8
9
10
#include <vector> int main() { std::vector<int> vec(4); std::vector<int>::iterator it; return 0; }

We can not directly assign an iterator with the address of an element of a container. We must assign it with an other iterator. The method std::vector<T>::iterator std::vector<T>::begin() returns an iterator to the first element of the vector and the method std::vector<T>::iterator std::vector<T>::end() returns an iterator to the (imaginary) element after the last one (Directly outside the vector).

Like with pointers, the element pointed by an iterator is accessed by dereferencing it using an asterisk *.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream> #include <vector> int main() { std::vector<int> vec; for(unsigned c=0; c < 5; c++) vec.push_back(c); // Defines an iterator pointing to the first element of the vector 'vec'. std::vector<int>::iterator it = vec.begin(); // As long as the iterator has not reached out the vector. while(it != vec.end()) { // Prints the value of the element pointed by the iterator. std::cout << *it << std::endl; // Makes the iterator point to the next element of the vector. it++; } return 0; }

An other example, iterating from the last element to the first one:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <iostream> #include <vector> int main() { std::vector<int> vec; for(unsigned c=0; c < 5; c++) vec.push_back(c); // Defines an iterator pointing to the last element of the vector 'vec'. std::vector<int>::iterator it = vec.end()-1; /* An iterator pointing to the (imaginary) element before the first one (Directly outside the vector). */ std::vector<int>::iterator rightBeforeBegining = vec.begin()-1; // As long as the iterator has not reached out the vector. while(it != rightBeforeBegining) { // Prints the value of the element pointed by the iterator. std::cout << *it << std::endl; // Makes the iterator point to the element before the one it points to. it--; } return 0; }

Note that we can use the +, -, += and -= operators with iterators to 'jump over' elements:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream> #include <vector> int main() { std::vector<int> vec; for(unsigned c=0; c < 5; c++) vec.push_back(c); std::vector<int>::iterator it = vec.begin(); it += 2; std::cout << *it << std::endl; it -= 2; std::cout << *it << std::endl; it += 4; std::cout << *it << std::endl; return 0; }

Iterators are often used inside loops for:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <iostream> #include <vector> int main() { std::vector<char> vec; for(unsigned c=0; c < 10; c++) vec.push_back('A' + c); for(std::vector<char>::iterator it = vec.begin(); it != vec.end(); it++) std::cout << *it << std::endl; return 0; }

Reverse iterators

Reverse iterators are like iterators, except that adding numbers to them has to effect to make them go toward the beginning of the container, rather than toward the end, and subtracting numbers from them, make them go toward the end.

A reverse iterator is defined by writing, after the type of the container, ::inverse_iterator rather than ::iterator.

The method std::vector<T>::reverse_iterator std::vector<T>::rbegin() returns a reverse iterator pointing to the last element of the vector (Reverse of the beginning). The method std::vector<T>::reverse_iterator std::vector<T>::rend() returns a reverse iterator to the (imaginary) element before the first element of the vector (Reverse of the end).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream> #include <vector> int main() { std::vector<char> vec; for(unsigned c=0; c < 5; c++) vec.push_back('A' + c); /* Iterates from the last element of the vector to the first, using a reverse iterator. */ std::vector<char>::reverse_iterator revIt = vec.rbegin(); while(revIt != vec.rend()) { std::cout << *revIt << std::endl; revIt++; } std::cout << std::endl; /* Iterates from the first element of the vector to the last, using a reverse iterator. */ revIt = vec.rend() - 1; while(revIt != (vec.rbegin()-1)) { std::cout << *revIt << std::endl; revIt--; } return 0; }

Iterators pointing to constant elements

Since C++11, we can define iterators pointing to constant elements, so they can not change their values. They are defined with the names: const_iterator and const_reverse_iterator, rather than iterator and reverse_iterator.

The methods std::vector<T>::cbegin, std::vector<T>::cend, std::vector<T>::crbegin and std::vector<T>::crend return iterators pointing to constant elements.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <iostream> #include <vector> int main() { std::vector<int> vec; for(unsigned c=0; c < 10; c++) vec.push_back(c*c); std::vector<int>::const_iterator cIt; cIt = vec.cbegin(); // Error: The iterator points to a constant element. // *cIt = 67; // Works. The iterator, itself, is not constant. cIt += 6; // We can read the value of the element pointed by the iterator. std::cout << *cIt << std::endl; return 0; }

An iterator pointing to constant elements and a constant iterator are not the same. A constant iterator may change the value of the element it points to, but we can not change which element it points to.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <iostream> #include <vector> int main() { std::vector<int> vec; vec.resize(5); // Defines a constant iterator pointing to the first element of the vector. const std::vector<int>::iterator it = vec.begin(); // It works, the iterator do not point to a constant element. *it = 43; // Error, the iterator, itself, is constant. // it++; // Error, the iterator, itself, is constant. // it = vec.begin() + 2; std::cout << *it << std::endl; *it = 79; std::cout << *it << std::endl; return 0; }

Note on the validity of iterators

It it important to be aware that if a vector is reallocated, all the iterators (And pointers) pointing to its elements are invalidated. Iterators (And pointers) store the address of the element they point to and when a vector is reallocated, the address of its elements changes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream> #include <vector> int main() { std::vector<unsigned> vec(3); std::vector<unsigned>::iterator it = vec.begin(); *it = 6; // Resizes the vector: It is reallocated. vec.resize(500); // Error, the iterator is now invalid: // *it = 9; // Makes the iterator point to the first element of the vector, again. it = vec.begin(); return 0; }