Sandrino's WEBSITE

   ABOUT  BLOG  


C++ Rule of Five

Introduction

Welcome, fellow coders and enthusiasts, to a journey through the elegant and powerful realm of C++. Today, we delve into a fundamental concept that not only embodies the essence of modern C++ design but also empowers developers to write efficient, expressive, and resource-conscious codeā€”the Rule of Five.

In the dynamic landscape of C++ programming, the Rule of Five stands as a guiding principle that transcends the mere syntax of the language. It encapsulates the quintessence of resource management, enabling us to wield the full potential of our creations while navigating the complexities of memory, ownership, and the life cycle of objects.

Join me as we embark on a quest to demystify the Rule of Five, unraveling its significance, understanding its implications, and mastering its application. Whether you're a seasoned developer seeking to deepen your understanding of C++ intricacies or a newcomer eager to grasp the essence of efficient code craftsmanship, this exploration promises to be a rewarding adventure.

So, let's embark on this voyage together, as we uncover the secrets behind the Rule of Five, unlocking a world where simplicity and performance converge in the heart of C++ programming.

Definition

As we explore the Rule of Five, these definitions will serve as the foundation for understanding the nuances of resource management and the importance of each special member function in achieving efficient and robust C++ code.

We have already encounterd most of those constructors in previous articles regarding move semantics.

Why is it needed

The Rule of Five is needed in C++ to address the challenges and responsibilities associated with resource management in user-defined types. C++ is a versatile programming language that provides developers with manual control over memory allocation and deallocation, enabling efficient low-level programming. However, with this power comes the responsibility of correctly managing resources to prevent memory leaks, resource exhaustion, and undefined behavior.

In summary, the Rule of Five in C++ is needed to provide a clear and consistent approach to managing resources in user-defined types. It enhances control over memory and other resources, promotes efficient object manipulation, and helps developers create robust, predictable, and high-performance code.

But what is with the Rule Of Three

With the introduction of move semantics in C++11, which added the move constructor and move assignment operator, the guidelines were expanded to the "Rule of Five." The Rule of Five recognizes the importance of these additional operations for more efficient resource management, especially when dealing with movable resources like dynamically allocated memory. And because I already talked about move semantics it seemed to be a good idea to talk about the rule of Five.

Additional Information

To make it clear from what I understood, the "custom" constructors are only needed if we must implement one of them. This ist mostly if we work with dynamic data (new, delete). Because we have to allocate the memory in the constructor and also to take care to delete that allocated memory correctly in the destructor.

Some Examples

Here I provided a code of a simple class which holds some data on the heap. As we can see it the member "data" is a pointer to heap allocated array of integers. Therefore we need a custom constructor but also a custom desctuctor. The code guidlines state: "Because the presence of a user-defined (or = default or = delete declared) destructor, copy-constructor, or copy-assignment operator prevents implicit definition of the move constructor and the move assignment operator, any class for which move semantics are desirable, has to declare all five special member functions" Rule of Five Cpp reference

#include <algorithm> // For std::swap
#include <iostream>

class DynamicIntArray {
public:
  // 1. Default Constructor
  DynamicIntArray(size_t sz = 0) : data(new int[sz]), size(sz) {}

  // 2. Destructor
  ~DynamicIntArray() { delete[] data; }

  void printArray() const {
    std::cout << "Array Elements: ";
    for( size_t i = 0; i < size; ++i )
    {
        std::cout << *(data + i) << " ";
    }
    std::cout << std::endl;
  }

public:
  int *data;
  size_t size;
};

int main() {
    DynamicIntArray arr1(5);
    arr1.printArray(); // Prints "Array Elements: 0 0 0 0 0"
    return 0;
}

This code code compiles just fine with: clang++ content.cpp -o content -std=c++17 -Wall

But now the we will extend the main function with a little copy assignment:

int main() {
  DynamicIntArray arr1(5);
  std::cout << "Address of arr1: " << arr1.data << std::endl; // Address of arr1: 0x600003229240
  arr1.printArray(); // Array Elements: 0 0 0 0 0
  
  {
    DynamicIntArray arr2 = arr1; // Copy Constructor
    std::cout << "Address of arr2: " << arr2.data << std::endl; // Address of arr2: 0x600003229240

  }
  arr1.printArray(); // Array Elements: 1792507904 32568 0 0 0

  return 0;
}

If we try to compile this it would work and the compiler would say something like "Yes, I see you have a custom destructor and you will for sure take care of the other constructors so here i compile it :)" And if we try to execute the code we get doube free error: content(40943,0x1d70edec0) malloc: *** error for object 0x600003495240: pointer being freed was not allocated content(40943,0x1d70edec0) malloc: *** set a breakpoint in malloc_error_break to debug (content is the name of the output file). And yes the problem is, that we actially ignore the Rule of three/five. The question I would ask where do we free the memory?

Why a double free?

Because we did not provided a custom copy constructor the compiler was so friendly to create one for us. The problem is, that the provided constructor only works on simple data types and not on dynamicaly allocated memory. So that means, that we only create a shallow copy of the pointer to the heap memory (Check the addresses of the two pointers). And also check the two lines of the printArray() calls. The first call prints the data just fine, but the second call gives us some garbage in the best case. Because I created a seperate scope for arr2 it will on exit call the destructor and call delete on its data pointer and therefore arr1 will try to access a already released memory area and print whatever currently is at that addres. And after that on exit of the main function the destructor of arr1 will be called which in turn again tries to free the already freed memory.

Some Improvements

Adding this to the DynamicIntArray class:

    // 3. Copy Constructor
    DynamicIntArray(const DynamicIntArray& other) : data(new int[other.size]), size(other.size) {
        std::copy(other.data, other.data + other.size, data);
    }

Now we explicitly added a copy constructor and gave instructons what to do with the heap data. Now let's try again to compile and run that bad boy.

Now I see following output:

Address of arr1: 0x600003599240
Array Elements: 0 0 0 0 0
Address of arr2: 0x600003599260
Array Elements: 0 0 0 0 0

We can see that the addresses of the pointer now finally are different because of the call to new int[other.size] in the initializer list and also the data is copied correctly with std::copy. The address of data in arr2 is 20 bytes higher up (Heap grows to higher addresses)

TO BE CONTINUED