Tuesday, February 9, 2010

C++: Shallow Copy & Deep Copy (with Copy Constructor and Assignment Operator)

Shallow Copy: Copies only the member field values. If an object stores an integer value10, it copies 10, and if an object stores an address say 0x100, it copies only the address and not the data stored at that address. So when data stored is an address, we don’t have a distinct copy of the object. It is also called Default copy.

Deep Copy: It makes an entirely distinct copy of the object, no matter what the object stores. To perform deep copy, copy-constructor and overloaded assignment operator functions are required.

In order to make duplicate of an object it is necessary either to make a shallow copy or a deep copy of the object. To make copies of some objects, a shallow copy is enough but for some other objects a deep copy is necessary. So the question is how we distinguish what to perform, to copy an object.

This is identified based on how memory is allocated for the object. If the memory is allocated statically, then a shallow copy is required, but if the memory is dynamically allotted then a deep copy is necessary.

Here in this article we try to understand more about shallow and deep copies, by defining copy constructor and assignment operator functions for implementing a circular queue. Also here we try to analyze the difference between copy-constructor and assignment operator.

Example: Circular Queue

In this example not all the functions required to implement a circular queue is defined. Only the functions concerning shallow copy and deep copy are defined. To perform deep copy we need a copy constructor and an overloaded assignment operator.

//START OF CODE

Queue.h (Header File)

class Queue {

public:

//Queue constructor: creates an empty queue.

Queue();

//empty: returns true if the queue is empty; false otherwise.

bool empty();

//enqueue: adds element to the back of the queue.

void enqueue(QueueElement);

//Queue destructor: removes all elements from the queue(heap) so

//memory can be reclaimed.

~Queue();

//Queue copy constructor: allows for making copies of queues.

Queue(Queue&);

//Assignment operator:allows for assignments of queues to each //other.

Queue& operator=(Queue&);

private:

class Node {

public:

QueueElement data;

Node* next;

Node() {

//empty Node constructor: creates an empty node.

//Needed especially for the next pointer.

data = 0;

next = 0;

}

Node(QueueElement thing) {

//Node constructor: creates a node with data in it.

data = thing;

next = 0;

}

};

typedef Node* NodePointer;

NodePointer last;

int mySize;

};

Queue.cpp (Source File)

//Queue constructor: makes an empty queue.

Queue::Queue()

{

last = 0;

mySize = 0;

}

//empty: returns true if the queue is empty; false otherwise.

bool Queue::empty()

{

return (mySize == 0);

}

//enqueue: adds an element to the back of the queue.

void Queue::enqueue(QueueElement thing)

{

NodePointer nuPtr = new Node(thing);

if (mySize == 0) {

//for an empty queue, stick the new element in all applicable slots.

last = nuPtr;

nuPtr->next = nuPtr;

}

else {

//otherwise, we need to file our new member into the back.

nuPtr->next = last->next;

last->next = nuPtr;

last = nuPtr;

}

//increment the size to that size calculations are correct.

mySize++;

}

//Destructor

Queue::~Queue()

{

while (!empty())

dequeue();//definition not included here.

cout << "Queue removed." <<>

}

//Copy constructor definition.

Queue::Queue(Queue& copy)

{

last = 0;

mySize = 0;

NodePointer ptr = copy.last->next;

do {

enqueue(ptr->data);

ptr = ptr->next;

} while (ptr != copy.last->next);

}

//Assignment operator: allows for assignments of queues to each other.

Queue& Queue::operator = (Queue& data)

{

if (this == &data)//(1)checking if LHS and RHS objects are the same, so //it doesn’t result in self-assignment.

{

cout << "You can't do that! You tried self-assignment!" <<>

return (*this);

}

this->~Queue(); //(2) Empty the LHS object, before copying RHS to LHS.

NodePointer ptr = data.last->next;

do {

enqueue(ptr->data);

ptr = ptr->next;

} while (ptr != data.last->next);

return (*this); //(3)Return LHS, the new object to which copy is made.

}

//END OF CODE

Declare the circular queue object.

Queue q1; //Refer to the default constructor definition.

Insert some elements to the queue through the ‘enqueue’ process.

q1.enqueue(10);

q1.enqueue(20);

q1.enqueue(30);

[Refer to the ‘enqueue’ function, where for each new element inserted, memory is allotted dynamically using ‘new’.]

Pictorially the queue looks like the one below.




[Note: As this is a circular queue, we have just one pointer “myBack” pointing to the last node in the queue. Also the last node is connected to the first node making it a circular queue.]

Here ‘q1’ is the name of the queue stored at memory address ‘1000’ and ‘myBack’ is a pointer to ‘q1’ whose address ‘100’ is stored in q1.

Now we need to make a copy of the queue ‘q1’ to another queue object say ‘q2’.

Queue q2 = q1; //Copy constructor.

Here a new circular queue object ‘q2’ is declared and we copy ‘q1’ to ‘q2’.

Upon the occurrence of the above statement (Copy constructor), the compiler identifies a deep-copy is necessary as the memory to queue objects are dynamically allocated. If a copy constructor is defined, then it is called. (Refer to the copy-constructor function in the above code.)

Pictorially this is shown below.


In the above diagram, ‘myBack’ is distinct in ‘q1’ and ‘q2’. Also the copied queue ‘q2’ is at memory address ‘2000’ and ‘q2’ holds the address of its own ‘myBack’ pointer at location ‘200’. These memory addresses clearly specify that ‘q1’ and ‘q2’ are entirely distinct copies.

Suppose, if destroy one queue say q1, still we can access q2 and its elements.

Suppose if copy constructor function is not defined, then upon the execution of the below statement

Queue q2 = q1;

A shallow (default) copy is made. After shallow copy this is the situation.


So if a shallow copy is made for a dynamically allocated object, only the value of the member field (address i.e 100 in this case) of the object is copied and not the entire object. Here ‘q2’ holds the address of the memory location which ‘q1’ is pointing to. So both q1 and q2 are referring to the same memory location.

Suppose, if we destroy a queue, say q1, all the nodes of the queue are destroyed and now if we try to access the other queue object q2, results in an unexpected situation. This is because the memory allocated to q2 was already destroyed using q1, which was also pointing to the same location. So if a shallow copy is made on dynamically created objects, no distinct copy is made and the new object created will point to the memory location of the old object.

Similarly, to do the assignment of one object to another object for which memory is allocated dynamically, we need to overload the assignment (=) operator. (Please refer to the overloaded assignment operator function in the above code sequence.)

Call to copy-constructor and overloaded assignment operator.

Call to Copy-Constructor is like,

Queue q2 = q1;

Call to overloaded assignment operator is like,

Queue q2;

q2 = q1;

[Note: In the above two calls it is assumed the q1 is already defined.]

Many confuse as the call to copy-constructor is actually a call to assignment operator function as it has the presence of assignment operator (=).

Let’s clarify this confusion,

Overloaded assignment operator call: Queue q2; //Line 1

q2 = q1; //Line 2 [Note: q1 is already created.]

In Line-1, the circular queue object q2 is created by calling default constructor.

In Line-2, the object q1 is assigned to already created but empty object q2. This makes a call to overloaded assignment operator function.

Copy-constructor call: Queue q2 = q1;

Here we are trying to assign object q1 to q2, even before q2 is created. So this doesn’t call the default constructor to create q2, but intern makes a call to copy-constructor, which will create object q2 using object q1 and also copy data of q1 to q2.

Differences between Copy-Constructor and Overloaded Assignment operator.

Copy-constructor copies an existing object to a non-existing object, which will be created before copying where as an assignment operation happen between two existing objects.

In both copy-constructor and assignment operations we copy from one object to another object, but in case of overloaded assignment operator function additionally we do the following.

· A check is made to prevent self-assignment.

· Existing values from the LHS object is removed, before copying.

· A reference to itself is returned.

A call to overloaded assignment operator could even be,

Queue q2, q3; //Line 1

q3 = q2 = q1; //Line 2 [It is assumed the object q1 is already created and hold some values]

In Line-1, we are creating two new queue objects q2 and q3.

In Line-2 we are making multiple assignments, which results in multiple calls to overloaded assignment function. First the assignment of q1 to q2 (q2 = q1) is made, followed by the assignment of q2 to q3 (q3=q2). For this reason of supporting multiple assignments, the Assignment operator function returns a reference to itself.

In this article, I have tried to explain the concepts of Shallow copy and Deep copy fusing the concepts of Copy constructor and Overloaded assignment operator. For any question and concerns please leave your comment.

1 comment:

Karthik said...

Good one hemanth. Its easily understandable even to a lay man.