Monday, March 29, 2010

Designing a Self-Healing Mechanism as a Layered Architecture

This post is a sequel to my previous post on achieving Reliability with Self-Healing Systems. Here, I go a step further and describe the architectural design of the Robot example we discussed in that post.

A robot is comprised of both hardware and software components. Here the hardware components are the camera, sensors and other mobility components. The Face recognition and obstacle detection constitute the main software components of the robot. The data captured by camera will be sent to the Face Recognition component, which processes the received data and identifies the desired recipient. Similarly, the data sent by the sensors will be processed by the obstacle detection component to identify the objects on the path. So for the robot application to perform successfully all these software and hardware components have to operate together.

The above described robot application is just one particular example of robot. There are many such robot applications that are designed and developed for various purposes. And as we know each such robot application is comprised of several components. So instead of providing the Self-Healing mechanism for each robot application it is more appropriate to design a Self-Healing mechanism for the robot platform, which supports the design, implementation and execution of all component based robot applications developed using the platform.

Below we will study the software architecture of one such component based robot platform. The platform is designed as the layered software architecture, structured into the service layer and self-management layer. The service layer contains executors or Threads processing various hardware and software components periodically within the defined cycle times, whereas the self-management layer monitors and self-manages failures of the executors. Figure 1 depicts executors in the service layer, and the monitor and executor repair manager in the self-management layer. The monitor is composed of the monitor listener and monitor handler which detects the executors violating the cycle time requirements, whereas the executor repair manager does the self-managing of failed executors. (click below image for enlarged view)


Failure Situation: An Executor or a Software Thread is assigned to execute only those components with same cycle time. In the figure, couple of camera components with same cycle time of 50ms is executed by Executor 1 (thread) and the Face recognition and two Obstacle detection components, each of cycle time 100ms are executed by Executor 2. The requirement is that each camera component has to be executed once every 50ms, so Executor 1 should be capable of executing both camera component once every 50ms. Similarly, Executor 2 has to execute the three (one Face recognition and two Obstacle detection) components once every 100ms. If the Executor fails to achieve this time constraint, it is termed as failed and in turn the components executed by the failed executor won’t operate as expected, resulting in the malfunction of the robot.

Interaction between Service layer and Self-Healing Layer: All the executors are monitored by the self-Healing layer. After executing each component the Executor has to notify the Monitor about the details of the component executed. This interaction happens through a message queue. The executors puts the message at one end of the message queue and the Executor Monitor Listener retrieves the message at the other end and store the executed component information in the component sequence table and time table.

Detecting the Failed Executor: As an Executor start one cycle, the start time of the cycle and the total cycle duration is updated in the Time-Table. The Executor Monitor Handler is responsible for checking if the Executor’s current cycle can be completed within the cycle duration. A violation of this result in the failure of the Executor and this has to be immediately notified to Executor Repair Manager.

Referring to the diagram, let’s assume Executor 2 fail’s while executing the first component i.e. ‘Face Recognition component’. This means the Face Recognition component took more time to execute as a result of which the other components assigned to Executor 2 could not be executed with in the cycle time.

Repair and Recovery Process: So it is clear that the failed executor is not capable of executing all the assigned components within the given cycle time. So the components are split and assigned to a new executor. All the components from the first component till the failed component are assigned to a new Executor (thread) and the remaining components continue to execute in the same executor after the executor is restarted in the next cycle. In this example as the first component itself failed it is removed from Executor 2 and assigned to a new executor (Executor 3). Now the work load of one executor is split among two executors and this may probably lead to the successful execution of all the components. If any failure is detected further, the splitting process will occur again.

The above approach has been prototyped and tested under practical circumstances and has shown significant improvement in the performance of robot applications.

Your feedback and suggestions are greatly appreciated.

Source: Self-Healing component executors for OPRoS, Dr. Michael Shin, Hemanth gowda, Taeghyun Kang, Texas Tech University and ETRI, South Korea.

Friday, March 5, 2010

Reliability with Self-Healing Mechanism

Reliability is the ability of the system to perform its operations in routine circumstances, as well as hostile or unexpected circumstances. Normally as a system is being developed attention is focused mainly on ensuring optimum performance under normal circumstances. It becomes highly challenging to simulate hostile scenarios due to their unpredictable nature and most of it is understood only when encountered.

Self-Healing mechanism is a technology, which considers the worst case scenario that a system could face and addresses it to ensure that it delivers the desired performance. Self-healing deals with self- cure of problems by the failed objects themselves. This will ensure that they are back at work again. In order to cure the problem, it has to first identify the problem. So a Self-Healing mechanism is composed of automatic error detection and recovery mechanisms.

Before looking to other detailed aspects of Self-Healing systems, let us understand the problem more clearly and how the self healing mechanism solves the problem to a greater extent in our daily usage of computers software.

Example 1: Consider the MS Word application. You want to document some notes and you just start with it and in the process you forget to save the document. After you have documented several pages unexpectedly the word application closes and this could be because of several reasons like power outage, word process being killed by an external process, system shuts down because of hardware failure etc. But the end result of this incident is some of your valuable data is lost as well as the time spent.

Fortunately this is not the case; the word application has a self-healing mechanism which recovers the unsaved or lost data. When the system or application resumes after the unexpected failure and when you restart the word application, you will find a popped-up recovery window, which shows all the documents that were recovered from the unexpected failure. If you open the document that you haven’t saved, to your surprise you will find most of the data you entered.

Example 2: Consider the case of Robots which are getting more used gradually at home, industry and military for wide variety of tasks. Suppose that a robot application is to deliver water from a kitchen to a human being. This application can be composed of several software components, which include Camera, Face Recognition, Obstacle Detection, and Mobility components. The Camera component takes pictures on the path between a kitchen and a human being repeatedly, sending the pictures to both the Face Recognition component and Obstacle Detection component. The Face Recognition component analyzes the camera data to recognize the human being, whereas the Obstacle Detection component uses the data to detect obstacles on the path.

As can be seen in the above example robotic application is composed of several components to accomplish a particular work. So if any one of these components fails, the work is not completed as expected. In this kind of adverse situation a Self-Healing mechanism, which is monitoring all the components in the robot application will immediately analyze the problem in the failed robot component and will automatically take measures to repair and recover the failed component. This greatly improves the performance of robotic applications even in adverse situations.

In summary, either it is a widely used application as Word or more sophisticated as a Robot, a Self-healing Mechanism surely adds the Reliability component to the system and makes it more Robust. But still this technology is very naïve as you find this feature only in some widely used and more popular products or in some critical applications.

In this post, I have just discussed the reliability issue in software systems for which Self-Healing Mechanism could be a possible solution.

In my next post, Designing a Self-Healing mechanism as a layered architecture I’ll be discussing more about design and implementation details of a Self-Healing system.

Tuesday, February 9, 2010

C++: Shallow Copy & Deep Copy (with Copy Constructor and Assignment Operator)

Shallow Copy: Copies only the member field values. If an object stores an integer value10, it copies 10, and if an object stores an address say 0x100, it copies only the address and not the data stored at that address. So when data stored is an address, we don’t have a distinct copy of the object. It is also called Default copy.

Deep Copy: It makes an entirely distinct copy of the object, no matter what the object stores. To perform deep copy, copy-constructor and overloaded assignment operator functions are required.

In order to make duplicate of an object it is necessary either to make a shallow copy or a deep copy of the object. To make copies of some objects, a shallow copy is enough but for some other objects a deep copy is necessary. So the question is how we distinguish what to perform, to copy an object.

This is identified based on how memory is allocated for the object. If the memory is allocated statically, then a shallow copy is required, but if the memory is dynamically allotted then a deep copy is necessary.

Here in this article we try to understand more about shallow and deep copies, by defining copy constructor and assignment operator functions for implementing a circular queue. Also here we try to analyze the difference between copy-constructor and assignment operator.

Example: Circular Queue

In this example not all the functions required to implement a circular queue is defined. Only the functions concerning shallow copy and deep copy are defined. To perform deep copy we need a copy constructor and an overloaded assignment operator.

//START OF CODE

Queue.h (Header File)

class Queue {

public:

//Queue constructor: creates an empty queue.

Queue();

//empty: returns true if the queue is empty; false otherwise.

bool empty();

//enqueue: adds element to the back of the queue.

void enqueue(QueueElement);

//Queue destructor: removes all elements from the queue(heap) so

//memory can be reclaimed.

~Queue();

//Queue copy constructor: allows for making copies of queues.

Queue(Queue&);

//Assignment operator:allows for assignments of queues to each //other.

Queue& operator=(Queue&);

private:

class Node {

public:

QueueElement data;

Node* next;

Node() {

//empty Node constructor: creates an empty node.

//Needed especially for the next pointer.

data = 0;

next = 0;

}

Node(QueueElement thing) {

//Node constructor: creates a node with data in it.

data = thing;

next = 0;

}

};

typedef Node* NodePointer;

NodePointer last;

int mySize;

};

Queue.cpp (Source File)

//Queue constructor: makes an empty queue.

Queue::Queue()

{

last = 0;

mySize = 0;

}

//empty: returns true if the queue is empty; false otherwise.

bool Queue::empty()

{

return (mySize == 0);

}

//enqueue: adds an element to the back of the queue.

void Queue::enqueue(QueueElement thing)

{

NodePointer nuPtr = new Node(thing);

if (mySize == 0) {

//for an empty queue, stick the new element in all applicable slots.

last = nuPtr;

nuPtr->next = nuPtr;

}

else {

//otherwise, we need to file our new member into the back.

nuPtr->next = last->next;

last->next = nuPtr;

last = nuPtr;

}

//increment the size to that size calculations are correct.

mySize++;

}

//Destructor

Queue::~Queue()

{

while (!empty())

dequeue();//definition not included here.

cout << "Queue removed." <<>

}

//Copy constructor definition.

Queue::Queue(Queue& copy)

{

last = 0;

mySize = 0;

NodePointer ptr = copy.last->next;

do {

enqueue(ptr->data);

ptr = ptr->next;

} while (ptr != copy.last->next);

}

//Assignment operator: allows for assignments of queues to each other.

Queue& Queue::operator = (Queue& data)

{

if (this == &data)//(1)checking if LHS and RHS objects are the same, so //it doesn’t result in self-assignment.

{

cout << "You can't do that! You tried self-assignment!" <<>

return (*this);

}

this->~Queue(); //(2) Empty the LHS object, before copying RHS to LHS.

NodePointer ptr = data.last->next;

do {

enqueue(ptr->data);

ptr = ptr->next;

} while (ptr != data.last->next);

return (*this); //(3)Return LHS, the new object to which copy is made.

}

//END OF CODE

Declare the circular queue object.

Queue q1; //Refer to the default constructor definition.

Insert some elements to the queue through the ‘enqueue’ process.

q1.enqueue(10);

q1.enqueue(20);

q1.enqueue(30);

[Refer to the ‘enqueue’ function, where for each new element inserted, memory is allotted dynamically using ‘new’.]

Pictorially the queue looks like the one below.




[Note: As this is a circular queue, we have just one pointer “myBack” pointing to the last node in the queue. Also the last node is connected to the first node making it a circular queue.]

Here ‘q1’ is the name of the queue stored at memory address ‘1000’ and ‘myBack’ is a pointer to ‘q1’ whose address ‘100’ is stored in q1.

Now we need to make a copy of the queue ‘q1’ to another queue object say ‘q2’.

Queue q2 = q1; //Copy constructor.

Here a new circular queue object ‘q2’ is declared and we copy ‘q1’ to ‘q2’.

Upon the occurrence of the above statement (Copy constructor), the compiler identifies a deep-copy is necessary as the memory to queue objects are dynamically allocated. If a copy constructor is defined, then it is called. (Refer to the copy-constructor function in the above code.)

Pictorially this is shown below.


In the above diagram, ‘myBack’ is distinct in ‘q1’ and ‘q2’. Also the copied queue ‘q2’ is at memory address ‘2000’ and ‘q2’ holds the address of its own ‘myBack’ pointer at location ‘200’. These memory addresses clearly specify that ‘q1’ and ‘q2’ are entirely distinct copies.

Suppose, if destroy one queue say q1, still we can access q2 and its elements.

Suppose if copy constructor function is not defined, then upon the execution of the below statement

Queue q2 = q1;

A shallow (default) copy is made. After shallow copy this is the situation.


So if a shallow copy is made for a dynamically allocated object, only the value of the member field (address i.e 100 in this case) of the object is copied and not the entire object. Here ‘q2’ holds the address of the memory location which ‘q1’ is pointing to. So both q1 and q2 are referring to the same memory location.

Suppose, if we destroy a queue, say q1, all the nodes of the queue are destroyed and now if we try to access the other queue object q2, results in an unexpected situation. This is because the memory allocated to q2 was already destroyed using q1, which was also pointing to the same location. So if a shallow copy is made on dynamically created objects, no distinct copy is made and the new object created will point to the memory location of the old object.

Similarly, to do the assignment of one object to another object for which memory is allocated dynamically, we need to overload the assignment (=) operator. (Please refer to the overloaded assignment operator function in the above code sequence.)

Call to copy-constructor and overloaded assignment operator.

Call to Copy-Constructor is like,

Queue q2 = q1;

Call to overloaded assignment operator is like,

Queue q2;

q2 = q1;

[Note: In the above two calls it is assumed the q1 is already defined.]

Many confuse as the call to copy-constructor is actually a call to assignment operator function as it has the presence of assignment operator (=).

Let’s clarify this confusion,

Overloaded assignment operator call: Queue q2; //Line 1

q2 = q1; //Line 2 [Note: q1 is already created.]

In Line-1, the circular queue object q2 is created by calling default constructor.

In Line-2, the object q1 is assigned to already created but empty object q2. This makes a call to overloaded assignment operator function.

Copy-constructor call: Queue q2 = q1;

Here we are trying to assign object q1 to q2, even before q2 is created. So this doesn’t call the default constructor to create q2, but intern makes a call to copy-constructor, which will create object q2 using object q1 and also copy data of q1 to q2.

Differences between Copy-Constructor and Overloaded Assignment operator.

Copy-constructor copies an existing object to a non-existing object, which will be created before copying where as an assignment operation happen between two existing objects.

In both copy-constructor and assignment operations we copy from one object to another object, but in case of overloaded assignment operator function additionally we do the following.

· A check is made to prevent self-assignment.

· Existing values from the LHS object is removed, before copying.

· A reference to itself is returned.

A call to overloaded assignment operator could even be,

Queue q2, q3; //Line 1

q3 = q2 = q1; //Line 2 [It is assumed the object q1 is already created and hold some values]

In Line-1, we are creating two new queue objects q2 and q3.

In Line-2 we are making multiple assignments, which results in multiple calls to overloaded assignment function. First the assignment of q1 to q2 (q2 = q1) is made, followed by the assignment of q2 to q3 (q3=q2). For this reason of supporting multiple assignments, the Assignment operator function returns a reference to itself.

In this article, I have tried to explain the concepts of Shallow copy and Deep copy fusing the concepts of Copy constructor and Overloaded assignment operator. For any question and concerns please leave your comment.

Saturday, January 30, 2010

Creating Threads in C++ using Boost C++ Libraries

At present standard C++, does not support Threads. But there are other open source libraries available like ‘Boost C++’, which provides Threading feature along with many other features. This allows a C++ programmer to go ahead with developing applications that require threading support.

This article explains how to create a thread. Here we create two threads ‘ThreadA’ and ‘ThreadB’ which continuously outputs the text ‘ThreadA output’ and ‘ThreadB output’ to the console accordingly.

So we need to download and configure the Boost C++ libraries with visual studio.
Download Path: http://www.boost.org
Configuration Steps: http://technologicalthemes.blogspot.com/2009/08/configuring-boost-c-library-with-visual.html

After the configuration, create a project and include the following code.
Different parts of the code are explained appropriately, by adding comments in the code.

Main Function: main.cpp

#include"iostream"
#include"string"
#include"boost/thread/thread.hpp"
#include"boost/thread/mutex.hpp"
#include"boost/thread/condition.hpp"

#include"ThreadA.h"
#include"ThreadB.h"

using namespace std;

void main()
{
//Create an object of Thread-A
ThreadA threadA;

//Creating the thread: "ThreadA"
//Argument-1: Call 'thread_A_Function' function in Thread-A.
//Argument-2: Pass an instance of thread.
//NOTE: The two arguments are arguments to the BOOST C++ Thread
//Library, so that it starts the thread. So these two arguments are a must to
//create a thread.
boost::thread thrdA(&ThreadA::thread_A_Function, &threadA);

//As explained for Thread-A, same thing applies for Thread-B.
ThreadB threadB;
boost::thread thrdB(&ThreadB::thread_B_Function, &threadB);

//Initiates the Thread and waits for its completion.
thrdA.join();
thrdB.join();
}

Thread –A:

Header File: ThreadA.h

#ifndef THREADA_H
#define THREADA_H

#include"iostream"
#include"boost/thread/thread.hpp"

class ThreadA
{
public:
ThreadA(void);
~ThreadA(void);
void thread_A_Function();
};

#endif

CPP File: ThreadA.cpp

#ifndef THREADA_CPP
#define THREADA_CPP

#include "ThreadA.h"

ThreadA::ThreadA(void)
{
}

ThreadA::~ThreadA(void)
{
}

//Defining the 'thread_A_Function' Function.
void ThreadA::thread_A_Function()
{
for(;;)
{
std::cout<<"Thread A output"<

//Make the Thread sleep for 1 second.

boost::xtime xt; //create a timer object.
boost::xtime_get(&xt, boost::TIME_UTC); //initialize the timer
//-object to a standard time.
xt.sec += 1; //Set the delay in seconds.
boost::thread::sleep(xt); //make the thread sleep.
}
}

#endif

Thread -B:

Header File: ThreadB.h

#ifndef THREADB_H
#define THREADB_H

#include"iostream"
#include"boost/thread/thread.hpp"

class ThreadB
{
public:
ThreadB(void);
~ThreadB(void);
void thread_B_Function();
};

#endif

CPP File: ThreadB.cpp

#ifndef THREADB_CPP
#define THREADB_CPP

#include "ThreadB.h"

ThreadB::ThreadB(void)
{
}

ThreadB::~ThreadB(void)
{
}

//Defining the 'thread_A_Function' Function.
void ThreadB::thread_B_Function()
{
for(;;)
{
std::cout<<"Thread B output"<

//Make the Thread sleep for 1 second.

boost::xtime xt; //create a timer object.
boost::xtime_get(&xt, boost::TIME_UTC); //initialize the timer object
//to a standard time.
xt.sec += 1; //Set the delay in seconds.
boost::thread::sleep(xt); //make the thread sleep.
}
}

#endif

This example only works with the threading support provided by Boost C++ libraries. The upcoming proposed standard for C++, that is “C++0x” will include threading support.