Reliability is the ability of the system to perform its operations in routine circumstances, as well as hostile or unexpected circumstances. Normally as a system is being developed attention is focused mainly on ensuring optimum performance under normal circumstances. It becomes highly challenging to simulate hostile scenarios due to their unpredictable nature and most of it is understood only when encountered.
Before looking to other detailed aspects of Self-Healing systems, let us understand the problem more clearly and how the self healing mechanism solves the problem to a greater extent in our daily usage of computers software.
Example 1: Consider the MS Word application. You want to document some notes and you just start with it and in the process you forget to save the document. After you have documented several pages unexpectedly the word application closes and this could be because of several reasons like power outage, word process being killed by an external process, system shuts down because of hardware failure etc. But the end result of this incident is some of your valuable data is lost as well as the time spent.
Fortunately this is not the case; the word application has a self-healing mechanism which recovers the unsaved or lost data. When the system or application resumes after the unexpected failure and when you restart the word application, you will find a popped-up recovery window, which shows all the documents that were recovered from the unexpected failure. If you open the document that you haven’t saved, to your surprise you will find most of the data you entered.
As can be seen in the above example robotic application is composed of several components to accomplish a particular work. So if any one of these components fails, the work is not completed as expected. In this kind of adverse situation a Self-Healing mechanism, which is monitoring all the components in the robot application will immediately analyze the problem in the failed robot component and will automatically take measures to repair and recover the failed component. This greatly improves the performance of robotic applications even in adverse situations.
In summary, either it is a widely used application as Word or more sophisticated as a Robot, a Self-healing Mechanism surely adds the Reliability component to the system and makes it more Robust. But still this technology is very naïve as you find this feature only in some widely used and more popular products or in some critical applications.
In this post, I have just discussed the reliability issue in software systems for which Self-Healing Mechanism could be a possible solution.
In my next post, Designing a Self-Healing mechanism as a layered architecture I’ll be discussing more about design and implementation details of a Self-Healing system.
No comments:
Post a Comment