Markov-Chain Model for Reliability Evaluation

**Figure 11:** Markov state diagram for Protocol 1.
$\begin{figure}\centering \centerline{\hbox{\epsfig{figure=figures/markovpb.eps, width=3.5in}}} \end{figure}$

Figure 11 shows the Markov state diagram for Protocol 1, which can also be applied to the other protocols. In this diagram,

signifies that the state number/index is

, and there are

and

failed nodes in the primary and backup group, respectively. All the states shown are working states, with the exception of

, which is the data loss state. The total number of states in the Markov state diagram is denoted by

and is equal to

. The Markov chain begins with State 1 (

), followed by State 2(

), and so on.

To facilitate the solution to this model, we derive a function, given in Eqn. 2, that maps from the system state with

failed primary nodes and

failed backup nodes to the state index

of the Markov state diagram:

$\displaystyle n$	$\textstyle =$	$\displaystyle i -1 - \frac{x(x+1)}{2}$	(3)
$\displaystyle m$	$\textstyle =$	$\displaystyle x - n$	(4)

**Figure 12:** The transition probability between different states.
$\begin{figure}\centering \centerline{\hbox{\epsfig{figure=figures/transitionNew.eps, width=3in}}} \end{figure}$

Figure 12 shows the transition rate between the neighboring states. In the diagram, $P_{ij}$ denotes the probability that the system remains functional, also referred to as safety probability, given that one more primary node fails while

primary nodes and

backup nodes have already failed. Similarly, $P_{ik}$ denotes the probability, or safety probability, of the system remaining functional when one more backup node fails while

primary nodes and

backup nodes have already failed. $P_{ij}$ can be calculated as Eqn. 5.

$\displaystyle P_{ij}$	$\textstyle =$	$\displaystyle \hat{P}((m+1)PnB\,\vert\,mPnB)$
	$\textstyle =$	$\displaystyle \frac{\hat{P}((m+1)PnB \cap mPnB)}{P(mPnB)}$
	$\textstyle =$	$\displaystyle \frac{\hat{P}((m+1)PnB)}{P(mPnB)}$	(5)

$\begin{displaymath} \hat{P}(mPnB) = \left\{ \begin{array}{rl} \frac{{N\choose m... ... $m+n \leq N$;} \\ 0 & \mbox{otherwise.} \end{array}\right. \end{displaymath}$

(6)

The transition probability from State

to the data loss state, denoted as $q_{i, DL}$ , can be calculated as Eqn. 8.

$\displaystyle q_{i,DL}$	$\textstyle =$	$\displaystyle \mbox{loss caused by one more primary node failure}$
		$\displaystyle + \; \mbox{loss caused by one more backup node failure}$
		$\displaystyle + \; \mbox{loss caused by network switch failure }$
	$\textstyle =$	$\displaystyle (N-m) \lambda [(1-P_{ij}) + P_{ij}(1- P_{c})]$
		$\displaystyle +(N-n)\lambda(1-P_{ik}) + \lambda_s$
	$\textstyle =$	$\displaystyle (N-m)\lambda (1-P_{c}P_{ij})$
		$\displaystyle +(N-n)\lambda(1-P_{ik}) + \lambda_s$	(8)

The stochastic transitional probability matrix is defined as $Q = \left[q_{ij}\right]$ , where $1\leq i, j\leq S$ and $q_{ij}$ is the transition probability from State

to State

. In summary, $q_{ij}$ can be calculated as follows.

$\begin{displaymath} q_{ij} = \left\{ \begin{array}{rl} (N-m_i)\lambda P_{ij}P_{... ..._j = n_i + 1$;} \\ 0 & \mbox{otherwise.} \end{array}\right. \end{displaymath}$

(9)

$\begin{displaymath} q_{ij} = \left\{ \begin{array}{rl} m_j\mu & \mbox{if $m_j =... ... = n_i + 1$\ ;} \\ 0 & \mbox{otherwise.} \end{array}\right. \end{displaymath}$

(10)

$\begin{displaymath} q_{ii} = 1 - \sum_{i=1, j \not= i}^{j \leq S}q_{ij}-q_{i,DL} \end{displaymath}$

(11)

**Figure 13:** Classic Markov state diagram of RAID-1.
$\begin{figure}\centering \centerline{\hbox{\epsfig{figure=figures/markovs.eps, width=3in}}} \end{figure}$

, i.e., the primary node and backup node are always kept consistent, like in RAID-1, and a fault-free network is assumed, the model shown in Figure 11 can be simplified to the classic RAID-1 model [28], as shown in Figure 13. This is proven by the fact that numerical results generated by both models with the same set of input parameters are identical.