next up previous
Next: 4.4.2 Generation of MAP Up: 4.4 MAP fitting algorithm Previous: 4.4 MAP fitting algorithm


4.4.1 Hidden Markov model for parameter estimate

A hidden Markov model (HMM) with explicit state duration is a doubly stochastic process, whose intensity is controlled by a finite-state discrete-time Markov chain {\( J_{n}: \) \( n\in Z^{+} \)} on the state space \( \{i:1\leq i\leq m_{C}\} \) representing the set of control states. The time that has been spent in state \( J_{n} \) is denoted by \( \tau _{n} \), and the number of arrivals per time unit is denoted by \( r_{n} \), which is the observable output associated with state \( J_{n} \). It is usually assumed that the control states \( \{J_{n}\} \) and the observations \( \{r_{n}\} \) are conditionally independent with the conditional distribution of \( r_{n} \) dependent on \( J_{n} \) only.

Since this semi-Markov chain is not directly observable, the state sequence \( \{J_{n},\tau _{n}\} \) and the model parameters (i.e., the transition probability matrix P for the control states, the mean interarrival time \( \mbox{\boldmath {$\lambda$}}_{i}^{-1} \), the vector of mean sojourn times \( \mbox{\boldmath {$\mu$}}_{i}^{-1} \), the sojourn time probability vector \( {\mathbf{p}}_{i}^{\mbox{\boldmath {$\mu$}}} \) for each control state \( i \), and the control state sequence \( \{J_{n}\} \) of the data set) are estimated from the observed sequence \( \{r_{n}\} \).

The main steps of the standard procedure for HMM with explicit state duration are summarized as follows:

We refer the interested reader to [74] for an overview of the details on these standard HMM algorithms. Additional technical details can be found in [97,108,110] and the references cited therein.

Following the above procedure, we can obtain the maximum likelihood model parameters for the given observation sequence \( \{r_{n}\} \) and the state space $\{ 1, \ldots, m_{C} \}$. Let $\widehat{H}_{i}(\tau )$ denote the estimated non-parametric probability mass function for the sojourn time \( \tau \) of state $i$, and let \( \widehat{O}_{i}(r) \) denote the estimated non-parametric probability mass function for the observation \( r \) of state \( i \).

The total number of model parameters can be further reduced if the observation distribution or the state sojourn time distribution is approximated by some parametric distributions such as Gaussian, Poisson or gamma distributions [91,62]. In this case, one only needs to estimate a few parameters that specify the selected distribution functions. Ferguson [31] has shown that the parameters for the parametric sojourn time distribution \( H_{i}(\tau ) \) and the parametric observation distribution \( O_{i}(r) \) for state \( i \) can be found by maximizing \( \sum _{\tau }\widehat{H}_{i}(\tau )\ln (H_{i}(\tau )) \) and \( \sum _{r}\widehat{O}_{i}(r)\ln (O_{i}(r)) \) subject to the stochastic constraints \( \sum _{\tau }H_{i}(\tau )=1 \) and \( \sum _{r}O_{i}(r)=1 \).

If the arrival process for each control state is Poisson and the control state sojourn times follow a hyperexponential distribution, i.e.,

\begin{displaymath}
O_{i}(r)=\frac{(\mbox{\boldmath {$\lambda$}}_{i}t)^{r}}{r!}e^{-\mbox{\boldmath {$\lambda$}}_{i}t} ,
\end{displaymath} (4.2)


\begin{displaymath}
H_{i}(\tau )=\sum _{j=1}^{m_{H}}\: {\mathbf{p}}_{ij}^{\mbox{...
... {$\mu$}}_{ij}\: e^{-\mbox{\boldmath {$\mu$}}_{ij}(\tau -1)} ,
\end{displaymath} (4.3)

where \( \mbox{\boldmath {$\lambda$}}_{i}\geq 0 \), \( \mbox{\boldmath {$\mu$}}_{ij}\geq 0 \) and \( \sum _{j}{\mathbf{p}}_{ij}^{\mbox{\boldmath {$\mu$}}}=1 \), then the arrival rate \( \mbox{\boldmath {$\lambda$}}_{i} \) of the Poisson process for state \( i \) can be estimated by \( \widehat{\mbox{\boldmath {$\lambda$}}}_{i}=\sum _{r}\widehat{O}_{i}(r)r \) [31,91]. The parameters \( {\mathbf{p}}_{ij}^{\mbox{\boldmath {$\mu$}}} \) and \( \mbox{\boldmath {$\mu$}}_{ij} \) of the hyperexponential distribution can be determined numerically via Eq. (4.3).

Finally, as part of the initialization step, we set the total number of control states to a pre-specified input parameter (which is analyzed in our experiments). If this parameter is a sufficiently large integer, in the re-estimation procedure, we delete the states that are never visited. As such the value of $m_{C}$ is reduced to a smaller number of control states. This led to a maximum of $20$ control states for the data sets used in our study. We also need to initialize the elements of the transition probability matrix, the control state sojourn time distributions, and the initial control state probability vector. An often used choice is to assume that these initial values for the model parameters are uniformly distributed. In addition, we assume the initial values of the control-state arrival rates to be proportional to the state index, i.e., \( \mbox{\boldmath {$\lambda$}}_{i}=r_{max}\frac{i}{m_{C}} \) where \( i \) is the index of the control state, \( r_{max} \) is the maximum value of \( r \), and \( m_{C} \) is the total number of control states.


next up previous
Next: 4.4.2 Generation of MAP Up: 4.4 MAP fitting algorithm Previous: 4.4 MAP fitting algorithm
Alma Riska 2003-01-13