 Research
 Open access
 Published:
A joint renewal process used to model event based data
Decision Analytics volume 3, Article number: 2 (2016)
Abstract
In many industrial situations, where systems must be monitored using data recorded throughout a historical period of observation, one cannot fully rely on sensor data, but often only has event data to work with. This, in particular, holds for legacy data, whose evaluation is of interest to systems analysts, reliability planners, maintenance engineers etc. Event data, herein defined as a collection of triples containing a time stamp, a failure code and eventually a descriptive text, can best be evaluated by using the paradigm of joint renewal processes. The present paper formulates a model of such a process, which proceeds by means of state dependent event rates. The system state is defined, at each point in time, as the vector of backward times, whereby the backward time of an event is the time passed since the last occurrence of this event. The present paper suggests a mathematical model relating event rates linearly to the backward times. The parameters can then be estimated by means of the method of moments. In a subsequent step, these event rates can be used in a MonteCarlo simulation to forecast the numbers of occurrences of each failure in a future time interval, based on the current system state. The model is illustrated by means of an example. As forecasting system malfunctions receives increasingly more attention in light of modern conditionbased maintenance policies, this approach enables decision makers to use existing event data to implement state dependent maintenance measures.
Model
Renewal processes have been a frequent object of analysis in early studies of stochastic processes, see Cox (1962), for instance. Only recently the idea of parallel renewal processes receives more attention, see Borgelt and PicadoMuino (2012), Gaigalas (2003), Kai et al. (2014), CRC (1994), Kallen et al. (2010), Truccolo (2005), Modir et al. (2010). However, little emphasis has been given to the subject of stochastic dependence between processes so far, with few exceptions such as shown in Borgelt and PicadoMuino (2012) or Truccolo (2005), Modir et al. (2010). Spike train analysis is an active neurobiological research area calling for parallel renewal processes. The latter paper emphasizes stochastic dependence between point processes described by conditionally independent intensity functions. In the same spirit, stochastic dependence between events will be at the core of the present paper in combination with a linear damage model in a condition based maintenance context. A formal representation of a parallel renewal process is given in Modir et al. (2010), whereby the authors look at the process from the point of view of an abstract Poisson process with statedependent event rates, conditionally independent given the history of the process. Kai et al. (2014) is another example of a biomedical application of a parallel renewal process, whereby the individual occurrence rates of neural spikes depend not only on the neuron in question, but also on a set of neighboring neurons. A MonteCarlo algorithm is used to construct a parallel renewal process based on the event rates identified.
The paper is structured as follows: The remaining sections of this chapter describe the process, the event rates and general modelling assumptions. Chapter 2 deals with the multidimensional renewal equation. After presenting the most general case with stochastic dependence, the special case without stochastic dependence between processes is considered and an asymptotic result for the expected number of cumulated events is derived. We proceed to show that there is a continuous transition from the case of stochastic dependence to the case of stochastic independence with one individual parameter tending to zero. Chapter 3 deals with technical details of the estimation problem used to find the model parameters. Chapter 4 illustrates the numerical findings by means of an example.
Input data
The input data in the case of an event oriented data model consists, from the practical point of view, of a list of records, say, with each record containing a failure code, a date time object and some explanatory text. From the mathematical point of view, however, it is sufficient to

classify the different failure codes,

index each class and

transform each date time object into a real number representing the occurrence time for the respective event.
This process results in a list of occurrence times, grouped according to failure classes as shown:
The occurrence times will then be used to construct a joint renewal process.
The process
Let \(E:=\{1,\ldots ,n\}\) be the set of all failure codes, i.e. events. Then, at each point in time \(t \in {\mathrm {I\!R\!}}_{+}\) let \(X_i(t)\) be the backward time of event \(i \in E\) and define the vector of backward times as
where
The probability for an event \(i \in \textit{E}\) to occur in the time interval \((t,t+dt)\) conditionally upon trajectory X(t) is then given by
In this equation it is assumed, that events are conditionally independent on each other, if the system state is given. \(\lambda _{i}(X(t))\) is an event rate or an intensity function such as defined in Press (2007). The following stochastic differential equation can now be proven for X(t):
Theorem 1
Let \(e_i, i\in E\) be the unit vector in direction i. Then the following holds:
whereby w.p. stands for “with probability”.
Proof
Each of the events \(i \in \textit{E}\) occurs with probability \(\lambda _{i}(X(t))dt+o(dt)\), in which case all of the events except event i age by an incremental amount of time dt and the backward time of event i is reset to 0. No event, therefore, occurs with probability \(1\sum _{i \in E} \lambda _{i}(X(t))dt+o(dt)\). More than one event occurs with probability o(dt) only. \(\square\)
The event rates
The question now is, whether a plausible functional relationship of \(\lambda\) on the system state X(t) can be found such that the parameters of this function can be efficiently estimated from the data available and such that this relationship can be used to generate realistic simulations of the joint renewal process serving as a reliable shortterm forecast in the domain of up to one week, for instance. The following assumption will be used throughout this paper:
(6) admits the interpretation, that each event rate consists of a random component \(\lambda _{i}\) and a condition or state dependent component controlled by the parameters \(\alpha _{i,j},i\in E,j\in E\). The state dependent component is modelled such that the event rates are linearly dependent on the backward times (=ages) of the events with proportionality factors given by \(\alpha _{i,j},i \in E, j \in E\). Therefore we sometimes refer to the process as a process with linear damage accumulation. Figure 1 shows schematically the dependence of the event rates on the vector of backward time.
Corollary 1
If the following holds
then the individual renewal processes become independent.
Proof
Using (5) and (6) one obtains for all \(i \in E\)
In (8) no crossdependencies between different components of the vector X(t) can be observed. \(\square\)
The renewal equation
Again, please note that the input data sample or, equivalently, the trajectory (2) has been observed and serves as input. Also, let \(\mathcal {F}_{t}\) be the sigma algebra generated by \(X(v), v \le t\) .
Preliminaries
Conditionally upon this trajectory the interevent time distribution
holds. Assuming stochastic independence conditionally upon \(X(v)_{v\le t}\) (9) immediately yields
Define further
Accordingly, \(f_i(tX)*dt + o(dt)\) is the probability for an event of type \(i \in E\) to occur in the time interval \((t,t+dt)\) conditionally upon the trajectory \(X(v)_{v\le t}\). Also let \(\hat{C}(tX)\) be the vector of expected numbers of renewals at time t, if the process starts in state X. The following then holds:
Theorem 2
Let
Then
Proof
Equation (13), representing the expected number of renewals at time t under the condition that the process started in state X, can be conditioned upon the first occurrence of the event. If this occurs after time t (probability R(tX)), then the expected numbers are \((0,\ldots ,0)^T\). If it occurs during the time interval \([u,u+du)\) somewhere in the interval [0, t) and is of type \(i \in E\) (probability \(f_i(uX)du)\), then state X transforms into state \(X^{(i)}(X,u)\) and—therefore—the expected number, seen as a vector, is equal to \(e_i +\hat{C}(tuX^{(i)}(X,u))\). \(\square\)
Iterative approximation
In this section the individual components of \(\hat{C}(tX)\) will be considered one by one and the arguments \(\lambda\) and \(\alpha\) will be suppressed. Also, \(\hat{C}_i(0X)=0\) will be assumed. Then the individual components of equation (13) can be written as
Assume (14) has an iterative solution such that, for \(k=0,1,2,\ldots\)
Then stages (1) and (2) of the iterative approximation can be written as
(16) can be used to approximate the cumulated number of events for rare failure codes, and only in the near term environment. The advantage is, however, that those numbers take into consideration the initial condition X and therefore are in agreement with the requirements of “Condition Based Maintenance”. It will be shown now, that the iteration given in (15) converges.
Lemma 1
For any \(T \in {\mathrm {I\!R\!}}_+\) such that
the following holds:
Proof
Let
Then from (15) one obtains
and therefore
as shown in Appendix 1, proving the lemma in the limit for \(k \rightarrow \infty.\) \(\square\)
Please observe that (17) expresses the condition that the process will not “explode” at any time in a finite time interval.
A special case: stochastic independence
Assume (7) holds. Let \(\tilde{C}_i(t)\) be the solution of (14) under (7). The expected cumulated numbers of events then become independent. For each \(i \in E\) the following result can be proven, whereby \(\lambda := \lambda _i\) and \(\alpha := \alpha _{ii}\) has been set.
Corollary 2
whereby \(\Phi (x)\) denotes the cumulative distribution function of the standard normal distribution.
Proof
Note that
Let
whereby the second equation above is proven in Appendix 2.
Furthermore, defining
and making use of the Laplace transformations introduced above yields
Next, we compute \(L_{\phi }(s)\) and \(L_{f}(s)\). It is easy to see that
see Appendix 3. Also, \(L_{f}(s)\) is shown to be expressed as
as shown in Appendix 4. Now, upon using (26), (27) and (28) the following is obtained:
which yields
whereby—see Cox (1962)—O(1) is a function of s bounded as \(s\rightarrow 0\). According to Cox (1962), section 1.3, \(\tilde{C}_i(t)\) then satisfies
\(\square\)
An equivalent proof can be obtained from one of the central results in renewal theory which states that
whereby \(\bar{T}\) is the expected renewal time, see chapter 4 in Cox (1962). By definition
Using substitutions in the style as shown above and properties of the incomplete Gamma function, as defined, for instance in Abramovitz and Stegun (1972), p. 262, one proves that
Using (32) along with (34) proves the statement.
The following conclusions are now easy to draw:
Corollary 3
If \(\lambda = 0\) then
If \(\,\alpha = 0\,\) then
Proof
(35) can immediately be derived from (22) by letting \(\lambda\) tend to zero. (36) must be concluded from equation (23), as (22) has been derived under the implicit assumption that \(\alpha \ne 0\) used in dividing the exponent by \(\alpha\), but the conclusion is straightforward. \(\square\)
Equation (35) has an interesting application. Assume (6) holds and, in addition \(\lambda =0\) can be safely assumed. In that case (35) allows to estimate \(\alpha\) by equating the slope of \(\hat{C}(t\alpha =0)\) with the coefficient of t.
In the context of condition based maintenance residual lifetimes of components or residual forward times of critical events must be estimated conditionally upon the system state, which frequently is expressed by parameters such as age or backward time. Let
be the expected forward time conditionally upon the event that the forward time exceeds \(\tau\). Then, in close analogy with (34), the following can be proven:
A first order correction
Let
and assume, for the sake of simplicity, \(\alpha _{i,j} \ge 0, i \in E, j \in E, j \ne i\). With the definitions given in Appendix 5 one can show that, under (7)—i.e. stochastic independence between the event numbers—the renewal equation becomes
The following proposition holds:
Proposition 1
whereby
and, approximately
as well as
In (43) \(\tilde{T}_i\) is defined as in (31), however with explicit reference to a failure mode \(i \in E\). This statement is given as a a proposition rather than as a theorem, because the behavior of \(\tilde{C}_i(tX)\) is used in its asymptotic approximation.
With (13), (40) and Appendix 6 one shows that
In Appendix 7 and Appendix 8 it is shown that
and this proves (43).
The estimation problem
With respect to numerical modelling the first task is the estimation of \(\lambda _{i}\) and \(\alpha _{i,j},i\in E,j\in E\) from the sample in (1) by providing estimates \(\hat{\lambda }_{i}, \hat{\alpha }_{i,j},i\in E,j\in E\) for the model parameters \(\lambda _{i}, \alpha _{i,j}\). As usual, there are several ways to solve this task. One of them is the wellknown maximum likelihood method, alternatively the method of moments can be used. Since (5) suggests, that the interevent time sample T is not necessarily stochastically independent, the likelihood of the sample used above cannot be computed via the product of the likelihoods of the individual members of the sample. The method of moments is therefore being used. Let
be the cumulative number of events up to and including time t. Observe that the sample of lifetimes as introduced above is in a onetoone correspondence with the trajectory of backward times as defined in (2). Let the sum of squares SSQ be defined as
whereby \(\hat{C}_i(T_{i,k}  \hat{\lambda }_{i},\hat{\alpha }_{i,j})\) is the estimated cumulative number of events of type i up to and including time t, based on the estimates \(\hat{\lambda }_{i},\hat{\alpha }_{i,j}\). Writing
for the sake of brevity, a mathematical expression for \(\hat{C}_i(T_{i,k})\) is being needed. This expression is provided by means of the renewal equation.
Estimating \(\lambda\) and \(\alpha\) with the least squares principle
In principle, equation (48) provides the appropriate means to find optimal estimators \(\hat{\lambda }_i, \hat{\alpha }_{i,j},i=1,\ldots ,n,j=1,\ldots ,n\) by minimizing SSQ with respect to the estimators. Using (48) unchanged, however, means that during the process of minimizing SSQ—by means of a welltested nonlinear numerical minimization routine—the renewal equation (13) would have to be called very often and that the full sample of observations of events would be ignored. Let now \(\bar{C}_{i}(tT,\lambda ,\alpha ), T_{i,k} \le t < T_{i,k+1}, 1 \le k \le N_i1\) be the conditional expectation of \(C_{i}(t)\)—conditional upon the sample (1). Also let \(U_i\) be the conditional time of occurrence of the next event i after \(T_{i,k+1}\), conditional upon T and parameters \(\lambda\) and \(\alpha\). Then
and
Herein the following must be observed:
and
With (53) the estimation problem requires minimizing \(\widehat{SSQ}\), which is the sum of squares of the deviations between the estimated numbers of cumulated events—observed and estimated, respectively—with respect to \(\lambda \, \text{and} \,\alpha\), i.e.
where (52) and (53) have been used.
Functional equations
Observing that (54) is of second order in the decision variables and that eventually positivity constraints must be satisfied, the appropriate technique to minimize (54) involves a nonlinear minimization algorithm. A powerful representative of this type of techniques is the FletcherReevesPolakRibiere (FRPR) algorithm, see Press (2007), for instance. Optimal estimators with respect to the quantities \((\lambda ,\alpha )=(\lambda _{i},\alpha _{i,j},i \in E,j\in E)\) according to the method of moments are obtained by

Differentiating (54) with respect to \(\lambda \, \text{and} \, \alpha _{i,j},i\in E,j \in E\)

Setting the results equal to zero and

Solving the resulting system of equations
Preparing the Numerical Solution
(55), when used in a minimization routine such as the FRPR method, may result in negative values for \(\lambda _{i}\) and \(\alpha _{i,j},i\in E,j\in E\). Both are undesirable effects. Negative values for \(\lambda _{i}\) would imply negative random components of the event rates, negative values for \(\alpha _{i,j},i \in E,j\in E\) would imply an unlikely healing effect, whereby the occurrence of events becomes less likely, the longer the backward time is. While this assumption is not entirely unlikely, it is not going to be considered any further in this paper.
Therefore, defining an array \(p_i\) through the correspondence
(54) can be written as
Differentiating (54) with respect to \(\lambda _{i}\, \text{and} \, \alpha _{i,j},i\in E,j\in E\) is now replaced by differentiating (55) with respect to \(p_{i,0},\ldots ,p_{i,1+n}\), and yields the following nonlinear system of equations:
with
Example
The problem at hand has its origins in monitoring trains in the German railroad industry. There is an abundance of historical data, however, some or most of it is discrete, event based material. Only slowly sensor data becomes available, less so because of technical reasons, but mostly due to the architectural database design complexity. This is why, parallel to stochastic time series analysis as used in evaluating sensor data, a mathematical model is needed to deal with event based data.
Figures 2 and 3 in Appendix 9 give two examples of approximating a cumulated event curve with an event rate model as given by (6). Both figures show the typical behaviour of this model in as far as the event rate grows quadratically with the backward time. This behaviour becomes very marked, when long backward times are observed. In a simulation context, this is not a critical phenomenon, because quadratically increasing failure rates make sure that untypically long interevent times become unlikely. Figure 3 also shows some typical behaviour of the model used: If there is enough structure in the time series and enough events, i.e. short backward times, then the model approximates the staircase structure represented by the cumulated event numbers to a sufficient degree of precision. As soon as backward times become long, event rates become prohibitively large. In Fig. 2 the following parameters have been used:\(n= 10,\lambda _i = 0.0001 + 0.0001*Z\) \(\alpha [i1,i] = 0.00001,\alpha [i,i] = 0.00001,\alpha [i,i+1] = 0.00001,\alpha [i,i+2] = 0.00001\). Figure 3 contains the approximation result of yet another example, whereby \(n = 10,\lambda _i = 0.0005 + 0.0005*Z\) \(\alpha [i1,i] = 0.0001 ,\alpha [i,i] = 0.0002, \alpha[i,i+1] = 0.0001,\alpha [i,i+2] = 0.000001\).
In both figures Z is a uniformly distributed random variable between 0 and 1. Please note the “quadratically explosive” nature of the expected cumulated event function with increasing backward times in Fig. 3.
Appendix
Appendix 1
One can prove that
and, upon using (17) and (19) this yields \(\gamma _{k+1}^2 \le \gamma _{k}^2*\rho ^2\), or, equivalently \(\gamma _{k+1} \le \gamma _{k}*\rho\).
Appendix 2
Using the substitution \(w:=\lambda u+\alpha \frac{u^2}{2}\), \(\frac{dw}{du}= \lambda +\alpha u\), \(u=0\Rightarrow w=0\), \(u=t\Rightarrow w=\lambda t+\alpha \frac{t^2}{2}\) one obtains
Appendix 3
Use the substitution \(v =\sqrt{\alpha }t+\frac{\lambda + s}{\sqrt{\alpha }}\), \(dv = \sqrt{\alpha } dt\), \(t=0\Rightarrow v = \frac{\lambda +s}{\sqrt{\alpha }}\), \(t=\infty \Rightarrow v =\infty\).
Appendix 4
The following holds:
with
Equations (63) can be shown to be equivalent to
Inserting (64) into (62) yields the statement.
Appendix 5
Appendix 6
Defining
one proves
which is equivalent to (43).
Appendix 7
Now, for the sake of simplicity, let \(X_j=0,j \in E\). This yields
Therefore
Appendix 8
Again, by letting \(X_l=0,l \in E\) one obtains , after some regrouping
Therefore
Appendix 9: Figures
Conclusions
This paper deals with a joint renewal process, whose component processes are coupled via failure rates depending linearly on the vector of backward times. It is shown such such a process can be described by a multidimensional renewal equation. In the case of stochastic independence an asymptotic approximation for the limiting cumulative number of events is derived. It is also shown, how the component processes become independent with one single quantity tending to zero. The model parameters can be estimated using the least squares principle. In order to prevent parameters such as rates and damage parameters from becoming negative, one can temporarily use the squares of the parameters as the decision variables in the least squares functional equations. A numerical example shows how the cumulative number of events is approximated by a continuous function.
The vector of backward times is by no means the only possible state variable to be used in a linear model. Rather, any statistic can be used, such as, for instance, the sliding average of the cumulated event numbers over a given embedding window.
References
Abramovitz M, Stegun IA. Handbook of Mathematical Functions. New York: Dover Publications; 1972.
Borgelt C, PicadoMuino D. Finding Frequent Patterns in Parallel Point Processes. Gonzalo Gutierrez Quiros, Mieres, Spain: European Centre for Soft Computing; 2012.
Cox DR. Renewal Theory. London: Methuen; 1962.
Gaigalas R, et al. Convergence of scaled renewal processes and a packet arrival model. Bernoulli. 2003;9(4):671–703.
Kai, Xu et al. Neural Decoding Using a Parallel Sequential Monte Carlo Method on Point Processes with Ensemble Effect. BioMed Research INternational. 2014; Volume 2014 . Article ID 685492.
M.J. Kallen et al. Superposition of renewal processes for modelling imperfect maintenance. Reliability, Risk and Safety: Theory and Applications. 2010.
Press WH, et al. Numerical Recipes  The Art of Scientific Computing. New York: Cambridge University Press; 2007.
Topics on Regenerative Processes. Boca Raton: CRC Press; 1994.
Truccolo W, et al. A Point Process Framework for Relating Neural Spiking Activity to Spiking History. Neural Ensemble and Extrinsic Covariate Effects, J Neurophysiol. 2005;93:1074–89.
Shanechi, Maryam Modir et al. A Parallel Pointprocess Filter for Estimation of Goaldirected Movements from Neural Signals. IEEE. 2010. 521524. Web. 2010 IEEE.
Authors’ contribution
WM suggested the use of the paradigm of a Joint Renewal Process as a Model for Event Based Data. SF helped in elaborating the multidimensional analogon of the renewal equation. DJ was inbstrumental in carrying out most of the Laplace transformations. LL programmed the Least Square MInimization in order to determine the model parameters. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Mergenthaler, W., Jaroszewski, D., Feller, S. et al. A joint renewal process used to model event based data. Decis. Anal. 3, 2 (2016). https://doi.org/10.1186/s4016501600199
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s4016501600199