Title: | Recurrent Event Data Analysis |
---|---|
Description: | Contains implementations of recurrent event data analysis routines including (1) survival and recurrent event data simulation from stochastic process point of view by the thinning method proposed by Lewis and Shedler (1979) <doi:10.1002/nav.3800260304> and the inversion method introduced in Cinlar (1975, ISBN:978-0486497976), (2) the mean cumulative function (MCF) estimation by the Nelson-Aalen estimator of the cumulative hazard rate function, (3) two-sample recurrent event responses comparison with the pseudo-score tests proposed by Lawless and Nadeau (1995) <doi:10.2307/1269617>, (4) gamma frailty model with spline rate function following Fu, et al. (2016) <doi:10.1080/10543406.2014.992524>. |
Authors: | Wenjie Wang [aut, cre] , Haoda Fu [aut], Sy Han (Steven) Chiou [ctb], Jun Yan [ctb] |
Maintainer: | Wenjie Wang <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.5.4 |
Built: | 2024-11-02 04:28:21 UTC |
Source: | https://github.com/wenjie2wang/reda |
The R package reda provides functions for simulating, exploring and modeling recurrent event data.
The main functions are summarized as follows:
simEventData
: Simulating survival, recurrent event, and
multiple event data from stochastic process point of view.
mcf
: Estimating the mean cumulative function (MCF) from a
fitted gamma frailty model, or from a sample recurrent event data by using
the nonparametic MCF estimator (the Nelson-Aelen estimator of the cumulative
hazard function).
mcfDiff
: Comparing two-sample MCFs by the pseudo-score tests
and estimating their difference over time.
rateReg
: Fitting Gamma fraitly model with spline baseline rate
function.
See the package vignettes for more introduction and demonstration.
AIC,rateReg-method
is an S4 class method calculating Akaike
information criterion (AIC) for one or several rateReg
objects,
according to the formula - 2 * log-likelihood + 2 * nPar, where nPar
represents the number of parameters in the fitted model.
## S4 method for signature 'rateReg' AIC(object, ..., k = 2)
## S4 method for signature 'rateReg' AIC(object, ..., k = 2)
object |
An object used to dispatch a method. |
... |
Optionally more fitted model objects. |
k |
An optional numeric value used as the penalty per parameter. The
default |
When comparing models fitted by maximum likelihood to the same data, the
smaller the AIC, the better the fit. A friendly warning will be thrown out
if the numbers of observation were different in the model comparison.
help(AIC, stats)
for other details.
If just one object is provided, a numeric value representing
calculated AIC. If multiple objects are provided, a data frame with
rows corresponding to the objects and columns df
and AIC
,
where df
means degree of freedom, which is the number of
parameters in the fitted model.
rateReg
for model fitting;
summary,rateReg-method
for summary of a fitted model;
BIC,rateReg-method
for BIC.
## See examples given in function rateReg.
## See examples given in function rateReg.
Summarize and convert the recurrent episodes for each subjects into character strings.
## S4 method for signature 'Recur' as.character(x, ...)
## S4 method for signature 'Recur' as.character(x, ...)
x |
An Recur object. |
... |
Other arguments for future usage. |
This function is intended to be a helper function for the 'show()' method of 'Recur' objects. To be precise, the function set the maximum number of recurrent episodes for each subject to be 'max(2L, as.integer(getOption("reda.Recur.maxPrint")))'. By default, at most three recurrent episodes will be summarized for printing. When subjects having more than three recurrent episodes, the first 'getOption("reda.Recur.maxPrint") - 1' number of recurrent episodes and the last one will be summarized. One may use 'options()' to adjust the setting. For example, the default value is equivalent to 'options(reda.Recur.maxPrint = 3)'.
An S4 class generic function that returns the estimated baseline rate function.
baseRate(object, ...) ## S4 method for signature 'rateReg' baseRate(object, level = 0.95, control = list(), ...)
baseRate(object, ...) ## S4 method for signature 'rateReg' baseRate(object, level = 0.95, control = list(), ...)
object |
An object used to dispatch a method. |
... |
Other arguments for future usage. |
level |
An optional numeric value indicating the confidence level required. The default value is 0.95. |
control |
An optional list to specify the time grid
where the baseline rate function is estimated.
The availble elements of the control list include
|
A baseRate
object.
baseRate,rateReg-method
: Estiamted baseline rate from a fitted model.
rateReg
for model fitting;
summary,rateReg-method
for summary of a fitted model;
plot,baseRate.rateReg-method
for ploting method.
## See examples given in function rateReg.
## See examples given in function rateReg.
An S4 class that represents the estimated baseline rate function from model.
The function baseRate
produces objects of this class.
baseRate
A data frame.
level
A numeric value.
BIC,rateReg-method
is an S4 class method calculating
Bayesian information criterion (BIC) or so-called
Schwarz's Bayesian criterion (SBC)
for one or several rateReg
objects,
according to the formula
- 2 * log-likelihood + ln(nObs) * nPar,
where nPar represents the number of parameters in the fitted model
and nObs is the number of observations.
## S4 method for signature 'rateReg' BIC(object, ...)
## S4 method for signature 'rateReg' BIC(object, ...)
object |
An object used to dispatch a method. |
... |
More fitted model objects. |
When comparing models fitted by maximum likelihood to the same
data, the smaller the BIC, the better the fit.
help(BIC, stats)
for other details.
If just one object is provided, a numeric value representing
calculated BIC.
If multiple objects are provided, a data frame with rows
corresponding to the objects and columns df
and BIC
,
where df
means degree of freedom,
which is the number of parameters in the fitted model.
rateReg
for model fitting;
summary,rateReg-method
for summary of a fitted model;
AIC,rateReg-method
for AIC.
## See examples given in function rateReg.
## See examples given in function rateReg.
Perform several checks for recurrent event data and update object
attributions if some rows of the contained data (in the .Data
slot)
have been removed by such as na.action
.
check_Recur(x, check = c("hard", "soft", "none"))
check_Recur(x, check = c("hard", "soft", "none"))
x |
An |
check |
A character value specifying how to perform the checks for
recurrent event data. Errors or warnings will be thrown, respectively,
if the |
An Recur
object invisibly.
coef,rateReg-method
is an S4 class method that extracts estimated
coefficients of covariates from rateReg
object produced by function
rateReg
.
## S4 method for signature 'rateReg' coef(object, ...)
## S4 method for signature 'rateReg' coef(object, ...)
object |
A |
... |
Other arguments for future usage. |
A named numeric vector.
rateReg
for model fitting;
confint,rateReg-method
for confidence intervals
for covariate coefficients;
summary,rateReg-method
for summary of a fitted model.
## See examples given in function rateReg.
## See examples given in function rateReg.
confint,rateReg-method
is an S4 class method for
rateReg
object, which returns approximate confidence intervals
for all or specified covariates.
## S4 method for signature 'rateReg' confint(object, parm, level = 0.95, ...)
## S4 method for signature 'rateReg' confint(object, parm, level = 0.95, ...)
object |
A |
parm |
A specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. |
level |
An optional numeric value to specify the confidence level required. By default, the value is 0.95, which produces 95% confidence intervals. |
... |
Other arguments for future usage. |
Under regularity condition (Shao 2003, Theorem 4.16 and Theorem 4.17, page 287, 290), the approximate confidence intervals are constructed loosely based on Fisher information matrix and estimates of coefficients.
A numeric matrix with row names and column names.
Shao, J. (2003), Mathematical statistics, Springer texts in statistics, New York: Springer, 2nd Edition.
rateReg
for model fitting;
coef,rateReg-method
for point estimates
of covariate coefficients;
summary,rateReg-method
for summary of a fitted model.
## See examples given in function rateReg.
## See examples given in function rateReg.
Return TRUE
if the specified xect is from the Recur
class, FALSE
otherwise.
is.Recur(x)
is.Recur(x)
x |
An |
A logical value.
An S4 class generic function that returns the mean cumulative function (MCF) estimates from a fitted model or returns the nonparametric MCF estimates (by Nelson-Aalen estimator or Cook-Lawless cumulative sample mean estimator) from the sample data.
mcf(object, ...) ## S4 method for signature 'formula' mcf( object, data, subset, na.action, variance = c("LawlessNadeau", "Poisson", "bootstrap", "CSV", "none"), logConfInt = FALSE, adjustRiskset = TRUE, level = 0.95, control = list(), ... ) ## S4 method for signature 'rateReg' mcf( object, newdata, groupName, groupLevels, level = 0.95, na.action, control = list(), ... )
mcf(object, ...) ## S4 method for signature 'formula' mcf( object, data, subset, na.action, variance = c("LawlessNadeau", "Poisson", "bootstrap", "CSV", "none"), logConfInt = FALSE, adjustRiskset = TRUE, level = 0.95, control = list(), ... ) ## S4 method for signature 'rateReg' mcf( object, newdata, groupName, groupLevels, level = 0.95, na.action, control = list(), ... )
object |
An object used to dispatch a method. |
... |
Other arguments for future usage. |
data |
A data frame, list or environment containing the variables in
the model. If not found in data, the variables are taken from
|
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
A function that indicates what should the procedure do if
the data contains |
variance |
A character specifying the method for variance estimates.
The available options are |
logConfInt |
A logical value. If |
adjustRiskset |
A logical value indicating whether to adjust the size
of risk-set. If |
level |
An optional numeric value indicating the confidence level required. The default value is 0.95. |
control |
An optional named list specifying other options. For
The option For formula method, the available named elements are given as follows:
|
newdata |
An optional data frame. If specified, the data frame should have the same column names as the covariate names appearing in the formula of original fitting. |
groupName |
An optional length-one charactor vector to specify the name
for grouping each unique row in |
groupLevels |
An optional charactor vector to specify the levels for
each unique row in |
For formula
object with Recur
object as response, the
covariate specified at the right hand side of the formula should be either
1
or any "linear" conbination of categorical variable in the data.
The former computes the overall sample MCF. The latter computes the sample
MCF for each level of the combination of the categorical variable(s)
specified, respectively.
The MCF estimates are computed on each unique time point of the sample data.
By default, the size of risk set is adjusted over time based on the at-risk
indicators, which results in the Nelson-Aalen nonparametric estimator
(Nelson 2003). If the size of risk set remains a constant (total number of
processes) over time (specified by adjustRiskset = FALSE
), the
cumulative sample mean (CSM) function introduced in Chapter 1 of Cook and
Lawless (2007) will be computed instead. The point estimate of sample MCF
at each time point does not assume any particular underlying model. The
variance estimates at each time point is computed following the Lawless and
Nadeau method (LawLess and Nadeau 1995), the Poisson process method, or the
bootstrap methods. The approximate confidence intervals are provided as
well, which are constructed based on the asymptotic normality of the MCF
itself (by default) or the logarithm of MCF.
For rateReg
object, mcf
estimates the baseline MCF and its
confidence interval at each time grid if argument newdata
is not
specified. Otherwise, mcf
estimates MCF and its confidence interval
for the given newdata
based on Delta-method.
A mcf.formula
or mcf.rateReg
object.
A brief description of the slots of a mcf.formula
object is given as
follows:
formula
: Model Formula.
data
: Processed data based on the model formula or an
empty data frame if keep.data
is set to be FALSE
.
MCF
: A data frame containing estimates for sample MCF.
origin
: Time origins.
multiGroup
: A logical value indicating whether MCF
is estimated for different groups respectively.
logConfInt
: A logical value indicating whether the
variance estimates are based on the normality of logarithm of
the MCF estimates.
level
: Confidence level specified.
Most slots of a mcf.rateReg
object are inherited from the input
rateReg
object. A brief description of other slots is given as
follows:
newdata
: Given dataset used to estimate MCF.
MCF
: A data frame containing MCF estimates.
level
: Confidence level specified.
na.action
: The way handling missing values.
control
: The control list.
multiGroup
: A logical value indicating whether MCF
is estimated for different groups respectively.
mcf,formula-method
: Sample MCF from data.
mcf,rateReg-method
: Estimated MCF from a fitted model.
Cook, R. J., and Lawless, J. (2007). The statistical analysis of recurrent events, Springer Science & Business Media.
Lawless, J. F. and Nadeau, C. (1995). Some Simple Robust Methods for the Analysis of Recurrent Events. Technometrics, 37, 158–168.
Nelson, W. B. (2003). Recurrent Events Data Analysis for Product Repairs, Disease Recurrences, and Other Applications (Vol. 10). SIAM.
rateReg
for model fitting;
mcfDiff
for comparing two-sample MCFs.
plot-method
for plotting MCF.
library(reda) ### sample MCF ## Example 1. valve-seat data ## the default variance estimates by Lawless and Nadeau (1995) method valveMcf0 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats) plot(valveMcf0, conf.int = TRUE, mark.time = TRUE, addOrigin = TRUE) + ggplot2::xlab("Days") + ggplot2::theme_bw() ## variance estimates following Poisson process model valveMcf1 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats, variance = "Poisson") ## variance estimates by bootstrap method (with 1,000 bootstrap samples) set.seed(123) valveMcf2 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats, variance = "bootstrap", control = list(B = 200)) ## comparing the variance estimates from different methods library(ggplot2) ciDat <- rbind(cbind(valveMcf0@MCF, Method = "Lawless & Nadeau"), cbind(valveMcf1@MCF, Method = "Poisson"), cbind(valveMcf2@MCF, Method = "Bootstrap")) ggplot(ciDat, aes(x = time, y = se)) + geom_step(aes(color = Method, linetype = Method)) + xlab("Days") + ylab("SE estimates") + theme_bw() ## comparing the confidence interval estimates from different methods ggplot(ciDat, aes(x = time)) + geom_step(aes(y = MCF)) + geom_step(aes(y = lower, color = Method, linetype = Method)) + geom_step(aes(y = upper, color = Method, linetype = Method)) + xlab("Days") + ylab("Confidence intervals") + theme_bw() ## Example 2. the simulated data simuMcf <- mcf(Recur(time, ID, event) ~ group + gender, data = simuDat, ID %in% 1 : 50) plot(simuMcf, conf.int = TRUE, lty = 1 : 4, legendName = "Treatment & Gender") ### estimate MCF difference between two groups ## one sample MCF object of two groups mcf0 <- mcf(Recur(time, ID, event) ~ group, data = simuDat) ## two-sample pseudo-score tests mcfDiff.test(mcf0) ## difference estimates over time mcf0_diff <- mcfDiff(mcf0, testVariance = "none") plot(mcf0_diff) ## or explicitly ask for the difference of two sample MCF mcf1 <- mcf(Recur(time, ID, event) ~ 1, data = simuDat, subset = group %in% "Contr") mcf2 <- mcf(Recur(time, ID, event) ~ 1, data = simuDat, subset = group %in% "Treat") ## perform two-sample tests and estimate difference at the same time mcf12_diff1 <- mcfDiff(mcf1, mcf2) mcf12_diff2 <- mcf1 - mcf2 # or equivalently using the `-` method stopifnot(all.equal(mcf12_diff1, mcf12_diff2)) mcf12_diff1 plot(mcf12_diff1) ### For estimated MCF from a fitted model, ### see examples given in function rateReg.
library(reda) ### sample MCF ## Example 1. valve-seat data ## the default variance estimates by Lawless and Nadeau (1995) method valveMcf0 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats) plot(valveMcf0, conf.int = TRUE, mark.time = TRUE, addOrigin = TRUE) + ggplot2::xlab("Days") + ggplot2::theme_bw() ## variance estimates following Poisson process model valveMcf1 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats, variance = "Poisson") ## variance estimates by bootstrap method (with 1,000 bootstrap samples) set.seed(123) valveMcf2 <- mcf(Recur(Days, ID, No.) ~ 1, data = valveSeats, variance = "bootstrap", control = list(B = 200)) ## comparing the variance estimates from different methods library(ggplot2) ciDat <- rbind(cbind(valveMcf0@MCF, Method = "Lawless & Nadeau"), cbind(valveMcf1@MCF, Method = "Poisson"), cbind(valveMcf2@MCF, Method = "Bootstrap")) ggplot(ciDat, aes(x = time, y = se)) + geom_step(aes(color = Method, linetype = Method)) + xlab("Days") + ylab("SE estimates") + theme_bw() ## comparing the confidence interval estimates from different methods ggplot(ciDat, aes(x = time)) + geom_step(aes(y = MCF)) + geom_step(aes(y = lower, color = Method, linetype = Method)) + geom_step(aes(y = upper, color = Method, linetype = Method)) + xlab("Days") + ylab("Confidence intervals") + theme_bw() ## Example 2. the simulated data simuMcf <- mcf(Recur(time, ID, event) ~ group + gender, data = simuDat, ID %in% 1 : 50) plot(simuMcf, conf.int = TRUE, lty = 1 : 4, legendName = "Treatment & Gender") ### estimate MCF difference between two groups ## one sample MCF object of two groups mcf0 <- mcf(Recur(time, ID, event) ~ group, data = simuDat) ## two-sample pseudo-score tests mcfDiff.test(mcf0) ## difference estimates over time mcf0_diff <- mcfDiff(mcf0, testVariance = "none") plot(mcf0_diff) ## or explicitly ask for the difference of two sample MCF mcf1 <- mcf(Recur(time, ID, event) ~ 1, data = simuDat, subset = group %in% "Contr") mcf2 <- mcf(Recur(time, ID, event) ~ 1, data = simuDat, subset = group %in% "Treat") ## perform two-sample tests and estimate difference at the same time mcf12_diff1 <- mcfDiff(mcf1, mcf2) mcf12_diff2 <- mcf1 - mcf2 # or equivalently using the `-` method stopifnot(all.equal(mcf12_diff1, mcf12_diff2)) mcf12_diff1 plot(mcf12_diff1) ### For estimated MCF from a fitted model, ### see examples given in function rateReg.
An S4 class that represents sample mean cumulative function (MCF) from data.
The function mcf
produces objects of this class.
formula
Formula.
data
A data frame.
MCF
A data frame.
origin
A named numeric vector.
multiGroup
A logical value.
variance
A character vector.
logConfInt
A logical value.
level
A numeric value.
An S4 class that represents estimated mean cumulative function (MCF) from
Models. The function mcf
produces objects of this class.
call
Function call.
formula
Formula.
spline
A character.
knots
A numeric vector.
degree
A nonnegative integer.
Boundary.knots
A numeric vector.
newdata
A numeric matrix.
MCF
A data frame.
level
A numeric value between 0 and 1.
na.action
A length-one character vector.
control
A list.
multiGroup
A logical value.
This function estimates the sample MCF difference between two groups. Both the point estimates and the confidence intervals are computed (Lawless and Nadeau 1995). The two-sample pseudo-score test proposed by Cook, Lawless, and Nadeau (1996) is also performed by default.
mcfDiff(mcf1, mcf2 = NULL, level = 0.95, ...) mcfDiff.test( mcf1, mcf2 = NULL, testVariance = c("robust", "Poisson", "none"), ... )
mcfDiff(mcf1, mcf2 = NULL, level = 0.95, ...) mcfDiff.test( mcf1, mcf2 = NULL, testVariance = c("robust", "Poisson", "none"), ... )
mcf1 |
A |
mcf2 |
An optional second |
level |
A numeric value indicating the confidence level required. The default value is 0.95. |
... |
Other arguments passed to |
testVariance |
A character string specifying the method for computing
the variance estimate for the pseudo-score test statistic proposed by
Cook, Lawless, and Nadeau (1996). The applicable options include
|
The function mcfDiff
estimates the two-sample MCFs' difference and
internally calls function mcfDiff.test
to perform the pseudo-score
tests by default. A -
method is available as a simple wrapper for the
function mcfDiff
for comparing two-sample MCFs from two
mcf.formula
objects. For instance, suppose mcf1
and
mcf2
are mcf.formula
objects, each of which represents the
sample MCF estimates for one group. The function call mcf1 - mcf2
is
equivalent to mcfDiff(mcf1, mcf2)
.
The null hypothesis of the two-sample pseudo-score test is that there is no
difference between the two sample MCFs, while the alternative hypothesis
suggests a difference. The test is based on a family of test statistics
proposed by Lawless and Nadeau (1995). The argument testVariance
specifies the method for computing the variance estimates of the test
statistics under different model assumption. See the document of argument
testVariance
for all applicable options. For the variance estimates
robust to departures from Poisson process assumption, both constant weight
and the linear weight function (with scaling) suggested in Cook, Lawless,
and Nadeau (1996) are implemented. The constant weight is powerful in cases
where the two MCFs are approximately proportional to each other. The linear
weight function is originally a(u) = t - u
, where u
represents
the time variable and t
is the first time point when the risk set of
either group becomes empty. It is further scaled by 1 / t
for test
statistics invariant to the unit of measurement of the time variable. The
linear weight function puts more emphasis on the difference at earily times
than later times and is more powerful for cases where the MCFs are no longer
proportional to each other, but not crossing. Also see Cook and Lawless
(2007, Section 3.7.5) for more details.
The function mcfDiff
returns a mcfDiff
object (of S4 class)
that contains the following slots:
call
: Function call.
MCF
: Estimated Mean cumulative function Difference at each time
point.
origin
: Time origins of the two groups.
variance
: The method used for variance estimates.
logConfInt
: A logical value indicating whether normality is
assumed for log(MCF)
instead of MCF itself. For mcfDiff
object, it is always FALSE
.
level
: Confidence level specified.
test
: A mcfDiff.test
object for the hypothesis test
results.
The function mcfDiff.test
returns a mcfDiff.test
object (of S4
class) that contains the following slots:
.Data
: A numeric matrix (of two rows and five columns) for
hypothesis testing results.
testVariance
: A character string (or vector of length one)
indicating the method used for the variance estimates of the test statistic.
Lawless, J. F., & Nadeau, C. (1995). Some Simple Robust Methods for the Analysis of Recurrent Events. Technometrics, 37(2), 158–168.
Cook, R. J., Lawless, J. F., & Nadeau, C. (1996). Robust Tests for Treatment Comparisons Based on Recurrent Event Responses. Biometrics, 52(2), 557–571.
Cook, R. J., & Lawless, J. (2007). The Statistical Analysis of Recurrent Events. Springer Science & Business Media.
## See examples given for function mcf.
## See examples given for function mcf.
An S4 class that represents the difference between two sample mean
cumulative functions from data. The function mcfDiff
produces objects of this class.
call
A function call.
MCF
A data frame.
origin
A named numeric vector.
variance
A character vector.
logConfInt
A logical value.
level
A numeric value.
test
A mcfDiff.test
class object.
An S4 class that represents the results of the two-sample pseudo-score tests
between two sample mean cumulative functions. The function
mcfDiff.test
produces objects of this class.
.Data
A numeric matrix.
testVariance
A character vector.
This function helps the parametrizations of covariates and covariate
coeffcients when users specify a general hazard rate function in function
simEvent
and simEventData
. It applies the specified function
(or the built-in option) FUN
to the row of the covariate
matrix
z
and the row of the coefficient matrix,
iteratively, for
from one to the number of rows of the covariate
matrix
z
.
parametrize(z, zCoef, FUN = c("exponential", "linear", "excess"), ...)
parametrize(z, zCoef, FUN = c("exponential", "linear", "excess"), ...)
z |
A numeric matrix, each row of which represents the covariate vector at one perticular time point. |
zCoef |
A numeric matrix, each row of which represents the covariate coeffcient vector at one perticular time point. |
FUN |
The parametrization of the model parameter(s) with covariates and
covariate coefficients. The built-in options include
|
... |
Other arguments that can be passed to the function |
A numeric vector.
simEvent
## time points timeVec <- c(0.5, 2) ## time-variant covariates zMat <- cbind(0.5, ifelse(timeVec > 1, 1, 0)) ## time-varying coefficients zCoefMat <- cbind(sin(timeVec), timeVec) ## the following three ways are equivalent for the exponential form, ## where the first one (using the built-in option) has the best performance parametrize(zMat, zCoefMat, FUN = "exponential") parametrize(zMat, zCoefMat, function(z, zCoef) exp(z %*% zCoef)) sapply(1 : 2, function(i) as.numeric(exp(zMat[i, ] %*% zCoefMat[i, ])))
## time points timeVec <- c(0.5, 2) ## time-variant covariates zMat <- cbind(0.5, ifelse(timeVec > 1, 1, 0)) ## time-varying coefficients zCoefMat <- cbind(sin(timeVec), timeVec) ## the following three ways are equivalent for the exponential form, ## where the first one (using the built-in option) has the best performance parametrize(zMat, zCoefMat, FUN = "exponential") parametrize(zMat, zCoefMat, function(z, zCoef) exp(z %*% zCoef)) sapply(1 : 2, function(i) as.numeric(exp(zMat[i, ] %*% zCoefMat[i, ])))
S4 class methods plotting sample MCF from data, estimated MCF, or estimated
baseline hazard rate function from a fitted model by using ggplot2
plotting system. The plots generated are thus able to be further customized
properly.
## S4 method for signature 'mcf.formula,missing' plot( x, y, lty, col, legendName, legendLevels, conf.int = FALSE, mark.time = FALSE, addOrigin = FALSE, ... ) ## S4 method for signature 'mcf.rateReg,missing' plot(x, y, conf.int = FALSE, lty, col, ...) ## S4 method for signature 'baseRate.rateReg,missing' plot(x, y, conf.int = FALSE, lty, col, ...) ## S4 method for signature 'mcfDiff,missing' plot( x, y, lty, col, legendName, legendLevels, conf.int = TRUE, addOrigin = FALSE, ... )
## S4 method for signature 'mcf.formula,missing' plot( x, y, lty, col, legendName, legendLevels, conf.int = FALSE, mark.time = FALSE, addOrigin = FALSE, ... ) ## S4 method for signature 'mcf.rateReg,missing' plot(x, y, conf.int = FALSE, lty, col, ...) ## S4 method for signature 'baseRate.rateReg,missing' plot(x, y, conf.int = FALSE, lty, col, ...) ## S4 method for signature 'mcfDiff,missing' plot( x, y, lty, col, legendName, legendLevels, conf.int = TRUE, addOrigin = FALSE, ... )
x |
An object used to dispatch a method. |
y |
An argument that should be missing and ignored now. Its existence
is just for satisfying the definition of generaic function |
lty |
An optional numeric vector indicating line types specified to different groups: 0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash. |
col |
An optional character vector indicating line colors specified to different groups. |
legendName |
An optional length-one charactor vector to specify the
name for grouping each unique row in |
legendLevels |
An optional charactor vector to specify the levels for
each unique row in |
conf.int |
A logical value indicating whether to plot confidence
interval. The default value is |
mark.time |
A logical value with default value |
addOrigin |
A logical value indicating whether the MCF curves start
from origin time. The default value is |
... |
Other arguments for further usage. |
A ggplot
object.
mcf
for estimation of MCF;
rateReg
for model fitting.
## See examples given in function mcf and rateReg.
## See examples given in function mcf and rateReg.
This function fits recurrent event data (event counts) by gamma frailty model with spline rate function. The default model is the gamma frailty model with one piece constant baseline rate function, which is equivalent to negative binomial regression with the same shape and rate parameter in the gamma prior. Spline (including piecewise constant) baseline hazard rate function can be specified for the model fitting.
rateReg( formula, data, subset, na.action, start = list(), control = list(), contrasts = NULL, ... ) rateReg.control( df = NULL, degree = 0L, knots = NULL, Boundary.knots = NULL, periodic = FALSE, verbose = TRUE, ... )
rateReg( formula, data, subset, na.action, start = list(), control = list(), contrasts = NULL, ... ) rateReg.control( df = NULL, degree = 0L, knots = NULL, Boundary.knots = NULL, periodic = FALSE, verbose = TRUE, ... )
formula |
|
data |
An optional data frame, list or environment containing the
variables in the model. If not found in data, the variables are taken
from |
subset |
An optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
A function that indicates what should the procedure do if
the data contains |
start |
An optional list of starting values for the parameters to be estimated in the model. See more in Section details. |
control |
An optional list of parameters to control the maximization process of negative log likelihood function and adjust the baseline rate function. See more in Section details. |
contrasts |
An optional list, whose entries are values (numeric
matrices or character strings naming functions) to be used as
replacement values for the contrasts replacement function and whose
names are the names of columns of data containing factors. See
|
... |
Other arguments passed to |
df |
A nonnegative integer to specify the degree of freedom of baseline
rate function. If argument |
degree |
A nonnegative integer to specify the degree of spline bases. |
knots |
A numeric vector that represents all the internal knots of
baseline rate function. The default is |
Boundary.knots |
A length-two numeric vector to specify the boundary knots for baseline rate funtion. By default, the left boundary knot is the smallest origin time and the right one takes the largest censoring time from data. |
periodic |
A logical value indicating if periodic splines should be used. |
verbose |
A logical value with default |
Function Recur
in the formula response by default first checks
the dataset and will report an error if the dataset does not fall into
recurrent event data framework. Subject's ID will be pinpointed if its
observation violates any checking rule. See Recur
for all the
checking rules.
Function rateReg
first constructs the design matrix from
the specified arguments: formula
, data
, subset
,
na.action
and constrasts
before model fitting.
The constructed design matrix will be checked again to
fit the recurrent event data framework
if any observation with missing covariates is removed.
The model fitting process involves minimization of negative log
likelihood function, which calls function constrOptim
internally. help(constrOptim)
for more details.
The argument start
is an optional list
that allows users to specify the initial guess for
the parameter values for the minimization of
negative log likelihood function.
The available numeric vector elements in the list include
beta
: Coefficient(s) of covariates,
set to be all 0.1 by default.
theta
: Parameter in Gamma(theta, 1 / theta) for
frailty random effect, set to be 0.5 by default.
alpha
: Coefficient(s) of baseline rate function,
set to be all 0.05 by default.
The argument control
allows users to control the process of
minimization of negative log likelihood function passed to
constrOptim
and specify the boundary knots of baseline rate function.
A rateReg
object, whose slots include
call
: Function call of rateReg
.
formula
: Formula used in the model fitting.
nObs
: Number of observations.
spline
: A list contains
spline
: The name of splines used.
knots
: Internal knots specified for the baseline
rate function.
Boundary.knots
: Boundary knots specified for the
baseline rate function.
degree
: Degree of spline bases specified in
baseline rate function.
df
: Degree of freedom of the model specified.
estimates
: Estimated coefficients of covariates and
baseline rate function, and estimated rate parameter of
gamma frailty variable.
control
: The control list specified for model fitting.
start
: The initial guess specified for the parameters
to be estimated.
na.action
: The procedure specified to deal with
missing values in the covariate.
xlevels
: A list that records the levels in
each factor variable.
contrasts
: Contrasts specified and used for each
factor variable.
convergCode
: code
returned by function
optim
, which is an integer indicating why the
optimization process terminated. help(optim)
for details.
logL
: Log likelihood of the fitted model.
fisher
: Observed Fisher information matrix.
Fu, H., Luo, J., & Qu, Y. (2016). Hypoglycemic events analysis via recurrent time-to-event (HEART) models. Journal Of Biopharmaceutical Statistics, 26(2), 280–298.
summary,rateReg-method
for summary of fitted model;
coef,rateReg-method
for estimated covariate coefficients;
confint,rateReg-method
for confidence interval of
covariate coefficients;
baseRate,rateReg-method
for estimated coefficients of baseline
rate function;
mcf,rateReg-method
for estimated MCF from a fitted model;
plot,mcf.rateReg-method
for plotting estimated MCF.
library(reda) ## constant rate function (constFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat)) ## six pieces' piecewise constant rate function (piecesFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat, subset = ID %in% 1:50, knots = seq.int(28, 140, by = 28))) ## fit rate function with cubic spline (splineFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat, knots = c(56, 84, 112), degree = 3)) ## more specific summary summary(constFit) summary(piecesFit) summary(splineFit) ## model selection based on AIC or BIC AIC(constFit, piecesFit, splineFit) BIC(constFit, piecesFit, splineFit) ## estimated covariate coefficients coef(piecesFit) coef(splineFit) ## confidence intervals for covariate coefficients confint(piecesFit) confint(splineFit, "x1", 0.9) confint(splineFit, 1, 0.975) ## estimated baseline rate function splinesBase <- baseRate(splineFit) plot(splinesBase, conf.int = TRUE) ## estimated baseline mean cumulative function (MCF) from a fitted model piecesMcf <- mcf(piecesFit) plot(piecesMcf, conf.int = TRUE, col = "blueviolet") ## estimated MCF for given new data newDat <- data.frame(x1 = rep(0, 2), group = c("Treat", "Contr")) splineMcf <- mcf(splineFit, newdata = newDat, groupName = "Group", groupLevels = c("Treatment", "Control")) plot(splineMcf, conf.int = TRUE, lty = c(1, 5)) ## example of further customization by ggplot2 library(ggplot2) plot(splineMcf) + geom_ribbon(aes(x = time, ymin = lower, ymax = upper, fill = Group), data = splineMcf@MCF, alpha = 0.2) + xlab("Days")
library(reda) ## constant rate function (constFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat)) ## six pieces' piecewise constant rate function (piecesFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat, subset = ID %in% 1:50, knots = seq.int(28, 140, by = 28))) ## fit rate function with cubic spline (splineFit <- rateReg(Recur(time, ID, event) ~ group + x1, data = simuDat, knots = c(56, 84, 112), degree = 3)) ## more specific summary summary(constFit) summary(piecesFit) summary(splineFit) ## model selection based on AIC or BIC AIC(constFit, piecesFit, splineFit) BIC(constFit, piecesFit, splineFit) ## estimated covariate coefficients coef(piecesFit) coef(splineFit) ## confidence intervals for covariate coefficients confint(piecesFit) confint(splineFit, "x1", 0.9) confint(splineFit, 1, 0.975) ## estimated baseline rate function splinesBase <- baseRate(splineFit) plot(splinesBase, conf.int = TRUE) ## estimated baseline mean cumulative function (MCF) from a fitted model piecesMcf <- mcf(piecesFit) plot(piecesMcf, conf.int = TRUE, col = "blueviolet") ## estimated MCF for given new data newDat <- data.frame(x1 = rep(0, 2), group = c("Treat", "Contr")) splineMcf <- mcf(splineFit, newdata = newDat, groupName = "Group", groupLevels = c("Treatment", "Control")) plot(splineMcf, conf.int = TRUE, lty = c(1, 5)) ## example of further customization by ggplot2 library(ggplot2) plot(splineMcf) + geom_ribbon(aes(x = time, ymin = lower, ymax = upper, fill = Group), data = splineMcf@MCF, alpha = 0.2) + xlab("Days")
The class rateReg
is an S4 class that represents a fitted model. The
function rateReg
produces objects of this class. See
“Slots” for details.
call
Function call.
formula
Formula.
nObs
A positive integer
spline
A list.
estimates
A list.
control
A list.
start
A list.
na.action
A character vector (of length one).
xlevels
A list.
contrasts
A list.
convergCode
A nonnegative integer.
logL
A numeric value.
fisher
A numeric matrix.
Create an S4 class object that represents formula response for recurrent event data with optional checking procedures embedded.
Recur( time, id, event, terminal, origin, check = c("hard", "soft", "none"), ... )
Recur( time, id, event, terminal, origin, check = c("hard", "soft", "none"), ... )
time |
A numerical vector representing the time of reccurence event or
censoring, or a list with elements named |
id |
Subject identificators. It can be numeric vector, character
vector, or a factor vector. If it is left unspecified, |
event |
A numeric vector that may represent the status, costs, or types of the recurrent events. Logical vector is allowed and converted to numeric vector. Non-positive values are internally converted to zero indicating censoring status. |
terminal |
A numeric vector that may represent the status, costs, or
types of the terminal events. Logical vector is allowed and converted
to numeric vector. Non-positive values are internally converted to zero
indicating censoring status. If a scalar value is specified, all
subjects will have the same status of terminal events at their last
recurrent episodes. The length of the specified |
origin |
The time origin of each subject. If a scalar value is
specified, all subjects will have the same origin at the specified
value. The length of the specified |
check |
A character value specifying how to perform the checks for
recurrent event data. Errors or warnings will be thrown, respectively,
if the |
... |
Other arguments for future usage. A warning will be thrown if any invalid argument is specified. |
This is a successor function of the deprecated function Survr
. See
the vignette by 'vignette("reda-Recur")' for details.
An Recur
object.
library(reda) with(valveSeats, Recur(Days, ID)) with(valveSeats, Recur(Days, ID, No.)) with(valveSeats, Recur(Days, ID, No., terminal = 1)) with(valveSeats, Recur(Days, ID, No., origin = 10))
library(reda) with(valveSeats, Recur(Days, ID)) with(valveSeats, Recur(Days, ID, No.)) with(valveSeats, Recur(Days, ID, No., terminal = 1)) with(valveSeats, Recur(Days, ID, No., origin = 10))
The class Recur
is an S4 that represents a formula response for
recurrent event data model. The function Recur
produces
objects of this class. See “Slots” for details.
.Data
A numeric matrix that consists of the following columns:
time1
: the beginning of time segements;
time2
: the end of time segements;
id
: Identificators
of subjects;
event
: Event indicators;
:
terminal
: Indicators of terminal events.
call
A function call producing the object.
ID
A character vector for unique original identificators of subjects.
ord
An integer vector for increasingly ordering data by id
,
time2
, and - event
. Sorting is often done in the
model-fitting steps, where the indices stored in this slot can be used
directly.
rev_ord
An integer vector for reverting the ordering of the sorted
data (by ord
) to its original ordering. This slot is provided to
easily revert the sorting.
first_idx
An integer vector indicating the first record of each subject in the sorted matrix. It helps in the data checking produce and may be helpful in model-fitting step, such as getting the origin time.
last_idx
An integer vector indicating the last record of each subject
in the sorted data. Similar to first_idx
, it helps in the data
checking produce and may be helpful in the model-fitting step, such as
locating the terminal events.
check
A character string indicating how the data checking is performed. It just records the option that users specified on data checking.
time_class
A character vector preserving the class(es) of input times.
Specify time segements or recurrent episodes by endpoints.
time1 %to% time2 time1 %2% time2
time1 %to% time2 time1 %2% time2
time1 |
The left end-points of the recurrent episodes. |
time2 |
The right end-points of the recurrent episodes. |
This function is intended to be used for specifying the argument time
in function Recur
.
A list that consists of two elements named
"time1"
and "time2"
.
S4 class methods that display objects produced from this package (similar to
S3 class print
methods).
## S4 method for signature 'Recur' show(object) ## S4 method for signature 'rateReg' show(object) ## S4 method for signature 'summary.rateReg' show(object) ## S4 method for signature 'summary.Recur' show(object) ## S4 method for signature 'mcf.formula' show(object) ## S4 method for signature 'mcf.rateReg' show(object) ## S4 method for signature 'simEvent' show(object) ## S4 method for signature 'mcfDiff' show(object) ## S4 method for signature 'mcfDiff.test' show(object)
## S4 method for signature 'Recur' show(object) ## S4 method for signature 'rateReg' show(object) ## S4 method for signature 'summary.rateReg' show(object) ## S4 method for signature 'summary.Recur' show(object) ## S4 method for signature 'mcf.formula' show(object) ## S4 method for signature 'mcf.rateReg' show(object) ## S4 method for signature 'simEvent' show(object) ## S4 method for signature 'mcfDiff' show(object) ## S4 method for signature 'mcfDiff.test' show(object)
object |
An object used to dispatch a method. |
The function simEvent
generates simulated recurrent events or
survival time (the first event time) from one stochastic process. The
function simEventData
provides a simple wrapper that calls
simEvent
internally and collects the generated survival data or
recurrent events into a data frame. More examples are available in one of
the package vignettes in addition to the function documentation.
simEvent( z = 0, zCoef = 1, rho = 1, rhoCoef = 1, rhoMax = NULL, origin = 0, endTime = 3, frailty = 1, recurrent = TRUE, interarrival = "rexp", relativeRisk = c("exponential", "linear", "excess", "none"), method = c("thinning", "inversion"), arguments = list(), ... ) simEventData(nProcess = 1, z = 0, origin = 0, endTime = 3, frailty = 1, ...)
simEvent( z = 0, zCoef = 1, rho = 1, rhoCoef = 1, rhoMax = NULL, origin = 0, endTime = 3, frailty = 1, recurrent = TRUE, interarrival = "rexp", relativeRisk = c("exponential", "linear", "excess", "none"), method = c("thinning", "inversion"), arguments = list(), ... ) simEventData(nProcess = 1, z = 0, origin = 0, endTime = 3, frailty = 1, ...)
z |
Time-invariant or time-varying covariates. The default value is
|
zCoef |
Time-invariant or time-varying coefficients of covariates. The
default value is |
rho |
Baseline rate (or intensity) function for the Poisson process.
The default is |
rhoCoef |
Coefficients of baseline rate function. The default value is
|
rhoMax |
A positive number representing an upper bound of the underlying rate function (excluding the frailty term but including the covariate effect) for the thinning method. If this argument is left unspecified, the function will try to determine an upper bound internally. |
origin |
The time origin set to be |
endTime |
The end of follow-up time set to be |
frailty |
A positive number or a function for frailty effect. The
default value is |
recurrent |
A logical value with default value |
interarrival |
A function object for randomly generating (positive)
interarrival time between two successive arrivals/events. The default
value is |
relativeRisk |
Relateive risk function for incorporating the covariates
and the covariate coefficients into the intensity function. The
applicable choices include |
method |
A character string specifying the method for generating simulated recurrent or survival data. The default method is thinning method (Lewis and Shedler 1979). Another available option is the inversion method (Cinlar 1975). When the rate function may go to infinite, the inversion method is used and a warning will be thrown out if the thinning method is initially specified. |
arguments |
A list that consists of named lists for specifying other
arguments in the corresponding functions. For example, if a function of
time named |
... |
Additional arguements passed from function |
nProcess |
Number of stochastic processes. If missing, the value will
be the number of row of the specified matrix |
For each process, a time-invariant or time-varying baseline hazard rate
(intensity) function of failure can be specified. Covariates and their
coefficients can be specified and incorporated by the specified relative
risk functions. The default is the exponential relative risk function, which
corresponds to the Cox proportional hazard model (Cox 1972) for survival
data or Andersen-Gill model (Andersen and Gill 1982) for recurrent
events. Other relative risk function can be specified through the argument
relativeRisk
. In addition, a frailty effect can be considered.
Conditional on predictors (or covariates) and the unobserved frailty effect,
the process is by default a Poisson process, where the interarrival times
between two successive arrivals/events follow exponential distribution. A
general renewal process can be specified through interarrival
for
other distributions of the interarrival times in addition to the exponential
distribution.
The thinning method (Lewis and Shedler 1979) is applied for bounded hazard rate function by default. The inversion method (Cinlar 1975) is also available for possibly unbounded but integrable rate function over the given time period. The inversion method will be used when the rate function may go to infinite and a warning will be thrown out if the thinning method is specified originally.
For the covariates z
, the covariate coefficients zCoef
, and
the baseline hazard rate function rho
, a function of time can be
specified for time-varying effect. The first argument of the input function
has to be the time variable (not need to be named as "time" though). Other
arguments of the function can be specified through a named list in
arguments
, while the first argument should not be specified.
For the frailty effect frailty
, the starting point origin
, and
the end point of the process endTime
, functions that generate random
numbers can be specified. An argument n = 1
will be implicitly
specified if the function has an argument named n
, which is designed
for those common functions generating random numbers from stats
package. Similar to z
, zCoef
, and rho
, other arguments
of the function can be specified through a named list in arguments
.
For time-varying covariates, the function simEventData
assumes
covariates can be observed only at event times and censoring times. Thus,
covariate values are returned only at these time points. If we want other
observed covariate values to be recorded, we may write a simple wrapper
function for simEvent
similar to simEventData
.
The function simEvent
returns a simEvent
S4 class object and
the function simEventData
returns a data.frame
.
Andersen, P. K., & Gill, R. D. (1982). Cox's regression model for counting processes: A large sample study. The annals of statistics, 10(4), 1100–1120.
Cinlar, Erhan (1975). Introduction to stochastic processes. Englewood Cliffs, NJ: Printice-Hall.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.
Lewis, P. A., & G. S. Shedler. (1979). Simulation of Nonhomogeneous Poisson Processes by Thinning. Naval Research Logistics Quarterly, 26(3), Wiley Online Library: 403–13.
library(reda) set.seed(123) ### time-invariant covariates and coefficients ## one process simEvent(z = c(0.5, 1), zCoef = c(1, 0)) simEvent(z = 1, zCoef = 0.5, recurrent = FALSE) ## simulated data simEventData(z = c(0.5, 1), zCoef = c(1, 0), endTime = 2) simEventData(z = cbind(rnorm(3), 1), zCoef = c(1, 0)) simEventData(z = matrix(rnorm(5)), zCoef = 0.5, recurrent = FALSE) ### time-varying covariates and time-varying coefficients zFun <- function(time, intercept) { cbind(time / 10 + intercept, as.numeric(time > 1)) } zCoefFun <- function(x, shift) { cbind(sqrt(x + shift), 1) } simEvent(z = zFun, zCoef = zCoefFun, arguments = list(z = list(intercept = 0.1), zCoef = list(shift = 0.1))) ## same function of time for all processes simEventData(3, z = zFun, zCoef = zCoefFun, arguments = list(z = list(intercept = 0.1), zCoef = list(shift = 0.1))) ## same function within one process but different between processes ## use quote function in the arguments simDat <- simEventData(3, z = zFun, zCoef = zCoefFun, arguments = list( z = list(intercept = quote(rnorm(1) / 10)), zCoef = list(shift = 0.1) )) ## check the intercept randomly generated, ## which should be the same within each ID but different between IDs. unique(with(simDat, cbind(ID, intercept = round(X.1 - time / 10, 6)))) ### non-negative time-varying baseline hazard rate function simEvent(rho = function(timeVec) { sin(timeVec) + 1 }) simEventData(3, origin = rnorm(3), endTime = rnorm(3, 5), rho = function(timeVec) { sin(timeVec) + 1 }) ## specify other arguments simEvent(z = c(rnorm(1), rbinom(1, 1, 0.5)) / 10, rho = function(a, b) { sin(a + b) + 1 }, arguments = list(rho = list(b = 0.5))) simEventData(z = cbind(rnorm(3), rbinom(3, 1, 0.5)) / 10, rho = function(a, b) { sin(a + b) + 1 }, arguments = list(rho = list(b = 0.5))) ## quadratic B-splines with one internal knot at "time = 1" ## (using function 'bSpline' from splines2 package) simEvent(rho = splines2::bSpline, rhoCoef = c(0.8, 0.5, 1, 0.6), arguments = list(rho = list(degree = 2, knots = 1, intercept = TRUE, Boundary.knots = c(0, 3)))) ### frailty effect ## Gamma distribution with mean one simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = rgamma, arguments = list(frailty = list(shape = 2, scale = 0.5))) ## lognormal with mean zero (on the log scale) set.seed(123) simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = "rlnorm", arguments = list(frailty = list(sdlog = 1))) ## or equivalently set.seed(123) logNorm <- function(a) exp(rnorm(n = 1, mean = 0, sd = a)) simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = logNorm, arguments = list(frailty = list(a = 1))) ### renewal process ## interarrival times following uniform distribution rUnif <- function(n, rate, min) runif(n, min, max = 2 / rate) simEvent(interarrival = rUnif, arguments = list(interarrival = list(min = 0))) ## interarrival times following Gamma distribution with scale one set.seed(123) simEvent(interarrival = function(n, rate) rgamma(n, shape = 1 / rate)) ## or equivalently set.seed(123) simEvent(interarrival = function(rate) rgamma(n = 1, shape = 1 / rate)) ### relative risk functioin set.seed(123) simEvent(relativeRisk = "linear") ## or equivalently rriskFun <- function(z, zCoef, intercept) { as.numeric(z %*% zCoef) + intercept } set.seed(123) simEvent(relativeRisk = rriskFun, arguments = list(relativeRisk = list(intercept = 1)))
library(reda) set.seed(123) ### time-invariant covariates and coefficients ## one process simEvent(z = c(0.5, 1), zCoef = c(1, 0)) simEvent(z = 1, zCoef = 0.5, recurrent = FALSE) ## simulated data simEventData(z = c(0.5, 1), zCoef = c(1, 0), endTime = 2) simEventData(z = cbind(rnorm(3), 1), zCoef = c(1, 0)) simEventData(z = matrix(rnorm(5)), zCoef = 0.5, recurrent = FALSE) ### time-varying covariates and time-varying coefficients zFun <- function(time, intercept) { cbind(time / 10 + intercept, as.numeric(time > 1)) } zCoefFun <- function(x, shift) { cbind(sqrt(x + shift), 1) } simEvent(z = zFun, zCoef = zCoefFun, arguments = list(z = list(intercept = 0.1), zCoef = list(shift = 0.1))) ## same function of time for all processes simEventData(3, z = zFun, zCoef = zCoefFun, arguments = list(z = list(intercept = 0.1), zCoef = list(shift = 0.1))) ## same function within one process but different between processes ## use quote function in the arguments simDat <- simEventData(3, z = zFun, zCoef = zCoefFun, arguments = list( z = list(intercept = quote(rnorm(1) / 10)), zCoef = list(shift = 0.1) )) ## check the intercept randomly generated, ## which should be the same within each ID but different between IDs. unique(with(simDat, cbind(ID, intercept = round(X.1 - time / 10, 6)))) ### non-negative time-varying baseline hazard rate function simEvent(rho = function(timeVec) { sin(timeVec) + 1 }) simEventData(3, origin = rnorm(3), endTime = rnorm(3, 5), rho = function(timeVec) { sin(timeVec) + 1 }) ## specify other arguments simEvent(z = c(rnorm(1), rbinom(1, 1, 0.5)) / 10, rho = function(a, b) { sin(a + b) + 1 }, arguments = list(rho = list(b = 0.5))) simEventData(z = cbind(rnorm(3), rbinom(3, 1, 0.5)) / 10, rho = function(a, b) { sin(a + b) + 1 }, arguments = list(rho = list(b = 0.5))) ## quadratic B-splines with one internal knot at "time = 1" ## (using function 'bSpline' from splines2 package) simEvent(rho = splines2::bSpline, rhoCoef = c(0.8, 0.5, 1, 0.6), arguments = list(rho = list(degree = 2, knots = 1, intercept = TRUE, Boundary.knots = c(0, 3)))) ### frailty effect ## Gamma distribution with mean one simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = rgamma, arguments = list(frailty = list(shape = 2, scale = 0.5))) ## lognormal with mean zero (on the log scale) set.seed(123) simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = "rlnorm", arguments = list(frailty = list(sdlog = 1))) ## or equivalently set.seed(123) logNorm <- function(a) exp(rnorm(n = 1, mean = 0, sd = a)) simEvent(z = c(0.5, 1), zCoef = c(1, 0), frailty = logNorm, arguments = list(frailty = list(a = 1))) ### renewal process ## interarrival times following uniform distribution rUnif <- function(n, rate, min) runif(n, min, max = 2 / rate) simEvent(interarrival = rUnif, arguments = list(interarrival = list(min = 0))) ## interarrival times following Gamma distribution with scale one set.seed(123) simEvent(interarrival = function(n, rate) rgamma(n, shape = 1 / rate)) ## or equivalently set.seed(123) simEvent(interarrival = function(rate) rgamma(n = 1, shape = 1 / rate)) ### relative risk functioin set.seed(123) simEvent(relativeRisk = "linear") ## or equivalently rriskFun <- function(z, zCoef, intercept) { as.numeric(z %*% zCoef) + intercept } set.seed(123) simEvent(relativeRisk = rriskFun, arguments = list(relativeRisk = list(intercept = 1)))
An S4 class that represents the simulated recurrent event or survival time
from one stochastic process. The function simEvent
produces
objects of this class.
.Data
A numerical vector of possibly length zero.
call
A function call.
z
A list.
zCoef
A list.
rho
A list.
rhoCoef
A numerical vector.
frailty
A list.
origin
A list.
endTime
A list.
censoring
A list.
recurrent
A logical vector.
interarrival
A list.
relativeRisk
A list.
method
A character vector.
A simulated data frame with covariates named
ID
, time
, event
, group
, x1
,
and gender
, where
ID
: Subjects identification;
time
: Event or censoring time;
event
: Event indicator, 1 = event, 0 = censored;
group
: Treatment group indicator;
x1
: Continuous variable.
gender
: Gender of subjects.
A data frame with 500 rows and 6 variables.
The sample dataset is originally simulated by the thinning method developed by Lewis and Shedler (1979) and further processed for a better demonstration purpose. See Fu et al. (2016) for details also.
Lewis, P. A., & Shedler, G. S. (1979). Simulation of nonhomogeneous Poisson processes by thinning. Naval Research Logistics Quarterly, 26(3), 403–413.
Fu, H., Luo, J., & Qu, Y. (2016). Hypoglycemic events analysis via recurrent time-to-event (HEART) models. Journal Of Biopharmaceutical Statistics, 26(2), 280–298.
Summary of estimated coefficients of covariates, rate function bases, and estimated rate parameter of frailty random variable, etc.
## S4 method for signature 'rateReg' summary(object, showCall = TRUE, showKnots = TRUE, ...)
## S4 method for signature 'rateReg' summary(object, showCall = TRUE, showKnots = TRUE, ...)
object |
A |
showCall |
A logic value with dafault |
showKnots |
A logic value with default |
... |
Other arguments for future usage. |
summary,rateReg-method
returns a
summary.rateReg
object,
whose slots include
covarCoef
: Estimated covariate coefficients.
frailtyPar
: Estimated rate parameter of gamma frailty.
baseRateCoef
: Estimated coeffcients of baseline
rate function.
For the meaning of other slots, see rateReg
.
summary.rateReg
object
rateReg
for model fitting;
coef,rateReg-method
for point estimates of
covariate coefficients;
confint,rateReg-method
for confidence intervals
of covariate coeffcients;
baseRate,rateReg-method
for coefficients of baseline
rate function.
## See examples given in function rateReg.
## See examples given in function rateReg.
Recur
objectSummarize an Recur
object
## S4 method for signature 'Recur' summary(object, ...)
## S4 method for signature 'Recur' summary(object, ...)
object |
An |
... |
Other arguments not used. |
summary.Recur
object.
The class summary.rateReg
is an S4 class with selective slots of
rateReg
object. See “Slots” for details. The function
summary,rateReg-method
produces objects of this class.
call
Function call.
spline
A character.
knots
A numeric vector.
Boundary.knots
A numeric vector.
covarCoef
A numeric matrix.
frailtyPar
A numeric matrix.
degree
A nonnegative integer.
baseRateCoef
A numeric matrix.
logL
A numeric value.
An S4 Class for Summarized Recur Object
call
A function call.
sampleSize
An integer representing the sample size (number of subjects).
reSize
An integer representing the number of recurrent events.
avgReSize
A numeric value representing the average number of recurrent events per subject.
propTem
A numeric value representing the proportion of subjects having terminal event.
medFU
A numeric value for median follow-up time.
medTem
A numeric value for median survival time of the terminal events.
Create an S4 class that represents formula response for recurrent event data modeled by methods based on counts and rate function. Note that the function is deprecated since version 0.5.0 and will be removed in future.
Survr(ID, time, event, origin = 0, check = TRUE, ...)
Survr(ID, time, event, origin = 0, check = TRUE, ...)
ID |
Subject identificators. It can be numeric vector, character vector, or a factor vector. |
time |
Time of reccurence event or censoring. In addition to numeric
values, |
event |
A numeric vector indicating failure cost or event indicator
taking positive values as costs ( |
origin |
The time origin of each subject or process. In addition to
numeric values, |
check |
A logical value suggesting whether to perform data checking
procedure. The default value is |
... |
Other arguments for future usage. |
This is a similar function to Survr
in package
survrec but with a more considerate checking procedure embedded for
recurrent event data modeled by methods based on counts and rate function.
The checking rules apply to each subject respectively and include that
Subject identification, event times, censoring time, and event indicator cannot be missing or contain missing values.
There has to be only one censoring time not earlier than any event time.
The time origin has to be the same and not later than any event time.
The class Survr
is an S4 that represents a formula response for
recurrent event data model. The function Survr
produces
objects of this class. See “Slots” for details.
.Data
A numeric matrix object.
ID
A charactrer vector for original subject identificator.
check
A logical value indicating whether to performance data checking.
ord
An integer vector for increasingly ordering data by ID
,
time
, and 1 - event
.
Valve seats wear out in certain diesel engines, each with 16 valve seats.
The dataset served as an example of recurrence data in Nelson (1995),
which consists of valve-seat replacements on 41 engines in a fleet.
The covariates are named
ID
, Days
, and No.
, where
ID
: The engine number;
Days
: Engine age in days;
No.
: Event indicator, '1' for a valve-seat replacement
and, '0' for the censoring age of an engine.
A data frame with 89 rows and 3 variables.
Nelson, W. (1995), Confidence Limits for Recurrence Data-Applied to Cost or Number of Product Repairs, Technometrics, 37, 147–157.