--- title: "Introduction to Formula Response Function Recur()" author: Wenjie Wang date: "`r Sys.Date()`" output: rmarkdown::html_document: toc: true toc_depth: 2 toc_float: true # bibliography: ../inst/bib/reda.bib vignette: > %\VignetteIndexEntry{Introduction to Formula Response Function Recur()} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, echo = 3:4} knitr::opts_chunk$set(comment = "") head <- utils::head library(reda) packageVersion("reda") ``` # Overview The `Recur()` function provides a flexible and widely applicable formula response interface for modeling recurrent event data with considerate data checking procedures. It combined the flexible interface of `reSurv()` (deprecated in **reReg** version 1.1.7) and the effective checking procedures embedded in the `Survr()` (deprecated in **reda** version 0.5.0). # Function Interface The function interface of `Recur()` is given below. ```R Recur(time, id, event, terminal, origin, check = c("hard", "soft", "none"), ...) ``` A high-level introduction to each argument is as follows: - `time`: event and censoring times - `id`: subject's id - `event`: recurrent event indicator, cost, or type - `terminal`: event indicator of terminal events - `origin`: time origin of subjects - `check`: how to run the data checking procedure - `"hard"`: throw errors if the `check_Recur()` finds any issue on the data structure - `"soft"`: throw warnings instead - `"none"`: not to run the checking procedure More details of arguments are provided in the function documentation by `?Recur`. # The `Recur` Object The function `Recur()` returns an S4-class `Recur` object representing model response for recurrent event data. The `Recur` class object mainly contains a numerical matrix object (in the `.Data` slot) that serves as a model response matrix. The other slots are - `call`: a function call producing the object. - `ID`: a factor storing the original subject's ID, which originally can be a character vector, a numeric vector, or a factor). It is needed to pinpoint data issues for particular subjects with their original ID's. - `ord`: indices that sort the response matrix (by rows) increasingly by `id`, `time2`, and `- event`. Sorting is often done in the model-fitting steps, where the indices stored in this slot can be used directly. - `rev_ord`: indices that revert the increasingly sorted response matrix by `ord` to its original ordering. This slot is provided to easily revert the sorting. - `first_idx`: indices that indicates the first record of each subject in the sorted matrix. It helps in the data checking produce and may be helpful in model-fitting step, such as getting the origin time. - `last_idx`: indices that indicates the last record of each subject in the sorted matrix. Similar to `first_idx`, it helps in the data checking produce and may be helpful in the model-fitting step, such as locating the terminal events. - `check`: a character string that records the specified `check` argument. It just records the option that users specified on data checking. # Usage Among all the arguments, only the argument `time` does not have default values and thus has to be specified by users. ## When only `time` is given - The function assumes that each time point is specified for each subject. - The `id` takes its default value: `seq_along(time)`. - The `event` takes its default values: `0` (censoring) at the last record of each subject, and `1` (event) before censoring. - Both `terminal` and `origin` take zero for all subjects by default. ```{r ex-1} ex1 <- Recur(3:5) head(ex1) ``` ## When `time` and `id` are given - The `event` takes its default values: `0` (censoring) at the last record of each subject, and `1` (event) before censoring. - Both `terminal` and `origin` take zero for all subjects by default. ```{r ex-2} ex2 <- Recur(6:1, id = rep(1:2, 3)) head(ex2) ## sort by id, time2, and - event head(ex2[ex2@ord, ]) ``` - The slot `ord` stores the indices that sort the response matrix by `id`, `time2`, and `- event`. ## Helper `%to%` for recurrent episodes The function `Recur()` allows users to input recurrent episodes by `time1` and `time2`, which can be specified with help of `%to%` (or its alias `%2%`) in `Recur()`. For example, ```{r ex-3} left <- c(1, 5, 7) right <- c(3, 7, 9) ex3 <- Recur(left %to% right, id = c("A1", "A1", "A2")) head(ex3) ``` Internally, the function `%to%` returns a list with element named `"time1"` and `"time2"`. Therefore, it is equivalent to specify such a list. ```{r ex-4} ex4 <- Recur(list(time1 = left, time2 = right), id = c("A1", "A1", "A2")) stopifnot(all.equal(ex3, ex4, check.attributes = FALSE)) ``` ## About `origin` and `terminal` - Both `origin` and `terminal` take a numeric vector. - The length of specified vector can be one, equal to the number of subjects, or the number of `time`. Some simple examples are given below. ```{r ex-5} ex5 <- Recur(3:5, origin = 1, terminal = 1) head(ex5) ``` ```{r ex-6} ex6 <- Recur(3:5, id = c("A1", "A1", "A2"), origin = 1:2, terminal = c(0, 1)) head(ex6) ``` ```{r ex-7} ex7 <- Recur(3:5, id = c("A1", "A1", "A2"), origin = c(1, 1, 2), terminal = c(0, 0, 1)) stopifnot(all.equal(ex6, ex7, check.attributes = FALSE)) ``` - An error message will be thrown out if the length is inappropriate. ```{r origin-terminal-err} try(Recur(1:10, origin = c(1, 2))) try(Recur(1:10, terminal = c(1, 2))) ``` ## Data Checking Rules The `Recur()` (internally calls `check_Recur()` and) checks whether the specified data fits into the recurrent event data framework by several rules if `check = "hard"` or `check = "soft"`. The existing rules and the corresponding examples are given below. 1. Every subject must have one censoring not before any event time. ```{r rule1} try(Recur(1:5, id = c(rep("A1", 3), "A2", "A3"), event = c(0, 0, 1, 0, 0))) ``` 2. Every subject must have one terminal event time. ```{r rule2} try(Recur(1:3, id = rep("A1", 3), terminal = c(0, 1, 1))) ``` 3. Event or censoring times cannot be missing. ```{r rule4} try(Recur(c(1:2, NA), id = rep("A1", 3))) ``` 4. Event times cannot be earlier than the origin time. ```{r rule5} try(Recur(3:5, id = rep("A1", 3), origin = 10)) try(Recur(3:5 %to% 1:3, id = rep("A1", 3))) ``` 5. The recurrent episode cannot be overlapped. ```{r rule6} try(Recur(c(0, 3, 5) %to% c(1, 6, 10), id = rep("A1", 3))) ``` 6. However, recurrent episode without events is allowed for possible time-varying covariates and risk-free gaps. ```{r rule3} Recur(c(0, 2, 6) %to% c(1, 3, 8), id = rep("A1", 3), event = c(0, 1, 0)) ``` # The `Show()` Method A `show()` method is added for the `Recur` object in a similar fashion to the output of the function `survival:::print.Surv()`, which internally converts the input `Recur` object to character strings representing the recurrent episodes by a dedicated `as.character()` method. For each recurrent episode, - Censoring not due to terminal is indicated by a trailing `+` sign; - Censoring due to terminal is indicated by a trailing `*` sign; - Otherwise, an event happens at the end of the recurrent episode. For a concise printing, the `show()` method takes the `getOption("reda.Recur.maxPrint")` to limit the maximum number of recurrent episodes to be printed for each process. By default, `options(reda.Recur.maxPrint = 3)` is set. ## The Valve Seats Example We may illustrate the results of the `show()` method by the example valve seats data, where terminal events are artificially added. ```{r show1} set.seed(123) term_events <- rbinom(length(unique(valveSeats$ID)), 1, 0.5) with(valveSeats, Recur(Days, ID, No., term_events)) ``` ## On Missing times The updated `show()` method preserves `NA`'s when `check = "none"`. However, `NA`'s will always appear because times are sorted internally. ```{r show-na} Recur(c(NA, 3:6, NA), id = rep(1:2, 3), check = "none") ``` ## Number of digits The `show()` method takes the value of `options("digits") - 3` to determine the largest number of digits for printing. ```{r show-dig1} op <- options() getOption("digits") Recur(pi, 1) ``` ```{r show-dig2} options(digits = 10) Recur(pi, 1) options(op) # reset (all) initial options ```