Title: | Hazard Discrimination Summary |
---|---|
Description: | Functions for calculating the hazard discrimination summary and its standard errors, as described in Liang and Heagerty (2016) <doi:10.1111/biom.12628>. |
Authors: | C. Jason Liang [aut, cre] |
Maintainer: | C. Jason Liang <[email protected]> |
License: | GPL-2 |
Version: | 0.8.1 |
Built: | 2025-03-03 04:02:34 UTC |
Source: | https://github.com/liangcj/hds |
finda
estimates the time-varying coefficients beta(t) at a single time
from a local-in-time Cox model. Think of it as a Cox model where the
the coefficients are allowed to vary with time. Further details can be found
in Cai and Sun (2003) and Tian et al. (2005).
finda(tt, times, status, covars, start = rep(0, ncol(covars)), h = 400, ...)
finda(tt, times, status, covars, start = rep(0, ncol(covars)), h = 400, ...)
tt |
Time to estimate beta(t) at |
times |
A vector of observed follow up times. |
status |
A vector of status indicators, usually 0=alive, 1=dead. |
covars |
A matrix or data frame of numeric covariate values, with a column for each covariate and each observation is on a separate row. |
start |
A vector of length p of starting values to be passed to
|
h |
A single value on the time scale representing the bandwidth to use. |
... |
Additional parameters to pass to |
The naming of the function finda
stands for "find a(t)", where "a(t)"
is the notation used in Cai and Sun (2003) to represent the time-varying
Cox model coefficients. We also refer to "a(t)" as "beta(t)" through the documentation.
The user typically will not interact with this function, as finda
is
wrapped by hdslc
.
A vector of length p, where p is the number of covariates. The vector
is the estimated beta(t) from the local-in-time Cox model at time tt
.
Cai Z and Sun Y (2003). Local linear estimation for time-dependent coefficients in Cox's regression models. Scandinavian Journal of Statistics, 30: 93-111. doi:10.1111/1467-9469.00320
Tian L, Zucker D, and Wei LJ (2005). On the Cox model with time-varying regression coefficients. Journal of the American Statistical Association, 100(469):172-83. doi:10.1198/016214504000000845
Returns hazard discimination summary (HDS) estimates at all specified evaluation times. See Liang and Heagerty (2016) for details on HDS.
hds(times, status, m, evaltimes = times[order(times)], se = TRUE)
hds(times, status, m, evaltimes = times[order(times)], se = TRUE)
times |
A vector of observed follow up times. |
status |
A vector of status indicators, usually 0=alive, 1=dead. |
m |
A matrix or data frame of covariate values, with a column for each
covariate and each observation is on a separate row. Non-numeric values
are acceptable, as the values will be transformed into a numeric model
matrix through |
evaltimes |
A vector of times at which to estimate HDS. Defaults to all
the times specified by the |
se |
TRUE or FALSE. TRUE: calculate and return standard error estimates. FALSE: do not calculate standard errors estimates and return NAs. Defaults to TRUE. May want to set to FALSE to save computation time if using this function to compute bootstrap standard errors. |
A wrapper for hds_t
. Since hds_t
only estimates HDS at one time
point, this function calls hds_t
multiple times to estimate the entire
HDS curve. hds
and hdslc
are the main functions the user will
interact with in this package.
The covariate values m
are centered for numerical stability. This is
particularly relevant for the standard error calculations.
A data frame with three columns: 1) the evaluation times, 2) the HDS estimates at each evaluation time, and 3) the standard error estimates at each evaluation time
Liang CJ and Heagerty PJ (2016). A risk-based measure of time-varying prognostic discrimination for survival models. Biometrics. doi:10.1111/biom.12628
## Not run: head(hds(times = survival::pbc[1:312, 2], status = (survival::pbc[1:312, 3]==2)*1, m = survival::pbc[1:312, 5])) hdsres <- hds(times=pbc5[,1], status=pbc5[,2], m=pbc5[,3:7]) hdslcres <- hdslc(times = pbc5[,1], status=pbc5[,2], m = pbc5[,3:7], h = 730) Survt <- summary(survival::survfit(survival::Surv(pbc5[,1], pbc5[,2])~1)) Survtd <- cbind(Survt$time, c(0,diff(1-Survt$surv))) tden <- density(x=Survtd[,1], weights=Survtd[,2], bw=100, kernel="epanechnikov") par(mar=c(2.25,2.25,0,0)+0.1, mgp=c(1.25,0.5,0)) plot(c(hdslcres[,1], hdslcres[,1]), c(hdslcres[,2] - 1.96*hdslcres[,3], hdslcres[,2] + 1.96*hdslcres[,3]), type="n", xlab="days", ylab="HDS(t)", cex.lab=.75, cex.axis=.75, ylim=c(-3,15), xlim=c(0,3650)) polygon(x=c(hdsres[,1], hdsres[312:1,1]), col=rgb(1,0,0,.25), border=NA, fillOddEven=TRUE, y=c(hdsres[,2]+1.96*hdsres[,3], (hdsres[,2]-1.96*hdsres[,3])[312:1])) polygon(x=c(hdslcres[,1], hdslcres[312:1, 1]), col=rgb(0,0,1,.25), border=NA, fillOddEven=TRUE, y=c(hdslcres[,2] + 1.96*hdslcres[,3], (hdslcres[,2] - 1.96*hdslcres[,3])[312:1])) lines(hdsres[,1], hdsres[,2], lwd=2, col="red") lines(hdslcres[,1], hdslcres[,2], lwd=2, col="blue") abline(h=1, lty=3) legend(x=1200, y=14, legend=c("Proportional hazards", "Local-in-time proportional hazards", "Time density"), col=c("red", "blue", "gray"), lwd=2, bty="n", cex=0.75) with(tden, polygon(c(x, x[length(x):1]), c(y*3/max(y)-3.5, rep(-3.5, length(x))), col="gray", border=NA, fillOddEven=TRUE)) ## End(Not run)
## Not run: head(hds(times = survival::pbc[1:312, 2], status = (survival::pbc[1:312, 3]==2)*1, m = survival::pbc[1:312, 5])) hdsres <- hds(times=pbc5[,1], status=pbc5[,2], m=pbc5[,3:7]) hdslcres <- hdslc(times = pbc5[,1], status=pbc5[,2], m = pbc5[,3:7], h = 730) Survt <- summary(survival::survfit(survival::Surv(pbc5[,1], pbc5[,2])~1)) Survtd <- cbind(Survt$time, c(0,diff(1-Survt$surv))) tden <- density(x=Survtd[,1], weights=Survtd[,2], bw=100, kernel="epanechnikov") par(mar=c(2.25,2.25,0,0)+0.1, mgp=c(1.25,0.5,0)) plot(c(hdslcres[,1], hdslcres[,1]), c(hdslcres[,2] - 1.96*hdslcres[,3], hdslcres[,2] + 1.96*hdslcres[,3]), type="n", xlab="days", ylab="HDS(t)", cex.lab=.75, cex.axis=.75, ylim=c(-3,15), xlim=c(0,3650)) polygon(x=c(hdsres[,1], hdsres[312:1,1]), col=rgb(1,0,0,.25), border=NA, fillOddEven=TRUE, y=c(hdsres[,2]+1.96*hdsres[,3], (hdsres[,2]-1.96*hdsres[,3])[312:1])) polygon(x=c(hdslcres[,1], hdslcres[312:1, 1]), col=rgb(0,0,1,.25), border=NA, fillOddEven=TRUE, y=c(hdslcres[,2] + 1.96*hdslcres[,3], (hdslcres[,2] - 1.96*hdslcres[,3])[312:1])) lines(hdsres[,1], hdsres[,2], lwd=2, col="red") lines(hdslcres[,1], hdslcres[,2], lwd=2, col="blue") abline(h=1, lty=3) legend(x=1200, y=14, legend=c("Proportional hazards", "Local-in-time proportional hazards", "Time density"), col=c("red", "blue", "gray"), lwd=2, bty="n", cex=0.75) with(tden, polygon(c(x, x[length(x):1]), c(y*3/max(y)-3.5, rep(-3.5, length(x))), col="gray", border=NA, fillOddEven=TRUE)) ## End(Not run)
hds_t
estimates HDS at time t under the PH assumption
hds_t(t, L0hat, betahat, m)
hds_t(t, L0hat, betahat, m)
t |
The time at which to calculate HDS |
L0hat |
A data frame with variable names of hazard and time. Typically
the object returned by |
betahat |
A vector of coefficient estimates from the Cox model.
Typically the |
m |
A numeric matrix of covariate values, with a column for each covariate and each observation is on a separate row. |
The user typically will not interact with this function. Rather, hds
is a wrapper for this function and is what the user typically will use.
Returns local constant HDS estimates at all specified evaluation times. See Liang and Heagerty (2016) for details on HDS.
hdslc(times, status, m, evaltimes = times[order(times)], h = 1.06 * sd(times) * (length(times)^(-0.2)), se = TRUE)
hdslc(times, status, m, evaltimes = times[order(times)], h = 1.06 * sd(times) * (length(times)^(-0.2)), se = TRUE)
times |
A vector of observed follow up times. |
status |
A vector of status indicators, usually 0=alive, 1=dead. |
m |
A matrix or data frame of covariate values, with a column for each
covariate and each observation is on a separate row. Non-numeric values
are acceptable, as the values will be transformed into a numeric model
matrix through |
evaltimes |
A vector of times at which to estimate HDS. Defaults to all
the times specified by the |
h |
A single numeric value representing the bandwdith to use, on the time scale. The default bandwidth is a very ad hoc estimate using Silverman's rule of thumb |
se |
TRUE or FALSE. TRUE: calculate and return standard error estimates. FALSE: do not calculate standard errors estimates and return NAs. Defaults to TRUE. May want to set to FALSE to save computation time if using this function to compute bootstrap standard errors. |
A local constant version of hds
. While hds
estimates HDS(t)
assuming the Cox proportional hazards model, hdslc
estimates HDS(t)
using a relaxed, local-in-time Cox model. Specifically, the hazard
ratios are allowed to vary with time. See Cai and Sun (2003) and Tian Zucker Wei (2005) for details on the
local-in-time Cox model.
Point estimates use hdslc.fast
and standard errors use
hdslcse.fast
. hdslc.fast
requires an estimate of beta(t) (time-varying
hazard ratio), which is estimated using finda()
; and subject specific
survival, which is estimated using sssf.fast(). hdslcse.fast
requires the same
and in addition standard error estimates of beta(t), which are estimated
using betahatse.fast()
.
The covariate values m
are centered for numerical stability. This is
particularly relevant for the standard error calculations.
A data frame with three columns: 1) the evaluation times, 2) the HDS estimates at each evaluation time, and 3) the standard error estimates at each evaluation time
Liang CJ and Heagerty PJ (2016). A risk-based measure of time-varying prognostic discrimination for survival models. Biometrics. doi:10.1111/biom.12628
Cai Z and Sun Y (2003). Local linear estimation for time-dependent coefficients in Cox's regression models. Scandinavian Journal of Statistics, 30: 93-111. doi:10.1111/1467-9469.00320
Tian L, Zucker D, and Wei LJ (2005). On the Cox model with time-varying regression coefficients. Journal of the American Statistical Association, 100(469):172-83. doi:10.1198/016214504000000845
## Not run: head(hdslc(times = survival::pbc[1:312, 2], status = (survival::pbc[1:312, 3]==2)*1, m = survival::pbc[1:312, 5])) hdsres <- hds(times=pbc5[,1], status=pbc5[,2], m=pbc5[,3:7]) hdslcres <- hdslc(times = pbc5[,1], status=pbc5[,2], m = pbc5[,3:7], h = 730) Survt <- summary(survival::survfit(survival::Surv(pbc5[,1], pbc5[,2])~1)) Survtd <- cbind(Survt$time, c(0,diff(1-Survt$surv))) tden <- density(x=Survtd[,1], weights=Survtd[,2], bw=100, kernel="epanechnikov") par(mar=c(2.25,2.25,0,0)+0.1, mgp=c(1.25,0.5,0)) plot(c(hdslcres[,1], hdslcres[,1]), c(hdslcres[,2] - 1.96*hdslcres[,3], hdslcres[,2] + 1.96*hdslcres[,3]), type="n", xlab="days", ylab="HDS(t)", cex.lab=.75, cex.axis=.75, ylim=c(-3,15), xlim=c(0,3650)) polygon(x=c(hdsres[,1], hdsres[312:1,1]), col=rgb(1,0,0,.25), border=NA, fillOddEven=TRUE, y=c(hdsres[,2]+1.96*hdsres[,3], (hdsres[,2]-1.96*hdsres[,3])[312:1])) polygon(x=c(hdslcres[,1], hdslcres[312:1, 1]), col=rgb(0,0,1,.25), border=NA, fillOddEven=TRUE, y=c(hdslcres[,2] + 1.96*hdslcres[,3], (hdslcres[,2] - 1.96*hdslcres[,3])[312:1])) lines(hdsres[,1], hdsres[,2], lwd=2, col="red") lines(hdslcres[,1], hdslcres[,2], lwd=2, col="blue") abline(h=1, lty=3) legend(x=1200, y=14, legend=c("Proportional hazards", "Local-in-time proportional hazards", "Time density"), col=c("red", "blue", "gray"), lwd=2, bty="n", cex=0.75) with(tden, polygon(c(x, x[length(x):1]), c(y*3/max(y)-3.5, rep(-3.5, length(x))), col="gray", border=NA, fillOddEven=TRUE)) ## End(Not run)
## Not run: head(hdslc(times = survival::pbc[1:312, 2], status = (survival::pbc[1:312, 3]==2)*1, m = survival::pbc[1:312, 5])) hdsres <- hds(times=pbc5[,1], status=pbc5[,2], m=pbc5[,3:7]) hdslcres <- hdslc(times = pbc5[,1], status=pbc5[,2], m = pbc5[,3:7], h = 730) Survt <- summary(survival::survfit(survival::Surv(pbc5[,1], pbc5[,2])~1)) Survtd <- cbind(Survt$time, c(0,diff(1-Survt$surv))) tden <- density(x=Survtd[,1], weights=Survtd[,2], bw=100, kernel="epanechnikov") par(mar=c(2.25,2.25,0,0)+0.1, mgp=c(1.25,0.5,0)) plot(c(hdslcres[,1], hdslcres[,1]), c(hdslcres[,2] - 1.96*hdslcres[,3], hdslcres[,2] + 1.96*hdslcres[,3]), type="n", xlab="days", ylab="HDS(t)", cex.lab=.75, cex.axis=.75, ylim=c(-3,15), xlim=c(0,3650)) polygon(x=c(hdsres[,1], hdsres[312:1,1]), col=rgb(1,0,0,.25), border=NA, fillOddEven=TRUE, y=c(hdsres[,2]+1.96*hdsres[,3], (hdsres[,2]-1.96*hdsres[,3])[312:1])) polygon(x=c(hdslcres[,1], hdslcres[312:1, 1]), col=rgb(0,0,1,.25), border=NA, fillOddEven=TRUE, y=c(hdslcres[,2] + 1.96*hdslcres[,3], (hdslcres[,2] - 1.96*hdslcres[,3])[312:1])) lines(hdsres[,1], hdsres[,2], lwd=2, col="red") lines(hdslcres[,1], hdslcres[,2], lwd=2, col="blue") abline(h=1, lty=3) legend(x=1200, y=14, legend=c("Proportional hazards", "Local-in-time proportional hazards", "Time density"), col=c("red", "blue", "gray"), lwd=2, bty="n", cex=0.75) with(tden, polygon(c(x, x[length(x):1]), c(y*3/max(y)-3.5, rep(-3.5, length(x))), col="gray", border=NA, fillOddEven=TRUE)) ## End(Not run)
hdslc.fast
estimates HDS at a single time using the local-in-time
proportional hazards model. See Cai and Sun (2003, Scandinavian Journal of
Statistics) for details on the local-in-time PH model.
hdslc.fast(S, betahat, m)
hdslc.fast(S, betahat, m)
S |
A vector of length |
betahat |
A p x 1 vector of coefficient estimates at time t of interest
from the local-in-time Cox model. Vector length p is the number of
covariates. Typically the output from |
m |
A numeric n x p matrix of covariate values, with a column for each covariate and each observation is on a separate row. |
The user typically will not interact with this function. Rather, hdslc
wraps this function and is what the user typically will use.
The HDS estimate at times t, where t is implied by choice of S
and betahat
passed to hdslc.fast
.
hdslcse.fast
calculates an estimate of the variance for the
local constant hazard discrimination summary estimator at a time t. The time
t is implied by S
, betahat
, and betahatse
hdslcse.fast(S, betahat, m, betahatse)
hdslcse.fast(S, betahat, m, betahatse)
S |
A vector of length |
betahat |
A p x 1 vector of coefficient estimates at time t of interest
from the local-in-time Cox model. Vector length p is the number of
covariates. Typically the output from |
m |
A numeric n x p matrix of covariate values, with a column for each covariate and each observation is on a separate row. |
betahatse |
A p x p covariance matrix for betahat at time t |
The use will typically not interact with this function directly. Instead this
function is wrapped by hdslc
.
Variance estimate that has not been normalized. To get a usable standard error estimate, divide the output of this function by the bandwidth and sample size, and then take the square root.
A cleaned up version of the Mayo PBC data from survival::pbc
. Only the
first 312 observations are used (the cases who participated in the
randomized trial). Only five of the covariates (listed below) are used.
Further, two of the covariates were log transformed.
pbc5
pbc5
A data frame with 312 rows and 7 variables:
follow up time in days
1=death, 0=censored
age in years
0=no edema, 0.5=untreated or successfully treated, 1=edema despite diuretic therapy
log serum bilirubin level (original value from
survival::pbc
is unlogged)
serum albumin
log standardized blood clotting time (original value from
survival::pbc
is unlogged)
Cleaned up version of survival::pbc