12 Common statistical models
This chapter reviews widely used statistical modeling approaches, including survival analysis for time-to-event data, logistic and Poisson regression for binary and count outcomes, quantile regression for distributional effects, and principal components analysis for dimensionality reduction; it also discusses practical modeling considerations such as covariate adjustment strategies, variable selection methods, and handling heteroscedastic or fan-shaped relationships in regression to ensure valid inference and robust model performance.
12.1 Survival analysis
12.1.0.1 Survival analysis
This section introduces the core survival analysis framework—survival, hazard, and cumulative hazard functions under random censoring—and shows how to estimate and compare survival curves nonparametrically using the Kaplan–Meier estimator and the log-rank test. It then presents parametric models (exponential/Weibull) and the Cox proportional hazards model, covering likelihood/partial-likelihood estimation, interpretation of coefficients (time ratios or hazard ratios), and key diagnostics such as the proportional hazards assumption and influential observations.
12.1.0.4 Joint model with longitudinal and survival data
This report demonstrates how to fit and evaluate joint models linking longitudinal biomarker trajectories (log serum bilirubin via LME) with time-to-event outcomes (Cox/competing risks), including PH diagnostics, dynamic survival prediction (AUC/ROC, survfitJM), and interpretation of biomarker–risk association parameters using the JM package.
12.9 Save And Finalize Your trained Model
This report demonstrates how to train, save, reload, and deploy both a linear regression model and a random forest model in R using saveRDS() and readRDS(), enabling model persistence and reproducible prediction workflows.