- Very similar to linear or logistic regression models except that the dependent variable is a measure of the timing or rate of event occurrence
- Most method of survival analysis require that the event time be measured with respect to some origin time
- Ideally the origin time is the same as the time at which observations begin and most software program assume that it is the case
- Might need to take into account late entry or left truncation
- Censoring is endemic to survival data
- Any report of survival analysis should discuss the type, cause and treatment of censoring
- Most common type of censoring is right censoring when an observation is terminated before an individual experiences an event
- Censoring could be informative if it occurs at varying time because individuals drop out of the study
- Slightly less common type of censoring is interval censoring when the exact time not not known, only between two point in time
- If you know the exact time at which an event occurs, use methods that treat time as continuous
- If not use discrete method (like when you only know the month or the year of the event)
- For discrete method you must choose between a logit model and a complementary log-log model but in practice the choice is usually not consequential
- Logit is more appropriate for truly discrete events
- The most popular method for regression analysis of survival data is the Cox regression
- Cox regression is semi parametric
- However parametric methods are much better at handling left censoring or interval censoring and can generate predicted times to events
- One major difference between survival regression and conventional linear regression is the possibility of time dependent covariates
- If the data contain information on more than one event for each individual then special methods are needed to take advantage of the additional information
- Repeated events provide more statistical power
- Likely to be statistical dependence among those observations
- There are four methods to provide correction for repeat events 1) Robust standard errors (Huber-white or sandwich estimates 2) Generalised estimating equation (GEE) 3) Random effect (mixed) models 4)Fixed effect methods
- Stata will estimate random effects models for Cox regression but SAS wont
- If event times are discrete, maximum likelihood estimation requires that models are estimated simultaneously suing the generalized logit model (no equivalent for log-log)
- Conventional wisdom has it that there should be at least 5 (some say 10) events for each parameter in the model in order for max likelihood estimates to have reasonably good properties
- Imputing values from random draws from the predictive distribution of the missing value. Generate several dataset (5 or more) each with slightly different imputed values. Then combine into a single set of parameters estimates
- For survival analysis imputation should only be done on the predictor variables. Cases on dependent variable should just be deleted
- Compare not nested models with AIC, SBC or BIC
- Preference is given to models with the lowest values of those statistics, although no p-values can be calculated
- Magnitudes of beta coefficients (hazard ratios) are difficult to interpret
- Hazard ratios (always positive) are confusing because a value of 1 means no effect
- The numeric value as a more straight forward value 100(HR-1)/100 is the percentage change in the hazard for one unit increase in the predictor
- Hazard ratios are asymmetric no can not use standard errors. Report 95% confidence levels instead
- Other stats can be chi-square test for the null hypothesis that all coefficients are zero
Survival analysis
Original by P.D. Allison, 2012, 12 pagesThis summary note was posted on 15 September 2017, by Reinie in Credit risk Finance
Latest Hamster Notes
- Measure what Matters posted in Management
- PSPO I posted in Agile
- Stuff on Scrum posted in Agile
- 3 tips to create a courageous space posted in Management
- The Lean Strategy posted in Management Personal Development
- 6 traits of an inclusive leader posted in Management
- Myers Briggs Type Indicator posted in Personal Development
- Positive Influence posted in Management
- Start with Why posted in Management
- 4 steps to optimise product value posted in Agile Management