A computational approach to compare regression modelling strategies in prediction research
Publication year
2016Source
BMC Medical Research Methodology, 16, 1, (2016), article 107ISSN
Publication type
Article / Letter to editor

Display more detailsDisplay less details
Organization
Health Evidence
IQ Healthcare
Journal title
BMC Medical Research Methodology
Volume
vol. 16
Issue
iss. 1
Subject
Radboudumc 18: Healthcare improvement science RIHS: Radboud Institute for Health SciencesAbstract
BACKGROUND: It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. METHODS: A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. Results : The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. CONCLUSION: The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
This item appears in the following Collection(s)
- Academic publications [204994]
- Electronic publications [103242]
- Faculty of Medical Sciences [81051]
- Open Access publications [71780]
Upload full text
Use your RU credentials (u/z-number and password) to log in with SURFconext to upload a file for processing by the repository team.