Comparison of penalized logistic regression models for rare event case

Olmus H., Nazman E., Erbas S.



The occurrence rate of the event of interest might be quite small (rare) in some cases, although sample size is large enough for Binary Logistic Regression (LR) model. In studies where the sample size is not large enough, the parameters to be estimated might be biased because of rare event case. Parameter estimations of LR model are usually obtained using Newton?Raphson (NR) algorithm for Maximum Likelihood Estimation (MLE). It is known that these estimations are usually biased in small samples but asymptotically unbiased. On the other hand, initial parameter values are sensitive for parameter estimation in NR for MLE. Our aim of the study is to present an approach on parameter estimation bias using inverse conditional distributions based on distribution assumption giving true parameter values and to compare this approach on different penalized LR methods. With this aim, LR, Firth LR, FLIC and FLAC methods were compared in terms of parameter estimation bias, predicted probability bias and Root Mean Squared Error (RMSE) for different sample sizes, event and correlation rates conducting a detailed Monte Carlo simulation study. Findings suggest that FLIC method should be preferred in rare event and small sample cases.