Validating machine learning models for the prediction of labour induction intervention using routine data: a registry-based retrospective cohort study at a tertiary hospital in northern Tanzania

Clifford Silver Tarimo1,2, Soumitra S Bhuyan3, Quanman Li1 Michael Johnson J Mahande4, Jian Wu1, Xiaoli Fu1
Publication year: 

Objectives We aimed at identifying the important variables for labour induction intervention and assessing the predictive performance of machine learning algorithms.

Setting We analysed the birth registry data from a referral hospital in northern Tanzania. Since July 2000, every birth at this facility has been recorded in a specific database.

Participants 21 578 deliveries between 2000 and 2015 were included. Deliveries that lacked information regarding the labour induction status were excluded.

Primary outcome Deliveries involving labour induction intervention.

Results Parity, maternal age, body mass index, gestational age and birth weight were all found to be important predictors of labour induction. Boosting method demonstrated the best discriminative performance (area under curve, AUC=0.75: 95% CI (0.73 to 0.76)) while logistic regression presented the least (AUC=0.71: 95% CI (0.70 to 0.73)). Random forest and boosting algorithms showed the highest net-benefits as per the decision curve analysis.

Conclusion All of the machine learning algorithms performed well in predicting the likelihood of labour induction intervention. Further optimisation of these classifiers through hyperparameter tuning may result in an improved performance. Extensive research into the performance of other classifier algorithms is warranted.