Library Header Image
LSE Research Online LSE Library Services

Predicting depression in patients with knee osteoarthritis using machine learning: model development and validation study

Nowinka, Zuzanna, Abdulhadi Alagha, M., Mahmoud, Khadija and Jones, Gareth G. (2022) Predicting depression in patients with knee osteoarthritis using machine learning: model development and validation study. JMIR Formative Research, 6 (9). ISSN 2561-326X

[img] Text (Predicting Depression in Patients With Knee Osteoarthritis Using Machine Learning) - Published Version
Available under License Creative Commons Attribution.

Download (652kB)
Identification Number: 10.2196/36130


Background: Knee osteoarthritis (OA) is the most common form of OA and a leading cause of disability worldwide. Chronic pain and functional loss secondary to knee OA put patients at risk of developing depression, which can also impair their treatment response. However, no tools exist to assist clinicians in identifying patients at risk. Machine learning (ML) predictive models may offer a solution. We investigated whether ML models could predict the development of depression in patients with knee OA and examined which features are the most predictive. Objective: The primary aim of this study was to develop and test an ML model to predict depression in patients with knee OA at 2 years and to validate the models using an external data set. The secondary aim was to identify the most important predictive features used by the ML algorithms. Methods: Osteoarthritis Initiative Study (OAI) data were used for model development and external validation was performed using Multicenter Osteoarthritis Study (MOST) data. Forty-two features were selected, which denoted routinely collected demographic and clinical data such as patient demographics, past medical history, knee OA history, baseline examination findings, and patient-reported outcome measures. Six different ML classification models were trained (logistic regression, least absolute shrinkage and selection operator [LASSO], ridge regression, decision tree, random forest, and gradient boosting machine). The primary outcome was to predict depression at 2 years following study enrollment. The presence of depression was defined using the Center for Epidemiological Studies Depression Scale. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and F1 score. The most important features were extracted from the best-performing model on external validation. Results: A total of 5947 patients were included in this study, with 2969 in the training set, 742 in the test set, and 2236 in the external validation set. For the test set, the AUC ranged from 0.673 (95% CI 0.604-0.742) to 0.869 (95% CI 0.824-0.913), with an F1 score of 0.435 to 0.490. On external validation, the AUC varied from 0.720 (95% CI 0.685-0.755) to 0.876 (95% CI 0.853-0.899), with an F1 score of 0.456 to 0.563. LASSO modeling offered the highest predictive performance. Blood pressure, baseline depression score, knee pain and stiffness, and quality of life were the most predictive features. Conclusions: To our knowledge, this is the first study to apply ML classification models to predict depression in patients with knee OA. Our study showed that ML models can deliver a clinically acceptable level of performance (AUC>0.7) in predicting the development of depression using routinely available demographic and clinical data. Further work is required to address the class imbalance in the training data and to evaluate the clinical utility of the models in facilitating early intervention and improved outcomes.

Item Type: Article
Official URL:
Additional Information: © 2022 The Authors
Divisions: LSE
Subjects: R Medicine > RA Public aspects of medicine > RA0421 Public health. Hygiene. Preventive Medicine
H Social Sciences
Date Deposited: 27 Oct 2022 14:24
Last Modified: 08 Apr 2024 04:27

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics