Volume 28, Issue 2 (9-2025)                   jha 2025, 28(2): 53-69 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Esmaeili M, Lotfnezhad Afshar H, Rahimi B, Khademvatani K, Samadzad Qushchi S, Hoseinpour V. Predicting the length of hospital stay in patients with congestive heart failure using data mining techniques. jha 2025; 28 (2) :53-69
URL: http://jha.iums.ac.ir/article-1-4586-en.html
1- Department of Medical Informatics, School of Allied Medical Sciences, Urmia University of Medical Sciences, Urmia, Iran.
2- Department of Health Information Technology, School of Allied Medical Sciences, Urmia University of Medical Sciences, Urmia, Iran. & Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran. , hadi.afshar@gmail.com
3- Department of Medical Informatics, School of Allied Medical Sciences, Urmia University of Medical Sciences, Urmia, Iran. & Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran.
4- Department of Cardiology, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran.
5- Health and Biomedical Informatics Research Center, Urmia University of Medical Sciences, Urmia, Iran.
6- Department of Emergency Medicine, School of Medicine, Urmia University of Medical Sciences, Urmia, Iran.
Full-Text [PDF 1233 kb]   (332 Downloads)     |   Abstract (HTML)  (660 Views)
Full-Text:   (247 Views)
 
Introduction
Congestive heart failure (CHF) is one of the most common and severe chronic diseases worldwide, significantly contributing to increased mortality rates and reduced quality of life [1–3]. Owing to frequent and prolonged hospitalizations, CHF imposes substantial pressure on hospital systems, including inpatient beds, medical staff, and healthcare equipment. Global estimates indicate that CHF affects approximately 64.3 million people, with a prevalence of 1%-2% among adults in developed countries and over 25 million cases worldwide [4–6]. In the United States alone, healthcare costs associated with CHF are projected to rise from $39 billion to over $153 billion by 2030 [7, 8]. The growing prevalence of CHF is particularly evident in developing countries such as Iran, where an aging population is accelerating disease incidence [9, 10]. Accurate prediction of the length of stay (LOS) for CHF patients enables healthcare providers to better estimate bed occupancy rates and optimize hospital operations. For CHF patients, LOS prediction also facilitates better discharge planning, which is critical for improving patient outcomes and minimizing the risk of readmission [11].
Data mining techniques provide innovative methods to analyze large-scale healthcare data and can be effectively used to develop LOS prediction models [12–17]. Machine learning and data mining have been applied effectively for health-related predictions, particularly with models such as support vector machines (SVM) and random forests (RF). For instance, Hache-Sou et al. [18] applied machine learning algorithms to predict LOS in cardiac patients, achieving 96.4% accuracy. Similarly, Turgeman et al. [17] used regression trees (Cubist) and SVM for LOS prediction, achieving 84% accuracy. However, these studies often lack external validation and focus primarily on general cardiac patients rather than CHF populations. Moreover, previous research has rarely employed association rule mining techniques, such as the Apriori algorithm, to identify specific factors influencing prolonged LOS. The Apriori algorithm enables the extraction of actionable clinical insights by identifying associations between patient features and LOS patterns [19, 20].
In the context of LOS prediction for CHF patients, a significant research gap remains in applying advanced data mining techniques to improve predictive accuracy and identify key clinical predictors. Studies by Luo et al. [21] and Dagistani et al. [22] have demonstrated the potential of algorithms such as RF for LOS prediction. However, only a few studies have specifically targeted CHF patients, and fewer still have applied a combined approach of predictive modeling and association rule mining to provide comprehensive clinical insights. To address these gaps, the present study proposes a data mining framework that combines various machine learning algorithms for accurate LOS prediction, along with the Apriori algorithm to uncover hidden associations in CHF patient data.


Methods
Data collection: This study was conducted using a retrospective cross-sectional design. Data were collected from 3,421 patients diagnosed with CHF discharged between 2018 and 2020 from Seyed Al-Shohada and Ayatollah Taleghani Hospitals in Urmia, Iran. A total of 1,690 records from Seyed Al-Shohada Hospital were used as the primary dataset (Dataset 1) for model development, while 1,719 records from Ayatollah Taleghani Hospital were used as an external validation dataset (Dataset 2).
The dataset included 27 variables covering demographic information (e.g., age, gender) and clinical characteristics such as hypertension history, length of stay (LOS), family history, diabetes, dyslipidemia, history of valve replacement, coronary artery bypass grafting, angioplasty, mitral balloon valvuloplasty, chronic pulmonary disease, asthma, stroke, atrial fibrillation, myocardial infarction, pericardial effusion, comorbidities, smoking, drug addiction, alcohol use, underlying etiology, elevated creatinine, low hemoglobin, number of CHF-related hospitalizations, and number of cardiovascular-related hospitalizations. Only patients with an ICD-10 code of I50.0 (CHF) were included.
Data cleaning: Variables such as body mass index (BMI) were excluded due to more than 70% missing data (1,183 cases), which could significantly bias model performance. For variables with less than 1% missing data, such as elevated creatinine (4 missing data) and low hemoglobin (17 missing data), mode imputation was applied. These strategies were implemented to preserve model accuracy and minimize errors from missing data.
Feature selection: In consultation with expert cardiologists and by referencing medical guidelines (e.g., ESC 2021 for heart failure), 27 out of 35 available variables were selected for analysis. This selection was validated through literature review [21, 23–27], manual review of patients’ electronic health records, and expert inputs. A panel of four cardiologists (average age: 52 years; average experience: 19 years; three males, one female) participated in this process.
Clustering: To transform LOS into a classification-ready format, K-means clustering was applied, guided by Silhouette coefficient (0.65) and the Elbow method to determine the optimal number of clusters. Clustering served as an unsupervised pre-processing step to uncover natural patterns in the data. Results indicated that K = 2 was optimal, consistent with prior studies that suggested a 7-day threshold to distinguish short and long hospital stays [28]. Accordingly, LOS was categorized into short stay: ≤ 7 days and long stay: > 7 days.
K-means clustering was performed after initial pre-processing (removing invalid records and imputing missing values) but before data balancing, to preserve the natural distribution of the data.
Balancing the data: To balance the binary LOS classes, resampling techniques including over-sampling, under-sampling, and synthetic minority over-sampling technique (SMOTE) were applied. SMOTE achieved the best performance an AUC of 85% and F1-score of 78%, providing more diverse and generalizable synthetic samples. Resampling was performed only during model training for supervised classifiers.
Apriori rule mining: The Apriori algorithm was employed to identify significant associations among variables. It was applied directly to the binary-labeled dataset (short vs long LOS) derived from K-means clustering, avoiding potential biases from machine learning classification outputs.
Model training: The initial dataset consisted of 1,690 records. After preprocessing and removing incomplete cases, 1,248 records remained. The dataset was then split into 80% training (1,000 records) and 20% testing (248 records). Modeling was conducted using SPSS Clementine 12 and R. Machine learning algorithms including decision tree (DT), neural network (NN), and adaptive neuro-fuzzy inference system (ANFIS) were evaluated. Random forest (RF) outperformed other models and was fine-tuned using grid search with 10-fold cross-validation. The final optimized hyperparameters for RF included number of trees: 100, features per split: 5, maximum tree depth: 30, splitting criterion: Gini index, minimum samples per split: 2.
While all algorithms were fine-tuned, RF showed superior improvements and resistance to overfitting compared to other models.
External validation: To evaluate the model generalizability, the trained model was validated on an independent dataset of 1,719 records from Ayatollah Taleghani Hospital. Pre-processing was applied in the same way as for Dataset 1. The BMI was excluded as well. The validation set included 1,133 patients (65.9%) with LOS ≤ 7 days and 586 patients (34.1%) with LOS > 7 days.
Evaluation: Model performance on both the training and testing datasets was assessed using the standard metrics, such as accuracy, sensitivity (recall), specificity, precision, Cohen’s kappa, F1-score, ROC curve and AUC [29-33].

Results
Dataset 1 (Seyed Al-Shohada Hospital): The dataset was divided into two classes: short and long stay. The Table 1 outlines the characteristics of each class.

Table 1. Comparison of clinical and demographic features (Dataset 1)
Feature Short-term (n=1171) Long-term (n=519) p-value
Mean age (years) 68.2 ± 12.2 75.3 ± 10.8 <0.001
Male gender (%) 52.1 58.7 0.013
Hypertension (%) 65.4 78.2 <0.001
Diabetes (%) 32.1 45.6 <0.001
Atrial fibrillation (%) 15.3 28.9 <0.001
Elevated creatinine (%) 48.2 72.4 <0.001
Low hemoglobin (%) 53.1 68.9 <0.001
History of angioplasty (%) 8.7 12.5 0.021

Of the 27 variables examined, only those showing statistically significant differences between the short and long stay (p < 0.05) are reported in Table 1.
Baseline feature analysis revealed that patients with long hospital stays were significantly older and had a higher prevalence of comorbidities (diabetes, hypertension) and laboratory abnormalities (elevated creatinine, low hemoglobin).
The algorithm implementation on dataset 1 (Seyed Al-Shohada Hospital) showed that RF outperformed decision trees, ANN, and ANFIS. As shown in Table 2, RF achieved accuracy of 87.14%, sensitivity of 97.56%, specificity of 23.24%, AUC of 55.40%, and F1-score of 71.13%.

Table 2. Algorithm performance on dataset 1 (Seyed Al-Shohada Hospital)
F1-score Kappa (%) AUC (%) Specificity(%) Sensitivity(%) Accuracy (%) Algorithm
65.61 01.55 51.03 07.08 94.18 67.56 Decision tree (C5.0)
69.36 20.75 59.56 26.88 92.24 64.24 Artificial neural network (ANN)
65.82 13.73 56.15 27.86 84.44 67.15 ANFIS
71.13 22.95 55.40 23.24 97.56 87.14 Random forest (RF)
After obtaining the above metrics, three balancing techniques (SMOTE, over-sampling, and under-sampling) were applied. SMOTE was applied only to the training data to balance class distribution. Evaluation was performed on the original, imbalanced test data. As a result, no significant improvement in accuracy was observed, but sensitivity and F1-score improved compared to the case without SMOTE. The slight decrease in specificity reflects the model’s shifted focus toward the long-stay class. As summarized in Table 3, Figure 1, and Figure 2, in the balanced dataset, RF demonstrated the best separation as well (AUC = 0.854). Table 4 shows the important features for LOS prediction based on the final RF model. 
F1-score% Kappa% AUC% Specificity% Sensitivity% Accuracy% Metric
79.16 64.63 80.55 70.00 91.09 85.13 C5.0
68.36 43.59 71.16 57.04 85.28 73.17 ANFIS
84.14 62.90 85.40 64.52 98.39 81.45 Random Forest
Table 3. Comparison of C5.0, ANFIS, and Random Forest after SMOTE balancing (Test Set - Seyed Al-Shohada Hospital)



         




Figure 1. ROC curves of the algorithms (Dataset 1) 



           Figure 2. Confusion matrix for dataset 1


Table 4.  Important predictive features
Rank Variable Predictive Role Explanation
1 History of CABG Positive Patients with a history of coronary artery bypass grafting (CABG) were more likely to have a hospital stay longer than 7 days.
2 Diabetes Positive Diabetic patients were more frequently found in the long-term stay group.
3 Dyslipidemia Positive Dyslipidemia was associated with longer hospital stays
4 Male Gender Positive The proportion of male patients was higher in the >7-day stay group.
5 Hypertension Negative Hypertension was more common among patients with shorter stays.
6 History of PCI Positive Patients with a history of percutaneous coronary intervention (PCI) had longer lengths of stay.
7 Elevated Creatinine Negative
High creatinine levels, particularly when combined with hypertension, were associated with shorter hospital stays.
 
The Apriori algorithm was used to extract rules identifying key factors affecting length of stay. Support and confidence thresholds were selected empirically, based on literature and expert validation. The following two rules were considered most clinically meaningful:
  • Rule 1: Male patients with hypertension, no valve replacement history, and elevated creatinine are more likely to have shorter hospital stays (Support: 0.107; Confidence: 0.923)
  • Rule 2: Patients with atrial fibrillation and elevated creatinine, but no angioplasty, no stroke, and no addiction history, are more likely to have longer stays (Support: 0.104; Confidence: 0.864)

Dataset 2 (Taleghani Hospital): This dataset included 1719 patients, divided into two groups: short stay (n=1133 patients/ 65.9%, long stay (n= 586 patients /34.1%). This dataset was used only for evaluation of previously developed models. Table 5, Figure 3, and Figure 4 confirm the generalizability and real-world applicability of RF model. The consistency in accuracy (77.40%) and AUC (84.82%) supports its use in clinical environments.

 

Table 5. Algorithm performance for dataset 2 (Taleghani Hospital)
Algorithm Accuracy% Sensitivity% Specificity% AUC% Kappa% F1-score%
Random Forest (RF) 77.40 94.32 68.61 84.82 65.40 73.90
Decision Tree (C5.0) 74.05 93.24 70.29 81.47 65.40 72.91
ANFIS 76.03 83.68 70.30 76.99 52.50 71.90
       
Figure 3. ROC curve for dataset 2(Taleghani Hospital)      Figure 4. Confusion matrix for dataset 2(Taleghani Hospital)
 
Discussion
The proposed model was successfully classified the length of stay (LOS) for patients with congestive heart failure (CHF) with high accuracy by employing the random forest algorithm. The model performance was acceptable on both the internal and external validation datasets, and results indicated that machine learning-based approaches are effective tools for predicting LOS in CHF patients.
These findings are consistent with the the study by Dagistani et al. [22], which employed data-driven algorithms to analyze the medical records of cardiac patients. However, that study did not report how missing data were handled, whereas in the present study, careful preprocessing and systematic handling of incomplete data were crucial for enhancing the model performance. The use of the K-means algorithm for clustering LOS facilitated a more precise separation grouping of patients and contributed to an improved classification. This method, combined with advanced algorithms, such as random forest, outperformed simpler models like decision trees or artificial neural networks (ANN), a point that should be further explored in comparison with similar studies.
In comparison with the study by Aghajani et al. [34], which focused on the factors affecting LOS in the general surgery ward in Tehran and reported a decision tree accuracy of 84.69%, the random forest model in the current study showed superior performance. Moreover, although Maharloo et al. [35] reported high performance of ANFIS for predicting LOS in ICU patients after cardiac surgery, in our study, this algorithm underperformed compared with RF and C5.0. These discrepancies may result from differences in patient populations, data characteristics, or preprocessing stages.
In a similar study, Gholipour et al. [36] employed an artificial neural network algorithm to predict trauma patients’ survival and LOS in the ward and ICU. Although their model predicted patients' clinical outcomes with good accuracy (93.33%), LOS prediction was relatively error-prone. In contrast, in the present study, the RF model accurately classified patients into short- and long-term stay groups with high accuracy and acceptable AUC. Another notable aspect was the use of the SMOTE technique for data balancing in the present study. Unlike studies that used simpler methods such as undersampling, this approach improved model accuracy. Overall, employing advanced machine learning algorithms, especially random forest, combined with proper data preprocessing and class balancing improves LOS prediction in patients.
Ultimately, proposed model identified variables such as gender, hypertension, comorbidities, and creatinine level as key predictors of LOS in CHF patients. Specifically, higher creatinine levels and the presence of comorbidities were associated with longer hospital stays, whereas male patients with hypertension but without a history of heart valve replacement were more likely fall into the short-stay group. These findings are consistent with confirming the role of comorbidities and impaired kidney function in increasing hospitalization duration. For example, Dagistani et al.  [22], identified chronic diseases such as diabetes and hypertension as factors contributing increased LOS. Moreover, previous studies have shown that impaired kidney function, through its effect on fluid and electrolyte balance, may delay the recovery process in CHF patients and increase LOS [37]. Therefore, considering these variables at the time of admission can play a key role in predicting LOS and optimizing hospital resource management.
Hypertension also emerged as a significant predictor, consistent with studies suggesting that it exacerbates CHF and, due to its association with comorbid conditions, results in longer hospital stays [26, 38]. Particularly when combined with other chronic diseases, hypertension can complicate patient condition management and delayed discharge. This finding aligns with Gottlieb et al. [39], who showed that CHF patients with hypertension often remain hospitalized longer due to the need for more intensive management and the higher risk of complications.
Creatinine level was another strong predictor of LOS. Elevated creatinine level indicates impaired kidney function, which can complicate CHF treatment. Poor renal function leads to longer hospital stays because these patients require closer monitoring, more precise drug therapy, and more complex management [39].
Atrial fibrillation was also found to be associated with longer LOS. This cardiac rhythm disorder usually co-occurs with heart failure and, due to the need for monitoring, multi-drug therapy, and higher risk of complications, results in greater resource use and delayed discharge [40]. Overall, these results highlight the importance of identifying high-risk patients at admission so that accurate LOS prediction can enable more efficient hospital resource allocation.
Through two complementary approaches (random forest algorithm for LOS prediction and Apriori for association rules extraction), this study proposed a comprehensive model for analyzing LOS in CHF patients. While previous studies such as Hachesu et al. [18] and Torgeman et al. [17] focused mainly on precise LOS prediction, the present study enhanced interpretability by association rule analysis. The Apriori algorithm was applied to binary-classified data (LOS ≤7 and >7 days), identifying specific combinations of patient features. For example, male patients with hypertension, and no valve replacement" are more associated with short stays. These rules complement the random forest model and can help interpret results and design targeted intervention programs.
The practical implications of these findings are significant for care planning and resource allocation in CHF management. Physicians can use these two approaches to identify patients at risk of longer stays early and plan targeted care accordingly. For instance, feature combinations such as atrial fibrillation and high creatinine levels, which are associated with longer LOS, can be applied to design personalized treatment pathways. Moreover, external validation using an independent dataset enhanced robustness and generalizability, indicating that applicability of this model beyond the study site.

Limitations
Despite its strengths, this study has several limitations. The data were collected from only two hospitals in Iran, which may limit the generalizability of the findings to other healthcare systems or populations. Furthermore, the dataset lacked variables such as detailed echocardiographic data or medication history, which could have provided a more comprehensive picture of patients’ condition and potentially improved prediction accuracy.
Future research should include additional clinical data, particularly imaging and medication-related variables, to enhance model performance and clinical relevance. Finally, exploring advanced machine learning techniques such as ensemble learning or deep learning may provide deeper insights into complex interactions within patient data and further improve LOS prediction in CHF and related conditions.

Conclusion
This study demonstrated the effectiveness of data mining techniques for predicting the length of stay (LOS) for patients with congestive heart failure (CHF) and highlighted its practical implications for resource management and patient care. By integrating predictive modeling with association rule mining, we proposed a comprehensive approach that can be adapted to other chronic diseases as well. Accurate LOS prediction facilitates improved planning and resource allocation, thereby enhancing the efficiency of healthcare delivery for CHF patients. The findings from this model can assist clinicians in identifying high-risk patients who may require prolonged care and facilitate timely interventions.

Declarations
Ethical considerations: This study was conducted under the ethical approval code IR.UMSU.REC.1398.012 issued by the Ethics Committee in Biomedical Research at Urmia University of Medical Sciences.
Funding: This study was part of a Master’s thesis supported by the Vice Chancellor for Research and Technology at Urmia University of Medical Sciences. The funding body had no role in data collection, analysis, or manuscript preparation.
Conflict of interest: The authors declare no conflicts of interest related to this manuscript.
Authors’ contributions: ME: Conceptualization, study design, data collection, methodology, software, validation, data analysis, data curation, writing – original draft, writing – review & editing. HLA: Conceptualization, study design, data collection, methodology, software, validation, data analysis, data curation, writing – original draft, writing – review & editing, Project administration, Fund acquisition. BR: Methodology, software, validation, data analysis. KKH: data collection, Methodology, validation, data analysis. ShSG: software, writing – review & editing, writing – review & editing, Visualization. VH: Methodology, validation, Financing.
Consent for publication: Not applicable.

Data availability: The datasets and codes used in this study are available from the corresponding author upon reasonable request via email
AI declaration: The English text of this article was edited using InstaText software. All content revised with the software was reviewed and approved by the authors.
Acknowledgments: The authors wish to thank all healthcare providers who supported this study at Seyed Al-Shohada and Ayatollah Taleghani Hospitals in Urmia. This article is based on a Master’s thesis titled "Predicting Length of Stay in Congestive Heart Failure Patients Using Data Mining Techniques at Seyed Al-Shohada and Ayatollah Taleghani Teaching Hospitals in Urmia," approved by Urmia University of Medical Sciences in 2020 (Project Code: 2509 ,Tracking code:3144).

  
Type of Study: Research | Subject: Health Information Technology
Received: 2025/02/9 | Accepted: 2025/09/3 | Published: 2025/09/28

References
1. Alemzadeh-Ansari MJ, Ansari-Ramandi MM, Naderi N. Chronic pain in chronic heart failure: a review article. The Journal of Tehran University Heart Center. 2017;12(2):49-56. Available from: /https://pmc.ncbi.nlm.nih.gov/articles/PMC5558055
2. Keyhani D, Razavi Z, Shafiee A, Bahadoram S. Autonomic function change following a supervised exercise program in patients with congestive heart failure. ARYA Atherosclerosis. 2013;9(2):150-156. PMCID: PMC3653242. Available from: https://pmc.ncbi.nlm.nih.gov/articles/PMC3653242
3. Writing Group Members, Rosamond W, Flegal K, et al. Heart disease and stroke statistics-2009 update: a report from the American heart association statistics committee and stroke statistics subcommittee. Circulation. 2009;119(3):e21-e181 [DOI:10.1161/CIRCULATIONAHA.108.191261]
4. Ahmadi A, Soori H, Mobasheri M, Etemad K, Khaledifar A. Heart failure: the outcomes, predictive and related factors in Iran. Journal of Mazandaran University of Medical Sciences. 2014;24(118):180-188. [In Persian]. Available from: http://jmums.mazums.ac.ir/article-1-4636-en.html
5. Liu LC, Voors AA, van Veldhuisen DJ, van der Meer P. Heart failure highlights in 2012-2013. European Journal of Heart Failure. 2014;16(2):122-32. [DOI:10.1002/ejhf.43]
6. Bowen RES, Graetz TJ, Emmert DA, Avidan MS. Statistics of heart failure and mechanical circulatory support in 2020. Annals of Translational Medicine. 2020;8(13):827. [DOI:10.21037/atm-20-1127]
7. Nomali M, Mohammadrezaei R, Keshtkar AA, Roshandel G, Ghiyasvandian S, Alipasandi K, et al. Self-monitoring by traffic light color coding versus usual care on outcomes of patients with heart failure reduced ejection fraction: protocol for a randomized controlled trial. JMIR Research Protocols. 2018;7(11):e9209. [DOI:10.2196/resprot.9209]
8. Ziaeian B, Fonarow GC. Epidemiology and aetiology of heart failure. Nature Reviews Cardiology. 2016;13(6):368-78. [DOI:10.1038/nrcardio.2016.25]
9. Mirdamadi A, Shafiee A, Ansari-Ramandi M, Garakyaraghi M, Pourmoghaddas A, Bahmani A, Mahmoudi H, Gharipour M. Beneficial effects of testosterone therapy on functional capacity, cardiovascular parameters, and quality of life in patients with congestive heart failure. BioMed Research International. 2014;2014:392432. [DOI:10.1155/2014/392432]
10. Mori J, Krantz MJ, Tanner J, Horwich TB, Yancy C, Albert NM, Hernandez AF, Dai D, Fonarow GC. Influence of hospital length of stay for heart failure on quality of care. The American Journal of Cardiology. 2008;102(12):1693-1697. [DOI:10.1016/j.amjcard.2008.08.015]
11. Azari A, Janeja VP, Mohseni A. Predicting hospital length of stay (PHLOS): a multi tiered data mining approach. In: 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW). 2012. p. 17-24. [DOI:10.1109/ICDMW.2012.69]
12. Mehdipour Y, Ebrahimi S, Karimi A, Alipour J, Khammarnia M, Siasar F. Presentation a model for prediction of cerebrovascular accident using data mining algorithm. Sadra Medical Journal. 2016;4(4):255-266. Available from: https://smsj.sums.ac.ir/article_43946_en.html
13. Ristevski B, Chen M. Big data analytics in medicine and healthcare. Journal of Integrative Bioinformatics. 2018;15(3):20170030. [DOI:10.1515/jib-2017-0030]
14. Pasupathi C, Kalavakonda V. Evidence based healthcare system using big data for disease diagnosis. In: 2016 2nd International Conference on Advances in Electrical, Electronics, Information, Communication and BioInformatics (AEEICB). 2016. p. 370-4. [DOI:10.1109/AEEICB.2016.7538393]
15. Sarafi Nejad A, Saeid A, Mohammed Rose I, Rowhanimanesh A. Modeling a data mining decision tree and propose a new model for the diagnosis of skin cancer by immunohistochemical staining methods. Journal of Health and Biomedical Informatics. 2014;1(1):54-62. Available from: http://jhbmi.ir/article-1-62-en.html
16. Tekieh MH, Raahemi B. Importance of data mining in healthcare: a survey. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2015. p. 1057-62. [DOI:10.1145/2808797.2809367]
17. Turgeman L, May JH, Sciulli R. Insights from a machine learning model for predicting the hospital length of stay at the time of admission. Expert Systems with Applications. 2017;78:376-85. [DOI:10.1016/j.eswa.2017.02.023]
18. Hachesu PR, Ahmadi M, Alizadeh S, Sadoughi F. Use of data mining techniques to determine and predict length of stay of cardiac patients. Healthcare Informatics Research. 2013;19(2):121-9. [DOI:10.4258/hir.2013.19.2.121]
19. Thuraisingham B. A primer for understanding and applying data mining. IT Professional. 2002;2(1):28-31. [DOI:10.1109/6294.819936]
20. Zhao J, Feng X, Pang Q, Fowler M, Lian Y, Ouyang M, et al. Battery safety: machine learning-based prognostics. Progress in Energy and Combustion Science. 2024;102:101142. [DOI:10.1016/j.pecs.2023.101142]
21. Luo L, Lain S, Feng C, Huang D, Zhang W. Data mining-based detection of rapid growth in length of stay on COPD patients. In: 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA). 2017. p. 319-23. [DOI:10.1109/ICBDA.2017.8078819]
22. Daghistani TA, Elshawi R, Sakr S, Ahmad A, Al-Thwayee A, Al-Mallah. Predictors of in hospital length of stay among cardiac patients: a machine learning approach. International Journal of Cardiology. 2019; 288:140-7. [DOI:10.1016/j.ijcard.2019.01.046]
23. Neri L, Oberdier MT, van Abeelen KCJ, Menghini L, Tumarkin E, Tripathi H, et al. Electrocardiogram monitoring wearable devices and artificial-intelligence-enabled diagnostic capabilities: a review. Sensors. 2023;23(10):4805. [DOI:10.3390/s23104805]
24. Dai W, Brisimi TS, Adams WG, Mela T, Saligrama V, Paschalidis IC. Prediction of hospitalization due to heart diseases by supervised learning methods. International Journal of Medical Informatics. 2015;84(3):189-197. [DOI:10.1016/j.ijmedinf.2014.10.002]
25. Natale J. A strategy for reducing congestive heart failure readmissions through the use of interventions targeted by machine learning [Doctoral dissertation]. University of Akron; 2015. OhioLINK Electronic Theses and Dissertations Center. Available from: http://rave.ohiolink.edu/etdc/view?acc_num=akron1428233380
26. Messerli FH, Rimoldi SF, Bangalore S. The transition from hypertension to heart failure: contemporary update. JACC: Heart Failure. 2017;5(8):543-51. [DOI:10.1016/j.jchf.2017.04.012]
27. Berkhin P, Becher JD. Learning simple relations: theory and applications. In: Proceedings of the 2002 SIAM International Conference on Data Mining. 2002. p. 420-36. [DOI:10.1137/1.9781611972726.25]
28. Zebin T, Rezvy S, Chaussalet TJ. A deep learning approach for length of stay prediction in clinical settings from medical records. In: 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). 2019. p. 1-6. [DOI:10.1109/CIBCB.2019.8791477]
29. Flach P, Blockeel H, Ferri C, Orallo JH, Struyf J. Decision support for data mining: an introduction to ROC analysis and its applications. In: Data Mining and Decision Support: Integration and Collaboration. Springer; 2003. p. 81-90. [DOI:10.1007/978-1-4615-0286-9_7]
30. Galdi P, Tagliaferri R. Data mining: accuracy and error measures for classification and prediction. Encyclopedia of Bioinformatics and Computational Biology. 2018;1:431-6. [DOI:10.1016/B978-0-12-809633-8.20474-3]
31. Ben-David A. About the relationship between ROC curves and Cohen's kappa. Engineering Applications of Artificial Intelligence. 2008;21(6):874-81. [DOI:10.1016/j.engappai.2007.09.009]
32. Preda S, Oprea SV, Bâra A, Belciu (Velicanu) A. PV forecasting using support vector machine learning in a big data analytics context. Symmetry. 2018;10(12):748. [DOI:10.3390/sym10120748]
33. Huang J, Ling CX. Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering. 2005;17(3):299-310. [DOI:10.1109/TKDE.2005.50]
34. Levy D, Larson MG, Vasan RS, Kannel WB, Ho KK. The progression from hypertension to congestive heart failure. JAMA. 1996;275(20):1557-62. [DOI:10.1001/jama.1996.03530440037034]
35. Maharlou H, Niakan Kalhori S.R, Shahbazi S, Ravangard R. Predicting length of stay in intensive care units after cardiac surgery: comparison of artificial neural networks and adaptive neuro fuzzy system. Healthcare Informatics Research. 2018;24(2):109-17. [DOI:10.4258/hir.2018.24.2.109]
36. Gholipour C, Rahim F, Fakhree A, Ziapour B. Using an artificial neural networks (ANNs) model for prediction of intensive care unit (ICU) outcome and length of stay at hospital in traumatic patients. Journal of Clinical and Diagnostic Research. 2015;9(4):OC19-23. [DOI:10.7860/JCDR/2015/9467.5828]
37. Bleumink GS, Knetsch AM, Sturkenboom MC, Straus SM, Hofman A, Deckers JW, et al. Quantifying the heart failure epidemic: prevalence, incidence rate, lifetime risk and prognosis of heart failure: the Rotterdam Study. European Heart Journal. 2004;25(18):1614-9. [DOI:10.1016/j.ehj.2004.06.038]
38. Sud M, Yu B, Wijeysundera HC, Austin PC, Ko DT, Braga J, et al. Associations between short or long length of stay and 30 day readmission and mortality in hospitalized patients with heart failure. JACC: Heart Failure. 2017;5(8):578-88. [DOI:10.1016/j.jchf.2017.03.012]
39. Gottlieb SS, Abraham W, Butler J, Forman DE, Loh E, Massie BM, et al. The prognostic importance of different definitions of worsening renal function in congestive heart failure. Journal of Cardiac Failure. 2002;8(3):136-41. [DOI:10.1054/jcaf.2002.125289]
40. Heist EK, Ruskin JN. Atrial fibrillation and congestive heart failure: risk factors, mechanisms, and treatment. Progress in Cardiovascular Diseases. 2006;48(4):256-69. [DOI:10.1016/j.pcad.2005.09.001]

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2026 CC BY-NC 4.0 | Journal of Health Administration

Designed & Developed by : Yektaweb