Showing 7 results for Prediction
A Biglarian, E Hajizadeh, A Kazemnejad,
Volume 6, Issue 3 (12-2010)
Abstract
Background & Objective: Using parametric models is common approach in survival analysis. In the recent years, artificial neural network (ANN) models have increasingly used in survival prediction. The aim of this study was to predict of survival rate of patients with gastric cancer by using a parametric regression and ANN models and compare these methods.
Methods: We used the data of 436 gastric cancer patients from a cancer registry in Tehran between 2002-2007. All patients had a confirmed diagnosis. Data were randomly divided into two groups: training and testing (or validation) set. For analysis of data we used a parametric model (exponential, Weibull, normal, lognormal, logistic and log-logistic models) and a three layer ANN model. In order to compare of the prediction of two models, we used the area under receiver operating characteristic (AUROC) curve, classification table and concordance index.
Results: The prediction accuracy of the ANN and the parametric (Weibull) models were 79.45% and 73.97% respectively. The AUROC for the ANN and the Weibull models were 0.815 and 0.748 respectively.
Conclusions: The ANN had a better predictions than the Weibull model. Thus it is suggested to use of the ANN model survival prediction in field of cancer.
J Hasanzadeh, F Najafi, M Moradinazar,
Volume 11, Issue 1 (6-2015)
Abstract
The time series is a collection of observation data that are arranged according to time. The main purpose of setting up a time series is to predict future values. The first step in time series data is graphed. Using graphs can provide general information such as uptrend or downtrend, seasonal patterns, periodic presence, and outliers in time series graphs. After graphing the data, if a good forecast is required, stationary data can be used. Differencing or decomposition methods can be used to make the data stationary. Then, a correlogram can be used to identify the order moving average and autoregressive model. The parameters of the model are examined using T-test. If the parameters are significant and the residue is independence, the predicted values can be evaluated using the mean absolute percentage error.
P Kimyaiee, M Bakhtiyari, M Mirzamoradi, S Ashrafivand, Ma Mansournia,
Volume 11, Issue 3 (11-2015)
Abstract
Background and Objectives: GTN is a general term for an extensive range of malignant trophoblastic diseases including invasive mole, choriocarcinoma, epithelioid trophoblastic tumors and placental site trophoblastic tumors. The aim of this study was to predict the risk of GTN in patients with molar pregnancy in Tehran.
Methods: All cases with partial and complete mole with a record of at least 4 titers of β-hCG were included in this study. Before and after fitting the appropriate model for calculating the area under the curve of each predictor variable, the type of the relationship (linear or non-linear) was first determined using locally weighted scatter plot smoothing (Lowess Smoother) and fractional polynomial regression (Fracpoly); then, a model tailored to data processing was used for drawing the ROC diagram.
Results: Nonparametric chi-square analysis indicated no significant difference between the components of high-risk molar pregnancy and GTN (P=0.39). Generally, among 201 cases of molar pregnancy, 61 (30%) had one of the components of high-risk molar pregnancy. The ROC curve with an AUC of 0.86 showed that the regression slope of β-hCG with 73% sensitivity and 88% specificity could be used as a predictor.
Conclusion: The serum β-hCG measurement after 21 days of molar pregnancy evacuation and the slope of the linear regression line of β-hCG were found be good tests to distinguish between patients who will benefit from spontaneous disease remission and patients developing GTN.
S Setareh, M Zahiri Esfahani , M Zare Bandamiri , A Raeesi, R Abbasi,
Volume 14, Issue 1 (6-2018)
Abstract
Background and Objectives: Colon cancer is the third most common cancer in the world and the fourth most common cancer in Iran. It is very important to predict the cancer outcome and its basic clinical data. Due to to the high rate of colon cancer and the benefits of data mining to predict survival, the aim of this study was to survey two widely used machine learning algorithms, Bagging and Support Vector Machines (SVM), to predict the outcome of colon cancer patients.
Methods: The population of this study was 567 patients with stage 1-4 of colon cancer in Namazi Radiotherapy Center, Shiraz in 2006-2011. Three hundred and thirty eight patients were alive and 229 patients were dead. We used the Support Vector Machines (SVM) and Bagging methods in order to predict the survival of patients with colon cancer. The Weka software ver 3.6.10 was used for data analysis.
Results: The performance of two algorithms was determined using the confusion matrix. The accuracy, specificity, and sensitivity of the SVM was 84.48%, 81%, and 87%, and the accuracy, specificity, and sensitivity of Bagging was 83.95%, 78%, and 88%, respectively.
Conclusion: The results showed both algorithms have a high performance in survival prediction of patients with colon cancer but the Support Vector Machines has a higher accuracy.
M Safari, M Sadeghifar, Gh Roshanaei , A Zahiri,
Volume 14, Issue 2 (9-2018)
Abstract
Background and Objectives: Tuberculosis is a chronic bacterial disease and a major cause of morbidity and mortality. It is caused by a Mycobacterium tuberculosis. Awareness of the incidence and number of new cases of the disease is valuable information for revising the implemented programs and development indicators. time series and regression are commonly used models for prediction but these methods require some assumptions. The purpose of this study was to predict new TB cases using the hidden Markov model which does not require many assumption.
Methods: The data used in this study was the monthly number of new TB cases during 2006-2016 identified and recorded in Hamedan Province. Rorecasting the number of new TB cases was done using hidden Markov models using the hidden Markov package in the R software.
Results: According to the AIC and BIC criterion, two states had the best fit to the data, i.e. the data of this study were a mixture of two Poisson distributions with average number of event 5.96 and 10.2 respectively. The results also predicted the number of new cases over the next 24 months based on the hidden Markov model would be between 8 and 9 new cases in each month.
Conclusion: The hidden Markov model is the best model for prediction using the Markov chain. This model, in addition to detection of an appropriate model for the available data, can determine the transition probability matrix, which can help physicians predict the future state of the disease and take preventive measures befor reaching advanced stages.
L Tapak, N Shirmohammadi-Khorram , O Hamidi, Z Maryanaji,
Volume 14, Issue 2 (9-2018)
Abstract
Background and Objectives: Identification of statistical models has a great impact on early and accurate detection of outbreaks of infectious diseases and timely warning in health surveillance. This study evaluated and compared the performance of the three data mining techniques in time series prediction of brucellosis.
Methods: In this time series, the data of the human brucellosis cases and climatology parameters of Hamadan, west of Iran, were analyzed on a monthly basis from 2004 (March/April) to 2017 (February/March). The data were split into two subsets of train (80%) and test (20%). Three techniques, i.e. radial basis function (RBF) and multilayer perceptron (MLP) artificial neural network methods as well as K Nearest neighbor (KNN), were used in both subsets. The root mean square errors (RMSE), mean absolute errors (MAE), mean absolute relative errors (MARE), determination coefficient (R2) and intra-class correlation coefficient (ICC) were used for performance comparison.
Results: Results indicated that RMSE (23.79), MAE (20.65) and MARE (0.25) for MLP were smaller compared to the values of the other two models. The ICC (0.75) and R2 (0.61) values were also better for this model. Thus, the MLP model outperformed the other models in predicting the used data. The most important climatology variable was temperature.
Conclusion: MLP can be effectively applied to diagnose the behavior of brucellosis over time. Further research is necessary to detect the most suitable method for predicting the trend of this disease.
F Feizmanesh, Aa Safaei,
Volume 14, Issue 3 (12-2018)
Abstract
Background and Objectives: Pulmonary embolism is a potentially fatal and prevalent event that has led to a gradual increase in the number of hospitalizations in recent years. For this reason, it is one of the most challenging diseases for physicians. The main purpose of this paper was to report a research project to compare different data mining algorithms to select the most accurate model for predicting pulmonary embolism in hospitalized patients. This model would provide the knowledge needed by the medical staff fir better decision making.
Methods: In this research, we designed a prediction model using different methods of machine learning that would best predict the probability of pulmonary embolism in patients at risk. Among data mining algorithms, Bayesian network, decisions tree (J48), logistic regression (LR), and sequential minimal optimization (SMO) were used. The data used in the study included risk factors and past history of patients admitted to the Lung Department of Shariati Hospital, Tehran, Iran.
Results: The results showed that the accuracy and specificity of all prediction models were satisfactory. The Bayesian model had the highest sensitivity in predicting pulmonary embolism.
Conclusion: Although the results showed a little difference in the performance of prediction models, the Bayesian model is a more appropriate tool to predict the occurrence of pulmonary embolism in hospitalized patients in this type of data. It can be considered a supportive approach along medical decisions to improve disease prediction.