Evaluation of classification methods is one of the most important step in data mining. The most techniques are confusion matrix, learning curves and receiver operating curves (ROC). The confusion matrix shows the number of correct and incorrect prediction made by the prediction models. True positive (TP) are the case which are predicted correctly by the model, and actually the patient has survived. True negatives (TN) are those case that are predicted expired by the model, but actual is the patient has survived. False Positive are cases when model predicts the patient survived, but the patient actually expired (Type I error). False negative are cases when prediction of the model is the patient expired, but they actually have survived (Type I I error).
Accuracy is the proportion of the total number of all the correct prediction of the model. It is calculated as the ratio between the total numbers of correctly classified cases to the overall number of cases under consideration.
Accuracy = TP +TN / (TP+TN+FP+FN)
Sensitivity is the proportion of positive cases, which are correctly classified i.e. the percentage of patients who expired and are classified correctly as expired.
Sensitivity = TP / (TP+FN)
Specificity is the proportion of negative cases, which are correctly classified i.e. the percentage of patients who survived and are classified correctly as survived.
Specificity = TP / (TN+FP)
Where TP, TN, FP and FN denote true positives, true negatives, false positives and false negatives.