Skip to Main Content
One of the main causes of death the world over are cardiovascular diseases, of which coronary artery disease (CAD) is a major type. This disease occurs when the diameter narrowing of one of the left anterior descending, left circumflex, or right coronary arteries is equal to or greater than 50 percent. Angiography is the principal diagnostic modality for the stenos is of heart vessels, however, because of its complications and costs, researchers are looking for alternative methods such as data mining. This study conducts data mining algorithms on the Z-Alizadeh Sani dataset which has been collected from 303 random visitors to Tehran's Shaheed Rajaei Cardiovascular, Medical and Research Center. In this paper, the reason of effectiveness of a preprocessing algorithm on the dataset is investigated. This algorithm which has been merely introduced in our previous works, extracts three new features from the dataset. These features are then used to enrich the primary dataset in order to achieve more accurate results. Moreover, despite the fact that misclassification of diseased patients has more side effects than that of healthy ones, to the best of our knowledge cost-sensitive algorithms have yet to be used in this field. Therefore, in this paper 10-fold cross validation on cost-sensitive algorithms along with base classifiers of Naïve Bayes, Sequential Minimal Optimization (SMO), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and C4.5 were employed. As a result, the SMO algorithm has yield to very high sensitivity (97.22%) and accuracy (92.09%) rates, the likes of which have not been reported simultaneously in the existing literature.