Anticipated Date of Graduation
Spring 2024
Document Type
Thesis
Degree Name
Master of Science in Mathematical Sciences
Department
Mathematical Sciences
First Advisor
Doug Darbro
Abstract
Discriminant analysis is a statistical technique used to classify data into different classes. Many studies have compared different methods used to classify data as regards their performance. This study compares Linear Discriminant Analysis and Quadratic Discriminant Analysis under varying conditions of normality and the equality of covariance matrices. More precisely, this study seeks to determine which of the two techniques is better when classifying datasets with different properties of normality and equality of covariance matrices and aims to determine whether normality and equality of covariance matrices influence the prediction performance of each method. This study processes online stores’ customer sales data. Though the data processed was randomly generated, it was close to reality, since the data generation took into account different aspects like the mean and standard deviations of purchases of a particular type of product for a given period. By varying such parameters as the mean and the standard deviation, approximate real-world datasets were obtained. These datasets were processed using LDA and QDA for classification and the ROC-AUC score was used as the performance metric for each method. By statistically comparing these metrics, information was obtained concerning which method performed better under certain conditions. The results indicate that LDA performs better than QDA when classifying online stores’ customers based solely on their purchasing habits, but also reveal an insensitivity of LDA to changes in both normality and equality of covariance matrices. With these results, businesses with online stores will be able to choose wisely which classification method to use depending on the type of distribution contained in the dataset.
Recommended Citation
Bate-Eya, Ayuk Egbe, "Empirically comparing the performance of LDA and QDA when classifying customer sales data with different properties of normality and equality of covariance matrices" (2024). Master of Science in Mathematics. 82.
https://digitalcommons.shawnee.edu/math_etd/82