Binary Classification Results

For our initial model, we decided to start off by implementing a binary classifier for English and Mandarin Chinese. We first took the raw training data from the Common Voice dataset, computed Fourier coefficients, and then fed that into our system for training. After doing so, the first results that we obtained are displayed below in a confusion matrix. Using 200 clips of each language for training, we obtained an accuracy of 87% for the Chinese classification, while only a 32% accuracy for its English counterpart.

Binary Classifier: About

Binary KNN Classification, K = 3

Binary Classifier: Welcome

After looking at the low accuracies obtained for the English classification, we manually picked out audio samples with better quality to train our model with. We also systematically trimmed audio clips upon finding strange artifacts of MFCC at the beginning of many clips. Finally, we also decided to implement the relieff function (discussed in Classification), which helped us determine the most useful features in classification and select the top predictors for model training. After applying these changes to our model, we obtained the following results.

Binary Classifier: Text

Binary KNN Classification, K = 3, # Predictors = 2500

Binary Classifier: About Me

The performance of English classification increased significantly as a result of the changes implemented. We consistently obtained accuracies of about 75% for both languages, which comfortably met the threshold we had set as our accuracy goal.

Multi-Language Classifier

Return to Top

Binary Classifier: Text