top of page
EECS 351 Language Classification
Multi-Language Classification Results
Once we were confident in our binary classification, we decided to introduce the third language, Hindi, into our system. After integrating the third class into our code, we ended up with the results below.
Multi-Language Classifier: About
Multi-class KNN, K = 3, # Predictors = 2500, 47.3% Accuracy
Multi-Language Classifier: Image
As shown in the confusion matrix above, introducing Hindi and following the same procedure did not present us with the best results. Classification accuracy of Mandarin and English both decreased to between 50-60%, but the more troubling result was that Hindi classification accuracy was 24%, being even worse on average than theoretical random guessing. From this point, we wanted to improve our classifier to the point where the accuracy for classifying all three languages approaches the same accuracy. After including extracted pitch analysis features (discussed in Data) and adjusting training parameters, we obtained the following results.
Multi-Language Classifier: Text
KNN:
Multi-Language Classifier: Text
Multi-class KNN, K = 10, # Predictors = 3000, 43.33% Accuracy
Multi-Language Classifier: Image
Multi-class KNN, K = 10, No Relieff, 43.67% Accuracy
Multi-Language Classifier: Image
When comparing with our initial multi-class classification results, it can be seen that the Hindi classification accuracy has notably improved from our initial results. Using relieff, all three languages have comparable accuracies instead of Hindi having a significantly lower accuracy (see Discussion).
Multi-Language Classifier: Text
Other Classifiers:
Multi-Language Classifier: Text
Multi-class Error-Correcting Output Codes (SVM), 35% Accuracy
Multi-Language Classifier: Image
Multi-class Binary Classification Tree, 36% Accuracy
Multi-Language Classifier: Image
Multi-Language Classifier: Text
bottom of page