EECS 351 Language Classification
Classification
Each language consists of different tonalities, grammatical structures, and phonetics etc. Writing an algorithm to distinguish each would involve a detailed and specific understanding of each language, a complex and difficult task.Therefore, in this case, machine learning offers a better approach as ML identifies trends/patterns that are not apparently available to humans. In our analysis, we attempted various ML classifiers in Matlab to group and classify the audio files by language.
For our project, we initially tested binary classification using various MATLAB classifiers to compare performances and later select a classifier to attempt to build upon and increase performance. The three classifiers we tested were K-Nearest Neighbors, Support Vector Machine, and Binary Classification Tree. For this stage of the project, we used the 14 MFCC features of each time window. When implementing our K-Nearest Neighbors classifier, we also used MATLAB’s relieff function to extract the features most beneficial for training. Trimming our matrix of features with relieff allowed us to improve our model accuracy while also reducing model size and computation time. In the later stage of our project, we use relieff similarly, but pass in a combined matrix of 14 MFCCs over time and 3 additional pitch features.


After trimming our feature matrix, we used it to train various classifiers of interest. Once we trained our classifiers, we compared the accuracy of each classifier to decide on which classifier provided our project with optimal performance.
fitcknn
Fitcknn applies a K-means clustering algorithm and returns a k-nearest neighbor classification model. An example of how we used this function is shown below:
Mdl = fitcknn(mfcctrain, trainLabels, “NumNeighbors”, Numneighbors);
mfcctrain = a matrix of features
trainLabels = a matrix of labels
Numneighbors = specifies the number of neighbors for the classifier
fitcsvm
The function returns an SVM classifier for one-class or binary classification. An example of how we used fitcsvm is provided below:
Mdl = fitcsvm(mfcctrain, trainLabels);
The two parameters described here are similar to what was described in fitcknn.
fitctree
The function returns a fitted binary classification tree.
Mdl = fitctree(mfcctrain, trainLabels);
The two parameters described here are similar to what was described in fitcknn.

