Machine Learning with Swift
上QQ阅读APP看书,第一时间看更新

Calculating the confusion matrix

We'll use a straightforward approach here to calculate the confusion matrix; however, this would not work for multiclass classification. Here, p stands for predicted value, and t is for ground truth:

let pairs: [(Int, Int)] = zip(predictions, yVecTest).map{ ($0.0, $0.1) } 
var confusionMatrix = [[0,0], [0,0]] 
for (p, t) in pairs { 
    switch (p, t) { 
    case (0, 0): 
        confusionMatrix[0][0] += 1 
    case (0, _): 
        confusionMatrix[1][0] += 1 
    case (_, 0): 
        confusionMatrix[0][1] += 1 
    case (_, _): 
        confusionMatrix[1][1] += 1 
    } 
} 
     
let totalCount = Double(yVecTest.count) 

Normalize the matrix by total count:

let normalizedConfusionMatrix = confusionMatrix.map{$0.map{Double($0)/totalCount}} 

As we already know, accuracy is a number of true predictions divided by the total number of cases.

To calculate accuracy, try using the following code:

let truePredictionsCount = pairs.filter{ $0.0 == $0.1 }.count 
let accuracy = Double(truePredictionsCount) / totalCoun  

To calculate true positive, false positive, and false negative counts, you can use the numbers from the confusion matrix, but let's do it the proper way:

 
let truePositive = Double(pairs.filter{ $0.0 == $0.1 && $0.0 == 0 }.count) 
let falsePositive = Double(pairs.filter{ $0.0 != $0.1 && $0.0 == 0 }.count) 
let falseNegative = Double(pairs.filter{ $0.0 != $0.1 && $0.0 == 1 }.count) 

To calculate precision:

let precision = truePositive / (truePositive + falsePositive) 

To calculate recall:

let recall = truePositive / (truePositive + falseNegative) 

To calculate F1-score:

let f1Score = 2 * precision * recall / (precision + recall) 
     
return Metrics(confusionMatrix: confusionMatrix, normalizedConfusionMatrix: normalizedConfusionMatrix, accuracy: accuracy, precision: precision, recall: recall, f1Score: f1Score) 
} 

Here is my result for the decision tree on iOS:

Confusion Matrix: 
[[135, 17],  
[20, 128]] 
 
Normalized Confusion Matrix: 
[[0.45000000000000001, 0.056666666666666664],  
[0.066666666666666666, 0.42666666666666669]] 
 
Accuracy: 0.876666666666667 
Precision: 0.870967741935484 
Recall: 0.888157894736842 
F1-score: 0.879478827361563 
 

And for the random forest:

Confusion Matrix: 
[[138, 14],  
[18, 130]] 
 
Normalized Confusion Matrix: 
[[0.46000000000000002, 0.046666666666666669],  
[0.059999999999999998, 0.43333333333333335]] 
 
Accuracy: 0.893333333333333 
Precision: 0.884615384615385 
Recall: 0.907894736842105 
F1-score: 0.896103896103896 

Congratulations! We've trained two machine learning algorithms, deployed them to the iOS, and evaluated their accuracy. Interesting that while decision tree metrics match perfectly, the random forest performance is slightly worse on Core ML. Don't forget to always validate your model after any type of conversion.