All experiments in this thesis are ten-fold cross-validations. The predictive performance of the experiments is measured using the Area Under Curve (AUC ) metric, which is described below. All times are reported in seconds, and details about timing measurements may be found in Section 5.1.5.
Before we describe AUC scores, we must first describe Receiver
Operating Characteristic (ROC ) curves [5]. We will
use the ROC and AUC description we published in
[20]. To construct an ROC curve, the
dataset rows are sorted according to the probability a row is in the
positive class under the learned logistic model. Starting at the
graph origin, we examine the most probable row. If that row is
positive, we move up. If it is negative, we move right. In either
case we move one unit. This is repeated for the remaining rows, in
decreasing order of probability. Every point
on an ROC curve represents the learner's ``favorite''
rows from the
dataset. Out of these favorite rows,
are actually positive, and
are negative.
Figure 5.2 shows an example ROC curve. Six predictions are made, taking values between 0.89 down to 0.17, and are listed in the first column of the table in the lower-left of the graph. The actual outcomes are listed in the second column. The row with highest prediction, 0.89, belongs to the positive class. Therefore we move up from the origin, as written in the third column and shown by the dotted line in the graph moving from (0,0) to (1,0). The second favorite row was positive, and the dotted line moves up again to (2,0). The third row, however, was negative and the dotted line moves to the right one unit to (2,1). This continues until all six predictions have been examined.
Suppose a dataset had
positive rows and
negative rows. A
perfect learner on this dataset would have an ROC curve starting at
the origin, moving straight up to
, and then straight right to
end at
. The solid line in Figure 5.2 illustrates
the path of a perfect learner in our example with six predictions.
Random guessing would produce, on average, an ROC curve which
started at the origin and moved directly to the termination point
. Note that all ROC curves will start at the origin and
end at
because
steps up or right must be taken, one for
each row.
As a summary of an ROC curve, we measure the area under the curve relative to area under a perfect learner's curve. The result is denoted AUC . A perfect learner has an AUC of 1.0, while random guessing produces an AUC of 0.5. In the example shown in Figure 5.2, the dotted line representing the real learner encloses an area of 5. The solid line for the perfect learner has an area of 8. Therefore the AUC for the real learner in our example is 5/8.
Whereas metrics such as precision and recall measure true positives and negatives, the AUC measures the ability of the classifier to correctly rank test points. This is very important for data mining. We often want to discover the most interesting galaxies, or the most promising drugs, or the products most likely to fail. When presenting results between several classification algorithms, we will compute confidence intervals on AUC scores. For this we compute one AUC score for each fold of our 10-fold cross-validation, and report the mean and a 95% confidence interval using a T distribution.