tex2html_comment_mark>822
figures/perf_time_corr_all.ps
|
Figure 6.5 examines the impact of attribute coupling on classification performance. We finally see KNN performing somewhat better as the coupling parameter approaches 0.45. We also see the LR classifiers doing slightly worse as we approach this extreme level of coupling. One of the three datasets in Minka [27] deliberately introduced correlations among the attributes. The author found that Newton's method performed better than CG on this dataset.
Figure 6.6 is a magnification of the upper right corner of the AUC plot from Figure 6.5. We can see that LR performance is degrading with increased coupling, and CG-MLE degrades more than the two IRLS methods. Our results seem less extreme than Minka's, perhaps because we are measuring AUC instead of likelihood. Minka concludes that one should decorrelate data before running LR. Decorrelation may be unreasonable for many high-dimensional datasets, and it does not seem worthwhile if AUC is the focus.
The time graph in Figure 6.5, shown on the right in the figure, suggests that KNN and LR speed may increase as the level of coupling increases. The increase appears to be at most two-fold for LR and five- to ten-fold for KNN. The other algorithms seem virtually unaffected by the increase in coupling.