next up previous contents
Next: B.4.2 dtree Up: B.4 Learners Previous: B.4 Learners   Contents


B.4.1 bc

This is an implementation of a Naive Bayesian Classifier [28,5] with binary-valued inputs attributes (all input values must be zero or one). The implementation has been optimized for speed and accuracy on very high dimensional sparse data.


$\displaystyle P(y =$   ACT$\displaystyle \vert x_1,x_2 \ldots x_m)$ $\displaystyle =$   (B.1)
$\displaystyle \frac{
P(x_1,x_2 \ldots x_m \vert y = \mbox{{\sc Act}}) P(y = \mb...
...x_2 \ldots x_m \vert y = \mbox{{\sc Not\_Act}}) (1 - P(y = \mbox{{\sc Act}}))
}$ $\displaystyle =$   (B.2)
$\displaystyle \frac{\theta_{\mbox{\scriptsize {\sc Act}}}p_{\mbox{\scriptsize {...
...heta_{\mbox{\scriptsize {\sc Not\_Act}}}(1 - p_{\mbox{\scriptsize {\sc Act}}})}$     (B.3)

where

$\displaystyle \theta_{\mbox{\scriptsize c}} = \prod_{k=1}^m P(x_k \vert y = c)$ (B.4)

under the (dubious) assumption that the input attributes $ x_1 \ldots x_m$ are conditionally independent of each other given a known activity level, and where

$\displaystyle p_{\mbox{\scriptsize {\sc Act}}}= P(Y = \mbox{{\sc Act}})$ (B.5)

Learning is trivial. From the dataset, $ p_{\mbox{\scriptsize {\sc Act}}}$ is simply estimated as the fraction of records in which $ y_i =$   ACT. And $ P(x_k = v \vert y = c)$ is estimated as

$\displaystyle \frac{ \mbox{Number of records with $x_i = v$ and $y_i = \mbox{{\sc Act}}$} }{ \mbox{Number of records with $y_i = \mbox{{\sc Act}}$} }$ (B.6)

Because of the vast numbers of attributes and because of the need to exploit sparseness, a number of computational tricks are used, including a method to permit all probabilities to be evaluated in log space, and a subtraction trick to avoid needing to ever explicitly iterate over elements of a record in who have a value of zero.

BC needs no keywords or arguments.


next up previous contents
Next: B.4.2 dtree Up: B.4 Learners Previous: B.4 Learners   Contents
Copyright 2004 Paul Komarek, komarek@cmu.edu