Next: B.4.7 super
Up: B.4 Learners
Previous: B.4.5 newknn
Contents
B.4.6 oldknn
This is an implementation of the conventional Ball-tree based
k nearest neighbor search.
A ball tree is a binary tree in which each node represents a
set of points, called Points(Node). Given a dataset, the
root node of a ball tree represents the full set of points in
the dataset. A node can be either a leaf node or a
non-leaf node.
A leaf node explicitly contains a
list of the points represented by the node.
A non-leaf node
does not explicitly contain a set of points. It has two children
nodes: Node.child1 and Node.child2, where

and
Points are organized spatially so that balls lower down the tree cover
smaller and smaller volumes. This is achieved by insisting, at tree
construction time, that
This gives the ability to bound distance from a target point t
to any point in any ball tree node. If x
Points(Node)
then we can be sure that:
The process of oldknn is described as below:
- Step 1. Build a ball tree on the training dataset.
- Step 2. Search the ball tree (Depth first recursive search) to find the
k nearest neighbors of t. Some nodes can be pruned according to
the above inequalities.
| Keyword |
Arg Type |
Arg Vals |
Default |
| Common |
| k |
int |
[0, ) |
9 |
| rmin |
int |
[1, ) |
5 |
| mbw |
float |
[1e-7, ) |
0.01 |
|
|
|
|
Common keywords and arguments:
- k int: number of nearest neighbor you are
interested.
- rmin int: the maximum number of points that
will be allowed in a tree node. Sometimes it saves memory to use
rmin
1. It even seems to speed things up, possibly because of
L1/L2 cache effects.
- mbw float: "min_ball_width" = a node will not
be split if it finds that all its points are within distance "mbw" of
the node centroid. It will not be split (and will hence be a leaf node).
Next: B.4.7 super
Up: B.4 Learners
Previous: B.4.5 newknn
Contents
Copyright 2004 Paul Komarek, komarek@cmu.edu