Hello everyone

I have implemented a kd-tree search algorithm in Mata, that can find the k nearest neighbours of a p-dimensional point among a set of points. For large data sets, this can be much faster than a 'brute force' search, and it could be useful for researchers doing spatial analysis.

The code is available from my Github repository. Simply download and run the file mata_knn.do; this will intialize all the Mata functions. Example usage:
Code:
version 15.1
mata: mata clear
mata: mata set matastrict on
run mata_knn.do
mata:
    N = 10000
    k = 5
    query_coords = runiform(N,2)
    data_coords = runiform(N,2)
    knn(data, data, k, kni=., knd=.)
end
The matrices kni and knd contain the indices of, and distances to the k nearest points, for each query point. Of course, the query and the data points could be the same in which case the first nearest neighbour is always 'self'. Duplicate data_coords are not allowed, and will throw an error.

I have only thoroughly tested it with 2-dimensional points yet. If you feel that this is useful, or if you find any bugs, kindly let me know! I also consider uploading it to the SSC archive, but have not found the time to do so yet.

Best
Robert