|
自动化学报 2012
An On-line Density-based Clustering Algorithm for Spatial Data Stream
|
Abstract:
We propose an efficient online density-based clustering algorithm (On-line density-based clustering algorithm for spatial data stream, OLDStream), which is designed for online discovering clusters in spatial data stream. In OLDStream, only the new spatial point and its adjunct points which satisfy core point are processed in clustering update. And the overall clusters results can be accessed instantaneously. The developed algorithm has exhibited many advantages such as its high scalability to online process incremental large-scale spatial data, its capability to discover overall clusters with arbitrary shape instantaneously, its insensitivity to the input sequence of data stream, and its capability to detect all isolated points. An experimental evaluation of the effectiveness, efficiency and scalability of our algorithm was performed by using real data and large synthetic data from Matlab and Thomas Brinkhoff's network-based generator. Experimental results vividly demonstrated that our algorithm can fast and efficiently cluster new points based on the previous points. The statistics of the results showed that only 4% of the points take the worst case running time, and the average running time is about 0.033 ms for each point process.