A Design Methodology for Distributed Adaptive Stream Mining Systems
Data-driven, adaptive computations are key to enabling the deployment of accurate and efficient stream mining systems, which invoke suitably configured queries in real-time on streams of input data. Due to the physical separation among data sources and computational resources, it is often necessary to deploy such stream mining systems in a distributed fashion, where local learners have access to disjoint subsets of the data that is to be mined, and forward their intermediate results to an ensemble learner that combines the results from the local learners. In this paper, we develop a design methodology for integrated de- sign, simulation, and implementation of dynamic data-driven adaptive stream mining systems. By systematically integrating considerations associated with local embedded processing, classifier configuration, data-driven adaptation and networked com- munication, our approach allows for effective assessment, prototyping, and implementation of alternative distributed design methods for data-driven, adaptive stream mining systems. We demonstrate our results on a dynamic data-driven application involving patient health care monitoring.