Composite rough sets for dynamic data mining


As a soft computing tool, rough set theory has become a popular mathematical framework for pattern recognition, data mining and knowledge discovery. It can only deal with attributes of a specific type in the information system by using a specific binary relation. However, there may be attributes of multiple different types in information systems in real-life applications. Such information systems are called as composite information systems in this paper. A composite relation is proposed to process attributes of multiple different types simultaneously in composite information systems. Then, an extended rough set model, called as composite rough sets, is presented. We also redefine lower and upper approximations, positive, boundary and negative regions in composite rough sets. Through introducing the concepts of the relation matrix, the decision matrix and the basic matrix, we propose matrix-based methods for computing the approximations, positive, boundary and negative regions in composite information systems, which is crucial for feature selection and knowledge discovery. Moreover, combined with the incremental learning technique, a novel matrix-based method for fast updating approximations is proposed in dynamic composite information systems. Extensive experiments on different data sets from UCI and user-defined data sets show that the proposed incremental method can process large data sets efficiently.

Information Sciences