What Is a Data Mining System?

Data mining system (data mining system) refers to a system that mines interesting knowledge from a large amount of data stored in a database, data warehouse, or other information base. In recent years, in order to promote the application of data mining in practice, many researchers have done a lot of research work on the architecture of data mining systems.

Data mining
A well-structured data mining system should have the following characteristics: [1]
A single database / data warehouse data mining system is a more mature data mining application system currently developed. Many commercial data mining applications should
With the development and maturity of network technology and distributed database technology, distributed databases have been more and more widely used. The original centralized storage and management of data has gradually transformed into distributed storage and management. The changes in data storage methods have also It will inevitably promote the changes of data mining technology and its system structure. Due to the security, privacy, confidentiality of the data and the bandwidth limitation of the network in practical applications, the method of first dispersing the stored data into a database and then mining It is not feasible, so distributed data mining has become the most feasible solution for data mining in distributed databases. [1]
The current commercial data mining software has further promoted the popularity and development of data mining technology, but there are still many problems in actual applications and areas for continuous research and improvement. The current main research directions and development trends include the following. Aspect: [1]
(1) Enhance visualization and interactivity. A data mining system with good visualization and interactive functions can enable users to intuitively see and understand the definition and execution process of data mining tasks, reduce the blindness of users' mining knowledge and the generation of a large number of unrelated patterns in the mining process, and improve the mining of the system Efficiency and user satisfaction and credibility of mining results.
(2) Improve scalability. Because the user's application environment is constantly changing, scalability is very important for data mining systems. The system should support the mining of multiple data sources and the scalability of mining algorithms, allowing users to add new algorithms as needed.
(3) Combination with specific industry applications. With the development of the application environment, the general data mining system can no longer meet the needs of users. If the user does not understand the characteristics of the mining algorithm, it is difficult to obtain a good model. Therefore, the data mining system should be closely integrated with the specific application. Provide a complete solution for this application area.
4) Follow uniform standards. Although data mining has not yet formed a complete set of industry standards, some standards have emerged, such as the data mining process standard CRISP DM, the prediction model exchange standard PMML, and Microsoft's OLE DB for DM. Data mining systems that follow a unified standard can easily share mining algorithms and models.
5) Support mobile environment. Combining data mining and mobile computing is a new research area, so a data mining system capable of mining data generated by mobile systems, embedded systems, and ubiquitous computing devices is a new development trend in the future.

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?