What Is Spatial Data Mining?

Spatial data mining refers to theories, methods, and techniques that extract hidden knowledge and spatial relationships that are not clearly expressed from a spatial database, and find useful features and patterns among them. The process of spatial data mining and knowledge discovery can be roughly divided into the following steps: data preparation, data selection, data preprocessing, data reduction or data transformation, determination of data mining goals, determination of knowledge discovery algorithms, data mining, pattern interpretation, knowledge Evaluation, etc., and data mining is just one of the key steps. But for simplicity, people often use spatial data mining instead of spatial data mining and knowledge discovery.

Spatial data mining

discuss
Spatial data mining refers to theories, methods, and techniques that extract hidden knowledge and spatial relationships that are not clearly expressed from a spatial database, and find useful features and patterns. The process of spatial data mining and knowledge discovery can be roughly divided into the following steps: data preparation, data selection, data preprocessing, data reduction or data transformation, determination of data mining goals, determination of knowledge discovery algorithms, data mining, pattern interpretation, knowledge Evaluation, etc., and data mining is just one of the key steps. But for simplicity, people often use spatial data mining instead of spatial data mining and knowledge discovery.
Chinese name
Spatial data mining
Nature
Data mining
Attributes
space
Spatial analysis method
Data analysis
1. Probabilistic approach. This is a method of mining spatial knowledge by calculating the probability of uncertainty attributes. The discovered knowledge is usually expressed as a conditional probability that an assumption is true under given conditions. When describing the uncertainty of remote sensing classification results with an error matrix, this conditional probability can be used as background knowledge to represent the confidence of the uncertainty.
2. Spatial analysis methods. Refers to the use of comprehensive attribute data analysis, topology analysis, buffer analysis, density analysis, distance analysis, overlay analysis, network analysis, terrain analysis, trend surface analysis, predictive analysis, etc. Association rules such as connection, neighbor and symbiosis in space, or mining the shortest path, optimal path and other knowledge between targets. At present, the commonly used spatial analysis methods include detective data analysis, spatial neighbor relationship mining algorithms, detective spatial analysis methods, detective induction learning methods, and image analysis methods.
3. Statistical analysis methods. Refers to the method of using the limited information and / or uncertainty information of a spatial object for statistical analysis, and then evaluating and predicting the characteristics of the spatial object's attributes, statistical laws and other knowledge. It mainly uses spatial self-covariance structure, variability function, or the degree of similarity of the autocovariate or local variable values to achieve spatial data mining with uncertainty.
4. Inductive learning methods. That is, under a certain knowledge background, the method of generalizing and synthesizing data, searching and mining general rules and patterns in a spatial database (data warehouse). There are many algorithms for inductive learning, such as the well-known C5.0 decision tree algorithm proposed by Quinlan, the attribute-oriented induction method proposed by Professor Han Jiawei and others, and the spatial attribute-based induction method proposed by Pei Jian and others.
5. Mining methods of spatial association rules. That is, an algorithm for searching and mining associations between spatial objects (and their attributes) in a spatial database (data warehouse). The most famous association rule mining algorithm is Apriori algorithm proposed by Agrawal; in addition, there are multi-level association rule mining algorithms proposed by Cheng Jihua et al., And generalized association rule model mining methods proposed by Xu Longfei et al.
6. Cluster analysis methods. That is, the method of clustering or classifying entities based on their characteristics, and then discovering the entire spatial distribution law and typical patterns of the data set. Commonly used clustering methods include K-mean, K-medoids method, R-tree-based data focusing method proposed by Ester, etc., and algorithms for discovering aggregation closeness and common features, and Zhou Chenghu et al. Class models, etc.
7. Neural network method. That is, a method composed of a large number of neurons to implement adaptive non-linear dynamic systems, and make it have distributed storage, associative memory, massively parallel processing, self-learning, self-organization, adaptive and other functions; in spatial data mining It can be used for classification and clustering knowledge and feature mining.
8. Decision tree method. That is, a method of generating rules and discovering rules by representing classification or decision sets in a tree structure according to different characteristics. The basic steps for spatial data mining using a decision tree method are as follows: first, use the training spatial entity set to generate a test function; second, establish a branch of the decision tree based on different values, and repeatedly establish lower nodes and branches in each branch subset to form Decision tree; then pruning the decision tree to transform the decision tree into rules for classifying new entities.
9. Rough Set Theory. An intelligent data decision analysis tool consisting of an upper approximation set and a lower approximation set to further process inaccurate, uncertain, and incomplete information based on the rough set, which is more suitable for spatial data mining based on attribute uncertainty .
10. Method based on fuzzy set theory. This is a series of methods that use fuzzy set theory to describe research objects with uncertainty, and analyze and deal with actual problems. The method based on fuzzy set theory has been widely used in fuzzy classification of remote sensing images, fuzzy query of GIS, uncertainty expression and processing of spatial data, and so on.
11. Spatial characteristics and trend detection methods. This is a spatial data mining algorithm based on the concept of neighborhood maps and neighborhood paths. It extracts spatial rules through the difference in the relative frequency of different types of attributes or objects.
12. Cloud-based approach. Cloud theory is a new theory for analyzing uncertain information. It consists of three parts: cloud model, uncertainty reasoning and cloud transformation. Cloud data-based spatial data mining method combines qualitative analysis and quantitative calculation to deal with the uncertainty attributes of randomness and ambiguity in spatial objects; it can be used for the mining of spatial association rules and the uncertainty query of spatial databases. Wait.
13. Evidence-based approach. Evidence theory is a theory that deals with uncertainty information through a credibility function (which measures the lowest degree of support for the hypothesis by existing evidence) and a possible function (which measures the highest degree of denial of the hypothesis based on existing evidence). Spatial data mining of uncertain attributes.
14. Genetic algorithms. This is an algorithm that simulates the process of biological evolution. It can perform an efficient parallel global search on the solution space of a problem. It can automatically acquire and accumulate knowledge about the search space during the search process. It can also control the search process through an adaptive mechanism to obtain Get the optimal solution. Many problems in spatial data mining, such as knowledge acquisition for classification, clustering, and prediction, can be solved using genetic algorithms. This method has been used for feature discovery in remote sensing image data.
15. Data visualization methods. This is a method to display spatial data through visualization technology, and help people use visual analysis to find spatial knowledge such as structure, features, patterns, trends, anomalies or related relationships in the data. To ensure this method works, powerful visualization tools and auxiliary analysis tools must be built.
16. Computational Geometry. This is a method that uses a computer program to calculate a Voronoi diagram of a set of planar points, and then discovers spatial knowledge. Voronoi diagrams can be used to solve problems such as spatial topological relationships, multi-scale representation of data, automatic synthesis, spatial clustering, the sphere of influence of spatial targets, the location of public facilities, and determining the shortest path.
17. Online spatial data mining. This is a web-based verification space for data mining and analysis tools. It is based on a multi-dimensional view, emphasizing execution efficiency and timely response to user commands. Generally, a spatial data warehouse is used as the direct data source. This method uses data analysis and report module query and analysis tools (such as OLAP, decision analysis, data mining, etc.) to complete the extraction of information and knowledge to meet the needs of decision-making.

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?