What Is a GIS Shapefile?
ESRI Shapefile (shp), or shapefile for short, is an open format for spatial data developed by the American Institute of Environmental Systems (ESRI). [1] At present, the file format has become an open standard in the geographic information software community, which shows the importance of ESRI in the global geographic information system market. Shapefile is also an important interchange format, which can interoperate data between ESRI and other company's products.
- Shapefile belongs to a vector graphics format, which can save the position and related attributes of geometry. But this format cannot store topological information of geographic data. Shapefile was first applied in the second version of ArcView GIS in the early 1990s. Currently, many free programs or commercial programs can read Shapefile.
- Shapefile is a primitive vector data storage method. It can only store the position data of geometry, but cannot store the attribute data of these geometry in a file. Therefore, the Shapefile must also be accompanied by a two-dimensional table for storing the attribute information of each geometry in the Shapefile. Many geometries in Shapefiles are capable of representing complex geographic things and provide them with powerful and accurate computing power.
- Shapefile refers to a file storage method. In fact, the file format is composed of multiple files. Among them, to form a Shapefile, three files are essential, which are ".shp", ".shx" and ".dbf" files. A set of files representing the same data should have the same filename prefix. For example, to store geometry and attribute data about a lake, there must be three files: lake.shp, lake.shx, and lake.dbf. The suffix of the "real" Shapefile is shp, but the data of this file is incomplete only. The other two must be attached to form a complete set of geographic data. In addition to these three required files, there are eight optional files that can be used to enhance the expressiveness of spatial data. All file names must follow the MS DOS 8.3 file name standard (file prefix name is 8 characters, suffix name is 3 characters, such as shapefil.shp) to facilitate compatibility with some old applications, although many new The programs are able to support long file names. In addition, all files must be in the same directory.
- Required documents:
- .shp Graphic format for saving the geometric entities of an element.
- .shxGraphic index format. Geometry position index, which records the position of each geometry in the shp file, can speed up the efficiency of searching a geometry forward or backward.
- .dbf attribute data format, which stores attribute data for each geometry in a dBase IV data table format.
- Other optional files:
- .prj Projection type, used to save geographic coordinate system and projection information, is a text file that stores well-known text projection descriptors.
- .sbnand.sbxThe spatial index of the geometry
- .fbnand.fbx the spatial index of the geometry of the read-only Shapefiles
- .ainand.aih The attribute index of the active field in the list.
- .ixs Geocoded index for read and write Shapefile files
- .mxs Geocoded index (ODB format) that can read and write Shapefile files
- .atxThe attribute index of the .dbf file, the file name format is shapefile . columnname .atx (ArcGIS 8 and later)
- .shp.xmlSaves metadata in XML format.
- .cpg used to describe .dbf files
Shapefile Shapefile shapefile and topology
- Shapefile cannot store topology information. In the ESRI file format, ArcInfo's Coverage and Personal / File / Enterprise geodatabases can save topological information of geographic features.
shapefile file space expression
- In the shapefile, all polylines and polygons are defined by points, and linear interpolation is used between points, that is, points are connected by line segments. When collecting data, the distance between points determines the scale used in the file. When the graph is enlarged more than a certain proportion, the graph will appear jagged. For graphics to look smoother, more points must be used, which consumes more storage space. In this case, the spline function can accurately express curves of different shapes and occupy relatively less space, but currently the shapefile does not support spline curves.
shapefile data storage
- The maximum size of a .shp file or a .dbf file cannot exceed 2 GB (or 2 digits). In other words, a shapefile can only store up to 70 million point coordinates. The number of geometry that a file can store depends on the number of vertices used by a single feature.
- The .dbf file used by the attribute database format is based on an older dBase standard. This database format has many inherent limitations, such as:
- Cannot store null value. This is a serious problem for quantitative data, because null values are usually replaced by 0, which will distort the results of many statistical expressions.
- Suboptimal support for Unicode in field names or stored values.
- Field names can be up to 10 characters.
- Can only have a maximum of 255 fields.
- Only the following data types are supported: floating point type (13 bytes of storage space), integer (4 or 9 bytes of storage space), date (not able to store time, 8 bytes of storage space), and text (up to 254 bytes of storage space)
- Floating point numbers may contain rounding errors because they are saved as text.
shapefile mixed geometry type
- Since the geometry type of each record is contained in each geometry record, theoretically a shapefile can store mixed geometry types. In fact, the specification states that all non-empty geometry in the same shapefile must be of the same type [1] . Therefore, the shapefile is limited to only storing empty geometry and another single geometry, and the type of the geometry must be the same as that defined in the file header. For example, a shapefile cannot contain both polyline and polygon data, so in the actual description of geographic things, the well (point type), river (polyline type), and lake (polygon type) must be stored separately in three different files. In.