Geospatial Data Science Quick Start Guide

上QQ阅读APP看书，第一时间看更新

From a data perspective

Having looked into the nature of location data from a technical perspective, let's also examine it from a data perspective. How is location data different than other data? In location data, we use geographic coordinates (2D) to represent the world (3D).

For example, Digital Elevation Models (DEMs) are used to represent heights and terrain surface. The first law of geography applies here as well. At a certain point of time, a particular terrain is very likely to have the same height with its relatively close surrounding, while we can expect a difference based on elevation in two areas distant from each other. As mentioned earlier, spatial autocorrelation in location data is assumed to be present in spatial data, while in other types of data, such as the statistical analysis of conventional data, we assume the independence of data points. That means location data can be categorized as stochastic, while other data is probabilistic.

Another complication in location data also arises from what we call Modifiable Area Unit Problem (MAUP), which arises from different aggregated units that produce different results. An example of this is poverty or crime estimates and aggregations. For example, areas of high poverty rates could be overestimated or underestimated depending on the boundaries of measured areas. By moving into different aggregations (that is, zip code, neighborhood, or district level), which can create different impressions and patterns created by the different scales and aggregations.