|Contact:||Roger.Bivand at nhh.no|
Base R includes many functions that can be used for reading, vizualising, and analysing spatial data. The focus in this view is on "geographical" spatial data, where observations can be identified with geographical locations, and where additional information about these locations may be retrieved if the location is recorded with care. Base R functions are complemented by contributed packages, some of which are on CRAN, and others are still in development. One active location is R-Forge , which lists "Spatial Data and Statistics" projects in its project tree . Information on R-spatial packages, especially sp will be posted on the R-Forge rspatial project website , including a visualisation gallery.
The contributed packages address two broad areas: moving spatial data into and out of R, and analysing spatial data in R.
The R-SIG-Geo mailing-list is a good place to begin for obtaining help and discussing questions about both accessing data, and analysing it. The mailing list is a good place to search for information about relevant courses, and a list is hosted at the GeoDaCenter .
The packages in this view can be roughly structured into the following topics. If you think that some package is missing from the list, please let me know. Please also visit and contribute to the spatial data handling and spatial statistics pages on the R Wiki.
Classes for spatial data : Because many of the packages importing and using spatial data have had to include objects of storing data and functions for vizualising it, an initiative is in progress to construct shared classes and plotting functions for spatial data. The sp package has been published on CRAN. The sp package is discussed in a note in R News . Some other packages have become dependent on these classes, including rgdal and maptools. Functions provided by vec2dtransf for applying affine and similarity transformations on vector spatial data (sp objects). The rgeos package provides an interface to topology functions for sp objects using GEOS . rgeos is now available for Mac OSX on CRAN. The raster package is a major extension of spatial data classes to virtualise access to large rasters, permitting large objects to be analysed, and extending the analytical tools available for both raster and vector data. Used with rasterVis, it can also provide enhanced visualization and interaction. The micromap package provides linked micromaps using ggplot2. The spacetime package extends the shared classes defined in sp for spatio-temporal data.
An alternative approach to some of these issues is implemented in the PBSmapping package; PBSmodelling provides modelling support. In addition, GEOmap provides mapping facilities directed to meet the needs of geologists, and uses the geomapdata package.
Handling spatial data : A number of packages have been written using sp classes. The raster package introduces many GIS methods that now permit much to be done with spatial data without having to use GIS in addition to R. It may be complemented by gdistance, which provided calculation of distances and routes on geographic grids. geosphere permits computations of distance and area to be carried out on spatial data in geographical coordinates. The Metadata package collects and downloads a variety of open GIS datasets that can be used to characterize the surface properties of Latitude/Longitude points, using raster, rgdal and others to access data providers. The spsurvey package provides a range of sampling functions. The trip package extends sp classes to permit the accessing and manipulating of spatial data for animal tracking. The hdeco package provides hierarchical decomposition of entropy for categorical map comparisons. The GeoXp package permits interactive graphical exploratory spatial data analysis. spcosa provides spatial coverage sampling and random sampling from compact geographical strata.
The UScensus2000 suite of packages (UScensus2000blkgrp, UScensus2000cdp, UScensus2000tract) makes the use of data from the 2000 US Census more convenient. An important data set, Guerry's "Moral Statistics of France", has been made available in the Guerry package, which provides data and maps and examples designed to contribute to the integration of multivariate and spatial analysis. The new cshapes and rworldmap packages make available data sets for conflict studies and environmental studies purposes - they are linked as rworldmap suggests cshapes.
The landsat package with accompanying JSS paper provides tools for exploring and developing correction tools for remote sensing data. taRifx is a collection of utility and convenience functions, and some interesting spatial functions.
Reading and writing spatial data - rgdal : Maps may be vector-based or raster-based. The rgdal package provides bindings to GDAL -supported raster formats and OGR -supported vector formats. It contains functions to write raster files in supported formats. The package also provides PROJ.4 projection support for vector objects ( this site provides searchable online PROJ.4 representations of projections). The Windows and Mac OSX CRAN binaries of rgdal include subsets of possible data source drivers; if others are needed, use other conversion utilities, or install from source against a version of GDAL with the required drivers.
Reading and writing spatial data - other packages : There are a number of other packages for accessing vector data on CRAN: maps (with mapdata and mapproj) provides access to the same kinds of geographical databases as S - RArcInfo allows ArcInfo v.7 binary files and *.e00 files to be read, and maptools and shapefiles read and write ArcGIS/ArcView shapefiles; for NetCDF files, ncdf may be used. The maptools package also provides helper functions for writing map polygon files to be read by WinBUGS, Mondrian, and the tmap command in Stata. It also provides interface functions between PBSmapping and spatstat and sp classes, in addition to maps databases and sp classes. There is also an interface to GSHHS shoreline databases. For visualization, the colour palettes provided in the RColorBrewer package are very useful, and may be modified or extended using the colorRampPalette function provided with R. The classInt package provides functions for choosing class intervals for thematic cartography. The gmt package gives a simple interface between GMT map-making software and R. geonames is an interface to the www.geonames.org service. If the user wishes to place a map backdrop behind other displays, the the RgoogleMaps package for accessing Google Maps(TM) may be useful. ggmaps may be used for spatial visualization with Google Maps and OpenStreetMap. plotKML is a package providing methods for the visualization of spatial and spatio-temporal objects in Google Earth. OpenStreetMap gives access to open street map raster images, and osmar provides infrastructure to access OpenStreetMap data from different sources, to work with the data in common R manner, and to convert data into available infrastructure provided by existing R packages. RSurvey may be used as a processing program for spatially distributed data, and is capable of error corrections and data visualization.
Integration with version 6.* of the leading open source GIS, GRASS, is provided in CRAN package spgrass6, using rgdal for exchanging data. RPyGeo is a wrapper for Python access to the ArcGIS GeoProcessor, and RSAGA is a similar shell-based wrapper for SAGA commands.
Point pattern analysis : The spatial package is a recommended package shipped with base R, and contains several core functions, including an implementation of Khat by its author, Prof. Ripley. In addition, spatstat allows freedom in defining the region(s) of interest, and makes extensions to marked processes and spatial covariates. Its strengths are model-fitting and simulation, and it has a useful homepage . It is the only package that will enable the user to fit inhomogeneous point process models with interpoint interactions. MarkedPointProcess is another contemporary point pattern package. The spatgraphs package provides graphs, graph visualization and graph based summaries to be used with spatial point pattern analysis. The splancs package also allows point data to be analysed within a polygonal region of interest, and covers many methods, including 2D kernel densities.
ecespa provides wrappers, functions and data for spatial point pattern analysis, used in the book on Spatial Ecology of the ECESPA/AEET. The functions for binning points on grids in ash may also be of interest. The ads package perform first- and second-order multi-scale analyses derived from Ripley's K-function. The aspace package is a collection of functions for estimating centrographic statistcs and computational geometries from spatial point patterns. spatialkernel provides edge-corrected kernel density estimation and binary kernel regression estimation for multivariate spatial point process data. DSpat contains functions for spatial modelling for distance sampling data, and spatialsegregation provides segregation measures for multitype spatial point patterns. GriegSmith uses the Grieg-Smith method on 2 dimensional spatial data
Geostatistics : The gstat package provides a wide range of functions for univariate and multivariate geostatistics, also for larger datasets, while geoR and geoRglm contain functions for model-based geostatistics. Variogram diagnostics may be carried out with vardiag. Automated interpolation using gstat is available in automap. This family of packages is supplemented by intamap with procedures for automated interpolation and psgp, which implements projected sparse Gaussian process kriging. A similar wide range of functions is to be found in the fields package. The spatial package is shipped with base R, and contains several core functions. The spBayes package fits Gaussian univariate and multivariate models with MCMC. ramps is a different Bayesian geostatistical modelling package. The geospt package contains some geostatistical and radial basis functions, including prediction and cross validation. Besides, it includes functions for the design of optimal spatial sampling networks based on geostatistical modelling.
The RandomFields package provides functions for the simulation and analysis of random fields, and variogram model descriptions can be passed between geoR, gstat and this package. SpatialExtremes proposes several approaches for spatial extremes modelling using RandomFields. In addition, CompRandFld, constrainedKriging, geospt and geofd provide alternative approaches to geostatistical modelling.
The sgeostat package is also available. Within the same general topical area are the deldir and tripack packages for triangulation and the akima package for spline interpolation; the MBA package provides scattered data interpolation with multilevel B-splines. In addition, there are the spatialCovariance package, which supports the computation of spatial covariance matrices for data on rectangles, the regress package building in part on spatialCovariance, and the tgp package. The Stem package provides for the estimation of the parameters of a spatio-temporal model using the EM algorithm, and the estimation of the parameter standard errors using a spatio-temporal parametric bootstrap. FieldSim is another random fields simulations package.
Disease mapping and areal data analysis : DCluster is a package for the detection of spatial clusters of diseases. It extends and depends on the spdep package, which provides basic functions for building neighbour lists and spatial weights, tests for spatial autocorrelation for areal data like Moran's I, and functions for fitting spatial regression models, such as SAR and CAR models. These models assume that the spatial dependence can be described by known weights. The spgwr package contains an implementation of geographically weighted regression methods for exploring possible non-stationarity. The sparr package provides another approach to relative risks. The glmmBUGS package is a helpful way of passing out spatial models to WinBUGS.
Spatial regression : The choice of function for spatial regression will depend on the support available. If the data are characterised by point support and the spatial process is continuous, geostatistical methods may be used, or functions in the nlme package. If the support is areal, and the spatial process is not being treated as continuous, functions provided in the spdep package may be used. This package can also be seen as providing spatial econometrics functions, and, as noted above, provides basic functions for building neighbour lists and spatial weights, tests for spatial autocorrelation for areal data like Moran's I, and functions for fitting spatial regression models. It provides the full range of local indicators of spatial association, such as local Moran's I and diagnostic tools for fitted linear models, including Lagrange Multiplier tests. Spatial regression models that can be fitted using maximum likelihood include spatial lag models, spatial error models, and spatial Durbin models. For larger data sets, sparse matrix techniques can be used for maximum likelihood fits, while spatial two stage least squares and generalised method of moments estimators are an alternative. When using GMM, sphet can be used to accommodate both autocorrelation and heteroskedasticity. Spatial count regression is provided using custom MCMC by spatcounts. The splm package provides methods for fitting spatial panel data by maximum likelihood and GM. spatialprobit make possible Bayesian estimation of the spatial autoregressive probit model (SAR probit model).
Ecological analysis : There are many packages for analysing ecological and environmental data. They include ade4 for exploratory and Euclidean methods in the environmental sciences, the adehabitat family of packages for the analysis of habitat selection by animals (adehabitatHR, adehabitatHS, adehabitatLT, and adehabitatMA), pastecs for the regulation, decomposition and analysis of space-time series, vegan for ordination methods and other useful functions for community and vegetation ecologists, and many other functions in other contributed packages. One such is tripEstimation, basing on the classes provided by trip. ncf has entered CRAN recently, and provides a range of spatial nonparametric covariance functions. rangeMapper is a package to manipulate species range (extent-of-occurrence) maps, mainly tools for easy generation of biodiversity (species richness) or life-history traits maps. ModelMap builds on other packages to create models using underlying GIS data. An off-CRAN package - Rcitrus - is for the spatial analysis of plant disease incidence. The Geneland package uses fields and RandomFields to make use of both geographic and genetic informations to estimate the number of populations in a dataset and delineate their spatial organisation. The Environmetrics Task View contains a much more complete survey of relevant functions and packages.