Earlier this month, I had the privilege of participating in the National Science Foundation TeraGrid Workshop on Cyber-GIS in Washington, DC. The workshop was sponsored by the National Science Foundation (NSF) TeraGrid Science Gateway program and the Office of Cyberinfrastructure with the goal of "underpin fundamental issues of Cyber-GIS for enhancing cyberinfrastructure while advancing the next-generation GIS with synergistic high-performance, distributed, and collaborative capabilities."
Each participant in the workshop was required to submit a position paper that highlighted an issue or opportunity in Cyber-GIS. My paper, "Cyber-GIS Opportunities for High-Resolution Topography Data Access, Processing, and Analysis", highlights activities OpenTopography is currently engaged in, and also points to opportunities and challenges we are pursuing. You can download a PDF of the position paper, or read it below.
Cyber-GIS Opportunities for High-Resolution Topography Data Access, Processing, and Analysis
San Diego Supercomputer Center, University of California, San Diego, CA
High-resolution topography data acquired with lidar (light detection and ranging) technology are revolutionizing the way we study the geomorphic, biologic and anthropogenic processes acting along the Earth’s surface (e.g. Carter et al., 2007). These data, acquired from either an airborne platform or a tripod-mounted scanner, are emerging as a fundamental tool for research on a variety of topics ranging from earthquake hazards to urban modeling. Lidar topography data are powerful because they represent processes and features at a scale not previously possible yet essential for their appropriate representation. These data sets also have significant implications for earth science education and outreach because when visualized, they provide an accurate digital representation of landforms, natural hazards and processes, and the built environment.
However, along with the potential of lidar topography comes an increase in the volume and complexity of data that must be efficiently managed, archived, distributed, processed and integrated in order for them to be of use to the community. A single lidar data acquisition may generate terabytes of data in the form of point clouds, digital elevation models (DEMs), and derivative products. This massive volume of data is often difficult to manage and poses significant distribution challenges when trying to allow access to the data for a large scientific user community. Furthermore, the data sets can be technically challenging to work with and may require specific software and computing resources that are not readily available to many users.
Projects such as the National Science Foundation-funded OpenTopography Facility (http://www.opentopography.org) (e.g. Crosby et al., 2009) are successfully leveraging emerging cyberinfrastructure technologies such as portal-based data access, service oriented architectures, high-performance parallel database systems (Nandigam et al., 2010), and optimized processing algorithms to improve internet-based access to these massive geospatial data sets. The OpenTopography system provides free and on-demand access to tens of billions of lidar point cloud measurements as well as processing tools that permit users to generate custom digital elevation models on-the-fly. OpenTopography’s growing user community of several thousand scientists, educators, students, government agency staff, and private sector users illustrate that cyberinfrastructure-based geospatial data access systems can have a significant impact by democratizing access to these massive data sets.
OpenTopography’s success is an illustration of the potential opportunities that exist through the application of cyberinfrastructure resources to geospatial data management and processing. However, the OpenTopography effort has only just scratched the surface of how routine data management and processing tasks could be enhanced with access to cloud or grid-based resources. As any regular user of high-resolution topography appreciates, many of the existing geographic information system (GIS) algorithms currently available for processing, analysis, and visualization point cloud and DEM data fail, or perform very slowly, when applied to lidar data. Taking a Cyber-GIS approach to lidar topography processing and analysis would allow users to carry out computationally intensive LiDAR data processing without having appropriate hardware locally. Resources such as Hadoop (http://hadoop.apache.org/)-based processing in the cloud, the TeraGrid (http://www.teragrid.org/), or Condor pools (http://www.cs.wisc.edu/condor/) could allow users to “outsource” their geospatial data processing to computing resources better equipped to handling significant data volumes.
However, to effectively utilize high-performance grid or cloud resources will require that the user community develop a new “toolkit” of algorithms and tools that are optimized to perform in these environments. This new toolkit should exist in the open source domain and consist of libraries that allow users to construct customized processing workflows that run in a distributed environment. Examples of necessary algorithms include those for high-performance gridding of lidar point cloud data (e.g. Kim et al., 2006), algorithms for hydrologic processing of DEMs (e.g. Wallis et al., 2009) including calculations of slope, slope-aspect, stream profiles, catchment areas, and topographic roughness and curvature, geomorphic change detection analysis, feature extraction (including vegetation classification and structural analysis, and building footprint extraction), as well as tools for the processing and analysis of full waveform lidar data.
Carter, W. E., R. Shrestha and K.C. Slatton, 2007, Geodetic Laser Scanning, Physics Today, Vol. 60, Number 12, pp 41-47.
Crosby, C.J., Nandigam, V., Arrowsmith, R., Baru, C., 2009, Enhancing Access to High-Resolution Lidar Topography – From Point Clouds To Google Earth, Geological Society of America Abstracts with Programs, Vol. 41, No. 7, p. 384
Kim, H., Arrowsmith, J R., Crosby, C.J., Jaeger-Frank, E., Nandigam, V., Memon, A., Conner, J., Badden, S.B., Baru, C., An Efficient Implementation of a Local Binning Algorithm for Digital Elevation Model Generation of LiDAR/ALSM Dataset, Eos Trans. AGU, 87(52), Fall Meet. Suppl., Abstract G53C-0921, 2006.
Nandigam, V., Baru, C., Crosby, C.J., Database Design for High-Resolution LIDAR Topography Data in preparation, 2010 International Conference on Scientific and Statistical Database Management
Wallis, C., Watson, D., Tarboton, D., Wallace, R., 2009, Parallel Flow-Direction and Contributing Area Calculation for Hydrology Analysis in Digital Elevation Models, Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA 2009, Las Vegas, Nevada, USA