OpenTopography hosts hundreds of high-resolution point cloud datasets covering more than 500,000 km² globally. While our web-based data portal provides intuitive point-and-click access to these data, researchers increasingly need programmatic methods to build automated, reproducible workflows. We're excited to share a new Jupyter notebook tutorial that demonstrates how to efficiently access OpenTopography's point cloud data using tile indexes and cloud-native streaming techniques.
OpenTopography's lidar datasets are organized into tiles—individual LAZ files that each cover a small geographic area. While this tiling structure optimizes data storage, it presents a challenge for programmatic access: how do you identify and retrieve only the tiles you need for your area of interest without manually browsing through hundreds or thousands of files?
Our new tutorial introduces two complementary approaches for programmatic point cloud data access:
For each hosted point cloud dataset, OpenTopography provides a tile index—a shapefile containing the spatial extent and download URL for every tile in the dataset. There are three different methods for accessing these tile indexes:
By spatially querying the tile index against your area of interest, you can identify exactly which tiles you need before downloading any data.

The notebook works through how to select LAZ tiles (boundaries in blue) that intersect a given area of interest (plotted in red). In this example, downloaded LAZ files are colored by classification, where unclassified points are plotted in gray, and points classified as ground are plotted in brown.
Beyond simple tile identification, the tutorial showcases an advanced workflow using the Point Data Abstraction Library (PDAL) to:
This approach eliminates the need to download full tiles when you only need data from a small region, dramatically reducing bandwidth and storage requirements.

The second part of the notebook works through how to stream data, and download only the points that intersect a given area of interest. In this plot, points within an area of interest are colored by elevation, and are plot with the LAZ tile boundaries in blue for reference.
The notebook walks through a complete workflow using a case study from New Zealand's Waikato region:
All code is provided with detailed explanations, making it accessible to users with basic Python experience while demonstrating advanced techniques for those ready to build more sophisticated workflows.
The complete Jupyter notebook, including all code and documentation, is available on OpenTopography's GitHub repository:
The tutorial requires Python with standard geospatial libraries (GeoPandas, Shapely, Requests) plus PDAL for the advanced streaming workflows. Installation instructions and environment setup guidance are provided in the repository.
This tutorial represents part of OpenTopography's ongoing commitment to supporting diverse data access methods. Whether you're a student exploring point cloud data for the first time through our map interface, or a researcher building automated processing pipelines, we're working to ensure OpenTopography's data remain accessible and useful for your needs. As cloud-native data formats and access patterns continue to evolve, we'll keep updating our documentation and tools to help the community leverage these technologies effectively.