Utilities¶
GridFIA provides utility classes for advanced users who need direct access to Zarr stores and location configuration.
Overview¶
| Class | Description | Use Case |
|---|---|---|
ZarrStore |
Unified Zarr access interface | Reading biomass data directly |
LocationConfig |
Geographic extent management | Custom region configuration |
ZarrStore¶
The ZarrStore class provides a unified interface for reading GridFIA Zarr stores,
handling both Zarr v2 and v3 formats transparently.
Unified interface for reading GridFIA Zarr stores.
This class provides a standardized way to access Zarr stores created by GridFIA, handling both Zarr v2 and v3 formats transparently. It provides typed properties for common access patterns and includes context manager support for safe resource management.
Attributes¶
path : Path Path to the Zarr store directory. biomass : zarr.Array The main biomass array with shape (species, height, width). species_codes : List[str] List of 4-digit FIA species codes. species_names : List[str] List of species common names. crs : CRS Coordinate reference system. transform : Affine Affine transformation for georeferencing. bounds : Tuple[float, float, float, float] Geographic bounds (left, bottom, right, top). num_species : int Number of species in the store. shape : Tuple[int, int, int] Shape of the biomass array (species, height, width).
Examples¶
Basic usage:
store = ZarrStore.from_path("data/montana.zarr") print(f"Species: {store.num_species}") print(f"Shape: {store.shape}") data = store.biomass[0, :, :] # Get first species layer
Using context manager:
with ZarrStore.open("data/montana.zarr") as store: ... richness = np.sum(store.biomass[:] > 0, axis=0)
Iterating over species:
store = ZarrStore.from_path("data/forest.zarr") for code, name in zip(store.species_codes, store.species_names): ... print(f"{code}: {name}")
Initialize ZarrStore from an open Zarr group.
Prefer using class methods from_path() or open() instead of
calling this constructor directly.
Parameters¶
root : zarr.Group Open Zarr group containing the biomass data. store : zarr.storage.LocalStore, optional The underlying storage object (for resource management). path : Path, optional Path to the Zarr store on disk.
biomass
property
¶
The main biomass array.
Shape is (species, height, width) where: - species: Number of species layers (index 0 is often total biomass) - height: Number of rows in the raster - width: Number of columns in the raster
Values are typically in Mg/ha (megagrams per hectare).
Returns¶
zarr.Array 3D array of biomass values.
species_codes
property
¶
List of 4-digit FIA species codes.
The first code ('0000') typically represents total biomass. Codes are zero-padded 4-digit strings (e.g., '0202' for Douglas-fir).
Returns¶
List[str] Species codes in order matching the biomass array's first dimension.
species_names
property
¶
List of species common names.
Names correspond to species codes and are in the same order as the biomass array's first dimension.
Returns¶
List[str] Species names (e.g., 'Douglas-fir', 'Ponderosa Pine').
crs
property
¶
Coordinate reference system for the data.
Returns¶
rasterio.crs.CRS The CRS object (default is EPSG:3857 / Web Mercator if not specified).
transform
property
¶
Affine transformation for georeferencing.
The transform maps pixel coordinates to geographic coordinates: geo_x, geo_y = transform * (pixel_x, pixel_y)
Returns¶
rasterio.transform.Affine Affine transformation matrix.
bounds
property
¶
Geographic bounds of the data.
Returns¶
Tuple[float, float, float, float] Bounds as (left, bottom, right, top) in the store's CRS.
num_species
property
¶
Number of species layers in the store.
This includes the total biomass layer if present (typically at index 0).
Returns¶
int Count of species layers.
shape
property
¶
from_path
classmethod
¶
from_path(path: Union[str, Path], mode: str = 'r') -> ZarrStore
Create a ZarrStore from a file path.
This is the primary way to open an existing Zarr store for reading.
Parameters¶
path : str or Path Path to the Zarr store directory. mode : str, default='r' Mode to open the store. Options: - 'r': Read-only (default) - 'r+': Read/write existing store
Returns¶
ZarrStore Initialized ZarrStore instance.
Raises¶
FileNotFoundError If the path does not exist. InvalidZarrStructure If the path is not a valid GridFIA Zarr store.
Examples¶
store = ZarrStore.from_path("data/forest.zarr") print(f"CRS: {store.crs}")
open
classmethod
¶
open(path: Union[str, Path], mode: str = 'r') -> Iterator[ZarrStore]
Context manager for safely opening and closing a ZarrStore.
This ensures proper resource cleanup when done accessing the store.
Parameters¶
path : str or Path Path to the Zarr store directory. mode : str, default='r' Mode to open the store ('r' for read-only, 'r+' for read/write).
Yields¶
ZarrStore Initialized ZarrStore instance.
Examples¶
with ZarrStore.open("data/forest.zarr") as store: ... total_biomass = np.sum(store.biomass[:]) ... print(f"Total biomass: {total_biomass:.2f}")
is_valid_store
classmethod
¶
Check if a path contains a valid GridFIA Zarr store.
This performs a quick validation without fully loading the store.
Parameters¶
path : str or Path Path to check.
Returns¶
bool True if the path contains a valid GridFIA Zarr store.
Examples¶
if ZarrStore.is_valid_store("data/forest.zarr"): ... store = ZarrStore.from_path("data/forest.zarr")
close
¶
Close the Zarr store and release resources.
After calling close(), the store should not be accessed.
Basic Usage¶
from gridfia.utils.zarr_utils import ZarrStore
# Open a Zarr store
store = ZarrStore.from_path("data/forest.zarr")
# Access properties
print(f"Shape: {store.shape}") # (species, height, width)
print(f"Species: {store.num_species}")
print(f"CRS: {store.crs}")
print(f"Bounds: {store.bounds}")
# Access species information
for code, name in zip(store.species_codes, store.species_names):
print(f" {code}: {name}")
# Access biomass data
biomass = store.biomass[:] # Load all data
print(f"Biomass shape: {biomass.shape}")
print(f"Total biomass range: {biomass[0].min():.2f} - {biomass[0].max():.2f}")
# Close when done
store.close()
Context Manager Usage¶
Use the context manager for automatic resource cleanup:
from gridfia.utils.zarr_utils import ZarrStore
with ZarrStore.open("data/forest.zarr") as store:
# Access data within context
biomass = store.biomass[:]
species = store.species_codes
# Process data
total_biomass = biomass[0] # First layer is total
mean_biomass = total_biomass[total_biomass > 0].mean()
print(f"Mean biomass: {mean_biomass:.2f} Mg/ha")
# Store automatically closed
Working with Species Data¶
from gridfia.utils.zarr_utils import ZarrStore
import numpy as np
with ZarrStore.open("data/forest.zarr") as store:
# Get index of specific species
species_code = "0202" # Douglas-fir
try:
idx = store.species_codes.index(species_code)
species_data = store.biomass[idx]
# Calculate statistics
valid_data = species_data[species_data > 0]
print(f"Species {species_code}:")
print(f" Mean biomass: {valid_data.mean():.2f} Mg/ha")
print(f" Max biomass: {valid_data.max():.2f} Mg/ha")
print(f" Coverage: {len(valid_data) / species_data.size * 100:.1f}%")
except ValueError:
print(f"Species {species_code} not in store")
Chunked Reading for Large Datasets¶
from gridfia.utils.zarr_utils import ZarrStore
import numpy as np
with ZarrStore.open("data/large_forest.zarr") as store:
# Read in chunks to avoid memory issues
chunk_size = 1000
height, width = store.shape[1], store.shape[2]
total_sum = 0
total_count = 0
for y in range(0, height, chunk_size):
for x in range(0, width, chunk_size):
y_end = min(y + chunk_size, height)
x_end = min(x + chunk_size, width)
# Read chunk
chunk = store.biomass[0, y:y_end, x:x_end]
valid = chunk[chunk > 0]
total_sum += valid.sum()
total_count += len(valid)
mean_biomass = total_sum / total_count if total_count > 0 else 0
print(f"Mean biomass: {mean_biomass:.2f} Mg/ha")
Validation¶
from gridfia.utils.zarr_utils import ZarrStore
from pathlib import Path
# Quick validation without full loading
path = Path("data/forest.zarr")
if ZarrStore.is_valid_store(path):
print("Valid GridFIA Zarr store")
else:
print("Invalid or non-GridFIA Zarr store")
LocationConfig¶
The LocationConfig class manages geographic extents for any US state, county, or custom region.
Configuration manager for any geographic location (state, county, custom region).
Initialize configuration from YAML file or create from location.
| PARAMETER | DESCRIPTION |
|---|---|
config_path
|
Path to configuration YAML file
TYPE:
|
location_type
|
Type of location ("state", "county", "custom")
TYPE:
|
from_state
classmethod
¶
from_state(state: str, output_path: Optional[Path] = None) -> LocationConfig
Create configuration for a specific US state.
| PARAMETER | DESCRIPTION |
|---|---|
state
|
State name or abbreviation
TYPE:
|
output_path
|
Path to save configuration file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LocationConfig
|
LocationConfig instance for the state |
from_county
classmethod
¶
from_county(county: str, state: str, output_path: Optional[Path] = None) -> LocationConfig
Create configuration for a specific county.
| PARAMETER | DESCRIPTION |
|---|---|
county
|
County name
TYPE:
|
state
|
State name or abbreviation
TYPE:
|
output_path
|
Path to save configuration file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LocationConfig
|
LocationConfig instance for the county |
from_bbox
classmethod
¶
from_bbox(bbox: Tuple[float, float, float, float], name: str = 'Custom Region', crs: str = 'EPSG:4326', output_path: Optional[Path] = None) -> LocationConfig
Create configuration for a custom bounding box.
| PARAMETER | DESCRIPTION |
|---|---|
bbox
|
Bounding box (xmin, ymin, xmax, ymax)
TYPE:
|
name
|
Name for the region
TYPE:
|
crs
|
CRS of the bounding box
TYPE:
|
output_path
|
Path to save configuration file
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
LocationConfig
|
LocationConfig instance for the custom region |
Creating Location Configurations¶
from gridfia.utils.location_config import LocationConfig
# Create configuration for a state
config = LocationConfig.from_state("Montana")
print(f"Location: {config.location_name}")
print(f"Bbox (Web Mercator): {config.web_mercator_bbox}")
# Save to file for reuse
config = LocationConfig.from_state(
"Montana",
output_path="config/montana.yaml"
)
from gridfia.utils.location_config import LocationConfig
# Create configuration for a county
config = LocationConfig.from_county(
county="Wake",
state="North Carolina"
)
print(f"Location: {config.location_name}")
print(f"Bbox: {config.web_mercator_bbox}")
# Save configuration
config = LocationConfig.from_county(
county="Harris",
state="Texas",
output_path="config/harris_county.yaml"
)
from gridfia.utils.location_config import LocationConfig
# WGS84 bounding box (lon/lat)
config = LocationConfig.from_bbox(
bbox=(-123.5, 45.0, -122.0, 46.5),
name="Pacific Northwest Study Area",
crs="EPSG:4326"
)
print(f"Location: {config.location_name}")
print(f"Web Mercator bbox: {config.web_mercator_bbox}")
# Save custom region
config = LocationConfig.from_bbox(
bbox=(-123.5, 45.0, -122.0, 46.5),
name="PNW Study Area",
crs="EPSG:4326",
output_path="config/pnw_study.yaml"
)
Using Configurations¶
from gridfia import GridFIA
from gridfia.utils.location_config import LocationConfig
from pathlib import Path
api = GridFIA()
# Create and save configuration
config = LocationConfig.from_county(
county="Wake",
state="NC",
output_path="config/wake.yaml"
)
# Use configuration for download
files = api.download_species(
location_config="config/wake.yaml",
species_codes=["0131", "0316"],
output_dir="data/wake"
)
Loading Saved Configurations¶
from gridfia.utils.location_config import LocationConfig
from pathlib import Path
# Load from file
config = LocationConfig(Path("config/montana.yaml"))
print(f"Location: {config.location_name}")
print(f"Bbox: {config.web_mercator_bbox}")
Helper Functions¶
Zarr Creation¶
from gridfia.utils.zarr_utils import create_zarr_from_geotiffs, validate_zarr_store
from pathlib import Path
# Create Zarr from GeoTIFFs
create_zarr_from_geotiffs(
output_zarr_path=Path("data/forest.zarr"),
geotiff_paths=[
Path("downloads/species_0202.tif"),
Path("downloads/species_0122.tif"),
],
species_codes=["0202", "0122"],
species_names=["Douglas-fir", "Ponderosa pine"],
chunk_size=(1, 1000, 1000),
compression="lz4",
compression_level=5,
include_total=True
)
# Validate the created store
info = validate_zarr_store(Path("data/forest.zarr"))
print(f"Shape: {info['shape']}")
print(f"Species: {info['num_species']}")
print(f"Chunks: {info['chunks']}")
print(f"CRS: {info['crs']}")
Configuration Loading¶
from gridfia.config import load_settings, save_settings, GridFIASettings
# Load from file
settings = load_settings(Path("config/production.yaml"))
# Save current settings
save_settings(settings, Path("config/backup.json"))
Integration with NumPy and Xarray¶
NumPy Integration¶
from gridfia.utils.zarr_utils import ZarrStore
import numpy as np
with ZarrStore.open("data/forest.zarr") as store:
# Load as NumPy array
biomass = np.asarray(store.biomass)
# Calculate species richness
presence = biomass > 0
richness = presence.sum(axis=0)
print(f"Max richness: {richness.max()} species")
Xarray Integration¶
from gridfia.utils.zarr_utils import ZarrStore
import xarray as xr
import numpy as np
with ZarrStore.open("data/forest.zarr") as store:
# Create xarray DataArray
da = xr.DataArray(
store.biomass[:],
dims=["species", "y", "x"],
coords={
"species": store.species_codes,
},
attrs={
"crs": str(store.crs),
"units": "Mg/ha"
}
)
# Xarray operations
total = da.sum(dim="species")
mean_by_species = da.mean(dim=["y", "x"])
print(mean_by_species)
See Also¶
- GridFIA Class - High-level API
- Data Models - API return types
- Configuration - Settings management