Skip to content

Utilities

GridFIA provides utility classes for advanced users who need direct access to Zarr stores and location configuration.

Overview

Class Description Use Case
ZarrStore Unified Zarr access interface Reading biomass data directly
LocationConfig Geographic extent management Custom region configuration

ZarrStore

The ZarrStore class provides a unified interface for reading GridFIA Zarr stores, handling both Zarr v2 and v3 formats transparently.

Unified interface for reading GridFIA Zarr stores.

This class provides a standardized way to access Zarr stores created by GridFIA, handling both Zarr v2 and v3 formats transparently. It provides typed properties for common access patterns and includes context manager support for safe resource management.

Attributes

path : Path Path to the Zarr store directory. biomass : zarr.Array The main biomass array with shape (species, height, width). species_codes : List[str] List of 4-digit FIA species codes. species_names : List[str] List of species common names. crs : CRS Coordinate reference system. transform : Affine Affine transformation for georeferencing. bounds : Tuple[float, float, float, float] Geographic bounds (left, bottom, right, top). num_species : int Number of species in the store. shape : Tuple[int, int, int] Shape of the biomass array (species, height, width).

Examples

Basic usage:

store = ZarrStore.from_path("data/montana.zarr") print(f"Species: {store.num_species}") print(f"Shape: {store.shape}") data = store.biomass[0, :, :] # Get first species layer

Using context manager:

with ZarrStore.open("data/montana.zarr") as store: ... richness = np.sum(store.biomass[:] > 0, axis=0)

Iterating over species:

store = ZarrStore.from_path("data/forest.zarr") for code, name in zip(store.species_codes, store.species_names): ... print(f"{code}: {name}")

Initialize ZarrStore from an open Zarr group.

Prefer using class methods from_path() or open() instead of calling this constructor directly.

Parameters

root : zarr.Group Open Zarr group containing the biomass data. store : zarr.storage.LocalStore, optional The underlying storage object (for resource management). path : Path, optional Path to the Zarr store on disk.

biomass property

biomass: Array

The main biomass array.

Shape is (species, height, width) where: - species: Number of species layers (index 0 is often total biomass) - height: Number of rows in the raster - width: Number of columns in the raster

Values are typically in Mg/ha (megagrams per hectare).

Returns

zarr.Array 3D array of biomass values.

species_codes property

species_codes: List[str]

List of 4-digit FIA species codes.

The first code ('0000') typically represents total biomass. Codes are zero-padded 4-digit strings (e.g., '0202' for Douglas-fir).

Returns

List[str] Species codes in order matching the biomass array's first dimension.

species_names property

species_names: List[str]

List of species common names.

Names correspond to species codes and are in the same order as the biomass array's first dimension.

Returns

List[str] Species names (e.g., 'Douglas-fir', 'Ponderosa Pine').

crs property

crs: CRS

Coordinate reference system for the data.

Returns

rasterio.crs.CRS The CRS object (default is EPSG:3857 / Web Mercator if not specified).

transform property

transform: Affine

Affine transformation for georeferencing.

The transform maps pixel coordinates to geographic coordinates: geo_x, geo_y = transform * (pixel_x, pixel_y)

Returns

rasterio.transform.Affine Affine transformation matrix.

bounds property

bounds: Tuple[float, float, float, float]

Geographic bounds of the data.

Returns

Tuple[float, float, float, float] Bounds as (left, bottom, right, top) in the store's CRS.

num_species property

num_species: int

Number of species layers in the store.

This includes the total biomass layer if present (typically at index 0).

Returns

int Count of species layers.

shape property

shape: Tuple[int, int, int]

Shape of the biomass array.

Returns

Tuple[int, int, int] Shape as (species, height, width).

from_path classmethod

from_path(path: Union[str, Path], mode: str = 'r') -> ZarrStore

Create a ZarrStore from a file path.

This is the primary way to open an existing Zarr store for reading.

Parameters

path : str or Path Path to the Zarr store directory. mode : str, default='r' Mode to open the store. Options: - 'r': Read-only (default) - 'r+': Read/write existing store

Returns

ZarrStore Initialized ZarrStore instance.

Raises

FileNotFoundError If the path does not exist. InvalidZarrStructure If the path is not a valid GridFIA Zarr store.

Examples

store = ZarrStore.from_path("data/forest.zarr") print(f"CRS: {store.crs}")

open classmethod

open(path: Union[str, Path], mode: str = 'r') -> Iterator[ZarrStore]

Context manager for safely opening and closing a ZarrStore.

This ensures proper resource cleanup when done accessing the store.

Parameters

path : str or Path Path to the Zarr store directory. mode : str, default='r' Mode to open the store ('r' for read-only, 'r+' for read/write).

Yields

ZarrStore Initialized ZarrStore instance.

Examples

with ZarrStore.open("data/forest.zarr") as store: ... total_biomass = np.sum(store.biomass[:]) ... print(f"Total biomass: {total_biomass:.2f}")

is_valid_store classmethod

is_valid_store(path: Union[str, Path]) -> bool

Check if a path contains a valid GridFIA Zarr store.

This performs a quick validation without fully loading the store.

Parameters

path : str or Path Path to check.

Returns

bool True if the path contains a valid GridFIA Zarr store.

Examples

if ZarrStore.is_valid_store("data/forest.zarr"): ... store = ZarrStore.from_path("data/forest.zarr")

close

close() -> None

Close the Zarr store and release resources.

After calling close(), the store should not be accessed.

Basic Usage

from gridfia.utils.zarr_utils import ZarrStore

# Open a Zarr store
store = ZarrStore.from_path("data/forest.zarr")

# Access properties
print(f"Shape: {store.shape}")  # (species, height, width)
print(f"Species: {store.num_species}")
print(f"CRS: {store.crs}")
print(f"Bounds: {store.bounds}")

# Access species information
for code, name in zip(store.species_codes, store.species_names):
    print(f"  {code}: {name}")

# Access biomass data
biomass = store.biomass[:]  # Load all data
print(f"Biomass shape: {biomass.shape}")
print(f"Total biomass range: {biomass[0].min():.2f} - {biomass[0].max():.2f}")

# Close when done
store.close()

Context Manager Usage

Use the context manager for automatic resource cleanup:

from gridfia.utils.zarr_utils import ZarrStore

with ZarrStore.open("data/forest.zarr") as store:
    # Access data within context
    biomass = store.biomass[:]
    species = store.species_codes

    # Process data
    total_biomass = biomass[0]  # First layer is total
    mean_biomass = total_biomass[total_biomass > 0].mean()
    print(f"Mean biomass: {mean_biomass:.2f} Mg/ha")
# Store automatically closed

Working with Species Data

from gridfia.utils.zarr_utils import ZarrStore
import numpy as np

with ZarrStore.open("data/forest.zarr") as store:
    # Get index of specific species
    species_code = "0202"  # Douglas-fir
    try:
        idx = store.species_codes.index(species_code)
        species_data = store.biomass[idx]

        # Calculate statistics
        valid_data = species_data[species_data > 0]
        print(f"Species {species_code}:")
        print(f"  Mean biomass: {valid_data.mean():.2f} Mg/ha")
        print(f"  Max biomass: {valid_data.max():.2f} Mg/ha")
        print(f"  Coverage: {len(valid_data) / species_data.size * 100:.1f}%")

    except ValueError:
        print(f"Species {species_code} not in store")

Chunked Reading for Large Datasets

from gridfia.utils.zarr_utils import ZarrStore
import numpy as np

with ZarrStore.open("data/large_forest.zarr") as store:
    # Read in chunks to avoid memory issues
    chunk_size = 1000
    height, width = store.shape[1], store.shape[2]

    total_sum = 0
    total_count = 0

    for y in range(0, height, chunk_size):
        for x in range(0, width, chunk_size):
            y_end = min(y + chunk_size, height)
            x_end = min(x + chunk_size, width)

            # Read chunk
            chunk = store.biomass[0, y:y_end, x:x_end]
            valid = chunk[chunk > 0]

            total_sum += valid.sum()
            total_count += len(valid)

    mean_biomass = total_sum / total_count if total_count > 0 else 0
    print(f"Mean biomass: {mean_biomass:.2f} Mg/ha")

Validation

from gridfia.utils.zarr_utils import ZarrStore
from pathlib import Path

# Quick validation without full loading
path = Path("data/forest.zarr")
if ZarrStore.is_valid_store(path):
    print("Valid GridFIA Zarr store")
else:
    print("Invalid or non-GridFIA Zarr store")

LocationConfig

The LocationConfig class manages geographic extents for any US state, county, or custom region.

Configuration manager for any geographic location (state, county, custom region).

Initialize configuration from YAML file or create from location.

PARAMETER DESCRIPTION
config_path

Path to configuration YAML file

TYPE: Optional[Path] DEFAULT: None

location_type

Type of location ("state", "county", "custom")

TYPE: str DEFAULT: 'state'

from_state classmethod

from_state(state: str, output_path: Optional[Path] = None) -> LocationConfig

Create configuration for a specific US state.

PARAMETER DESCRIPTION
state

State name or abbreviation

TYPE: str

output_path

Path to save configuration file

TYPE: Optional[Path] DEFAULT: None

RETURNS DESCRIPTION
LocationConfig

LocationConfig instance for the state

from_county classmethod

from_county(county: str, state: str, output_path: Optional[Path] = None) -> LocationConfig

Create configuration for a specific county.

PARAMETER DESCRIPTION
county

County name

TYPE: str

state

State name or abbreviation

TYPE: str

output_path

Path to save configuration file

TYPE: Optional[Path] DEFAULT: None

RETURNS DESCRIPTION
LocationConfig

LocationConfig instance for the county

from_bbox classmethod

from_bbox(bbox: Tuple[float, float, float, float], name: str = 'Custom Region', crs: str = 'EPSG:4326', output_path: Optional[Path] = None) -> LocationConfig

Create configuration for a custom bounding box.

PARAMETER DESCRIPTION
bbox

Bounding box (xmin, ymin, xmax, ymax)

TYPE: Tuple[float, float, float, float]

name

Name for the region

TYPE: str DEFAULT: 'Custom Region'

crs

CRS of the bounding box

TYPE: str DEFAULT: 'EPSG:4326'

output_path

Path to save configuration file

TYPE: Optional[Path] DEFAULT: None

RETURNS DESCRIPTION
LocationConfig

LocationConfig instance for the custom region

Creating Location Configurations

from gridfia.utils.location_config import LocationConfig

# Create configuration for a state
config = LocationConfig.from_state("Montana")

print(f"Location: {config.location_name}")
print(f"Bbox (Web Mercator): {config.web_mercator_bbox}")

# Save to file for reuse
config = LocationConfig.from_state(
    "Montana",
    output_path="config/montana.yaml"
)
from gridfia.utils.location_config import LocationConfig

# Create configuration for a county
config = LocationConfig.from_county(
    county="Wake",
    state="North Carolina"
)

print(f"Location: {config.location_name}")
print(f"Bbox: {config.web_mercator_bbox}")

# Save configuration
config = LocationConfig.from_county(
    county="Harris",
    state="Texas",
    output_path="config/harris_county.yaml"
)
from gridfia.utils.location_config import LocationConfig

# WGS84 bounding box (lon/lat)
config = LocationConfig.from_bbox(
    bbox=(-123.5, 45.0, -122.0, 46.5),
    name="Pacific Northwest Study Area",
    crs="EPSG:4326"
)

print(f"Location: {config.location_name}")
print(f"Web Mercator bbox: {config.web_mercator_bbox}")

# Save custom region
config = LocationConfig.from_bbox(
    bbox=(-123.5, 45.0, -122.0, 46.5),
    name="PNW Study Area",
    crs="EPSG:4326",
    output_path="config/pnw_study.yaml"
)

Using Configurations

from gridfia import GridFIA
from gridfia.utils.location_config import LocationConfig
from pathlib import Path

api = GridFIA()

# Create and save configuration
config = LocationConfig.from_county(
    county="Wake",
    state="NC",
    output_path="config/wake.yaml"
)

# Use configuration for download
files = api.download_species(
    location_config="config/wake.yaml",
    species_codes=["0131", "0316"],
    output_dir="data/wake"
)

Loading Saved Configurations

from gridfia.utils.location_config import LocationConfig
from pathlib import Path

# Load from file
config = LocationConfig(Path("config/montana.yaml"))

print(f"Location: {config.location_name}")
print(f"Bbox: {config.web_mercator_bbox}")

Helper Functions

Zarr Creation

from gridfia.utils.zarr_utils import create_zarr_from_geotiffs, validate_zarr_store
from pathlib import Path

# Create Zarr from GeoTIFFs
create_zarr_from_geotiffs(
    output_zarr_path=Path("data/forest.zarr"),
    geotiff_paths=[
        Path("downloads/species_0202.tif"),
        Path("downloads/species_0122.tif"),
    ],
    species_codes=["0202", "0122"],
    species_names=["Douglas-fir", "Ponderosa pine"],
    chunk_size=(1, 1000, 1000),
    compression="lz4",
    compression_level=5,
    include_total=True
)

# Validate the created store
info = validate_zarr_store(Path("data/forest.zarr"))
print(f"Shape: {info['shape']}")
print(f"Species: {info['num_species']}")
print(f"Chunks: {info['chunks']}")
print(f"CRS: {info['crs']}")

Configuration Loading

from gridfia.config import load_settings, save_settings, GridFIASettings

# Load from file
settings = load_settings(Path("config/production.yaml"))

# Save current settings
save_settings(settings, Path("config/backup.json"))

Integration with NumPy and Xarray

NumPy Integration

from gridfia.utils.zarr_utils import ZarrStore
import numpy as np

with ZarrStore.open("data/forest.zarr") as store:
    # Load as NumPy array
    biomass = np.asarray(store.biomass)

    # Calculate species richness
    presence = biomass > 0
    richness = presence.sum(axis=0)

    print(f"Max richness: {richness.max()} species")

Xarray Integration

from gridfia.utils.zarr_utils import ZarrStore
import xarray as xr
import numpy as np

with ZarrStore.open("data/forest.zarr") as store:
    # Create xarray DataArray
    da = xr.DataArray(
        store.biomass[:],
        dims=["species", "y", "x"],
        coords={
            "species": store.species_codes,
        },
        attrs={
            "crs": str(store.crs),
            "units": "Mg/ha"
        }
    )

    # Xarray operations
    total = da.sum(dim="species")
    mean_by_species = da.mean(dim=["y", "x"])

    print(mean_by_species)

See Also