Contributing Guide¶
Thank you for your interest in contributing. The sections below outline how the library is structured, how to submit changes, and the conventions to follow when developing new features or improving existing functionality.
For convenience, you can copy this auto updated LLM priming prompt with function headers and docs.
Collaboration Instructions¶
We welcome all contributions the project! Please be respectful and work towards improving the library. To get started:
-
Create an issue describing the feature or bug or just to ask a question. Provide relevant context, desired timeline, any assistance needed, who will be responsible for the work, anticipated results, and any other details.
-
Fork the repository and create a new feature branch.
-
Make your changes and add any necessary tests.
-
Open a Pull Request against the main repository.
Design Philosophy¶
- Keep code concise and simple
- Adapt code for large datasets with windows, multiprocessing, progressive computations, etc
- Keep code modular and have descriptive names
- Use PEP 8 code formatting
- Use functions that are already created when possible
- Combine similar params into one multi-value parameter
- Use similar naming convention and input parameter format as other functions.
- Create docstrings (Google style), tests, and update the docs for new functionality
Extensible Function Types¶
In Relative Radiometric Normalization (RRN) methods often differ in how images are matched, pixels are selected, and seamlines are created. This library organizes those into distinct Python packages, while other operations like aligning rasters, applying masks, merging images, and calculating statistics are more consistent across techniques and are treated as standard utilities.
Matching functions¶
Used to adjust the pixel values of images to ensure radiometric consistency across scenes. These functions compute differences between images and apply transformations so that brightness, contrast, or spectral characteristics align across datasets.
Masking functions (PIF/RCS)¶
Used to define which parts of an image should be kept or discarded based on spatial criteria. These functions apply vector-based filters or logical rules to isolate regions of interest, remove clouds, or exclude invalid data from further processing.
Seamline functions¶
Used to determine optimal boundaries between overlapping image regions. These functions generate cutlines that split image footprints in a way that minimizes visible seams and balances spatial coverage, often relying on geometric relationships between overlapping areas.
Standard UI¶
Reusable types are organized into the types and validation module. Use these types directly as the types of params inside functions where applicable. Use the appropriate _resolve... function to resolve these inputs into usable variables.
Input/Output¶
The input_name parameter defines how the input files are determined and accepts either a str or a list. If given as a str, it should contain either a folder glob pattern path and default_file_pattern must be set or a whole glob pattern file path. Functions should default to searching for all appropriately formated files in the input folder (for example "*.tif"). Alternatively, it can be a list of full file paths to individual input images. For example:
- input_images="/input/files/*.tif" (does not require default_file_pattern)
- input_images="/input/folder" (requires default_file_pattern to be set),
- input_images=["/input/one.tif", "/input/two.tif", ...] (does not require default_file_pattern)
The output_name parameter defines how output filenames are determined and accepts either a str or a list. If given as a str, it should contain either a folder template pattern path and default_file_pattern must be set or a whole template pattern file path. Functions should default to templating with basename, underscore, processing step (for example "$_Global"). Alternatively, it may be a list of full output paths, which must match the number of input images. For example:
- output_images="/output/files/$.tif" (does not require default_file_pattern)
- output_images="/output/folder" (requires default_file_pattern to be set),
- output_images=["/output/one.tif", "/output/two.tif", ...] (does not require default_file_pattern)
The _resolve_paths function handles creating folders for output files. Folders and files are distinguished by the presence of a "." in the basename.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Output dtype¶
The custom_output_dtype parameter specifies the data type for output rasters and defaults to the input image’s data type if not provided.
1 2 3 4 5 6 7 8 |
|
Nodata Value¶
The custom_nodata_value parameter overrides the input nodata value from the first raster in the input rasters if set.
1 2 3 4 5 6 7 8 |
|
Debug Logs¶
The debug_logs parameter enables printing of debug information; it defaults to False. Functions should begin by printing "Start {process name}", while all other print statements should be conditional on debug_logs being True. When printing the image being processed, use the image name and not the image path.
1 2 3 4 5 6 7 |
|
Vector Mask¶
The vector_mask parameter limits statistics calculations to specific areas and is given as a tuple with two or three items: a literal "include" or "exclude" to define how the mask is applied, a string path to the vector file, and an optional field name used to match geometries based on the input image name (substring match allowed). Defaults to None for no mask.
1 2 3 4 5 6 7 |
|
Parallel Workers¶
The image_parallel_workers parameter defines the parallelization strategy at the image level. It accepts a tuple such as ("process", "cpu") to enable multiprocessing across all available CPU cores, or you can use "thread" as the backend if threading is preferred. Set it to None to disable image-level parallelism. The window_parallel_workers parameter controls parallelization within each image at the window level and follows the same format. Setting it to None disables window-level parallelism. Processing windows should be done one band at a time for scalability.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
|
Windows¶
The window_size parameter sets the tile size for reading and writing, using an integer for square tiles, a tuple for custom dimensions, "internal" to use the raster’s native tiling (ideal for efficient streaming from COGs), or None to process the full image at once.
1 2 3 4 5 6 7 8 9 |
|
COGs¶
The save_as_cog parameter, when set to True, saves the output as a Cloud-Optimized GeoTIFF with correct band and block ordering.
1 2 3 4 5 6 7 |
|
Validate Inputs¶
The validate methods are used to check that input parameters follow expected formats before processing begins. There are different validation methods for different scopes—some are general-purpose (e.g., Universal.validate) and others apply to specific contexts like matching (Match.validate_match). These functions raise clear errors when inputs are misconfigured, helping catch issues early and enforce consistent usage patterns across the library.
1 2 3 4 5 6 7 8 9 |
|
File Cleanup¶
Temporary generated files can be deleted once they are no longer needed via this command:
1 |
|
Docs¶
Serve docs locally¶
Runs a local dev server at http://localhost:8000.
1 |
|
Build static site¶
Generates the static site into the site/ folder.
1 |
|
Deploy to GitHub Pages¶
Deploys built site using mkdocs gh-deploy.
1 |
|
Versioning¶
Uses git tag to create annotated version tags and push them. This also syncs to Pypi. New versions will be released when the maintainer determines sufficient new functionality has been added.
1 |
|
Code Formatting¶
This project uses black for code formatting and ruff for linting.
Set Up Pre-commit Hooks (Recommended)¶
To maintain code consistency use this hook to check and correct code formatting automatically:
1 2 |
|
Manual Formatting¶
Format code: Automatically formats all Python files with black.
1 |
|
Check formatting: Checks that all code is formatted (non-zero exit code if not).
1 |
|
Lint code: Runs ruff to catch style and quality issues.
1 |
|
Testing¶
pytest is used for testing. Tests will automatically be run when merging into main but they can also be run locally via:
1 |
|
To test a individual folder or file:
1 |
|