Data Validation, Code Linting & Type Checking

Claude Code makes it easy to add just about any feature I can imagine. For my national parks hiking project, one conversation was all it took for Claude to create a 3D trail visualization module. After selecting a park and a trail, the script produces a nice interactive 3D trail map via Plotly.

Plotly visualization of the Guadalupe Peak Trail, Guadalupe Mountains National Park

Given the ease with which it’s possible to generate code, I wanted more robust checks on the data collection and code itself.

I added data validation schemas to each collector script:

Pydantic to validate JSON API responses from NPS and USGS sources
Pandera to validate tabular data (GeoDataFrames) from OSM and TNM sources

I configured code formatting and linting tools to run via pre-commit hooks, including:

Black for consistent formatting
isortfor import organization
mypy for static type checking.

Lastly, I expanded the pytest suite of unit tests and configured them to run automatically via a GitHub Action workflow on every push and pull request.

Some of these tools I’ve used previously in my own professional work, but others I discovered while reading Software Engineering for Data Scientists. With these checks in place, I look forward to adding more features!