Data Validation, Code Linting & Type Checking

Implement data quality controls including validation schemas, automated code formatting and linting, and unit tests that run via GitHub Actions.
python
claude
pydantic
pandera
black
pytest
pre-commit
github actions
Published

January 28, 2026

Claude Code makes it easy to add just about any feature I can imagine. For my national parks hiking project, one conversation was all it took for Claude to create a 3D trail visualization module. After selecting a park and a trail, the script produces a nice interactive 3D trail map via Plotly.

Plotly visualization of the Guadalupe Peak Trail, Guadalupe Mountains National Park

Given the ease with which it’s possible to generate code, I wanted more robust checks on the data collection and code itself.

I added data validation schemas to each collector script:

I configured code formatting and linting tools to run via pre-commit hooks, including:

Lastly, I expanded the pytest suite of unit tests and configured them to run automatically via a GitHub Action workflow on every push and pull request.

Some of these tools I’ve used previously in my own professional work, but others I discovered while reading Software Engineering for Data Scientists. With these checks in place, I look forward to adding more features!