Tools#
This section highlights tools that support reproducible analysis and research. This includes tools for general software development and bespoke packages that have been developed for government analysis. Those developed or contributed to within government are marked with the abbreviation (gov).
If you have developed a package for use in analysis or recommend any that are not included here, please add them to the list. You can request a new tool to be added to the list by creating an issue on GitHub or contacting us by email. Alternatively, you can add it directly to the project by creating a Pull Request. You can do this using the “Suggest edit” link under the GitHub logo at the top of this page. Please include a link and brief description when requesting a new tool to be added.
The tools included on this page will in general follow the good quality assurance practices described in this guidance. However, as with any software there is a chance that they may still contain bugs or limitations. Please apply your own judgement when using them. If you feel a tool should no longer be included in this list, please suggest an edit or get in touch.
Data manipulation and analysis#
Manipulating and analysing data.
Python#
R#
Publishing#
afcolours (Python and R) (gov) - ease the use of the Analysis Function colour palettes for visually accessible graphs.
Quarto - reproducible documents for Python and R
a11ytables (R only) (gov) - creating reproducible, accessible spreadsheets
gptables (Python and R) (gov) - creating reproducible, accessible spreadsheets
Testing#
Tools for implementing automated code testing.
Python#
pytest - common testing framework
unittest - common testing framework
hypothesis - property-based testing
chispa (PySpark) - helper for testing PySpark code
coverage - measuring test coverage
R#
Dependency management#
venv (Python) - manage packages using virtual environments
pyenv (Python) - manage independent Python versions for different projects
renv (R) - virtual environments for managing packages
conda - manage language versions and packages for most languages
Version control#
Git - common open source version control system
pre-commit - trigger checks (e.g. linters and formatters) before Git commits are created
Project templates#
govcookiecutter (Python) (gov) - template project for reproducible analysis
Rgovcookiecutter (R) (gov) - template project for reproducible analysis
Code Linters#
Analysing code for stylistic errors, and sometimes bugs.
Python#
R#
lintr - check code style
Code Formatters#
Automated code formatters. These check code style, like linters, but also actively make changes to your code to meet a particular style.
Python#
R#
Packaging Code#
Creating and releasing code as a package.
Python#
twine - utility for publishing Python packages to the Python Package Index PyPI
R#
goodpractice - gives advice on the quality of your R packages
fusen - builds R packages from Rmarkdown file specifications
Pipeline Orchestration#
Apache Airflow - workflow management platform
targets (R) - defining and executing pipelines in R