Skip to content
Binary file added docs/papers/cset_ui.png
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions for nicer images for CSET?

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
56 changes: 56 additions & 0 deletions docs/papers/paper.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
@article{
cylc8,
doi = {10.21105/joss.00737},
url = {https://doi.org/10.21105/joss.00737},
year = {2018},
publisher = {The Open Journal},
volume = {3},
number = {27},
pages = {737},
author = {Oliver, Hilary J. and Shin, Matthew and Sanders, Oliver},
title = {Cylc: A Workflow Engine for Cycling Systems},
journal = {Journal of Open Source Software}
}

@software{metplus,
author = {Prestopnik, J. and Opatz, J. and Gotway, J.Halley and Jensen, T. and Vigh, J. and Row, M. and Kalb, C. and Fisher, H. and Goodrich, L. and Adriaansen, D. and Win-Gildenmeister, M. and McCabe, G. and Frimel, J. and Blank, L. and Arbetter, T.},
date = {2025},
title = {The METplus Version 6.1.0 User’s Guide},
publisher = {Developmental Testbed Center},
url = {https://github.com/dtcenter/METplus/releases.},
language = {en}
}

@software{scitools_iris,
author = {{Iris contributors}},
doi = {10.5281/zenodo.595182},
date = {2025},
license = {BSD-3-Clause},
title = {{Iris}},
url = {https://github.com/SciTools/iris}
}

@article{lfric,
title = {LFRic: Meeting the challenges of scalability and performance portability in Weather and Climate models},
journal = {Journal of Parallel and Distributed Computing},
volume = {132},
pages = {383-396},
year = {2019},
issn = {0743-7315},
doi = {https://doi.org/10.1016/j.jpdc.2019.02.007},
url = {https://www.sciencedirect.com/science/article/pii/S0743731518305306},
author = {S.V. Adams and R.W. Ford and M. Hambley and J.M. Hobson and I. Kavčič and C.M. Maynard and T. Melvin and E.H. Müller and S. Mullerworth and A.R. Porter and M. Rezny and B.J. Shipway and R. Wong},
keywords = {Separation of concerns, Domain specific language, Exascale, Numerical weather prediction},
abstract = {This paper describes LFRic: the new weather and climate modelling system being developed by the UK Met Office to replace the existing Unified Model in preparation for exascale computing in the 2020s. LFRic uses the GungHo dynamical core and runs on a semi-structured cubed-sphere mesh. The design of the supporting infrastructure follows object-oriented principles to facilitate modularity and the use of external libraries where possible. In particular, a ‘separation of concerns’ between the science code and parallel code is imposed to promote performance portability. An application called PSyclone, developed at the STFC Hartree centre, can generate the parallel code enabling deployment of a single source science code onto different machine architectures. This paper provides an overview of the scientific requirement, the design of the software infrastructure, and examples of PSyclone usage. Preliminary performance results show strong scaling and an indication that hybrid MPI/OpenMP performs better than pure MPI.}
}

@software{esmvaltool,
author = {Andela, Bouwe and Broetz, Bjoern and de Mora, Lee and Drost, Niels and Eyring, Veronika and Koldunov, Nikolay and Lauer, Axel and Mueller, Benjamin and Predoi, Valeriu and Righi, Mattia and Schlund, Manuel and Vegas-Regidor, Javier and Zimmermann, Klaus and Adeniyi, Kemisola and Arnone, Enrico and Bellprat, Omar and Berg, Peter and Billows, Chris and Blockley, Ed and Bock, Lisa and Bodas-Salcedo, Alejandro and Caron, Louis-Philippe and Carvalhais, Nuno and Cionni, Irene and Cortesi, Nicola and Corti, Susanna and Crezee, Bas and Davin, Edouard Leopold and Davini, Paolo and Deser, Clara and Diblen, Faruk and Docquier, David and Dreyer, Laura and Ehbrecht, Carsten and Earnshaw, Paul and Geddes, Theo and Gier, Bettina and Castellani, Giulia and Gillett, Ed and Gonzalez-Reviriego, Nube and Goodman, Paul and Hagemann, Stefan and Hardacre, Catherine and von Hardenberg, Jost and Hassler, Birgit and Heuer, Helge and Hogan, Emma and Hunter, Alasdair and Kadow, Christopher and Kindermann, Stephan and Koirala, Sujan and Kuehbacher, Birgit and Lledó, Llorenç and Lejeune, Quentin and Lembo, Valerio and Little, Bill and Loosveldt-Tomas, Saskia and Lorenz, Ruth and Lovato, Tomas and Lucarini, Valerio and Malinina, Elizaveta and Massonnet, François and Mohr, Christian Wilhelm and Amarjiit, Pandde and Parsons, Naomi and Pérez-Zanón, Núria and Phillips, Adam and Proft, Max and Russell, Joellen and Sandstad, Marit and Sellar, Alistair and Senftleben, Daniel and Serva, Federico and Sillmann, Jana and Stacke, Tobias and Swaminathan, Ranjini and Tomkins, Katherine and Torralba, Verónica and Weigel, Katja and Schulze, Kirsten and Sarauer, Ellen and Roberts, Charles and Kalverla, Peter and Alidoost, Sarah and Verhoeven, Stefan and Vreede, Barbara and Smeets, Stef and Soares Siqueira, Abel and Kazeroni, Rémi and Potter, Jerry and Winterstein, Franziska and Beucher, Romain and Kraft, Jeremy and Ruhe, Lukas and Bonnet, Pauline and Munday, Gregory and Chun, Felicity},
doi = {10.5281/zenodo.3401363},
license = {Apache-2.0},
month = sep,
title = {{ESMValTool}},
url = {https://github.com/ESMValGroup/ESMValTool/},
version = {v2.13.0},
year = {2025}
}
145 changes: 145 additions & 0 deletions docs/papers/paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
---
title: "CSET: Toolkit for evaluation of weather and climate models"
date: 17 September 2025
bibliography: paper.bib
tags:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any additional keywords to suggest?

- Python
- Cylc
- Weather
- Climate
- Atmospheric Science
authors:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To start with I've ordered the authors by amount of code/commits, with a cut off for those that have changed $\lt 10$ lines of code (which is the same as those who have changed $\lt 100$ lines).

I'm not very experienced with this aspect of publication, so would welcome any advice.

Additionally, we need to get everyone listed here to sign off on the paper, and make sure we have their preferred name and affiliation.

- name: James Frost
orcid: 0009-0009-8043-3802
affiliation: 1
- name: James Warner
orcid:
affiliation: 1
- name: Sylvia Bohnenstengel
orcid:
affiliation: 1
- name: David L. A. Flack
orcid: 0000-0001-7262-4937
affiliation: 1
- name: Huw Lewis
orcid:
affiliation: 1
- name: Dasha Shchepanovska
orcid:
affiliation: 1
- name: Jon Shonk
orcid:
affiliation: 1
- name: Bernard Claxton
orcid:
affiliation: 1
- name: Jorge Bornemann
orcid:
affiliation: 2
- name: Carol Halliwell
orcid:
affiliation: 1
- name: Magdalena Gruziel
orcid:
affiliation: 3
- name: Pluto ???
orcid:
affiliation: 4
- name: John M Edwards
orcid:
affiliation: 1
affiliations:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have "MetOffice@Reading" for Reading folk.

- name: Met Office, United Kingdom
index: 1
ror: 01ch2yn61
- name: National Institute of Water and Atmospheric Research, New Zealand
index: 2
ror: 04hxcaz34
- name: Interdisciplinary Centre for Mathematical and Computational Modelling, Poland
index: 3
- name: Centre for Climate Research Singapore, Meteorological Service Singapore, Singapore
index: 4
ror: 025sv2d63
---
<!-- TODO: Get people's agreement on authorship, and their preferred names and ORCIDs. -->

# Summary

<!-- A summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience. -->

The _Convective- [and turbulence-] Scale Evaluation Toolkit_ (**CSET**) is a community-driven open source library, command line tool, and workflow designed to support the evaluation of weather and climate models at convective and turbulent scales.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to put convective and turbulent scales as something more accessible like: km and 100 m scales or kilometre and hectometric scales?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, especially as we already mention the convective and turbulence scales in CSET's name. Hectometric is probably too jargony, so maybe

support the evaluation of weather and climate models at kilometre and hundred metre scales.

Developed by the Met Office in collaboration with the [Momentum® Partnership][momentum_partnership] and broader research community, CSET provides a reproducible, modular, and extensible framework for model diagnostics and verification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear what "broader research community" means here - are we talking industry, academic partners, something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially anyone who is not in the Momentum Partnership. Currently this is just academics.

It analyses numerical weather prediction (NWP) and climate modelling output, including from the next-generation LFRic model [@lfric], ML models, and observational data and visualises the output in an easily sharable static website to allow the development of a coherent evaluation story for weather and climate models across time and spatial scales.

# Statement of need

<!-- A Statement of need section that clearly illustrates the research purpose of the software and places it in the context of related work. -->

Evaluation is essential for the model development process in atmospheric sciences.
Typically, an evaluation includes both context and justification to demonstrate the benefit of model changes against other models or previous model versions.
The verification provides the context or baseline for understanding the model’s performance through comparison against observations.
The evaluation then demonstrates the benefit through comparison against theoretical expectations or previous or different version of the model and other models for similar application areas using diagnostics derived from model output to explain the context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The evaluation then demonstrates the benefit through comparison against theoretical expectations or previous or different version of the model and other models for similar application areas using diagnostics derived from model output to explain the context.
The evaluation then demonstrates the benefit through comparison against theoretical expectations or known characteristics of previous or different versions of the same model, and (or) other models for similar applications using diagnostics derived from model output to explain the context.

I'm getting a bit lost in this sentence. I've altered it a little, but see what you think, I'm still not fully happy with what I have suggested. So, I think the main message is that there needs to be some further iteration on this sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a crack at it.


# Contribution to the field

CSET addresses the need for an evaluation system that supports consistent and comparable evaluation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't start a sentence with an acronym.

It gives users easy access to a wide selection of peer-reviewed diagnostics, including spatial plots, time series, vertical profiles, probability density functions, and aggregated analysis over multiple model simulations, replacing bespoke evaluation scripts.
To cater for the full evaluation process, CSET provides a range of verification diagnostics to compare against observations and derived diagnostics based on model output, allowing for both physical process-based and impact-based understanding.

<!-- TODO: Find a better image. -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth looking at one of our more bespoke diagnostics rather than a histogram?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. I wanted an image of the output, and mainly put it in here to figure out the layout, but this was just a random screenshot I had to hand. I would welcome suggestions for what would make an interesting screenshot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like Age of air, CAPE ratio, or even the Beaufort Scale (if had with UM data) could be good alongside a difference plot - as colourful and shows a bit more bespoke CSET. If using Age of air/CAPE ratio you will also be able to reference the paper that the diagnostic was made in. If including the description, think which ones have a clear description on the webpage that can aid interpretation of the diagnostic.

![The website produced by CSET. The left column allows for navigating and selecting the displayed diagnostic(s). The main region of the interface displays a particular diagnostic, and documentation to aid interpretation.](cset_ui.png)

<!-- TODO: Should METplus be mentioned given it isn't integrated yet? -->
The verification side of CSET utilises the Model Evaluation Tools (METplus) verification system [@metplus] to provide a range of verification metrics that are aligned with operational verification best practices.
The justification side of CSET consists of a range of diagnostics derived from model output.
These derived diagnostics include process-based diagnostics for specific atmospheric phenomena and impact-based diagnostics that can be used to understand how model changes will affect customers.

Compared to alternative open source evaluation tools, such as ESMValTool [@esmvaltool], CSET is focused on weather-relevant time scales and evaluating models towards a goal of operational usage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth referencing the EMSValTool papers rather than just the repository. Or is this standard for JOSS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing JOSS specific.

I chose that reference as ESMValTool has a CITATION.cff file in their repository, which is used to indicate the preferred citation.

There are plenty of other good ESMValTool papers1 though, so I'm not really sure which one to pick.

Footnotes

  1. Such as https://gmd.copernicus.org/articles/18/4009/2025/ and https://gmd.copernicus.org/articles/18/1169/2025/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more inclined to go with https://gmd.copernicus.org/articles/9/1747/2016/, and the equivalent one(s) for version 2. Usually when citing software the full history of papers is used, you don't have to be limited to using one reference.


# Design

CSET is built using operators, recipes and a workflow:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to start the sentence without the acronym?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel there is no way to start it without the acronym it should be spelt out in full rather than using the acronym (even if the acronym has already been defined).


* **Operators** are small python functions performing a single task, such as reading, writing, filtering, executing a calculation, stratifying, or plotting.
* **Recipes** are YAML files that compose operators together to produce diagnostics, such as a wind speed difference plot between two model configurations.
* The **Workflow** runs the recipes across a larger number of models, variables, model domains, and dates; it collates the result into a website.

The design provides a flexible software that is easily adaptable by scientists to address model evaluation questions while maintaining traceability.

![Graph view of a wind speed difference recipe, as produced by `cset graph`. Each node represents an operator, with the arrows showing the flow of data.](wind_speed_difference_graph.svg)

The recipes and operators within CSET are well-documented, tested, and peer reviewed, increasing discoverability and giving confidence to users.
The documentation covers information on the applicability and interpretation of diagnostics, ensuring they are appropriately used.

CSET has been built with portability in mind.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be starting a sentence with an acronym.

It can run on a range of platforms, from laptops to supercomputers, and can be easily installed from conda-forge.
It is built on a modern software stack that is underpinned by Cylc (a workflow engine for complex computational tasks) [@cylc8], Python 3, and Iris (a Python library for meteorological data analysis) [@scitools_iris].
CSET is open source under the Apache-2.0 licence, and actively developed on GitHub, with extensive automatic unit and integration testing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be starting a sentence with an acronym.

It aims to be a community-based toolkit, thus contributing to CSET is made easy and actively encouraged with clear developer guidelines to help.

# Research usage

<!-- Mention (if applicable) a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it. -->

Recently, CSET has been the tool of choice in the development and evaluation of the Regional Atmosphere Land Configuration RAL3-LFRic in the Met Office and across the Momentum® Partnership (a cooperative partnership of institutions sharing a seamless modelling framework for weather and climate science and services), as part of the Met Office’s Next Generation Modelling System (NGMS) programme to transition from the Unified Model to LFRic.
It has helped us to characterise the regional configuration and lead to improvements in our model.

# Conclusion

CSET shows the benefits of open source evaluation software.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not be starting a sentence with an acronym.

It reduces redundant evaluation diagnostics development and supports easier collaboration across organisations involved in atmospheric model evaluation, helping to build a clear and consistent understanding of model characteristics and model improvement benefits.
Major items on CSET's development roadmap are integrating METplus verification into the workflow, and increasing the number of supported observation sources.

The CSET documentation is hosted at https://metoffice.github.io/CSET

# Acknowledgements

<!-- Acknowledgement of any financial support. -->

We acknowledge contributions and support from the Met Office and Momentum® Partnership for this project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also be acknowledging funding e..g WCSSP projects that are contributing to people's time/have contributed towards peoples time and naming who is being supported by them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a good list? Off the top of my head:

  • WCSSP South East Asia
  • WCSSP South Afric
  • NGMS, ML-INT (Not sure as these are internal Met Office programs.)


# References

<!-- A list of key references, including to other software addressing related needs. Note that the references should include full names of venues, e.g., journals and conferences, not abbreviations only understood in the context of a specific discipline. -->

[momentum_partnership]: https://www.metoffice.gov.uk/research/approach/collaboration/momentum-partnership
Loading