You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/papers/paper.bib
+31Lines changed: 31 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -11,3 +11,34 @@ @article{
11
11
title = {Cylc: A Workflow Engine for Cycling Systems},
12
12
journal = {Journal of Open Source Software}
13
13
}
14
+
15
+
@software{metplus,
16
+
author = {Prestopnik, J. and Opatz, J. and Gotway, J.Halley and Jensen, T. and Vigh, J. and Row, M. and Kalb, C. and Fisher, H. and Goodrich, L. and Adriaansen, D. and Win-Gildenmeister, M. and McCabe, G. and Frimel, J. and Blank, L. and Arbetter, T.},
author = {S.V. Adams and R.W. Ford and M. Hambley and J.M. Hobson and I. Kavčič and C.M. Maynard and T. Melvin and E.H. Müller and S. Mullerworth and A.R. Porter and M. Rezny and B.J. Shipway and R. Wong},
42
+
keywords = {Separation of concerns, Domain specific language, Exascale, Numerical weather prediction},
43
+
abstract = {This paper describes LFRic: the new weather and climate modelling system being developed by the UK Met Office to replace the existing Unified Model in preparation for exascale computing in the 2020s. LFRic uses the GungHo dynamical core and runs on a semi-structured cubed-sphere mesh. The design of the supporting infrastructure follows object-oriented principles to facilitate modularity and the use of external libraries where possible. In particular, a ‘separation of concerns’ between the science code and parallel code is imposed to promote performance portability. An application called PSyclone, developed at the STFC Hartree centre, can generate the parallel code enabling deployment of a single source science code onto different machine architectures. This paper provides an overview of the scientific requirement, the design of the software infrastructure, and examples of PSyclone usage. Preliminary performance results show strong scaling and an indication that hybrid MPI/OpenMP performs better than pure MPI.}
title: "CSET: Toolkit for evaluation of weather and climate models"
3
3
tags:
4
-
- Python
5
-
- Cylc
6
-
- Weather
7
-
- Climate
8
-
- Atmospheric Science
4
+
- Python
5
+
- Cylc
6
+
- Weather
7
+
- Climate
8
+
- Atmospheric Science
9
9
authors:
10
-
- name: James Frost
11
-
orcid: 0009-0009-8043-3802
12
-
affiliation: 1
13
-
- name: James Warner
14
-
orcid:
15
-
affiliation: 1
16
-
- name: Sylvia Bohnenstengel
17
-
orcid:
18
-
affiliation: 1
19
-
- name: David Flack
20
-
orcid:
21
-
affiliation: 1
22
-
- name: Huw Lewis
23
-
orcid:
24
-
affiliation: 1
25
-
- name: Dasha Shchepanovska
26
-
orcid:
27
-
affiliation: 1
28
-
- name: Jon Shonk
29
-
orcid:
30
-
affiliation: 1
31
-
- name: Bernard Claxton
32
-
orcid:
33
-
affiliation: 1
34
-
- name: Jorge Bornemann
35
-
orcid:
36
-
affiliation: 2
37
-
- name: Carol Halliwell
38
-
orcid:
39
-
affiliation: 1
40
-
- name: Magdalena Gruziel
41
-
orcid:
42
-
affiliation: 3
43
-
- name: Pluto ???
44
-
orcid:
45
-
affiliation: 4
46
-
- name: John M Edwards
47
-
orcid:
48
-
affiliation: 1
10
+
- name: James Frost
11
+
orcid: 0009-0009-8043-3802
12
+
affiliation: 1
13
+
- name: James Warner
14
+
orcid:
15
+
affiliation: 1
16
+
- name: Sylvia Bohnenstengel
17
+
orcid:
18
+
affiliation: 1
19
+
- name: David Flack
20
+
orcid:
21
+
affiliation: 1
22
+
- name: Huw Lewis
23
+
orcid:
24
+
affiliation: 1
25
+
- name: Dasha Shchepanovska
26
+
orcid:
27
+
affiliation: 1
28
+
- name: Jon Shonk
29
+
orcid:
30
+
affiliation: 1
31
+
- name: Bernard Claxton
32
+
orcid:
33
+
affiliation: 1
34
+
- name: Jorge Bornemann
35
+
orcid:
36
+
affiliation: 2
37
+
- name: Carol Halliwell
38
+
orcid:
39
+
affiliation: 1
40
+
- name: Magdalena Gruziel
41
+
orcid:
42
+
affiliation: 3
43
+
- name: Pluto ???
44
+
orcid:
45
+
affiliation: 4
46
+
- name: John M Edwards
47
+
orcid:
48
+
affiliation: 1
49
49
affiliations:
50
-
- name: Met Office, United Kingdom
51
-
index: 1
52
-
ror: 01ch2yn61
53
-
- name: NIWA, New Zealand
54
-
index: 2
55
-
ror: 01ch2yn61
56
-
- name: Interdisciplinary Centre for Mathematical and Computational Modelling, Poland
57
-
index: 3
58
-
- name: Centre for Climate Research Singapore, Meteorological Service Singapore, Singapore
59
-
index: 4
60
-
ror: 025sv2d63
50
+
- name: Met Office, United Kingdom
51
+
index: 1
52
+
ror: 01ch2yn61
53
+
- name: NIWA, New Zealand
54
+
index: 2
55
+
ror: 01ch2yn61
56
+
- name: Interdisciplinary Centre for Mathematical and Computational Modelling, Poland
57
+
index: 3
58
+
- name: Centre for Climate Research Singapore, Meteorological Service Singapore, Singapore
59
+
index: 4
60
+
ror: 025sv2d63
61
61
date: 17 September 2025
62
62
bibliography: paper.bib
63
63
---
64
64
65
65
# CSET: Toolkit for evaluation of weather and climate models.
66
66
67
-
<!-- TODO: Recopy paragraphs from Word doc, as it is still being updated. -->
68
-
69
67
## Summary
70
68
71
69
<!-- A summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience. -->
72
70
73
-
The Convective- [and turbulence-] Scale Evaluation Toolkit (**CSET**) is an open source library, command line tool, and workflow for evaluation of weather and climate models. It can analyse model and observational data and visualises the output in a website to allow the development of a coherent evaluation story for numerical weather prediction, climate, and machine learning models across time and spatial scales.
71
+
The _Convective- [and turbulence-] Scale Evaluation Toolkit_ (**CSET**) is a community-driven open source library, command line tool, and workflow designed to support the evaluation of weather and climate models at convective and turbulent scales.
72
+
Developed by the Met Office in collaboration with the [Momentum® Partnership][momentum_partnership] and broader research community, CSET provides a reproducible, modular, and extensible framework for model diagnostics and verification.
73
+
It analyses numerical weather prediction (NWP) and climate modelling output, including from the next-generation LFRic model [@lfric], ML models, and observational data and visualises the output in an easily sharable static website to allow the development of a coherent evaluation story for weather and climate models across time and spatial scales.
74
74
75
75
## Statement of need
76
76
77
77
<!-- A Statement of need section that clearly illustrates the research purpose of the software and places it in the context of related work. -->
78
78
79
-
Evaluating weather and climate models is essential for the model development process and has applications in various research domains. Typically, an evaluation includes both context and justification to demonstrate the benefit of model changes compared to other models or previous model versions. The verification provides the context or baseline for understanding the model’s performance through comparison against observation. The evaluation then demonstrates the benefit through comparison against theoretical expectations or previous or different version of the model and other models for similar application areas using diagnostics derived from model output to explain the context.
80
-
81
-
Historically, evaluation has typically been done with bespoke scripts. These scripts are rarely portable, and the results of evaluation at different institutions are therefore difficult to compare. The writing of these scripts for each evaluation takes significant effort, and they are often poorly maintained, with little in the way of testing or documentation.
79
+
Evaluation is essential for the model development process in atmospheric sciences.
80
+
Typically, an evaluation includes both context and justification to demonstrate the benefit of model changes against other models or previous model versions.
81
+
The verification provides the context or baseline for understanding the model’s performance through comparison against observation.
82
+
The evaluation then demonstrates the benefit through comparison against theoretical expectations or previous or different version of the model and other models for similar application areas using diagnostics derived from model output to explain the context.
82
83
83
84
## Contribution to the field
84
85
85
-
The toolkit aims to cater for the full evaluation process, providing a range of verification diagnostics and diagnostics derived from model output that allow for both process-based and impact-based understanding. The verification side of CSET utilises the Model Evaluation Tools (METplus) verification system [@metplus] to provide a range of verification metrics that are aligned with operational verification best practices. The justification side of CSET consists of a range of diagnostics derived from model output. The diagnostics include process-based diagnostics for specific phenomena. Impact-based diagnostics that can be used to provide meaning to changes for customers are also included.
86
+
CSET addresses the need for an evaluation system that supports consistent and comparable evaluation.
87
+
It gives users easy access to a wide selection of peer-reviewed diagnostics, including spatial plots, time series, vertical profiles, probability density functions, and aggregated analysis over multiple model simulations, replacing bespoke evaluation scripts.
88
+
To cater for the full evaluation process, CSET provides a range of verification diagnostics to compare against observations and derived diagnostics based on model output, allowing for both physical process-based and impact-based understanding.
89
+
90
+
<!-- TODO: Should we include a figure of the CSET web UI? -->
91
+
92
+
<!-- TODO: Should METplus be mentioned given it isn't integrated yet? -->
93
+
The verification side of CSET utilises the Model Evaluation Tools (METplus) verification system [@metplus] to provide a range of verification metrics that are aligned with operational verification best practices.
94
+
The justification side of CSET consists of a range of diagnostics derived from model output.
95
+
These derived diagnostics include process-based diagnostics for specific atmospheric phenomena and impact-based diagnostics that can be used to understand how model changes will affect customers.
96
+
97
+
## Design
98
+
99
+
CSET is build using operators, recipes and a workflow:
86
100
87
-
The diagnostics within CSET are well-documented, tested, and peer reviewed, allowing confidence for users and increased discoverability. Furthermore, CSET provides a legacy for diagnostics via a clear maintenance infrastructure. The documentation covers diagnostic applicability allowing for confidence in their use. By building around composable operators CSET’s evaluation code can be adapted to user needs while maintaining traceability, putting customers at the heart of evaluation.
101
+
***Operators** are small python functions performing a single task, such as reading, writing, filtering, executing a calculation, stratifying, or plotting.
102
+
***Recipes** are YAML files that compose operators together to produce diagnostics, such as a wind speed difference plot between two model configurations.
103
+
* The **Workflow** runs the recipes across a larger number of models, variables, model domains and dates, collating the result into a website.
88
104
89
-
Technically, CSET has been built with portability in mind. It can run on a range of platforms, from laptops to supercomputers, and can be easily installed from conda-forge. It is built on a modern software stack that is underpinned by Cylc (a workflow engine for complex computational tasks) [@cylc8], Python 3, Iris (a Python library for meteorological data analysis) [@iris], and METplus (a verification system for weather and climate models). The toolkit is open source and actively developed in the open on GitHub, with extensive automatic unit and integration testing. It aims to be a community-based toolkit, thus contributing to CSET is made easy and actively encouraged with clear developer guidelines to help.
105
+
The design provides a flexible software that is easily adaptable by scientists to address model evaluation questions while maintaining traceability.
106
+
107
+

108
+
109
+
The recipes and operators within CSET are well-documented, tested, and peer reviewed, increasing discoverability and giving confidence to users.
110
+
The documentation covers information on the applicability and interpretation of diagnostics, ensuring they are appropriately used.
111
+
112
+
CSET has been built with portability in mind.
113
+
It can run on a range of platforms, from laptops to supercomputers, and can be easily installed from conda-forge.
114
+
It is built on a modern software stack that is underpinned by Cylc (a workflow engine for complex computational tasks) [@cylc8], Python 3, and Iris (a Python library for meteorological data analysis) [@scitools_iris].
115
+
CSET is open source under the Apache-2.0 licence, and actively developed on GitHub, with extensive automatic unit and integration testing.
116
+
It aims to be a community-based toolkit, thus contributing to CSET is made easy and actively encouraged with clear developer guidelines to help.
90
117
91
118
## Research usage
92
119
93
120
<!-- Mention (if applicable) a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it. -->
94
121
95
-
In the Met Office and across the Momentum® Partnership (a cooperative partnership of institutions sharing a seamless modelling framework for weather and climate science and services) [@momentum_partnership], CSET has been the tool of choice for understanding the regional configuration of the next-generation numerical weather prediction and climate model LFRic. [@lfric] It has helped us to characterise the regional configuration and lead to improvements in our model.
122
+
Recently, CSET has been the tool of choice in the development and evaluation of the Regional Atmosphere Land Configuration RAL3-LFRic in the Met Office and across the Momentum® Partnership (a cooperative partnership of institutions sharing a seamless modelling framework for weather and climate science and services), as part of the Met Office’s Next Generation Modelling System (NGMS) programme to transition from the Unified Model to LFRic.
123
+
It has helped us to characterise the regional configuration and lead to improvements in our model.
96
124
97
-
## Related software packages
125
+
## Conclusion
98
126
99
-
<!-- TODO: Discuss alternatives, such as ESMValTool. -->
127
+
128
+
CSET shows the benefits of open source evaluation software.
129
+
It reduces redundant evaluation diagnostics development and supports easier collaboration across organisations involved in atmospheric model evaluation, helping to build a clear and consistent understanding of model characteristics and model improvement benefits.
130
+
Major items on CSET's development roadmap are integrating METplus verification into the workflow, and increasing the number of supported observation sources.
131
+
132
+
The CSET documentation is hosted at https://metoffice.github.io/CSET
100
133
101
134
## Acknowledgements
102
135
@@ -110,6 +143,7 @@ We acknowledge contributions and support from the Met Office and Momentum® Part
0 commit comments