Skip to content

Limit where dynamic crops and transient PFTs are allowed #229

@billsacks

Description

@billsacks

(From NCAR Teamwork site, 2015-11-11)

I found one area with a big potential for performance gains: Limiting where we allow dynamic crops. I did some performance runs with (a) CLM5BGC, (b) CLM5BGCCROP, and (c) CLM5BGCCROP, but only allocating memory for non-zero-weight crop columns (according to either year-1850 or year-2000 crop distributions... the difference between those two, in terms of performance, was much less than I expected).

When we include crops, the impact of adding 0-weight columns is much greater than the ~ 10% I found when I was previously investigating the impact of dynamic landunits on performance: When running with crops, the impact of these 0-weight columns is a 1.59x increase in cost. I think that part of the reason why I didn't consider this before was that these 0-weight columns had always been added, even before I started my dynamic landunits work.

Currently, the impact of including crops is a 2.31x cost increase. This could be reduced to as low as ~ 1.5x if we only allocated memory where needed, according to the crop distribution in year-2000. In practice, we would want to do something more than that, to allow for other changes with future scenarios. And things get harder if we want to allow coupling to an integrated assessment model, which could change flanduse_timeseries in unpredictable ways. But we could still achieve a significant cost savings if we restrict where dynamic crops are possible.

Note that this big cost savings comes from restricting the types of crops that can grow in each grid cell, in addition to the presence of any crops at all. If, instead, we say: "anywhere crops can grow, allocate memory for all possible crop types", then the savings are less (at best we reduce the 2.31x to 2.0x).

So how could we do this?

(1) The easiest thing from a software perspective would be to have a field on the surface dataset (or elsewhere) saying which crops are allowed to exist in each grid cell, ever. If someone was able to come up with such a dataset, the code mods needed to make use of this new field would be easy – I've pretty much already done it for the sake of these timing runs. The main downside of this approach is needing to come up with this field, particularly if we want to allow coupling to an integrated assessment model that may change flanduse_timeseries in unpredictable ways.

(2) An alternative would be to give more consideration to the idea of resizing arrays in memory as needed. It's possible that we could achieve this through something like init_interp; however, it feels like that would still take significant code rework, to allow reinitialization of the model at runtime. So I would not consider this a viable solution on the CESM2 timeframe.

Metadata

Metadata

Assignees

Labels

enhancementnew capability or improved behavior of existing capabilityperformanceidea or PR to improve performance (e.g. throughput, memory)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions