For testing add a framework to bypass parts of the code #3425

ekluzek · 2025-08-21T23:23:57Z

Description of changes

This adds a framework for adding some structure to tell the model that you want to bypass some parts of the model for some different types of testing that you may want to do. It's controlled by a namelist read as part of the SelfTestDriver code. Then there are some methods to control the bypass functionality mainly in clm_initialize and clm_driver.

There is some more work I think that should happen before this comes in. But, this gives the rough structure. I need this to get to this point so that I can use this branch in my other ones.

This does part of #3301 but in a cleaner way.

Specific notes

Contributors other than yourself, if any: @samsrabin and others for ideas

CTSM Issues Fixed (include github issue #):
This enables work in #2995
Starts #3276
Some work here #3295

Are answers expected to change (and if so in what way)? No

Any User Interface Changes (namelist or namelist defaults changes)? Yes
New namelist items for self tests

Does this create a need to change or add documentation? Did you do so? Maybe? But, no to second

Testing performed, if any: So far running a self test case. Will run the decomp_init testlist as well as aux_clm

Definition of done:

Get feedback on the general structure
Apply the suggested changes
Figure out what the error checking should look like
Decide on what things above should come in now, vs opened in an issue and done later in separate PR's
Have CTSM SE group review it
Add error checking in build-namelist
Add error checking in the code
Changes to the logical functions for robustness
Add some documentation around this

…phases as well as logical functions to do that

…_framework

…rom run_self_tests which inherits from it

samsrabin · 2025-08-22T16:54:07Z

@ekluzek This is the PR I was asking about earlier. Is it ready for review or still under development?

ekluzek · 2025-08-22T17:26:44Z

@ekluzek This is the PR I was asking about earlier. Is it ready for review or still under development?

Ahh, actually both. I want to get some feedback on it now. But, I think there are several things I need to do, before a final review. I want to get feedback on if this looks like even a good direction to pursue, as if not I'll brainstorm on a different direction. I also want to get some feedback now, if it does look promising on the list of things to do before merging it into b4b-dev.

samsrabin

This is exactly how I was envisioning you implementing the self tests, so feel free to continue!

bld/namelist_files/namelist_definition_ctsm.xml

Fix spelling from review. Co-authored-by: Sam Rabin <[email protected]>

ekluzek

Some notes I'm making on things to do.

ekluzek · 2025-08-21T23:31:10Z

bld/CLMBuildNamelist.pm

  push @groups, "clm_canopy_inparm";
  push @groups, "prigentroughness";
  push @groups, "zendersoilerod";
+  push @groups, "for_testing_options";


I think I should add more error checking to the build-namelist for these bypass options. I also think that these testing options should maybe do things like ensure that history and restart files are off the like. There also might be some error checking that some of the advanced options are NOT turned on with the bypass options, and that sort of thing. And it will be important to make sure it's clear that these bypass and testing options are turned on -- and they don't get turned on accidentally. It will take some thinking to figure that out.

cime_config/testdefs/testmods_dirs/clm/run_self_tests/user_nl_clm

ekluzek · 2025-08-21T23:33:19Z

src/main/clm_driver.F90

    ! CalcIrrigationNeeded. Simply declaring this variable makes the ICE go away.
    real(r8), allocatable :: dummy1_to_make_pgi_happy(:)
    !-----------------------------------------------------------------------
+    if ( for_testing_bypass_run_except_clock_advance() ) return


Because this short circuits the code -- it's worth thinking if this should be a simple return statement or a more explicit if structure.

@samsrabin this is one thing I'd like to hear feedback on. Let me know what you think about this.

Well this one isn't just an early return; it actually short-circuits the entire subroutine. So why not just wrap its call in an if statement?

I'm seeing now that the other returns are the same way, so same question there.

I really like your idea @samsrabin. For performance it's of course better to do at the higher level. For readability it might depend on where you want these things to be seen.

But, in this case it would remove this strange testing option from code that's a mix of science/software infrastructure to the higher level in the NUOPC cap that is pretty much just for SE's. So it helps with readability as well as separation of concerns.

ekluzek · 2025-08-21T23:34:28Z

src/main/clm_initializeMod.F90

       call bgc_vegetation_inst%Init2(bounds_proc, NLFilename)
    end if

+    if ( .not. for_testing_bypass_init_after_self_tests() )then


I need to indent the lines below. I left it unindents until I saw if it could run, and that this is the spot where it should be.

ekluzek · 2025-08-21T23:34:46Z

src/main/clm_initializeMod.F90

    if (nsrest == nsrContinue ) then
       call htapes_fieldlist()
    end if
+    end if


Indent lines above.

ekluzek · 2025-08-21T23:37:00Z

src/main/clm_initializeMod.F90

       call hist_htapes_build()
    end if

+    if ( .not. for_testing_bypass_init_after_self_tests() )then


Same thing about the indents.

It's also not completelly clear how many of these type of things there should be.

One reason that I need to do this here -- rather than having return statements, is that I need the timers to work. So if a timer has been started in the code above, I can't return in the middle without that timer being messed up. So it needs to exucute the part of the code where the timer is stopped.

@samsrabin this and the one above about the return statements are things I'd like to hear from you on. In here they need to be if statements as I say above, but in the above section they could be either.

Although, maybe because this is a weird pathway in the code -- the return statements should be preferred so it doesn't disrupt the code flow as much as if statements. It's easy to miss returns in the code, flow but that's only important when it's something that might happen commonly enough. There was a code standard in my past that recommended to not have return statements in the midst of subroutines, because they are easy to miss. And I do see that point...

I prefer early returns as much as possible, especially since as you said these are rare cases. Avoiding returns by using if statements creates its own readability issues: How far indented am I right now? Under what conditions?

The bigger readability problem with this module is how enormous the subroutines are. I'd prefer to see things like initialize2 getting refactored before we start worrying about ifs vs. returns. Maybe that could be something you do before this PR? I.e., refactor that subroutine to create at least two new ones:

Everything inside the first new if

Everything inside the second new if

Maybe also (3) everything between the new ifs.

I wouldn't want to see that done in this PR, though, in the interest of keeping things easy to review and test.

Refactoring could also avoid return statements too: Break subroutines into two smaller subroutines, with the call of the first being wrapped in an if statement.

All good points, thanks for the comments. I'll be thinking about all this. I can't do a lot of work in refactoring clm_initialize and won't do it here. But maybe it would help to do some simple things in a different small PR. Hmmm...

ekluzek · 2025-08-21T23:38:36Z

src/main/clm_instMod.F90


    ! Initialize urban time varying data
-    call urbantv_inst%Init(bounds, NLFilename)
+    if ( .not. for_testing_bypass_init_after_self_tests() )then


In some of my decompInit testing work, I found where the urbantv calll specifically was problematic, so I'm explicitly just avoiding this one call here.

Actually, this isn't where the problem was.

So I should probably remove this one.

ekluzek · 2025-08-21T23:39:22Z

src/self_tests/SelfTestDriver.F90

+    character(len=*), parameter :: nmlname = 'for_testing_options'
+    !-----------------------------------------------------------------------
+
+    namelist /for_testing_options/ for_testing_bypass_init, for_testing_bypass_run


Some of the other for_testing namelist options should move to here as well.

ekluzek · 2025-08-21T23:39:50Z

src/self_tests/SelfTestDriver.F90

+    call shr_mpi_bcast (for_testing_bypass_init, mpicom)
+    call shr_mpi_bcast (for_testing_bypass_run, mpicom)
+
+    if (masterproc) then


There also should be some error checking done here in the Fortran code.

ekluzek · 2025-08-21T23:42:08Z

src/self_tests/SelfTestDriver.F90

+  !-----------------------------------------------------------------------
+
+  logical function for_testing_bypass_init_after_self_tests()
+    ! Determine if should exit initialization early after having run the self tests


Both of these should get more sosphisticated and do things like NOT return true until after the self-tests are run. And ensure that the self_test namelist was read in, or else abort. The run phase bypass should maybe NOT return true until after the init phase has passed and things like that.

…M into for_testing_bypass_framework

Conflicts: bld/namelist_files/namelist_definition_ctsm.xml src/cpl/nuopc/lnd_comp_nuopc.F90 src/main/clm_varctl.F90 src/self_tests/SelfTestDriver.F90

…sting_work Conflicts: cime_config/testdefs/ExpectedTestFails.xml

…sting_work Conflicts: cime_config/testdefs/testmods_dirs/clm/for_testing_fastsetup_bypassrun/user_nl_clm cime_config/testdefs/testmods_dirs/clm/run_self_tests/user_nl_clm

… stuff in it when use_noio is TRUE Work on reconciling timers and for_testing bypass code from the mpi_scan branch. Conflicts: src/main/clm_initializeMod.F90

…128 for Derecho or 48 for Izumi

…in sync and doing so was not working

…global clumps

… out, and so that the self-tests can run to completion afterwards Conflicts: src/cpl/nuopc/lnd_comp_nuopc.F90 src/main/clm_initializeMod.F90

…Init_lnd timers to around the calls rather than for the entire subroutine, because the things at the top that may abort will then have a broken timer Conflicts: src/cpl/nuopc/lnd_comp_nuopc.F90 src/main/decompInitMod.F90

…_exit_after_self_tests, change the self tests testmod so that its about initialization, this works with a compset with SATM, but hangs -- because nothing stops the run Conflicts: cime_config/testdefs/testmods_dirs/clm/run_self_tests/README cime_config/testdefs/testmods_dirs/clm/run_self_tests/shell_commands cime_config/testdefs/testmods_dirs/clm/run_self_tests/user_nl_clm src/cpl/nuopc/lnd_comp_nuopc.F90

… of atm_present and adjustments to how send_to_atm was done

…hould be Conflicts: src/main/clm_initializeMod.F90

…lly some timers accidentally brought in again

ekluzek added 5 commits August 21, 2025 14:07

Add for_testing options to namelist handling to bypass init and run

d3178e1

Turn the bypass init and run logicals for testing on

d65ecfb

Add a namelist read and some logical settings to bypass init and run …

96e0c94

…phases as well as logical functions to do that

Merge remote-tracking branch 'escomp/b4b-dev' into for_testing_bypass…

f6272ef

…_framework

Add use of abortutils so can make endrun calls

7006740

ekluzek requested a review from samsrabin August 21, 2025 23:23

ekluzek self-assigned this Aug 21, 2025

ekluzek added the enhancement new capability or improved behavior of existing capability label Aug 21, 2025

ekluzek added this to LMWG: Sprint Planning Board Aug 21, 2025

ekluzek added testing additions or changes to tests bfb bit-for-bit devops Development Operations to improve development throughput, E.g., adding GitHub Workflows labels Aug 21, 2025

github-project-automation bot moved this to Todo in LMWG: Sprint Planning Board Aug 21, 2025

ekluzek changed the base branch from master to b4b-dev August 21, 2025 23:24

ekluzek mentioned this pull request Aug 22, 2025

Work with self_tests so that there's an initialization only option #3301

Draft

6 tasks

Add bypassing the run phase in the for_testing tests, and remove it f…

93628b2

…rom run_self_tests which inherits from it

samsrabin reviewed Aug 22, 2025

View reviewed changes

bld/namelist_files/namelist_definition_ctsm.xml Outdated Show resolved Hide resolved

Update bld/namelist_files/namelist_definition_ctsm.xml

28834c9

Fix spelling from review. Co-authored-by: Sam Rabin <[email protected]>

ekluzek commented Aug 22, 2025

View reviewed changes

ekluzek added 9 commits September 23, 2025 01:35

Merge branch 'b4b-dev' into for_testing_bypass_framework

256e692

Merge branch 'for_testing_bypass_framework' of github.com:ekluzek/CTS…

0137dc2

…M into for_testing_bypass_framework

Merge branch 'b4b-dev' into for_testing_bypass_framework

69792af

Add namelist controls for self testing

de1ca06

Conflicts: bld/namelist_files/namelist_definition_ctsm.xml src/cpl/nuopc/lnd_comp_nuopc.F90 src/main/clm_varctl.F90 src/self_tests/SelfTestDriver.F90

Add unit_test_shr directory to the main model build

c1c7ca3

Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_for_te…

db4551c

…sting_work Conflicts: cime_config/testdefs/ExpectedTestFails.xml

Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_for_te…

cb8e7ec

…sting_work Conflicts: cime_config/testdefs/testmods_dirs/clm/for_testing_fastsetup_bypassrun/user_nl_clm cime_config/testdefs/testmods_dirs/clm/run_self_tests/user_nl_clm

Balance check doesn't take time, so adjust the timers again for part3

09aa5ac

Add another timer within part3, and also turn off some of the history…

bf498ab

… stuff in it when use_noio is TRUE Work on reconciling timers and for_testing bypass code from the mpi_scan branch. Conflicts: src/main/clm_initializeMod.F90

ekluzek added 15 commits September 29, 2025 19:10

Add timers for clm_initialize2 that cover the whole subroutine

3c54060

Change the test grid total size to 384 so can be divisible by either …

ce2d68b

…128 for Derecho or 48 for Izumi

Don't do the abort testing if not serial as different tasks won't be …

2fc723f

…in sync and doing so was not working

Change a test to make it valid for clump_pproc or not

d8d656b

Just do the checking over the local processor clumps and not all the …

4ce6b5f

…global clumps

Resolve the conflicts

7dc7dcd

Remove some of the previous bypassing changes that aren't needed here

2a46724

Move bypass code around a bit so that most timers aren't half in/half…

c95b886

… out, and so that the self-tests can run to completion afterwards Conflicts: src/cpl/nuopc/lnd_comp_nuopc.F90 src/main/clm_initializeMod.F90

Move the get_proc_bounds to inside the bypass

dac0ae0

Add asserts for scalars and also text scalars

d19b894

Revert most of 2fd081b so removing the changes regarding the addition…

a5d5b5c

… of atm_present and adjustments to how send_to_atm was done

Move some for_testing namelist items into the selftests driver namelist

b3185c0

Remove the uneeded timers and get back to the 3 part timers as they s…

1acf630

…hould be Conflicts: src/main/clm_initializeMod.F90

ekluzek changed the title ~~[WIP] For testing add a framework to bypass parts of the code~~ For testing add a framework to bypass parts of the code Oct 1, 2025

ekluzek added 4 commits October 1, 2025 11:07

Remove some changes from the baseline code that aren't needed especia…

2636975

…lly some timers accidentally brought in again

Remove TestDecompInit for now, bring it in, in another PR

def2c97

Remove the update to Assertions and bring it in, in another PR

bf947fb

Remove update in DecompInitMod for now

80192dc

ekluzek mentioned this pull request Oct 3, 2025

Decomp init for testing work #3412

Open

ekluzek added 2 commits October 3, 2025 17:40

Merge branch 'b4b-dev' into for_testing_bypass_framework

df19381

Merge branch 'b4b-dev' into for_testing_bypass_framework

417922d

ekluzek moved this from Todo to In Progress in LMWG: Sprint Planning Board Oct 6, 2025

ekluzek moved this from In Progress to Stalled in LMWG: Sprint Planning Board Oct 8, 2025

For testing add a framework to bypass parts of the code #3425

Are you sure you want to change the base?

For testing add a framework to bypass parts of the code #3425

Uh oh!

Conversation

ekluzek commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Specific notes

Uh oh!

samsrabin commented Aug 22, 2025

Uh oh!

ekluzek commented Aug 22, 2025

Uh oh!

samsrabin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ekluzek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samsrabin Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ekluzek commented Aug 21, 2025 •

edited

Loading

samsrabin Aug 22, 2025 •

edited

Loading