Skip to content

Race Condition When Writing To Local Result File #691

@LennartPurucker

Description

@LennartPurucker

When running AMLB in local mode in parallel (e.g., on a cluster or a large machine with a NFS). this code https://github.com/openml/automlbenchmark/blob/master/amlb/benchmark.py#L473 has a race condition in rare edge cases.

When two processes try to append to the same local file (session_dir/scores/results.csv) at the same time, one of the append operations gets dropped.

The resulting file will lock like this:

id,task,framework,constraint,fold,type,result,metric,mode,version,params,app_version,utc,duration,training_duration,predict_duration,models_count,seed,info,acc,balacc,logloss,models_ensemble_count
openml.org/t/2073,yeast,AutoGluonDefault14022025,4h8c,7,multiclass,-0.974113,neg_logloss,local,1.2.1b20250215,"","",2025-02-15T12:53:58,107.3,79.7,0.1,69,2031772959,,0.635135,0.509885,0.974113,4

openml.org/t/2073,yeast,AutoGluonDefault14022025,4h8c,9,multiclass,-1.07679,neg_logloss,local,1.2.1b20250215,"","",2025-02-15T12:54:02,111.4,82.1,0.1,69,818337949,,0.608108,0.556688,1.07679,5

But it should look like this:

id,task,framework,constraint,fold,type,result,metric,mode,version,params,app_version,utc,duration,training_duration,predict_duration,models_count,seed,info,acc,balacc,logloss,models_ensemble_count
openml.org/t/2073,yeast,AutoGluonDefault14022025,4h8c,7,multiclass,-0.974113,neg_logloss,local,1.2.1b20250215,"","",2025-02-15T12:53:58,107.3,79.7,0.1,69,2031772959,,0.635135,0.509885,0.974113,4
openml.org/t/2073,yeast,AutoGluonDefault14022025,4h8c,8,multiclass,-1.15063,neg_logloss,local,1.2.1b20250215,"","",2025-02-15T12:54:02,107.4,79.4,0.1,69,1120882699,,0.540541,0.47974,1.15063,4
openml.org/t/2073,yeast,AutoGluonDefault14022025,4h8c,9,multiclass,-1.07679,neg_logloss,local,1.2.1b20250215,"","",2025-02-15T12:54:02,111.4,82.1,0.1,69,818337949,,0.608108,0.556688,1.07679,5

The results will still be correctly saved to the global results file. Moreover, this does not crash the process. This would either require more unique session IDs or also a lock for the local results file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions