You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: rdagent/scenarios/data_science/dev/runner/prompts.yaml
+12-18Lines changed: 12 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -25,13 +25,10 @@ DSCoSTEER_eval:
25
25
3. Confirm that the prediction file (`submission.csv`) is generated using only the test dataset, and its format matches the sample submission.
26
26
If the code does not satisfy the requirements:
27
27
- Set "acceptable" to false.
28
-
- Set "final_decision" to false.
29
-
{% if enable_hyperparameter_tuning_check %}- set "hyperparameter_tuning_decision" to false.
30
-
- Set "hyperparameter_tuning_suggestion" to an empty string.
31
28
If the code satisfy the requirements:
32
29
- Set "acceptable" to true.
33
-
- Proceed to the next evaluation.
34
30
31
+
{% if enable_hyperparameter_tuning_check %}
35
32
# Evaluation 2: Hyperparameter
36
33
## Evaluation Description
37
34
The user will provide you the time spent on the whole code execution and the timeout of the code execution. You should decide whether the hyperparameter is reasonable based on the time.
@@ -45,7 +42,6 @@ DSCoSTEER_eval:
45
42
3. Your suggestion should have a strong chance of improving the model's performance. Focus on the most obvious and impactful opportunities for quick improvement by leveraging more training time. Don't explore hyperparameters with low confidence. If there are no obvious and impactful opportunities and the code runs well, please accept it.
46
43
If the code satisfy the requirements:
47
44
- Set "hyperparameter_tuning_decision" to true.
48
-
- Set "final_decision" to false.
49
45
- Provide a reasonable suggestion in "hyperparameter_tuning_suggestion". The "hyperparameter_tuning_suggestion" should begin with a clear observation, followed by your suggestion. For example: "[Observation] The maximum number of epochs was reached, but the validation loss is still going down and early stopping was not activated. Only 15% of the allowed time was used. [Suggestion] We recommend increasing epochs to 100 to avoid underfitting and further improve model performance."
50
46
If the code does not satisfy the requirements:
51
47
- Set "hyperparameter_tuning_decision" to false.
@@ -59,10 +55,11 @@ DSCoSTEER_eval:
59
55
"execution": "Describe whether the whole code base executed successfully and generating the final submission. Include any errors or issues encountered, and retain all error messages and traceback details.",
60
56
"return_checking": "Verify the generated files, particularly the submission file. Ensure that its format matches the sample submission",
61
57
"code": "Provide feedback on code quality, readability, and adherence to the given specifications.",
62
-
"acceptable": <true/false: if the solution has paased execution, return_checking, and code verification, then it is a valid solution and acceptable. Otherwise it is not acceptable.>,{% if enable_hyperparameter_tuning_check %}
58
+
"acceptable": <true/false: if the solution has passed execution, return_checking, and code verification, then it is a valid solution and acceptable. Otherwise it is not acceptable.>,
59
+
{% if enable_hyperparameter_tuning_check %}
63
60
"hyperparameter_tuning_decision": <true/false>,
64
-
"hyperparameter_tuning_suggestion": <suggestion in plain text for hyperparameter tuning>,{% endif %}
65
-
"final_decision": <true/false>,
61
+
"hyperparameter_tuning_suggestion": <suggestion in plain text for hyperparameter tuning>,
62
+
{% endif %}
66
63
}
67
64
```
68
65
{% else %}
@@ -101,14 +98,13 @@ DSCoSTEER_eval:
101
98
"acceptable": <true/false: if the solution has paased execution, return_checking, and code verification, then it is a valid solution and acceptable. Otherwise it is not acceptable.>,
102
99
{% if enable_hyperparameter_tuning_check %}"hyperparameter_tuning_decision": <true/false>,
103
100
"hyperparameter_tuning_suggestion": <suggestion in plain text for hyperparameter tuning>,{% endif %}
104
-
"final_decision": <true/false>,
105
101
}
106
102
```
107
103
{% endif %}
108
104
# NOTE: when is_sub_enabled == False, we don't have any checking about the return. So it is just placeholder currently
109
105
110
106
user: |-
111
-
# Code base
107
+
# Current Code base
112
108
{{ code }}
113
109
114
110
## Stdout of code execution and testing
@@ -121,10 +117,9 @@ DSCoSTEER_eval:
121
117
122
118
{% if queried_former_failed_knowledge|length != 0 %}
123
119
# Evolving History
124
-
{% for former_failed_knowledge in queried_former_failed_knowledge %}## Attempt {{ loop.index }}:
120
+
{% for former_failed_knowledge in queried_former_failed_knowledge %}## Attempt {{ loop.index }}:
0 commit comments