-
Notifications
You must be signed in to change notification settings - Fork 25
Description
Overall I see class campaign
does not have an easy way to export generated data once campaign has finalized the running part.
Use case:
I want to plot a custom graph which requires rearranging my data, adding other columns, etc., and then adding specific plotting features.
Issue:
Current campaign has campaign.generate_graph(...)
which offers a straightforward solution to generate graphs based on x,y,hue params (seaborn/pandas style). This might be enough but for other more customized graphs requires adding pre/post callbacks. Example: composition of graphs, having FacetGrid vs non-FacetGrid.
Possible solutions
- (manually) Adding something like
campaign.get_data()
to get raw generated data in DataFrame (pandas) form. Example:
output_data, gen_path = campaign.get_data() # <-- here we can also add data_frame callback similarly to current generate_graph approach
# adding custom plot
g = sns.catplot(data=processed_output, kind='bar', x='..', y='..', hue='..', palette='..', ...)
g.fig.get_axes()[0].set_title("Title")
g.set(ylabel="...", xlabel="...")
g.fig.get_axes()[0].set_yscale('log')
# saving using output path generated by benchkit/campaign
g.fig.savefig(f"{fig_path}.png", transparent=False)
print(f'[INFO] Saving campaign figure in "{gen_path }.png"')
g.fig.savefig(f"{fig_path}.pdf", transparent=False)
print(f'[INFO] Saving campaign figure in "{gen_path }.pdf"')
-- PROS: add post-process in the campaign
-- CONS: mix of responsibilities. current campaign class already has dependencies with Seaborn/Pandas when generating graph. Maybe campaign.get_data()
should only return csv data rather than Pandas.
- (add more complexity into
campaign.generate_graph
) Adding more callbacks (pre/post) to add specific calls to the pipeline:
campaign.generate_graph(
plot_name="catplot",
kind="bar",
orient='v',
x="...",
y="...",
hue="...",
palette="...",
...,
process_dataframe=df_callback,
**graph_callback=post_graph_callback**
)
-- PROS: already used in benchkit, no more methods are needed
-- CONS: adding more callbacks means adding more complexity. We cannot generate wrappers of wrappers to support custom plots. Generating graphs using campaign.generate_graph
should not have more complexity than using standard Seaborn/Matplotlib way.
- (out of benchkit) Do a post-process afterwards on the csv/json files that are generated. This seems to be fair solution, but someone could it would be good to have only one pipeline from benchkit already.