This repository provides code to replicate the experiments in the paper Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions by Aaron Mishkin, Arda Sahiner, and Mert Pilanci.
Python 3.8 or newer.
Clone the repository using
git clone https://github.com/pilanci_lab/scnn_experiments.git
We provide a script for easy setup on Unix systems. Run the setup.sh file with
./setup.sh
This will:
- Create a virtual environment in
.venvand install the project dependencies. - Install
scaffoldin development mode. This library contains infrastructure for running our experiments. - Create the
data,figures,tables, andresultsdirectories.
After running setup.sh, you need to activate the virtualenv using
source .venv/bin/activate
The experiments are run via a command-line interface. Most experiments can be replicated with a single command, but some require a sequence of experiments to be executed in the correct order.
First, make sure that the virtual environment is active.
Running where python in bash will show you where the active Python binaries are; this will point to a file in code/.venv/bin if the virtual environment is active.
Scripts are executed by calling scripts/run_experiment.py with the -E flag to specify an experiment name, like in the following:
python scripts/run_experiment.py -E "test"
There are several command line arguments that can be passed to run_experiment.py, such as -V for verbose execution, -F to force re-runs, etc.
Try
python scripts/run_experiment.py --help
To see all the available options.
Experiment configurations are located in scripts/exp_configs.
The configuration scripts/exp_configs/test.py is provided so you can
familiarize yourself with the execution system.
Pre-defined sbatch configurations for slurm are provided in sbatch_scripts for transparency and convenience.
These are the exact configurations used to run the original experiments on the Sherlock Cluster.
A flag specifying which sbatch script can be passed to run_experiment.py.
This will be used to submit slurm jobs if the -N parameter (used to specify number of nodes)
is also passed.
All experiments are named to corresponding with the figure/table which they
generate in the paper.
For example, Figure 1 can be generated by running the figure_1 experiment
defined in scripts/exp_configs/figure_1.py and then using
python scripts/make_figure_1_6.py
To generate Figures 1 and 6. Check each experiment file to see the exact experiment names. As noted above, some experiments have specific orders in which they must be executed; these are:
-
Table 2:
- Run the experiments in
table_2_gs.pyfirst. - Run
extract_table_2_best_params.py. - Run the experiments in
table_2_final.py.
- Run the experiments in
-
Table 3:
- Run the experiments in
table_3_gs.pyfirst. Note thattable_3_nc_relu_gsmust be run aftertable_3_relu_gs, since the latter experiment is used to determine the widths of the non-convex networks. You must update a file-path used to load experiments results before runningtable_3_nc_relu_gs. - Run
extract_table_3_best_params.py. - Run the experiments in
table_3_final.py. Again, there are several file-paths to update andtable_3_nc_relu_finalmust be run aftertable_3_relu_support.
- Run the experiments in
-
Figure 4:
- Run the experiments in
table_4.py. Again, you must runfigure_4_relubeforefigure_4_nc_reluand it is necessary to update a file-path before running the latter.
- Run the experiments in
The remaining experiments are straightforward to replicate.
Please open an issue if you experience any bugs or have trouble replicating the experiments.