This repository contains code for the following paper:
Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning (PETS 2019, link to presentation)
Sanjit Bhat, David Lu, Albert Kwon, and Srini Devadas.
- Ensure that you have a functioning machine with an NVIDIA GPU inside it. The model will take significantly longer to run on a CPU.
- Make sure you have the TensorFlow/Keras deep learning stack installed. For detailed instructions, see this link under the "Software Setup" section. For our experiments, we used Ubuntu 16.04 LTS, CUDA 8.0, CuDNN v6, and TensorFlow 1.3.0 as a backend for Keras 2.0.8.
- To install all required Python packages, simply issue the following command:
pip install -r requirements.txt.
The first step in running our model is to place the adequate amount
of raw packet sequences in the data_dir folder. Each monitored
website needs to have at least num_mon_inst_train + num_mon_inst_test
instances, and there needs to be at least
num_unmon_sites_train + num_unmon_sites_test unmonitored sites.
If you use the Wang et al. data format (i.e., each line representing a
new packet with the relative time and direction separated by a space),
then we have that supported in wang_to_varcnn.py. Otherwise, you will need
to modify wang_to_varcnn.py, or you can write your own glue code to move
to the Wang et al. format.
After setting up the data and specifying the parameters in config.json,
you can run all parts of our code just by issuing a python run_model.py
command. After that, our programs will be called in the following sequence:
- wang_to_varcnn.py: This parses the- data_dirfolder; extracts direction, time, metadata, and labels; and stores all the monitored and unmonitored traces in- all_closed_world.npzand- all_open_world.npz, respectively, in the- data_dirfolder.
- preprocess_data.py: This uses the data in- all_closed_world.npzto pick a random- num_mon_inst_trainand- num_mon_inst_testinstances of each of the- num_mon_sitesmonitored sites for the training and test sets, respectively. It also performs a similar random split for the unmonitored sites (using the- all_open_world.npzfile) and preprocesses all of these traces to scale the metadata, change to inter-packet timing, etc. Finally, it saves the direction data, time data, metadata, and labels to- .h5files to conserve RAM during the training process.
- run_model.py: This is the main file that first calls the prior two files. Next, it loads the model architectures from either- var_cnn.pyor- df.py, trains the models, saves their predictions, and calls- evaluate.pyfor evaluation.
- During training, data_generator.pygenerates new batches of data in parallel. Since large datasets can contain hundreds of thousands of traces,data_generator.pyuses.h5files to access the traces for one batch without loading the entire dataset into memory.
- evaluate.py: This first calculates metrics for each of the in-training combinations specified in- mixture. Then, it averages each of their predictions together and reports metrics for the overall out-of-training ensemble. It saves all metrics to the- job_result.jsonfile.
config.json provides the configuration settings to all the
other programs. We describe its parameters in further detail below:
- data_dir: This relative path provides the location of the "raw" packet sequences (e.g., the "0", "1", "0-0", "0-1" files in Wang et al.'s dataset). Also, it later stores the- all_closed_world.npzand- all_open_world.npzfiles generated by- wang_to_varcnn.pyand the- .h5data files generated by- preprocess_data.py.
- predictions_dir: After training the model,- run_model.pygenerates predictions for the test set and stores them in this directory.- evaluate.pylater uses them to calculate test metrics.
- num_mon_sites: The number of monitored websites. Each of the- num_mon_sitessites in- data_dirmust have at least- num_mon_inst_train+- num_mon_inst_testinstances.
- num_mon_inst_train: The number of monitored instances used for training.
- num_mon_inst_test: The number of monitored instances used for testing.
- num_unmon_sites_train: The number of unmonitored sites used for training. Each site has one instance.
- num_unmon_sites_test: The number of unmonitored sites used for testing. Each site has one instance, and these unmonitored websites are different from those used for training.
- model_name: The model name. Either "var-cnn" or "df".
- batch_size: The batch size used during training. For Var-CNN, we found that a batch size of 50 works well. The recommended batch size for DF is 128.
- mixture: The mixture of ensembles used during training and evaluation. Each of the inner arrays represent models combined in-training.- run_modelwill save the predictions for every such in-training combination. Subsequently,- evaluate_ensemblewill report metrics for these individual models as well as the overall out-of-training ensemble (i.e., the average of the individual predictions). Note: this functionality only works with Var-CNN (in fact, deep fingerprinting will automatically default to using- [["dir"]]). Also, do not use two in-training combinations with the same components as their prediction files will be overwritten. Default:- [["dir", "metadata"], ["time", "metadata"]]for Var-CNN.
- seq_length: The length of the input sequence fed into the CNN (default: 5000). We use this parameter right from the start when scraping the raw data.
- df_epochs: The number of epochs used to train DF (default: 30).
- var_cnn_max_epochs: The maximum number of epochs used to train Var-CNN (default: 150). The- EarlyStoppingcallback often cuts off training much sooner -- whenever validation accuracy fails to increase.
- var_cnn_base_patience: The "patience" (i.e., number of epochs of no validation accuracy improvement) until we decrease the learning rate of Var-CNN and stop training (default: 5). We implement this functionality in the- ReduceLROnPlateauand- EarlyStoppingcallbacks inside- var_cnn.py.
- dir_dilations: Whether to use dilations with the direction ResNet (default: true).
- time_dilations: Whether to use dilations with the time ResNet (default: true).
- inter_time: Whether to use the inter-packet time (i.e., time between two packets) or the relative time (i.e., time from the first packet) for timing data (default: true, i.e., we do use inter-packet time).
- scale_metadata. Whether to scale metadata to zero mean and unit variance (default: true).
If you find Var-CNN useful in your research, please consider citing:
@article{bhat19,
  title={{Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning}},
  author={Bhat, Sanjit and Lu, David and Kwon, Albert and Devadas, Srinivas},
  journal={Proceedings on Privacy Enhancing Technologies},
  volume={4},
  pages={292--310},
  year={2019}
}   
sanjit.bhat (at) gmail.com
davidboxboro (at) gmail.com
kwonal (at) mit.edu
devadas (at) mit.edu
Any discussions, suggestions, and questions are welcome!