The source code and data for our paper "Tiny-NewsRec: Effective and Efficient PLM-based News Recommendation" in EMNLP 2022.
- PyTorch == 1.6.0
- TensorFlow == 1.15.0
- horovod == 0.19.5
- transformers == 3.0.2
-
Prepare Data
You can download and unzip the public MIND dataset with the following command:
# Under Tiny-NewsRec/ mkdir MIND && mkdir log_all && mkdir model_all cd MIND wget https://mind201910small.blob.core.windows.net/release/MINDlarge_train.zip wget https://mind201910small.blob.core.windows.net/release/MINDlarge_dev.zip wget https://mind201910small.blob.core.windows.net/release/MINDlarge_test.zip unzip MINDlarge_train.zip -d MINDlarge_train unzip MINDlarge_dev.zip -d MINDlarge_dev unzip MINDlarge_test.zip -d MINDlarge_test cd ../
Then, you should run
python split_file.pyunderTiny-NewsRec/to prepare the training data. SetNin line 13 ofsplit_file.pyas the number of available GPUs. This script will construct the training samples and split them intoNfiles for multi-GPU training. -
Run Experiments
Our Tiny-NewsRec method contains the following 4 steps:
-
Step 1
Run the notebook
Domain-specific_Post-train.ipynbto domain-specifically post-train the PLM-based news encoder. This will generate a checkpoint namedDP_12_layer_{step}.pteveryargs.Tsteps underTiny-NewsRec/. Then you need to set the variableckpt_pathsas the paths to the last$M$ checkpoints and run the rest cells. For each checkpoint, it will generate two.pklfiles namedteacher_title_emb_{idx}.pklandteacher_body_emb_{idx}.pklwhich are used for the post-training stage knowledge distillation in our method. -
Step 2
Run the notebook
Post-train_KD.ipynbto run the post-training stage knowledge distillation in our method. Modifyargs.num_hidden_layersto change the number of Transformer layers in the student model. This will generate a checkpoint of the student model underTiny-NewsRec/. -
Step 3
Use the script
Tiny-NewsRec/PLM-NR/demo.shto finetune theMteacher models post-trained in Step 1 with the news recommendation task. Remember to setuse_pretrain_modelasTrueand setpretrain_model_pathas the path to one of these teacher models respectively. -
Step 4
Run
bash demo.sh get_teacher_embunderTiny-NewsRec/Tiny-NewsRecto generate the news embeddings of the$M$ teacher model finetuned in Step 3 for the finetuning stage knowledge distillation in our method. Setteacher_ckptsas the path to these teacher models (separate by space).Use the script
Tiny-NewsRec/Tiny-NewsRec/demo.shto run the finetuning stage knowledge distillation in our method. Modify the value ofnum_student_layersto change the number of Transformer layers in the student model and setbert_trainable_layersto the indexes of its last two layers (start from 0). Setuse_pretrain_modelasTrueand setpretrain_model_pathas the path to the checkpoint of the student model generated in Step 2. Then you can start training withbash demo.sh train.
-