Skip to content

TensorFlow v1 to v2 migration for NCF models #668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 21, 2025

Conversation

qtuantruong
Copy link
Member

@qtuantruong qtuantruong commented Apr 19, 2025

TensorFlow v1 to v2 Migration for NCF models

This outlines the changes made to migrate the Neural Collaborative Filtering (NCF) models from TensorFlow v1 to TensorFlow v2.

Key Changes

1. Removed TensorFlow v1 Compatibility Mode

  • Removed import tensorflow.compat.v1 as tf and tf.disable_v2_behavior()
  • Now using native TensorFlow v2 APIs with import tensorflow as tf

2. Session-based to Keras Model-based Approach

  • Replaced session-based execution with Keras models
  • Removed placeholders and feed dictionaries
  • Renamed _build_graph_tf to _build_model_tf to reflect the Keras model approach
  • Updated the model saving and loading functionality to use Keras model weights

3. Updated Backend Implementation

  • Created Keras Layer classes for GMF and MLP in backend_tf.py
  • Updated the embedding implementation to use Keras Embedding layers
  • Updated the dense layer implementation to use Keras Dense layers
  • Updated the regularization loss collection method

4. Updated Training Loop

  • Replaced session-based training with eager execution using GradientTape
  • Updated the optimizer creation to use Keras optimizers
  • Updated the loss function to work with Keras models

5. Updated Pretrained Model Handling

  • Updated the pretrained model loading logic to work with Keras models
  • Implemented weight transfer between models using Keras layer weights

File-specific Changes

backend_tf.py

  • Created GMFLayer and MLPLayer classes that extend tf.keras.layers.Layer
  • Updated activation functions to use TensorFlow v2 APIs
  • Updated loss function to use TensorFlow v2 regularization loss collection
  • Replaced TensorFlow v1 optimizers with TensorFlow v2 Keras optimizers

recom_ncf_base.py

  • Updated _fit_tf to use eager execution with GradientTape
  • Implemented a proper _score_tf method that uses the Keras model
  • Updated model saving and loading to use Keras model weights

recom_gmf.py, recom_mlp.py, recom_neumf.py

  • Updated model building to use Keras functional API
  • Removed session-based code and placeholders
  • Updated pretrained model handling in recom_neumf.py

@qtuantruong qtuantruong requested a review from hieuddo April 19, 2025 18:47
@hieuddo
Copy link
Member

hieuddo commented Apr 21, 2025

I ran ncf_example.py and the results look fine.

Model NDCG@50 Recall@50 Train (s) Test (s)
GMF_tensorflow 0.0409 0.1175 90.5001 13.2401
MLP_tensorflow 0.0387 0.1146 99.0240 18.8367
NeuMF_tensorflow 0.0407 0.1273 108.9428 27.2931
NeuMF_pretrained_tensorflow 0.0390 0.1156 88.7559 27.8466
GMF_pytorch 0.0407 0.1167 83.3929 1.6627
MLP_pytorch 0.0391 0.1105 67.2847 2.1375
NeuMF_pytorch 0.0397 0.1155 68.0123 2.4424
NeuMF_pretrained_pytorch 0.0407 0.1167 68.1153 2.4476

The environment setup was quite troublesome on my side. I had some version conflicts between numpy, tensorflow, pytorch, and cuda versions. I ended up replicating Google Colab environment, and it worked.

Shall we split the cornac/models/ncf/requirements.txt into two separate files:

  • requirements_pt.txt: torch>=0.4.1
  • requirements_tf.txt: numpy==2.0.2, tensorflow==2.18.0 (for reference, could be less strict using >=)

Also, what do you think about updating ncf_example.py to run both backends?

@qtuantruong
Copy link
Member Author

@hieuddo could you try setting up a new env with the following tensorflow>=2.12.0, torch>=0.4.1? I didn't face any conflict issues with that. Regarding numpy, we might not want to specify version because it'll be included as part of tensorflow/torch anyway, and we want them to be compatible.

@hieuddo
Copy link
Member

hieuddo commented Apr 21, 2025

Yeah, it worked. I guess it was because I installed one by one, and conda couldn't find feasible dependencies after some versions were already in place.

Also, what do you think about updating ncf_example.py to run both backends?

@qtuantruong
Copy link
Member Author

Also, what do you think about updating ncf_example.py to run both backends?

Looks like we've already indicated in the example that there are two options of backend. I'm open to updating the example to run both if you think it's necessary.

@hieuddo
Copy link
Member

hieuddo commented Apr 21, 2025

It's not really necessary. I just thought it might be helpful to see the comparisons between 2 backends in one run.

Feel free to merge!

@qtuantruong qtuantruong merged commit eb58278 into PreferredAI:master Apr 21, 2025
16 checks passed
@qtuantruong qtuantruong deleted the ncf-tf2 branch April 21, 2025 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants