Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 50 additions & 15 deletions tfx_addons/firebase_publisher/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@

**Your company/organization:** Individual(ML GDE)

**Project name:** [Firebase ML Publisher](https://github.com/tensorflow/tfx-addons/issues/59)
**Project name:** [FirebasePublisher](https://github.com/tensorflow/tfx-addons/issues/59)

## Project Description
This project defines a custom TFX component to publish/update ML models from TFX Pusher to [Firebase ML](https://firebase.google.com/products/ml). The input model from TFX Pusher is assumed to be a TFLite format.
This project defines a custom TFX component to publish/update ML models to [Firebase ML](https://firebase.google.com/products/ml).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a line to denote why pushing models to Firebase ML is useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is noted in the Project Use-Case(s) section :)


## Project Category
Component
Expand All @@ -21,25 +21,60 @@ This project helps users to publish trained models directly from TFX Pusher comp
With Firebase ML, we can guarantee that mobile devices can be equipped with the latest ML model without explicitly embedding binary in the project compiling stage. We can even A/B test different versions of a model with Google Analytics when the model is published on Firebase ML.

## Project Implementation
Firebase ML Publisher component will be implemented as Python function-based component. You can find the [actual source code](https://github.com/sayakpaul/Dual-Deployments-on-Vertex-AI/blob/main/custom_components/firebase_publisher.py) in my personal project.
Firebase ML Publisher component will be implemented as Python class-based component. You can find the [actual source code](https://github.com/deep-diver/complete-mlops-system-workflow/tree/feat/firebase-publisher/training_pipeline/pipeline/components/pusher/FirebasePublisher) in my personal project.

The implementation details
- Define a custom Python function-based TFX component. It takes the following parameters from a previous component.
- The URI of the pushed model from TFX Pusher component.
- Requirements from Firebase ML (credential JSON file path, Firebase temporary-use GCS bucket). Please find more information from [Before you begin section](https://firebase.google.com/docs/ml/manage-hosted-models#before_you_begin) in the official Firebase document.
- Meta information to manage published model for Firebase ML such as `display name` and `tags`.
- Download the Firebase credential file and pushed TFLite model file.
- Initialize Firebase Admin with the credential and Firebase temporary-use GCS bucket.
- Search if any models with the same `display name` has already been published.
- if yes, update the existing Firebase ML mode, then publish it
- if no, create a new Firebase ML model, then publish it
- Return `tfx.dsl.components.OutputDict(result=str)` to indicate if the job went successful, and if the job was about creating a new Firebase ML model or updating the exisitng Firebase ML model.
- This component behaves similar to [Pusher](https://www.tensorflow.org/tfx/api_docs/python/tfx/v1/components/Pusher) component, but it pushes/hosts model to Firebase ML instead. To this end, FirebasePublisher interits Pusher, and it gets the following inputs

```python
FirebasePublisher(
model: types.BaseChannel = None,
model_blessing: Optional[types.BaseChannel] = None,
custom_config: Optional[Dict[str, Any]] = None,
)
```

- Each inputs:
- `model` : the model from the upstream TFX component such as [Trainer](https://www.tensorflow.org/tfx/api_docs/python/tfx/v1/components/Trainer)
- `model_blessing` : the output of `blessing` from the Evaluator component to indicate if the given `model` is good enough to be pushed
- `custom_config` : additional information to initialize and configure [Firebase Admin SDK](https://firebase.google.com/docs/reference/admin/python). `FIREBASE_ML_MODEL_NAME` and `FIREBASE_ML_MODEL_TAGS` correspond to the `display_name` and `tags` respectively of the Firebase hosted model. `FIREBASE_CREDENTIALS` is a optional parameter, and it indicates GCS location where a Service Account Key (JSON) file is stored. If this parameter is not given, [Application Default Credentials](https://cloud.google.com/docs/authentication/production) will be used in GCP environment

```python
FIREBASE_ML_ARGS = {
"FIREBASE_ML": {
"FIREBASE_CREDENTIALS": ...,
"FIREBASE_ML_MODEL_NAME": ...,
"FIREBASE_ML_MODEL_TAGS": ["tag1", ... ],
"OPTIONS": {
# to be passed directly into the options argument of firebase_admin.initialize_app.
# The mandatory option is storageBucket where Firebase ML temporarily stores model
# https://firebase.google.com/docs/reference/admin/python/firebase_admin#initialize_app

"storageBucket": ...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this you may want to instead use the output PushedModel location instead of asking user to provide a bucket. Or is this a bucket Firebase specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is Firebase specific. For now, it doesn't let us host model without this.

Copy link
Member

@casassg casassg Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a reference to what does storageBucket refer to? The link provided also seems to not link what is a storageBucket which makes it pretty confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the document itself is not enough but the source maybe
: https://github.com/firebase/firebase-admin-python/blob/44a8fde5672828232ffa68267a71eedb270dbb16/firebase_admin/ml.py#L491

storageBucket is the key to use when initializing an app, and it is going to be used in the next step of TFLiteGCSModelSource.from_tflite_model_file or from_saved_model. As you see, it uploads files to the designated GCS Bucket.

I hope this can refer to the output from the Trainer, but it seems like it has to upload. But I maybe wrong about the word temporary, this could be where the actual model is hosted when the model is published in Firebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems you can just provide it a gcs_tflite_uri isnt that correct? https://github.com/firebase/firebase-admin-python/blob/44a8fde5672828232ffa68267a71eedb270dbb16/firebase_admin/ml.py#L478

That said, I agree it may be good to ask user to provide one if they want to overwrite what gets used

},
}
}
```

- It outputs the following information by the method [`_MarkPushed`](https://github.com/tensorflow/tfx/blob/3b5290aa77c2df52a4791715cfd761be7696fe81/tfx/components/pusher/executor.py#L222) from Pusher component
- `pushed` : indicator if the model is pushed without any issue or if the model is blessed.
- `pushed_destination` : URL string for easy access to the model from Firebase Console such as `f"https://console.firebase.google.com/u/1/project/{PROJECT_ID}/ml/custom"`
- `pushed_version` : version string of the pushed model. This is determined in the same manner as Pusher by `str(int(time.time()))`

- The detailed behaviour of this component
- Initialize Firebase App with `firebase_admin.initialize_app` from [Firebase Admin SDK](https://firebase.google.com/docs/admin/setup). It uses the inputs passed in `custom_config`. When `FIREBASE_CREDENTIALS` is given, it first downloads the credential file. Otherwise, all the values in `OPTIONS` will be passed to the `options` parameter

- Download the model from the upstream TFX component if the model is blessed. Unfortunately, Firebase ML only lets us upload/host a model from the local storage, so this step is required. Along the way, if the model is `TFLite` format, local flag `is_tfile` will be marked as `True`

- If the model is `SavedModel` (which can be determined if `is_tflite` is set to `False`), `TFLiteGCSModelSource.from_saved_model(model_path)` is called. This function will convert `SavedModel` to `TFLite` and temporarily stores it in the GCS bucket specified in `storageBucket` of `custom_config`. Otherwise, `TFLiteGCSModelSource.from_tflite_model_file(model_path)` is used to directly upload the given `TFLite` model file

- Search the list of models whose `display_name` is same to the `FIREBASE_ML_MODEL_NAME` in `custom_config`. If the list is empty, a new model will be created and hosted. If the list is non-empty, the existing modell will be updated.
- In any cases, tags will be updated with the `FIREBASE_ML_MODEL_TAGS` in `custom_config`. Plus, additional tag information of the model version will be automatically added.

## Project Dependencies
The implementation will use the following libraries.
- [Firebase Admin Python SDK](https://github.com/firebase/firebase-admin-python)
- [Python Client for Google Cloud Storage](https://github.com/googleapis/python-storage)

## Project Team
**Project Leader** : Chansung Park, deep-diver, [email protected]
1. Sayak Paul, sayakpaul, [email protected]
1. Sayak Paul, sayakpaul, [email protected]