Skip to content
This repository was archived by the owner on Jan 15, 2024. It is now read-only.

Conversation

eric-haibin-lin
Copy link
Member

@eric-haibin-lin eric-haibin-lin commented Sep 7, 2019

Description

The distilBERT model from https://arxiv.org/abs/1910.01108

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@codecov
Copy link

codecov bot commented Sep 7, 2019

Codecov Report

❗ No coverage uploaded for pull request head (convert@087dbcd). Click here to learn what that means.
The diff coverage is n/a.

@codecov
Copy link

codecov bot commented Sep 7, 2019

Codecov Report

Merging #922 into master will decrease coverage by 0.52%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #922      +/-   ##
=========================================
- Coverage   88.32%   87.8%   -0.53%     
=========================================
  Files          67      71       +4     
  Lines        6330    6723     +393     
=========================================
+ Hits         5591    5903     +312     
- Misses        739     820      +81
Impacted Files Coverage Δ
src/gluonnlp/data/question_answering.py 100% <100%> (ø) ⬆️
src/gluonnlp/model/seq2seq_encoder_decoder.py 50% <0%> (-30%) ⬇️
src/gluonnlp/model/utils.py 70.76% <0%> (-6.93%) ⬇️
src/gluonnlp/model/transformer.py 86.85% <0%> (-4.81%) ⬇️
src/gluonnlp/model/attention_cell.py 87.7% <0%> (-4.47%) ⬇️
src/gluonnlp/model/block.py 51.06% <0%> (-2.13%) ⬇️
src/gluonnlp/calibration/__init__.py 100% <0%> (ø)
src/gluonnlp/data/datasetloader.py 85.34% <0%> (ø)
src/gluonnlp/calibration/collector.py 26.66% <0%> (ø)
src/gluonnlp/data/dataloader.py 87.93% <0%> (ø)
... and 4 more

@mli
Copy link
Member

mli commented Sep 7, 2019

Job PR-922/1 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/1/index.html

@szha szha requested a review from a team September 8, 2019 03:52
@eric-haibin-lin eric-haibin-lin requested a review from a team as a code owner January 25, 2020 01:23
@eric-haibin-lin eric-haibin-lin added the release focus Progress focus for release label Jan 25, 2020
@eric-haibin-lin eric-haibin-lin changed the title [WIP] port distilBERT [Model] DistilBERT Jan 25, 2020
@mli
Copy link
Member

mli commented Jan 25, 2020

Job PR-922/5 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/5/index.html

@mli
Copy link
Member

mli commented Jan 25, 2020

Job PR-922/6 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/6/index.html

@mli
Copy link
Member

mli commented Jan 26, 2020

Job PR-922/7 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/7/index.html

@mli
Copy link
Member

mli commented Jan 26, 2020

Job PR-922/8 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/8/index.html

@mli
Copy link
Member

mli commented Jan 26, 2020

Job PR-922/9 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/9/index.html

@@ -0,0 +1,211 @@
# coding: utf-8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you intend to maintain and verify this script?

Copy link
Member Author

@eric-haibin-lin eric-haibin-lin Jan 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're currently not maintained, and not documented. Ideally we should test them in CI (adding fairseq, tf, pytorch transformer). But I don't have the bandwidth to do that now...

@mli
Copy link
Member

mli commented Jan 31, 2020

Job PR-922/10 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/10/index.html

@mli
Copy link
Member

mli commented Jan 31, 2020

Job PR-922/11 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/11/index.html


Usage:

pip3 install pytorch-transformers
Copy link
Contributor

@leezu leezu Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That package seems not maintained anymore. Why not convert from transformers package? Can be addressed in a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of writing this script, there was only the pytorch-transformer package. I think it's still useful to some users, who started to use bert with pytorch-transformers.

@mli
Copy link
Member

mli commented Feb 1, 2020

Job PR-922/13 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/13/index.html

@mli
Copy link
Member

mli commented Feb 1, 2020

Job PR-922/12 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/12/index.html

@mli
Copy link
Member

mli commented Feb 2, 2020

Job PR-922/14 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/14/index.html

@mli
Copy link
Member

mli commented Feb 3, 2020

Job PR-922/15 is complete.
Docs are uploaded to http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR-922/15/index.html

@leezu leezu merged commit acee226 into dmlc:master Feb 3, 2020
@cnlewis3
Copy link

cnlewis3 commented May 4, 2020

@mli Super happy DistilBERT was added as a model option!
Is exporting symbols file from the distilbert model an option? It looks like the only choices for that are the base and large bert models. Wasn't sure if I should open an issue to ask.

@leezu
Copy link
Contributor

leezu commented May 4, 2020

It's a HybridBlock, so you can call .export() to get the symbol

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release focus Progress focus for release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants