Skip to content

Conversation

@Mddct
Copy link
Collaborator

@Mddct Mddct commented Apr 19, 2023

changes:

  • stack fbank instead of stack features after subsampling (BEST-RQ paper do stack fbank )
  • replace compute_mask_indices_v2 with compute_mask_indices,(v2 from wav2vec2 which compute mask in cpu), The v2 version can consider padding, and the probability of short sequence overmask becomes smaller
  • fbank should do cmvn , stack fbanks should renorm using unlearnable LN avoiding codebooks collapse
  • add fbank l2 regularization which can stable training
  • other info about training
  • reset encoder parameter using bert-style: trunk_norm std==0.02

@Mddct Mddct force-pushed the Mddct-bestrq-stable-training branch from 6bce0c8 to dbe7148 Compare April 19, 2023 09:30
@Mddct Mddct force-pushed the Mddct-bestrq-stable-training branch from dbe7148 to 0768fb1 Compare April 19, 2023 09:36
@Mddct Mddct requested review from robin1001 and xingchensong April 19, 2023 09:42
@Mddct
Copy link
Collaborator Author

Mddct commented Apr 19, 2023

BTW: link error: ./test/test_tokenize.py:108:16: C419 Unnecessary list comprehension passed to all() prevents short-circuiting - rewrite as a generator. ' not in this pr

@Mddct Mddct marked this pull request as ready for review April 19, 2023 09:44
@Mddct Mddct mentioned this pull request Apr 19, 2023
@Mddct Mddct closed this Apr 20, 2023
@Mddct Mddct reopened this Apr 20, 2023
@Mddct Mddct marked this pull request as draft April 20, 2023 04:03
@Mddct Mddct force-pushed the Mddct-bestrq-stable-training branch from d05e917 to 7be7a19 Compare April 20, 2023 04:09
@Mddct Mddct force-pushed the Mddct-bestrq-stable-training branch from 7be7a19 to a3837af Compare April 20, 2023 04:12
@Mddct Mddct marked this pull request as ready for review April 20, 2023 04:19
@robin1001 robin1001 merged commit e657753 into main Apr 22, 2023
@robin1001 robin1001 deleted the Mddct-bestrq-stable-training branch April 22, 2023 02:04
xs = xs * self.signal_istd
input = xs

features_pen: Optional[torch.Tensor] = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, I guess xs here is just the log mel fbank feature from dataloader, so it seems that calculating features_pen is not meaningful in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants