It would be really useful to see the training code used for the experiments in the paper, in order to be able to completely understand/reconstruct the hyperparameters and training approach. I am particularly interested in the audio modelling and generation experiments (Section 4.4 of [1]).
Is the code available somewhere? Would you consider making it available? I can't find it in this repository.
Many thanks,
L
[1] Gu, A., & Dao, T. (2024). Mamba: Linear-Time Sequence Modeling with Selective State Spaces (No. arXiv:2312.00752). https://doi.org/10.48550/arXiv.2312.00752