incorrect audio shape

data_svc/waves-16k/
data_svc/whisper
>>>>>>>>>>speaker0<<<<<<<<<<
(639, 1024)
>>>>>>>>>>speaker1<<<<<<<<<<
Traceback (most recent call last):
  File "/content/lora-svc/prepare/preprocess_ppg.py", line 54, in <module>
    pred_ppg(whisper, f"{wavPath}/{spks}/{file}.wav", f"{ppgPath}/{spks}/{file}.ppg")
  File "/content/lora-svc/prepare/preprocess_ppg.py", line 26, in pred_ppg
    ppg = whisper.encoder(mel.unsqueeze(0)).squeeze().data.cpu().float().numpy()
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/content/lora-svc/whisper/model.py", line 154, in forward
    assert len_x <= len_e, "incorrect audio shape"
AssertionError: incorrect audio shape

any idea what is the issue speaker0 is my record voice around 11 sec and  speaker1 is song which is around 57 sec 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

incorrect audio shape #87

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

incorrect audio shape #87

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions