librosa
json5
torch
torchaudio
huggingface_hub
transformers <=4.75.1
omegaconf
einops
descript-audiotools==0.7.2
tokenizer
eval
munch
accelerate
wetext