Releases: wolfgitpr/HubertFA
v0.0.7: Onnx Model for Mandarin&Japanese&English Singing Voice and automatically identify AP and EP.
Code version: v0.0.7
- Language used: Chinese Mandarin, Japanese Romanization (Mandarin&Japanese)
- Using the dictionary: ds-zh-pinyin-lite/japanese_dict_full/ds_cmudict-07b
- Scope of application: Singing voice
- Non acoustic automatic recognition: AP, EP (not recommend)
- Release date: 2025-12-30
Migrate CSV Transcription Files
The completed diffsinger dataset has been created. Use this script to migrate the CSV format annotations of the old dictionary to the new dictionary.
python scripts/migrate_dict.py csv [TRANSCRIPTION_CSV] \
--source-dict [SOURCE_DICT] \
--target-dict [TARGET_DICT] \
[--save-path [OUTPUT_CSV] ] \
[--overwrite]- `TRANSCRIPTION_CSV`: Path to CSV file with transcriptions.
- `--source-dict`: Path to source dictionary file.
- `--target-dict`: Path to target dictionary file.
- `--save-path`: Path to save migrated file (defaults to original file).
- `--overwrite`: Overwrite existing file (optional flag).
Data contributors
- 白烁
- 风羽翼Tsubasa
- 烛曦遥Haruka
- 夜燐Yarin
- 芸青岩
- 绮萱
Full Changelog: v0.0.6...v0.0.7
v0.0.6: Onnx Model for Mandarin&Japanese&English Singing Voice and automatically identify AP and EP.
Code version: v0.0.6
- Language used: Chinese Mandarin, Japanese Romanization (Mandarin&Japanese)
- Using the dictionary: opencpop-expression/japanese_dict_full/ds_cmudict-07b
- Scope of application: Singing voice
- Non acoustic automatic recognition: AP, EP (not recommend)
- Release date: 2025-12-17
Onnx infer
pip install -r requirements_onnx.txt
python onnx_infer.py --onnx_path xxx --wav_folder xxx_wav --language zh ...
args:
- --onnx_path / -m: Path to Onnx models.
- --wav_folder / -wf: Input folder path. (default: segments)
- --out_path / -o: Path to the output label.
- --language / -l: Designated language, zh ja en yue. (default: zh)
- --non_speech_phonemes / -np: Non speech phonemes. (default: AP, optional AP,EP)
- --pad_times / -pt: The number of times to pad blank audio before reasoning. (default: 1)
- --pad_length / -pl: The max length of blank audio on the pad before inference. (default: 5)
- --dictionary / -d: Custom dictionary path.
Data contributors
- 白烁
- 风羽翼Tsubasa
- 烛曦遥Haruka
- 夜燐Yarin
- 芸青岩
- 绮萱
Full Changelog: v0.0.5...v0.0.6
v0.0.5 fixed: Onnx Model for Mandarin&Japanese&English Singing Voice and automatically identify AP and EP.
Code version: v0.0.5 fixed
- Language used: Chinese Mandarin, Japanese Romanization (Mandarin&Japanese)
- Using the dictionary: opencpop-expression/apanese_dict_full/ds_cmudict-07b. txt
- Scope of application: Singing voice
- Non acoustic automatic recognition: AP, EP (not recommend)
- Release date: 2025-12-8
onnx infer
pip install -r requirements_onnx.txt
python onnx_infer.py --onnx_path xxx --wav_folder xxx_wav --language zh ...
args:
- --onnx_path / -m: Path to Onnx models.
- --wav_folder / -wf: Input folder path. (default: segments)
- --out_path / -o: Path to the output label.
- --language / -l: Designated language, zh ja en yue. (default: zh)
- --non_speech_phonemes / -np: Non speech phonemes. (default: AP, optional AP,EP)
- --pad_times / -pt: The number of times to pad blank audio before reasoning. (default: 1)
- --pad_length / -pl: The max length of blank audio on the pad before inference. (default: 5)
- --dictionary / -d: Custom dictionary path.
Data contributors
- 白烁
- 风羽翼Tsubasa
- 烛曦遥Haruka
- 夜燐Yarin
- 芸青岩
- 绮萱