ai/SpeechRecognition/requirements.txt

65 lines
1.5 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# ============================================
# 语音识别 + 说话人分离 项目依赖
# ============================================
# 使用说明:
# 1. 创建虚拟环境python -m venv funasr_env
# 2. 激活虚拟环境funasr_env\Scripts\activate
# 3. 安装依赖pip install -r requirements.txt
# ============================================
# ---------- 核心框架 ----------
torch>=2.7.0
torchaudio>=2.7.0
# ---------- FunASR 语音识别 ----------
funasr>=1.3.0
modelscope>=1.36.0
transformers>=5.7.0
# ---------- 3D-Speaker 说话人分离 ----------
# 注意3D-Speaker 需要手动克隆到项目目录
# git clone https://github.com/alibaba-damo-academy/3D-Speaker.git
speakerlab>=0.0.6
# ---------- 音频处理 ----------
soundfile>=0.12.0
librosa>=0.11.0
scipy>=1.15.0
numpy>=1.26.0
# ---------- 机器学习基础库 ----------
scikit-learn>=1.7.0
numba>=0.65.0
pandas>=2.3.0
# ---------- 聚类算法 ----------
hdbscan>=0.8.42
umap-learn>=0.5.0
fastcluster>=1.2.0
# ---------- 深度学习组件 ----------
pytorch-lightning>=2.6.0
lightning>=2.6.0
pyannote.audio>=3.4.0
# ---------- 数据处理 ----------
datasets>=3.0.0,<4.0.0
pyarrow>=24.0.0
sentencepiece>=0.2.1
# ---------- 工具库 ----------
tqdm>=4.67.0
pyyaml>=6.0
simplejson>=3.19.0
sortedcontainers>=2.4.0
addict>=2.4.0
jieba>=0.42.0
# ---------- 可选ONNX 加速 ----------
# onnxruntime-gpu>=1.23.0
# ---------- 可选Web API 服务 ----------
Flask>=3.1.0
waitress>=3.0.0
# SQLAlchemy>=2.0.0