# ============================================ # 语音识别 + 说话人分离 项目依赖 # ============================================ # 使用说明: # 1. 创建虚拟环境:python -m venv funasr_env # 2. 激活虚拟环境:funasr_env\Scripts\activate # 3. 安装依赖:pip install -r requirements.txt # ============================================ # ---------- 核心框架 ---------- torch>=2.7.0 torchaudio>=2.7.0 torchvision>=0.22.0 # ---------- FunASR 语音识别 ---------- funasr>=1.3.0 modelscope>=1.36.0 transformers>=5.7.0 # ---------- 3D-Speaker 说话人分离 ---------- # 注意:3D-Speaker 需要手动克隆到项目目录 # git clone https://github.com/alibaba-damo-academy/3D-Speaker.git speakerlab>=1.0.0 # ---------- 音频处理 ---------- soundfile>=0.12.0 librosa>=0.11.0 scipy>=1.15.0 numpy>=2.2.0 # ---------- 机器学习基础库 ---------- scikit-learn>=1.7.0 numba>=0.65.0 pandas>=2.3.0 # ---------- 聚类算法 ---------- hdbscan>=0.8.42 umap-learn>=0.5.0 fastcluster>=1.2.0 # ---------- 深度学习组件 ---------- pytorch-lightning>=2.6.0 lightning>=2.6.0 pyannote.audio>=3.4.0 # ---------- 数据处理 ---------- datasets>=4.8.0 pyarrow>=24.0.0 sentencepiece>=0.2.1 # ---------- 工具库 ---------- tqdm>=4.67.0 pyyaml>=6.0 simplejson>=3.19.0 sortedcontainers>=2.4.0 addict>=2.4.0 jieba>=0.42.0 # ---------- 可选:ONNX 加速 ---------- # onnxruntime-gpu>=1.23.0 # ---------- 可选:Web API 服务 ---------- # Flask>=3.1.0 # SQLAlchemy>=2.0.0