-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathrequirements.txt
More file actions
43 lines (37 loc) · 1.81 KB
/
Copy pathrequirements.txt
File metadata and controls
43 lines (37 loc) · 1.81 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# =============================================================================
# Qolda-AVL — Python dependencies (curated for the audio training pipeline)
# =============================================================================
# This project is a fork of ms-swift, whose full upstream dependency lists live
# under ./requirements/ (framework.txt, eval.txt, ...) and are also wired into
# setup.py, so `pip install -e .` pulls them in.
#
# The list below is a self-contained set sufficient to run the Qolda-AVL audio
# training pipeline. Install a CUDA-matched PyTorch build (torch / torchaudio /
# torchvision) and the bundled transformers fork first, then
# `pip install -r requirements.txt`.
# -----------------------------------------------------------------------------
# --- ms-swift framework dependencies (inherited from upstream) ---
-r requirements/framework.txt
# --- core deep-learning stack ---
# Install the build that matches your CUDA version, e.g. from pytorch.org
torch>=2.4
torchaudio>=2.4
torchvision
# NOTE: the Qwen3AVL model requires the bundled transformers fork in this repo:
# pip install -e ./transformers
# Do NOT install `transformers` from PyPI — it does not know about qwen3_avl.
accelerate>=0.34
deepspeed
peft>=0.11
# --- audio branch (Whisper encoder + audio DeepStack) ---
librosa # audio loading / log-mel feature extraction
soundfile # wav/flac I/O
av # robust audio/video decoding backend
# --- vision / video (inherited from Qwen3-VL) ---
qwen-vl-utils
decord
# --- optional, strongly recommended for speed ---
# FlashAttention is required by the training scripts (--attn_impl flash_attn).
# It needs a CUDA toolchain to build; install separately if the wheel fails:
# pip install flash-attn --no-build-isolation
flash-attn; platform_system == "Linux"