© logo nt & Chiara Vischioni
Note
NEW:
Added the moda of the coverage to better visualize coverage changes across the chromosomes;
Reduce the y limit value to better enphasize diffences now 2.5 x median value excluding bin with less 5X;
Added coverage segmentation DNAcopy package
An automated, modular computational framework for a rapid glimpse of the species composition and genomic features of Saccharomyces yeasts from large datasets of paired-end Illumina reads, adapted to handle different computational resources.
SppComp is released as part of the SGRP5 project.
SppComp allows the detection of:
- hybrids composition,
- hybrid ploidy,
- introgressed DNA,
- aneuploidies and copy number variations (CNVs) (relative copy number).
The species composition of Saccharomyces strains plays a major role in biological studies, providing valuable insights into the evolutionary history of the genus while being exploited to improve industrial phenotypes. Short-read sequencing is the most popular choice for large-scale genomics projects due to its rapid processing and affordable cost. As a leading model organism, Saccharomyces yeasts have been massively sequenced using Illumina short-read platforms. SppComp takes advantage of chromosome-level, end-to-end genome assemblies from the ScRAPdb, and competitive short-read mapping, as implemented and described in MuLo-YDH, to assess the species composition of Saccharomyces yeasts from large datasets of paired-end Illumina reads. SppComp is written in Bash and R. By means of the implementation of state-of-the-art software, functional programming and vectorized code, SppComp reduces computational slowdowns.
| Ploidy | Baseline CN | Gain | Fold change | log2 |
|---|---|---|---|---|
| Haploid | 1 | 2 | 2× | 1.00 |
| Diploid | 2 | 3 | 1.5× | ~0.58 |
| Triploid | 3 | 4 | 1.33× | ~0.42 |
| Tetraploid | 4 | 5 | 1.25× | ~0.32 |
| Ploidy | Baseline CN | Loss | Fold change | log2 |
|---|---|---|---|---|
| Diploid | 2 | 1 | 0.5× | -1.00 |
| Triploid | 3 | 2 | 0.67× | ~-0.58 |
| Tetraploid | 4 | 3 | 0.75× | ~-0.42 |
git clone https://github.com/nicolo-tellini/sppComp.git
cd sppComp
git clone https://github.com/nicolo-tellini/rust_cov_bed
cd rust_cov_bed
cargo build --release
cd ..
📂 :
.
├── rep
├── misc
├── tmp
├── rust_cov_bed
├── scr
└── seq
mamba create -n sppcomp \
minimap2=2.28 \
samtools=1.21 \
genmap=1.3.0 \
bedtools=2.31.1 \
gawk=5.3.1 \
parallel=20240722 \
r-base=4.4.1 \
r-ggplot2=3.5.2 \
r-data.table=1.15.4 \
r-tidyverse=2.0.0 \
bioconductor-biostrings=2.74.0 \
bioconductor-genomicranges=1.58.0 \
bioconductor-dnacopy \
-c conda-forge -c bioconda
Activate the environment:
mamba activate sppcomp
You can now edit the config file inside scr.
✔️ Testing
CBS 2834 3n S.cer x S. kud x S.uva sample with complex aneuploidies and species combinations.
Signal segmentation CBS 2834
- v1.0.1 Realeased in 2026
- v1.0.0 Released in 2023
Please, if you use this pipeline, cite this repo.
