Skip to content

nicolo-tellini/sppComp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

283 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sppComb logo

© logo nt & Chiara Vischioni

Licence Release release date commit

Note

NEW:

Added the moda of the coverage to better visualize coverage changes across the chromosomes;

Reduce the y limit value to better enphasize diffences now 2.5 x median value excluding bin with less 5X;

Added coverage segmentation DNAcopy package

Saccharomyces species composition (sppComp)

An automated, modular computational framework for a rapid glimpse of the species composition and genomic features of Saccharomyces yeasts from large datasets of paired-end Illumina reads, adapted to handle different computational resources.

SppComp is released as part of the SGRP5 project.

SppComp allows the detection of:

  • hybrids composition,
  • hybrid ploidy,
  • introgressed DNA,
  • aneuploidies and copy number variations (CNVs) (relative copy number).

Description

The species composition of Saccharomyces strains plays a major role in biological studies, providing valuable insights into the evolutionary history of the genus while being exploited to improve industrial phenotypes. Short-read sequencing is the most popular choice for large-scale genomics projects due to its rapid processing and affordable cost. As a leading model organism, Saccharomyces yeasts have been massively sequenced using Illumina short-read platforms. SppComp takes advantage of chromosome-level, end-to-end genome assemblies from the ScRAPdb, and competitive short-read mapping, as implemented and described in MuLo-YDH, to assess the species composition of Saccharomyces yeasts from large datasets of paired-end Illumina reads. SppComp is written in Bash and R. By means of the implementation of state-of-the-art software, functional programming and vectorized code, SppComp reduces computational slowdowns.

Interpreting segmentation table

Ploidy Baseline CN Gain Fold change log2
Haploid121.00
Diploid231.5×~0.58
Triploid341.33×~0.42
Tetraploid451.25×~0.32

Ploidy Baseline CN Loss Fold change log2
Diploid210.5×-1.00
Triploid320.67×~-0.58
Tetraploid430.75×~-0.42

Download

git clone https://github.com/nicolo-tellini/sppComp.git

cd sppComp

git clone https://github.com/nicolo-tellini/rust_cov_bed

cd rust_cov_bed

cargo build --release

cd ..

Content

📂 :

.
├── rep
├── misc
├── tmp
├── rust_cov_bed
├── scr
└── seq

Installation

mamba create -n sppcomp \
    minimap2=2.28 \
    samtools=1.21 \
    genmap=1.3.0 \
    bedtools=2.31.1 \
    gawk=5.3.1 \
    parallel=20240722 \
    r-base=4.4.1 \
    r-ggplot2=3.5.2 \
    r-data.table=1.15.4 \
    r-tidyverse=2.0.0 \
    bioconductor-biostrings=2.74.0 \
    bioconductor-genomicranges=1.58.0 \
    bioconductor-dnacopy \
    -c conda-forge -c bioconda

Activate the environment:

mamba activate sppcomp

You can now edit the config file inside scr.

more about ...

⚠️ The Assemblies

✔️ Competitive mapping

⚠️ The Implementation

✔️ Testing

⚠️ Organisation of the directories

Example

CBS 2834 3n S.cer x S. kud x S.uva sample with complex aneuploidies and species combinations.

Screenshot from 2026-04-12 15-25-11

Signal segmentation CBS 2834

Screenshot from 2026-04-14 18-03-43 Screenshot from 2026-04-14 18-04-11 Screenshot from 2026-04-14 18-04-42 Screenshot from 2026-04-14 18-05-10

Release history

  • v1.0.1 Realeased in 2026
  • v1.0.0 Released in 2023

Citations

Please, if you use this pipeline, cite this repo.

About

sppComp describes species composition of Saccharomyces samples and allows the raw detection of genomics signatures such as inrogressions, CNVs and aneuploiudies

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors