The SMaHT SNV pipeline detects somatic single-nucleotide variants (SNVs) across multiple sequencing technologies. The pipeline integrates four somatic SNV callers: three short-read-based callers (Strelka2, Mutect2, RUFUS), and one long-read-based caller (longcallD).
Raw calls generated by the individual tools are merged and then processed through hierarchical filtering and cross-evidence validation to produce high-confidence SNV calls. The pipeline is designed for per-tissue sample execution while leveraging donor-level information to validate and refine candidate variants.
All sequencing libraries generated by multiple Genome Characterization Centers (GCCs) for each sample are merged prior to analysis and provided as high-depth input (~300X short-read coverage) to the variant callers.
PacBio HiFi data is used for long-read variant calling when available.