After cross-technology validation, candidate variants undergo donor-level refinement. This step evaluates evidence across multiple tissues from the same donor to identify variants that are consistently detected across independent samples.
The goal of this stage is to increase confidence in mosaic variant calls by leveraging donor-level sequencing data and identifying variants supported in more than one tissue.
Candidate variants are evaluated using strict pileup analysis across all available Illumina sequencing data from the donor.
Variants located in high-confidence genomic regions (“easy” regions) are evaluated for support across tissues from the same donor.
A variant is considered supported in a tissue if sufficient alternate reads supporting the variant are detected in the short-read pileup using the following criteria:
For each tissue, a binomial test (Poisson approximation) is applied to evaluate whether the observed alternate read count exceeds expectations under a 0.1% sequencing error model (p < 1e-5).
Tissues meeting this threshold are recorded in the VCF annotation field:
TISSUE_SR_VAFs
Variants supported in two or more tissues from the same donor are labeled with the tag CrossTissue.
This annotation indicates that the variant is detected independently in multiple tissues, providing additional evidence that the variant represents a true mosaic event rather than a sequencing artifact.
All the relevant code can be accessed in the GitHub repository: