pipelines-docs

Alignment with pbmm2

The pipeline uses pbmm2 to align each unaligned BAM file to the reference genome. The software also sorts the reads by genomic coordinates, strips unnecessary tags, and links methylation tags if present. An integrity check is then performed on the resulting BAM file.

pbmm2 is used to align both the Full Length Non Chimeric (FLNC) reads and the consensus transcripts generated by cluster2.

Aligning and Sorting

Align and sort reads
pbmm2 align --preset ISOSEQ --sort --strip --unmapped reference.fasta unaligned.bam aligned.bam

Arguments:

Integrity Check

To confirm the integrity of the alignment BAM file, in-house Python code checks for the presence of the 28-byte empty block representing the EOF marker in BAM format.

Implementation with pbmm2

The pipeline uses pbmm2 version 1.13.0, which wraps minimap2 version 2.26. It’s important to note that pbmm2 sets some defaults that may differ from the standard minimap2.

Default set by pbmm2 for minimap2:

Note: Due to multi-threading the output alignment ordering can differ between multiple runs with the same input parameters. The same can occur even with option --sort for records that align to the same target sequence, the same position within that target, and in the same orientation, which are the only fields that samtools sort uses.


Home - Overview - Clustering - Alignment - Collapsing - Classification and Filtering - Annotation