pipelines-docs

GENCODE

The GENCODE project1 provides comprehensive annotation of gene features for the human genome, including coding and non-coding genes, pseudogenes, and other significant genomic elements.

The specific version in use is GENCODE Release 47 (GRCh38.p14), which aligns with the Genome Reference Consortium Human Build 38 (GRCh38) and is accessible for download here.

Collapsing GENCODE Annotation

Download comprehensive gene annotation
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_47/gencode.v47.annotation.gtf.gz
Collapse gene annotation
python3 collapse_annotation.py \
    --collapse_only gencode.v47.annotation.gtf \
    gencode.v47.genes.gtf

Source code for the collapse_annotation.py2 script is available here.

1: Frankish A, et al. GENCODE: reference annotation for the human and mouse genomes in 2023. Nucleic Acids Res., Volume 51, Issue D1, 6 January 2023, Pages D942–D949. doi: 10.1093/nar/gkac1071; 2: Original author: Francois Aguet


Home - Overview - GENCODE