Molecule Tutorials - Herong's Tutorial Examples - v1.26, by Herong Yang
Reference Genome Sequence Data File
This section provides a tutorial example on how to download the Reference Genome Sequence Data File, provided by NCBI (National Center for Biotechnology Information.
What Is Reference Genome Sequence Data File? - Reference Genome Sequence Data File in FASTA format that contains reference human genome sequences provided by NCBI (National Center for Biotechnology Information.
Here is what I did to download the Reference Human Genome data file.
1. Get the data file with "curl" command:
herong$ curl ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/\ GRCh38_latest/refseq_identifiers/GRCh38_latest_genomic.fna.gz > genome.gz
2. Unzip and verify the file.
herong$ gunzip genome.gz herong$ head -100 genome >NC_000001.11 Homo sapiens chromosome 1, GRCh38.p13 Primary Assembly NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN ...
3. Count number of chromosomes in the data file.
herong$ grep ">NC_" genome_reviews >NC_000001.11 Homo sapiens chromosome 1, GRCh38.p13 Primary Assembly >NC_000002.12 Homo sapiens chromosome 2, GRCh38.p13 Primary Assembly >NC_000003.12 Homo sapiens chromosome 3, GRCh38.p13 Primary Assembly >NC_000004.12 Homo sapiens chromosome 4, GRCh38.p13 Primary Assembly >NC_000005.10 Homo sapiens chromosome 5, GRCh38.p13 Primary Assembly >NC_000006.12 Homo sapiens chromosome 6, GRCh38.p13 Primary Assembly >NC_000007.14 Homo sapiens chromosome 7, GRCh38.p13 Primary Assembly >NC_000008.11 Homo sapiens chromosome 8, GRCh38.p13 Primary Assembly >NC_000009.12 Homo sapiens chromosome 9, GRCh38.p13 Primary Assembly >NC_000010.11 Homo sapiens chromosome 10, GRCh38.p13 Primary Assembly >NC_000011.10 Homo sapiens chromosome 11, GRCh38.p13 Primary Assembly >NC_000012.12 Homo sapiens chromosome 12, GRCh38.p13 Primary Assembly >NC_000013.11 Homo sapiens chromosome 13, GRCh38.p13 Primary Assembly >NC_000014.9 Homo sapiens chromosome 14, GRCh38.p13 Primary Assembly >NC_000015.10 Homo sapiens chromosome 15, GRCh38.p13 Primary Assembly >NC_000016.10 Homo sapiens chromosome 16, GRCh38.p13 Primary Assembly >NC_000017.11 Homo sapiens chromosome 17, GRCh38.p13 Primary Assembly >NC_000018.10 Homo sapiens chromosome 18, GRCh38.p13 Primary Assembly >NC_000019.10 Homo sapiens chromosome 19, GRCh38.p13 Primary Assembly >NC_000020.11 Homo sapiens chromosome 20, GRCh38.p13 Primary Assembly >NC_000021.9 Homo sapiens chromosome 21, GRCh38.p13 Primary Assembly >NC_000022.11 Homo sapiens chromosome 22, GRCh38.p13 Primary Assembly >NC_000023.11 Homo sapiens chromosome X, GRCh38.p13 Primary Assembly >NC_000024.10 Homo sapiens chromosome Y, GRCh38.p13 Primary Assembly >NC_012920.1 Homo sapiens mitochondrion, complete genome
Table of Contents
Molecule Names and Identifications
Nucleobase, Nucleoside, Nucleotide, DNA and RNA
ChEMBL Database - European Molecular Biology Laboratory
PubChem Database - National Library of Medicine
►INSDC (International Nucleotide Sequence Database Collaboration)
►Reference Genome Sequence Data File
RefSeq Proteins of Human Genome
HGNC (HUGO Gene Nomenclature Committee)