Glossary
Adapter: Short oligonucleotides which are attached to the DNA to be sequenced. An adapter can provide a priming site for both amplification and sequencing of the adjoining, unknown nucleic acid.
Alignment: The mapping of a raw sequence read to a location within a reference genome. The mapping occurs because the sequences within the raw read match or align to sequences within the reference genome.
Assembly: Assembly of fragment sequences into higher order structures based on their overlap and reference sequence, where appropriate.
BAM: Binary version of SAM file. It is a compressed format for storing SAM data, a typical output of the secondary phase of data analysis.
​
BED: Text file used for genomic intervals, e.g. genes, peak regions etc.
Coverage: This value indicates the coverage of an analysed sequence with respect to its length, usually expressed as a percentage; sometimes the term is also used for the depth of reading.
Coverage- depth: The number of nucleotides from reads that are mapped to a given position of reference genome.
De novo genome assembly: Sequencing of genetic material with no reference sequence available.
FASTA: Simple text format for storing raw sequence data.
​
FASTQ: Text format for storing raw sequence data with quality scores for each base.
Paired-end sequencing: Sequencing process where both ends of a single DNA or RNA fragment are sequenced, but the intermediate region is not. Particularly useful for identifying structural rearrangements, including gene fusions.
Read: Data output from the analysis of a single fragment (sequence).
Read-accuracy: Indicates the occurrence of errors (in %) after primary analysis.
Read-depth: The number of sequence reads that pile up at the same genomic location. For example, 30X read-depth coverage indicates that the genomic location is covered by 30 independent sequencing reads. Increased read-depth translates into higher confidence for calling genomic variants.
Read-length: The number of base pairs that are sequenced in an individual sequence read.
SAM: Text file format for storing sequence alignments against a reference genome. See also BAM.
SNP Calling: Process of detecting Single Nucleotide Polymorphisms in the sequences obtained.
Variant Calling: Process of detection of sequence variants in the sequences obtained.
​