• Reduce text
  • Restore text size
  • Increase the text
  • Print
Male mallard duck, Anas platyrhynchos.. © Catherine MADZAC, MADZAK Catherine

Sequencing the duck genome

Next steps after sequencing

How can a genome be recovered after sequencing? Modern sequencers are very fast, but they produce thousands of very short sequences that must then be put back together again to recover an intact genome. This reconstruction is time-consuming and requires a whole range of methodologies. It takes even longer to identify the position and function of genes.

By Pascale Mollier, translated by Inge Laino
Updated on 10/23/2013
Published on 07/23/2013

From sequencing to physical recovery of genomes

Modern sequencing methods produce millions of very short sequences (100 – 1,000 base pairs), which then need to be put back in order and organized according to chromosomes. Bioinformatic processing allows scientists to detect where the short segments overlap to obtain larger fragments. At this stage, 230,000 fragments can be obtained for the duck, of an average size of 4,700 base pairs, to reconstruct a genome of about 1.26 billion base pairs and 40 pairs of chromosomes (1). To reconstruct this enormous puzzle, several methods must be used, such as:

- comparison with the chicken genome, very similar to that of the duck, which makes it easier to order some fragments according to chromosomes.

- irradiated hybrid mapping: this method allows researchers to detect proximity between genome fragments created by splitting from radiation. Irradiated duck cells are then fused with hamster cells, which retain a random portion of chromosome fragments from the split duck cells. To measure the physical distance between two fragments of the duck genome, the number of hybrid cells that contain both must be determined, based on the premise that the closer the fragments are, the less chance they have of being separated during radiation, and therefore the more likely they are to remain in the hybrid cells.

Obtaining irradiated hybrids. For publication in INRA journal Productions animales, special report on foie gras producing ducks and geese, estimated release end-2013. © INRA, Alain Vignal. © INRA, Alain Vignal
Obtaining irradiated hybrids. For publication in INRA journal Productions animales, special report on foie gras producing ducks and geese, estimated release end-2013. © INRA, Alain Vignal © INRA, Alain Vignal

- physical marker maps: sequencing fragments are labelled by molecular markers (2). Those fragments with the same markers overlap, which allows scientists to re-order them on chromosomes.

- genetic maps: created by crossing two individuals with different genotypes. The frequency of combinations with two markers allows scientists to deduce the genetic difference between the two (in centimorgans).

 Functional genomics

A DNA sequence is nothing more than a succession of four letters - A, C, G, and T - which make up the genetic code. But the genes and their function still have to be identified.

In addition to the physical map of a genome, the position of each gene in the genome must be determined. With the help of information technology, scientists have estimated that ducks have between 15,000 and 19,000 genes. Bio-informatic processing has also helped determine theoretical coding sequences, thereby supplying science with an open reading frame for DNA.

To study actual gene expression, transcript (messenger RNA) and protein sequences are made for different tissues and conditions.

Several methods of so-called functional genomics must then be used to determine the function of genes: research to identify homologous traits between known genes, mutations, reverse genetics, research to identify QTL (3), transgenesis, etc.

Physical and genetic mapping complement each other: sequence polymorphisms (physical markers) can be associated with variations in physiological traits, and thus allow scientists to pinpoint the genes involved in the expression of these traits in chromosomes.  

Several levels of mapping are necessary to order sequence fragments on chromosomes. Transcript and protein sequencing are also necessary to detect genes and determine their structure (structural annotation). For publication in INRA journal Productions animales, special report on foie gras producing ducks and geese, estimated release end-2013. © INRA, Alain Vignal. © INRA, INRA, Alain Vignal
Several levels of mapping are necessary to order sequence fragments on chromosomes. Transcript and protein sequencing are also necessary to detect genes and determine their structure (structural annotation). For publication in INRA journal Productions animales, special report on foie gras producing ducks and geese, estimated release end-2013. © INRA, Alain Vignal © INRA, INRA, Alain Vignal

(1) The number of chromosomes is determined by cytogenetic techniques by establishing the karyotype.

(2) Markers: determined DNA sequences, which can be gene fragments or sequences presenting a polymorphism (microsatellite markers, SNPs, etc.)

(3) QTL: quantitative trait locus. Set of genes that contribute to the expression of a phenotypic trait.