MSU Evolutionary Genomics Laboratory
This laboratory was established as part of a scientific research project supported with a monetary grant awarded by the Government of the Russian Federation under a grant competition designed to provide governmental support to scientific research projects implemented under the supervision of the world's leading scientists at Russian institutions of higher learning, research institutions of the governmental academies of sciences and governmental research centers of the Russian Federation (Resolution of the RF Government No.220 of April 9, 2010).
Host institution of higher learning:
State educational institution of higher professional education "Moscow State University named after M. V. Lomonosov"
Scientific research area:
To compare the genomes of kin species and genotypes of kin specimens.
Key project objectives:
1. To research complex forms of natural selection characterised by interdependency of relative adaptations of alleles of different loci;
2. To research the epistasis phenomenon as applied to positive selection conducive to new and infrequent phenotypes, as well as negative selection conducive to ancestral, frequent phenotypes.
Anticipated project outputs:
1. The project will help determine the role of the key factors of biological evolution: mutation process and natural selection;
2. The project will help identify new peculiarities of the functioning of viral proteins.
Leading scientist's full name: Kondrashov, Alexey Simonovich
Academic degree and title:
Candidate of biological sciences, processor
Professor at the Institute of Biological Sciences and the Department of ecology and evolutionary biology of the University of Michigan (USA).
Field of scientific interests:
Key scientific achievements:
- Determinist mutation hypothesis that explains gamogenesis;
- Advancement of the sympatric speciation theory;
- Assessment of mutagenesis rates;
- Assessment of natural selection in nucleotide sequences and protein evolution.
Theory and bioinformatics
The overall purpose of our theoretical research is to study natural selection at the genomic level. Over the past two years, we have been mostly working on five objectives that were formulated in our project proposal and grant application. These five objectives and brief descriptions of the results achieved to date are provided below.
1. To develop a method that would enable us to identify the positive selection that occurred in the past by using negative selection data from the present.
This method has been developed. It is based on a very simple idea: once a beneficial allele has expelled a harmful allele from a population, the selection that was conducive to this beneficial allele stops being positive and starts being negative because the beneficial allele stops being infrequent and starts being frequent. Having applied this method to human genetic variability data we discovered that prior to the Ponginae–Homininae divergence approximately 50% of the amino acid replacements in the human ancestral line were beneficial whereas at a later stage of evolution of our species such replacements were a lot more infrequent. On the contrary, in the course of the evolution of the Drosophila melanogaster ancestral line the share of adaptive amino acid replacements always remained about 50% (Bazykin and Kondrashov 2011). We also discovered that a more lengthy selection plays a greater role in the evolution of conservative segments of protein-encoding genes (Bazykin and Kondrashov 2012).
2. To research the properties of adaptive landscapes in a space of sequences using data on fixation of beneficial mutations in genome sequences that have recently undergone significant changes.
We demonstrated that a frame-preserving dropout or insertion within a protein-encoding gene is usually followed by an "adaptive walk" consisting of several amino acid replacements that occur as a result of positive selection (Leushkin et al. 2012a). The lengths of these walks approach those predicted by the general theory of adaptive landscapes. These walks occur on the backdrop of complex interactions between mutagenesis that usually brings about deletions and selection (Leushkin et al. 2012b) and conversion (Leushkin and Bazykin 2012) that are conducive to insertions.
3. To study the correlations between replacements in neighbouring sites of encoding and non-encoding sequences
We described a strong small-scale heterogeneity of parameters of the mutagenic process in animal genomes. This heterogeneity affects the overall rate of mutations and relative frequencies of point mutations of different kinds, first and foremost, transitions and transversions (Seplyarskiy et al. 2012). While taking into account this heterogeneity, we studied the phenomenon of correlated replacements in neighbouring nucleotide sites in non-encoding sequences of the genomes of vertebrates and Drosophila. We demonstrated that this phenomenon is a lot more common than previously believed and that it is caused primarily by simultaneous or almost simultaneous replacements of multiple nucleotides. Such multiple replacements do not affect more than 10 neighbouring nucleotide sites but many of them include replacements of more than two nucleotides. Multiple replacements may be explained by both mutation and selection but mutation seems to be much more likely (Terekhanova et al. 2012).
4. To study the phenomenon of preservation of orthology of genome segments without preservation of apparent similarity between their sequences.
We developed a method of identification of selective restrictions of evolution of unequalizable sequences. This method requires identifying a large number of orthologic yet unequalizable genome segments. Introns whose orthology is unambiguously determined through the protein-encoding context are convenient for that purpose. Next, a quartet of species is selected that consists of two pairs within each of which there are significant local equalizations between some orthologic introns yet there are no such equalizations between the pairs themselves. Examples of such a quartet can be represented by ((Homo sapiens, Gallus gallus), (Ciona intestinalis, Ciona savignyi)). Next, the correlation between significant local equalizations within two pairs of a quartet is analyzed. We demonstrated (Vakhrusheva et al. 2012) that such correlation does exist. This means that selective restrictions occur throughout the entire evolution of a quartet from a common ancestor despite the fact that the divergence of remote species has led to their sequences no longer equalizing.
5. To study the phenomenon of increased power of negative selection in favour of a new allele following an allele replacement and to apply it to analyzing the properties of adaptive landscapes.
We studied the dynamics of the contribution made by an amino acid to adaptation depending on the amount of time that has elapsed since the moment this amino acid appeared in a protein. This dependency proved very strong. Data on reverse amino acid replacements show that a recently replaced amino acid often restores itself and that it gets replaced less and less often over time. This effect is primarily accounted for by the declining contribution to adaptation of the replaced amino acid, which is apparently the consequence of epistatic interactions between the new amino acid and the amino acids that appear in other sites of the protein as a result of replacement (Naumenko et al. 2012a). We demonstrated that there is an unexpectedly weak correlation between the rate of evolution and the multitude of allowable amino acids, both at the level of discrete sites and whole proteins. They are, therefore, two different parameters of the process of evolution (Naumenko et al. 2012b).
In addition, we started researching three more theoretical problems:
6. Epistasis in the evolution of the Type A influenza virus
We are researching the phenomenon of acceleration of evolution of the Type A flu virus proteins following re-assortment as a result of which the virus genome acquires one or several proteins from a different, often phylogenetically distant virus. This acceleration points to epistatic interactions between the processes of selection that affect different proteins.
7. Population genomics of the hyper-variable organism Ciona savignyi
We are conducting a comparative analysis of the genotypes of six Ciona savignyi specimens caught in three different regions of the Pacific Ocean. In view of its record variability, this species is an ideal object for the study of natural selection.
8. Analysis of selection against function loss alleles in Drosophila populations
We study accumulation of non-synonymic and synonymic replacements in the alleles of protein-encoding Drosophila loci that have been inactivated by nonsense replacements. The neutrality of replacements in the inactivated alleles makes it possible to determine their age and, therefore, the power of selection to support the functional allele.
Acquisition and analysis of own data
In addition to purely theoretical and bioinformational research, we have also initiated research that entails acquisition and analysis of our own data:
9. Sequencing, annotation, and analysis of the minimal genome of a flowering plant
The Genlisea margaretae and G. aurea have the shortest genomes among the angiosperms – only 64 million nucleotides. We study the mechanisms responsible for the miniaturization of the genome by comparing them with the genomes of relatively close species. We have established that the G. aurea genome has a smaller number of genes as compared to all the other plant species studied to date and that the length of non-encoding genome segments is significantly shorter.
See details here: http://www.biomedcentral.com/1471-2164/14/476/
10. Genomic analysis of adaptation of tidder, Gasterosteus aculeatus, to fresh water
We compare the genotypes of tidder specimens from the populations found in the White Sea and the neighbouring freshwater lakes. This comparison makes it possible to identify the genes responsible for adaptation to fresh water and assess the power of the corresponding selection. This work is carried out in cooperation with N. S. Muge (Laboratory of Population Biology at the Russian National Research Institute of Fisheries and Oceanography).
11. Research of the fine structure of recombination sites following interspecies crossing in Schizophyllum commune basidiomycete
We crossed two haploid Schizophyllum commune specimens the synonymic distance between whose genomes exceeds 20% and sequenced the genotypes of 20 haploid offspring. A detailed analysis of the 35 crossing-over points we identified makes it possible to study their properties, including those that are not available for analysis in the offspring produced by less genetically distant organisms.
12. Analysis of biological diversity of the White Sea; search for new model systems with a high level of genetic polymorphism
We systematically study intra-population variability in the populations of macroscopic animals and algae found in the White Sea. To date, we have studied more than 100 populations and discovered seven sibling species and one case of mitochondrial introgression. This work is carried out in cooperation with numerous zoologists and algologists from the White Sea Biological Station of the Moscow State University.
13. Genomic polymorphism of Geomyces pannorum ascomycetes extracted from extreme habitats
We have sequenced 14 genotypes of Geomyces pannorum, an ascomycete that has lost its sexual procreation ability and that lives in the conditions of low temperatures, including permafrost. A comparative analysis of these genomes has shown complex phylogenetic connections between the organisms from different habitats, as well as a number of cases of horizontal gene transfer. This work is carried out in cooperation with S. M. Ozerskaya (Russian National Collection of Microorganisms of the Institute of Biochemistry and Physiology of Microorganisms named after G. K. Skryabin).
14. Sequencing, annotation, and analysis of the genomes of wild and cultured buckwheat
We have sequenced the genome of the wild ancestor of one of the two cultured buckwheat species, Fagopyrum tataricum ssp. Potanini, and are now analysing it. We will then sequence and analyse the genome of the Fagopyrum tataricum cultivars, as well as the wild ancestor and cultivars of the other cultured buckwheat species, Fagopyrum esculentum.
15. Population genomics of the hyper-variable Schizophyllum commune basidiomycete
We have shown that intra-population variability of the Schizophyllum commune genotypes is high and amounts to almost 7%. We sequenced the genotypes of 42 haploid specimens of this species from several regions of Russia and the USA and are now conducting their comparative analysis.
16. Sequencing, annotation, and analysis of the genome of the tetraploid Capsella bursa-pastoris species
Blindweed Capsella bursa-pastoris is the product of a recent genome-wide duplication. We have sequenced its genome and are now researching the initial stages of its post-duplication evolution. This work is carried out in cooperation with V. Yu. Makeyev and A. S. Kasyanov (Laboratory of Systemic Biology and Computational Genetics of the Institute of General Genetics).
17. Genomics of adaptation of the Podospora anserina ascomycete to new habitat
O. A. Kudryavtseva (Department of mycology and algology of the Moscow State University's Faculty of biology) is conducting a lengthy experiment cultivating the Podospora anserina ascomycete in artificial conditions. We sequence ancestral genotypes and the genotypes of the lines they produced and examine the changes caused by the adaptation of the species to the cultivated conditions.
18. Evolution of proteins during the speciation of Baikal Amphypoda
The Baikal Lake Amphypoda is one of the most remarkable strands of species. We sequence transcriptomes of different species to study the processes that accompanied rapid cladogenesis that brought about this strand of species. This work is carried out in cooperation with L. Yu. Yampolsky (Department of Biology, East Tennessee State University).