Do Other Animals Besides Humans Have Junk Dna
Ups J Med Sci. 2020; 125(i): one–9.
What animals can teach us about evolution, the man genome, and homo disease
Kerstin Lindblad-Toh
aDepartment for Medical Biochemistry and Microbiology, Uppsala Academy, Uppsala, Sweden;
bBroad Institute of MIT and Harvard, Cambridge, MA, United states of america
Received 2020 January 22; Accustomed 2020 January 23.
Abstract
During the past twenty years, since I started equally a postdoc, the world of genetics and genomics has changed dramatically. My main research goal throughout my career has been to understand homo disease genetics, and I have developed comparative genomics and comparative genetics to generate resources and tools for agreement human illness. Through comparative genomics I have worked to sequence enough mammals to empathise the functional potential of each base of operations in the man genome as well as chosen vertebrates to study the evolutionary changes that have given many species their key traits. Through comparative genetics, I have adult the dog as a model for human illness, characterising the genome itself and determining a list of germ-line loci and somatic mutations causing complex diseases and cancer in the dog. Pulling all these findings and resources together opens new doors for understanding genome evolution, the genetics of circuitous traits and cancer in man and his best friend.
Keywords: Canine genetics, comparative genomics, genome sequencing, human genetics
Introduction
Human being disease—early studies
Homo diseases can largely be divided into infectious diseases and genetic diseases. In many cases diseases arise as a result of both genetic predisposition and environmental factors. In the early years, diseases dependent on a unmarried gene were analysed with laborious methods using large families to see if simple sequence length polymorphism (SSLPs) markers segregated with disease. These single factor diseases included for example cystic fibrosis (CFTR) (ane) and Huntington's disease (IT15) (2), while Down syndrome was found to depend on an actress chromosome (trisomy 21) (3) leading to a more complex phenotype related to many genes. More common complex diseases such as diabetes, schizophrenia, and rheumatoid arthritis were far too complex to sympathise. On superlative of this, cancer was postulated to have both inherited mutations and mutations arising in the tumour (including the Knudsen 2-hit hypothesis) (iv), making the neoplasm become increasingly malignant (Figure ane).

Professor Kerstin Lindblad-Toh, winner of the Medical Kinesthesia of Uppsala University Rudbeck Award 2019, 'for her excellent research in comparative genomics and for developing the dog as a model for biomedical enquiry'.
Tools have transformed genetics
Over the past twenty years enormous changes related to genome sequencing and gene mapping take occurred, mostly every bit collaborative efforts striving to develop new technologies, large-scale resource, and computational approaches. As I will discuss below, while Deoxyribonucleic acid was described more than one-half a century ago, the understanding of how genetic diseases arise is still incomplete, just the field makes continuous progress every day. Many of the tools and analysis methods, and some of the knowledge clustered, are besides already beingness used to understand the biology of disease. Over time, this will allow more than detailed diagnosis with improve treatment options, and—in the long-term—personalised medicine.
Comparative genomics
Sequencing the human genome
The human being genome has long been a source of wonder, with but a partial understanding of how it works, despite enormous efforts to generate dissimilar types of data. After the discovery of DNA in the 1950s, more than active research on the human genome and the genes and regulatory elements hidden therein took a long time to follow. In the beginning, studies often focussed on a single gene for a single disease. Several types of maps relying on different types of markers (SSLPs and radiations hybrid markers) immune the mapping of monogenic diseases such as cystic fibrosis and Huntington's disease. It was recognised that the world of disease genetics would greatly improve if the whole human genome could exist sequenced. For much of the 1990s, more than 2,800 researchers in a world-wide consortium worked on sequencing the human genome, with different regions or chromosomes divided between labs. This public effort, using Sanger sequencing (5) led to the starting time human draft genome (6), but as well coincided with the publishing of a unlike human genome (7), which used a novel arroyo: whole-genome shotgun sequencing, where segments are randomly sequenced and and so put together as a behemothic puzzle. Both genome assemblies covered about of the genome, just struggled with more complex regions such as complex gene families, difficult repeats, and centromeres and telomeres. Thus, the human genome project continued to make full in gaps for an additional 5 years or so. While the human genome was declared finished in 2003 (8), in that location still remained gaps that were unfilled. In the last couple of years using novel long-read technologies, it has finally become possible to sequence complex regions of the genome such as segmental duplications and repeats, for example in centromeric regions (9), thus enabling new and complete assemblies of chromosomes spanning as well echo regions, from telomere to telomere (ten).
Prior to the human genome sequence, common lore had it that the man genome contained ∼100,000 protein-coding genes. When start analysed, the human genome was found to contain ∼50% repetitive sequences, for many years thought of as 'junk DNA' (11). In the rest of the genome, scientists struggled to identify the protein-coding genes that must be there. They used bachelor RNA expression data and known protein coding sequence. This allowed them to extrapolate their findings to ∼xl,000 protein-coding genes, bold that there were all the same genes they could not observe. Difficult regions included complex gene families (such as olfactory receptors) or genes with an especially GC-rich sequence.
Mouse and rat genomes
The second mammalian genome to be sequenced was the mouse (12). The mouse is the all-time laboratory animal model for homo disease, as it is small and easy to manipulate in captivity, and hence it was accounted of loftier importance to generate a reference sequence for it. For the reference the C57Bl/6J strain was used, despite the fact that multiple strains exist and are used for unlike disease studies. Intriguingly, the haplotype construction of the mouse genome (every bit determined by whole-genome sequencing multiple strains) was quite blocky (long stretches of sequences were inherited together when looking at multiple strains), and when multiple strains were analysed and compared to the two founder mice, Mus musculus domesticus and Mus muscle musculus, it was seen that most laboratory strains were hybrids between those 2 founder strains (13). This finding agreed with the fact that early mice were used as pet mice based on different phenotypes such as colouring or 'dancing' mice in Japan and Communist china (M. musculus domesticus) and Europe (M. muscle musculus) (14). In add-on to studying the genetics in unlike strains, mice have been used both with knock-out mutations and transgenes. In the current era of CRISPR editing (xv), the assay of mutations in mice has become even easier.
Rats are similar to mice as laboratory animals, but their larger size makes them more expensive to firm, while their physiology is more similar to humans. The Brown Norway rat genome was sequenced and published in 2004 (xvi). Following this, the rat has been widely used to map complex traits by a combination of sequencing and mapping strategies. For instance, one study of outbred rats identified 355 quantitative trait loci for 122 phenotypes including anxiety, heart affliction, and multiple sclerosis (17).
Dog genome
The dog, man'south all-time friend, was the fifth mammal to be sequenced. At 2.four Gb the dog genome is somewhat smaller than the human genome, based on a lower corporeality of lineage-specific repeat sequences (334 Mb versus 609 Mb, respectively) (18). Previously, the coding-gene count in mammalian genomes had not been precisely determined, just when using conserved synteny between 4 mammals—human, mouse, rat, and dog—nosotros could revise the number of mammalian genes to ∼xx,000. This number varies slightly betwixt species, primarily based on lineage-specific cistron family expansions and contractions, but almost 14,000 genes are 1:1:1 orthologs beyond human, mouse, and dog (18).
Much research has gone into trying to understand the early history of dogs and wolves. Dissimilar studies suggest different times and places for the domestication. While the answer is still out there, it would seem logical if wolves domesticated themselves in multiple places and at different times (10,000–40,000 years ago) (xix). While studies are ongoing to try to empathise the correlation and adaptations making dogs into dogs, one clear genetic result is the duplication of the amylase (AMY2B) factor (twenty), which has only i pair of copies in the wolf, while virtually dogs have many, roughly 5, pairs of copies. This gene is important for the digestion of starch and can be coupled to the changing diet involving considerably more than starch in agrarian societies. The fact that dogs living in the Arctic do not accept the boosted AMY2B copies is likely due to their living on a meat-rich nutrition (21).
Monodelphis genome and vertebrate evolution
Post-obit the sequencing of a number of useful placental mammals, we moved outside the placental mammals and sequenced the first marsupial — opossum (Monodelphis domesticata) (22). The perhaps most intriguing finding when comparing the opossum to placental mammals was that innovations in protein-coding regions are relatively rare, while 20% of eutherian conserved non-coding elements (CNEs) are recent innovations. A substantial proportion of these eutherian-specific CNEs accept arisen from sequences inserted by transposable elements (repeats), pointing to transposons as a major artistic force in the development of mammalian factor regulation.
The chicken red jungle fowl (Gallus gallus), being an important food source, was the first bird to be sequenced and published in 2004 (23). The chicken genome is roughly ane Gb in size, roughly one-third of the human genome, which correlates with a lower number of repeat elements and segmental duplications. In addition to vi pairs of macrochromosomes, and 1 pair of sex chromosomes (the female person is the heterogametic sex), chickens also have 32 pairs of intermediate or microchromosomes. Microchromosomes are small, GC-rich and factor rich. The analysis of the chicken genome in multiple populations has immune the identification of both morphological traits (24) and traits related to egg and meat production (25)
Anolis carolinensis was the first lizard to be sequenced (26). Lizards, like chickens, have substantial numbers of microchromosomes, and they both rely on eggs for reproduction. The development of the amniotic egg was 1 of the peachy evolutionary innovations in the history of life, freeing vertebrates from an obligatory connectedness to h2o, and thus permitting the conquest of terrestrial environments. A. carolinensis microchromosomes are highly syntenic with chicken microchromosomes, however they exercise not exhibit the loftier GC and low repeat content that are characteristic of avian microchromosomes. Also, A. carolinensis mobile elements are very immature and diverse—more so than in any other sequenced amniote genome — which maybe has allowed the novel innovations underlying the rapid radiation of 400 lizard species. These species accept radiated, often convergently, into a variety of ecological niches with attendant morphological adaptations, providing one of the all-time examples of adaptive radiation (27).
Despite the sequencing of a large number of country-living creatures, it was all the same a mystery how the starting time beast crawled onto land. Curiously plenty, a rarely seen fish, called the coelacanth (Latimeria chalumnae), and living in the deep ocean, for example off the East Coast of Africa, was reported to accept four lobe-finned limbs similar to many country-living vertebrates. Based on material from a stranded coelacanth, nosotros sequenced the coelacanth genome (28), together with the transcriptome of the lung fish. The lung fish also has four limbs, but every bit it has an extremely large genome (estimated at forty–100 Gb: http://www.genomesize.com), containing a lot of transposable elements (29), we could not afford to sequence it at that time. Careful analysis of the genes in coelacanth and lung fish showed that the lung fish was more than closely related to land-living animals, supporting the primitive notion that both lungs and legs are a great advantage on land.
Overall modification of cistron-regulatory elements may underlie a significant proportion of phenotypic changes on animal lineages. To investigate the proceeds of regulatory elements throughout vertebrate evolution we identified a genome-broad gear up of putative regulatory regions for five vertebrates, including man, and looked for signs of gains. In early vertebrate times regulatory gains occurred frequently almost transcription factors and developmental genes, but this trend was then replaced by innovations near extra-cellular signalling genes, and finally, in the last 100 million years (during the mammalian radiation), innovations near mail service-translational protein modifiers (xxx). This suggests that the complexity of regulation and function of poly peptide-coding genes have increased continuously (Figure 2).

Vertebrate genome sequencing projects shed calorie-free on genome evolution, domestication, and accommodation. Many of the beginning vertebrate whole-genome projects represented model species (e.g. mouse and rat), but over time, additional resources representing natural model species have been added. Highlighted in this tree are some of the studies that take been undertaken, inside and across lineages, to study the processes of natural accommodation (marked A; for example, stickleback adaptation to farthermost aquatic environments), domestication (marked D; for instance, genetic signatures separating domestic dogs and wolves), and genome evolution (marked GE; for example, exaptation changes in a regulatory sequence function between human and monodelphis). As well as indicating the genetic distances between representative vertebrate species, this tree also illustrates the time periods when novel regulatory innovations arose. In particular, regulatory elements near transcription factors (red box) and developmental genes (yellow box) evolved speedily in early vertebrate history, followed by prison cell advice (dark-green box) and poly peptide modification (blue box) in the more than recent past. As whole-genome sequencing becomes substantially cheaper and more accessible, the expansion of reference genomes inside each clade is set to increase, with the publication of 200 mammals, 300 birds, and more than 100 fish expected by the close of 2020. Image adapted with permission from Meadows & Lindblad-Toh, Nature Review Genetics (63).
Stickleback, cichlids, and herring—good examples of ecology adaptations
Sticklebacks are modest fish that were originally marine. They have colonised and adapted to thousands of streams and lakes formed since the terminal water ice age in Northward America and Europe (31). Typical changes of the freshwater adaptations included body shape, length, depth, fin position, spine length, eye size, and armour plate number. An early study generating a genome sequence also involved the sequencing of twenty individuals from locations spanning across both freshwater and saline environments globally (32). The study identified 90 genomic regions that consistently varied betwixt fresh and salt h2o. We also noted the re-utilize of globally shared standing genetic variation, including chromosomal inversions, to let for repeated evolution of distinct marine and freshwater sticklebacks.
A later on study showed like patterns of adaptation to salinity for the herring, a mutual nutrient source in Scandinavia, spanning between the brackish Baltic and the salty Northern Atlantic Sea (33). The genome sequence complemented with whole-genome sequencing of many populations identified >500 regions related to adaptations to brackish water. Further assay likewise identified >100 loci that varied between bound and autumn breeding populations (34). These studies also suggest that the adaptations can depend on both protein-coding and regulatory adaptations, and that haplotype blocks spanning multiple genes are selected, suggesting that multiple variants in a region might underlie genomic adaptations.
The tilapia, present in the Nile, again a common food source, was used every bit a backbone for the analysis of the adaptations present in hundreds of cichlid species in the African Lakes of Victoria, Malawi, and Tanganyika. Analysis of four fish from the Eastern lineage showed gene duplications, an abundance of not-coding element departure, accelerated coding sequence evolution, expression difference associated with transposable chemical element insertions, and regulation by novel microRNAs compared to the tilapia and other teleost genomes (35). Later studies accept likewise identified strong selection on colour schemes and morphology related to where in the lakes the different species alive. Deeper sequencing of 5 Lake Malawi species followed by genotyping in a diverse collection of ∼160 species from across Africa identified ∼200 genic and non-genic SNPs varying across species. We observed segregating polymorphisms outside of the Malawi lineage for more than 50% of these loci, suggesting that river cichlids accept transported polymorphisms betwixt lakes (36).
Mammals assist annotate the homo genome
In 2008, sequencing was still relatively expensive, but the whole-genome shotgun approach enabled the use of low-coverage genome sequencing of mammalian genomes to better sympathize the human genome. In 2011, we published a paper including 2× sequencing (whole-genome shotgun sequencing where random sequences are generated so that each base in the genome is sequenced on average 2-fold) of eighteen mammals added to the xi existing (Sanger sequenced at 7×) mammals (37). This project allowed the identification of evolutionary constraint of 12-bp elements, resulting in the identification of >3 million constraint elements, encompassing 4.ii% of the genome. The poly peptide-coding sequence just covers ∼i% of the genome, thus suggesting a flood of novel candidate regulatory elements. The data also allowed us to await at synonymous constraint elements where regulatory elements overlap coding sequence, constraint patterns in promoters, and accelerated regions in humans and primates—hallmarks of positive selection for human adaptations. Recent work has shown that human accelerated elements cover regulatory elements such as well conserved enhancers for developmental genes (38).
All is non poly peptide-coding genes
In addition to poly peptide-coding genes (1%) other regulatory entities encompass at to the lowest degree three times equally much infinite. These regions include non-coding RNA transcripts, such as thousands of long intergenic non-coding RNAs (lincRNAs) (39) and microRNAs (miRNAs) (xl). LincRNAs are RNA molecules larger than 200 nucleotides and are more or less conserved across species, presumably varying in the strength of role. Still, they have a widespread role in cistron regulation and other cellular processes including prison cell-cycle regulation, apoptosis, and establishment of cell identity (41). MiRNAs are brusk (20 to 24 nucleotides), non-coding RNA molecules composed of a single-stranded sequence. They predominantly act as negative regulators of gene expression (forty), but are functionally involved in virtually all physiologic processes, including differentiation and proliferation, metabolism, hemostasis, apoptosis, and inflammation.
To better catalogue the regulatory mural, the Encyclopaedia of Deoxyribonucleic acid Elements (ENCODE) projection (https://www.encodeproject.org) was formed to map functional non-coding elements. Initially, ChIP-seq was performed to find the location of private transcription factor binding sites in many tissues. Over time, this assay has expanded to look at differential methylation and acetylation of genomic bases and their bounden proteins. Recently, Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) (42) has been adult to detect open chromatin. Moreover, the three C technology has been adult into HiC (43), which can observe topologically associating domains (TADs) allowing scientists to infer what portions of the genome are within specific regulatory regions in specific tissues (44). In addition, the GTEx effort (https://www.gtexportal.org/home/) works to couple genome variation, such as single nucleotide polymorphisms (SNPs), to differences in cistron expression.
More than 240 mammals for unmarried base resolution constraint in mammalian genomes
Equally Illumina sequencing became affordable, we put together a project with the goal of detecting human single base evolutionary constraint using >240 mammalian genomes (45). Of these, 131 genomes were generated by u.s. using DISCOVAR-de novo (46), combined with the 110 mammalian genomes in NCBI in March 2017. Every bit this data prepare is analysed it will permit the study of: 1) the largest Eutherian nuclear genome phylogeny; 2) the ability to perform genotype–phenotype correlations across many mammalian species; 3) the evolution of genome structure; 4) reference genomes that tin be utilised for species conservation; and, finally, 5) a detailed map of evolutionary constraint, which can exist used with human being genome-broad clan (GWAS) catalogues and other species data sets to investigate patterns of constraint in disease-associated regions in any of the 241 genomes. The data ready will also make possible the report of accelerated regions under positive selection in whatsoever of the sequenced mammalian genomes.
Comparative genetics
Complication of homo affliction genetics
After the generation of the human genome, lots of try went into detecting variation in the human genome. SNPs, indels, and larger structural variations were discovered, and SNPs were put into genotyping panels to allow for genome-broad association studies (GWAS) to map disease loci. The initial thoughts were that common diseases were caused by common variants. Although tens of thousands of patients and controls take been genotyped for many complex traits, the loci identified have simply accounted for a fraction of the heritability of the disease. As an instance, 113,075 controls and 36,989 cases with schizophrenia have identified 108 genome-wide significantly associated loci (47). These loci are estimated to explain 30–fifty% of the heritability for schizophrenia. There could exist multiple reasons for why only a smaller portion of the heritability has been detected, such as the need of larger samples sizes, environmental factors, epigenetics or a bigger proportion of rare variants. For some diseases such as autism, where individuals with the illness reproduce less frequently, the fraction of novel mutations or rare variants of high upshot may exist larger (48). To find individual rare variants, much larger sample sizes are needed; however, methods of Burden testing of specific gene regions or pathways will enable the summing of multiple mutations. While currently the factor plus a unified flanking region is involved in the analysis, TAD domains and GTEx data should be helpful in performing Burden tests on regions distinctly important to the gene.
The dog as a model for homo illness
The strongest reason for sequencing the canine genome was to harness the genetics related to the enormous diversity among breeds (49). Pet dogs are special because they share our environment, and besides share the same diseases equally humans, including autoimmune illness, neurological illness, cardiovascular affliction, and cancer. In fact, roughly xxx% of dogs get cancer, which is similar to the frequency observed in man (fifty). On peak of this, at that place has been very potent option for morphological traits and behaviour, suggesting that rare variants with potent result may take go more common in certain breeds, leading to affliction. The bottlenecks at breed creation may as well have allowed drift to make some alleles much more than mutual in some breeds. The recent cosmos of breeds (in the concluding 200 years) besides means that haplotype blocks are long inside breed and short across breeds. This allows for a mapping strategy where first high-risk breeds are used to find the rough locations of illness mutation (by case command GWAS), and and then other breeds are added in to fine-map the region to find the functional mutation (Figure 3) (18). Based on this, monogenic traits can be mapped with ∼20 cases and twenty controls, while complex traits, such equally cancers, tin exist mapped with a few hundred cases and controls. In humans, many thousands of patients and controls are needed. Many diseases such every bit osteosarcoma (51), canine systemic lupus erythematosus (52), and canine compulsive disorder (CCD) have now been identified and, in several cases, been translated to the corresponding human illness.

Genome-wide association (GWAS) is easier in dogs than in humans. Monogenic traits in dogs can exist mapped with fewer SNP markers and fewer individuals than in humans. GWAS in dogs will utilise the long linkage disequilibrium (LD) within dog breeds, followed by fine-mapping in multiple breeds with the same phenotype (panel a). In humans the LD is short, requiring the apply of a lot of SNP markers already in the GWAS step (panel b). The number of SNP markers required for unlike types of traits in dogs is lower, equally is the number of loci contributing to each trait in dogs (console c), while in humans nigh traits are more circuitous and require more samples (panel d).
Obsessive-compulsive disorders (OCD) shares a common aetiology between dogs and humans
CCD shows strong clinical similarities with human OCD; both species perform sure normal behaviour in excess and oft repetitively. To investigate the genetic causes of CCD, we first performed GWAS in 92 cases and 68 controls of the Doberman Pinscher brood, identifying a single genome-wide significance locus (53). This locus was nearly the cadherin 2 (CHD2) cistron, for which the protein is located in the synapse. Secondly, careful reanalysis of the data identified multiple regions of suggestive association as well as regions of fixation in the Doberman Pinscher brood. Thirdly, we performed targeted resequencing of all these regions and identified a number of genes with increased numbers of mutations in cases versus controls (54). Many of these genes were active in the synapse. Finally, nosotros used the genes and pathways found in dogs, combined them with known functionally important OCD genes from humans and mice, and used the combined gene ready to perform targeted sequencing of human OCD cases and controls (55). Altogether, nosotros analysed 608 genes in 592 cases and 560 controls and identified four genes as strongly associated (one genome-wide). Two of these genes, NRXN1 and HTR2A, were enriched for protein-coding mutations in cases, while 2 genes, CTTNBP2 (synapse maintenance) and REEP3 (vesicle trafficking), had only regulatory mutations in this study. This might propose that these two proteins with regulatory mutations have such critical functions that they cannot tolerate coding mutations. Now larger GWAS studies are being performed in humans, and it will be interesting to see if the link to CCD will be farther strengthened.
Taking the side by side step
Sequencing technologies change the way we can analyse most species on globe
As the cost of long-read sequencing technologies is finally coming down for generating a loftier-quality genome (and the generation of population data can exist cheaply generated by curt-reads), the power to generate genomes from many species changes dramatically. Nonetheless, one of the challenges still remaining is access to high-quality DNA samples, which is necessary for generating a reference genome with long-read sequencing, and also for samples from a sufficient number of individuals to allow the generation of population data from dissimilar regions of the world. Multiple zoos (i.eastward. the San Diego Frozen Zoo) have collected samples and jail cell lines which are potentially useful for generating genome sequences or studying variation, only also potentially for in vitro reproduction for endangered species. 1 such case is the Southern White Rhino which is shut to extinction, with only two individuals all the same live and neither of them able to carry a pregnancy. Attempts will at present exist made for a Northern White Rhino to be a surrogate mother (56).
About a year agone the Earth Biogenome Project (https://www.earthbiogenome.org) (57) started with the enormous aim of generating high-quality genomes for each of 10–15 1000000 eukaryotic species on Earth in the next 10 years. To achieve this, almost every step of the process needs to be scaled upward: sample collection, sequencing and associates, notation, and standardised analysis, too as species-specific analysis. On top of this comes the generation and analysis of population data and transcriptomics for notation of genomes. Although this is still a moonshot, we are getting closer. Chiefly, to salvage diversity of life on Earth, sequencing must exist combined with more than practical conservation efforts such every bit protection of habitats and inhibition of poaching.
Mammalian constraint and its use for understanding affliction in many mammals
As mentioned before, almost GWAS loci fall exterior protein-coding regions of the genome, thus requiring the use of various ways to annotate unmarried variants for the likelihood that they are a causative mutation for affliction. Several techniques are being developed to address this crucial issue. The 200-mammals-data reaches single base of operations constraint for all bases in the human genome, allowing a triage of each position's likelihood to be functional without any human relationship to in which tissues the affected gene is located. As novel methods such as massively parallel reporter assays and genome editing at large scale (58) become possible, information technology will allow the comparison of overall evolutionary functional constraints to that of functionality in individual tissues.
To add to this, boosted annotations of all types of transcripts in many healthy and illness tissues as well every bit many types of functional annotations (both experimental like ChIP-seq, Regulome and HiC, or bioinformatic Hidden Markov Models [HMMs] (59) and machine learning methodologies) volition aid our understanding of how normal and diseased tissues are affected by each gene/mutation. To more clearly understand the tissue-specific effects, single cell sequencing (sixty) has also go more frequent and tin can decipher cells in, for case, the immune system and brain, where a large number of different cell types alive in close proximity and symbiosis. Cultured organelles and spatial transcriptomics (61) in central tissues allow further autopsy of transcriptomics and functional effects.
Utilising the canine model for clinical trials
The clinical similarity between disease in dogs and humans has been studied for many years. Based on Multiple GWAS data sets, neoplasm mutations and expression data accept shown also a molecular similarity between dogs and humans in many diseases. Based on the shorter life-span in dogs compared to humans, the consequence from clinical trials is probable to be informative more quickly. Mouse models on the other manus, may requite rapid results, but are frequently induced and are rarely spontaneous. Already, trials for canine ALS (https://vhc.missouri.edu/small-animal-infirmary/neurology-neurosurgery/electric current-clinical-trials/) and multiple cancers are underway (https://trials.vet.tufts.edu/clinical-trials/?fwp_species=dog&fwp_veterinary_specialties=oncology). Also, the analysis of cell costless Deoxyribonucleic acid (cfDNA) and circulating neoplasm DNA (ctDNA) for monitoring disease progression in liquid biopsies (62) in the canis familiaris should exist informative.
Determination
During the past two decades, the understanding of vertebrate development every bit well equally of the human genome, and consequently human illness, has expanded at an exceptional pace. We have increased our understanding of evolutionary principles and the content of the human genome. Loci associated with a specific human illness can be in the hundreds, detected with tens of thousands of individuals, still explaining but a fraction of the disease adventure. Thus, we take yet only scraped the surface when it comes to understanding human disease.
The adjacent decade is likely to generate exponentially more data to help protect endangered species past having both a reference genome and population data. It will also increment the understanding of the human genome, including non-coding mutations and rare variants. This will require both an understanding of every base in the human genome besides as large sample sizes to fully map human disease. Increasing apply of pet dogs for affliction gene identification as well every bit for clinical trials is likely to help propel the biological agreement into canine and human being personalised medicine.
Biography
•
Professor Lindblad-Toh received her PhD in human genetics at the Karolinska Institute. Subsequently a short Postdoc at the Whitehead Found/MIT Genome plant, she went on to lead many vertebrate genome projects, also equally canine disease projects. She was one of the founders of SciLifeLab. She is currently a Professor at Uppsala University and the Scientific Director of Vertebrate Genomics at the Broad Institute.
Funding Statement
Fiscal back up has been received from the Knut and Alice Wallenberg Foundation, NIH, Cancerfonden, and FORMAS. The author is a Distinguished Professor at the Swedish Research Council.
Acknowledgements
I am indebted to all the people in my groups in Sweden and the United States, also as a big number of collaborators around the globe for their hard work on these projects. This science would not accept been possible without you! I thank Elisabeth Sundström, Ann-Catherine Lindblad, and Sue Wincent-Dodd for useful commentary and proof-reading of the manuscript and Kai Siang Toh for assistance with illustrations.
Disclosure statement
No potential conflict of interest was reported past the writer(s).
References
1. Riordan JR, Rommens JM, Kerem B, Alon N, Rozmahel R, Grzelczak Z, et al.. Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. Scientific discipline 1989;245:1066–73. [PubMed] [Google Scholar]
2. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. The Huntington's Illness Collaborative Inquiry Group. Cell 1993;72:971–83. [PubMed] [Google Scholar]
3. Patterson D. Molecular genetic analysis of Down syndrome. Hum Genet. 2009;126:195–214. [PubMed] [Google Scholar]
4. Knudson AG., Jr., Mutation and cancer: statistical study of retinoblastoma. Proc Natl Acad Sci USA. 1971;68:820–3. [PMC complimentary commodity] [PubMed] [Google Scholar]
5. Sanger F, Coulson AR. A rapid method for determining sequences in DNA by primed synthesis with Deoxyribonucleic acid polymerase. J Mol Biol. 1975;94:441–8. [PubMed] [Google Scholar]
6. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al.. Initial sequencing and analysis of the human genome. Nature 2001;409:860–921. [PubMed] [Google Scholar]
seven. Venter JC, Adams MD, Myers EW, Li Pw, Mural RJ, Sutton GG, et al.. The sequence of the man genome. Science 2001;291:1304–51. [PubMed] [Google Scholar]
viii. Rogers J. The finished genome sequence of Homo sapiens. Cold Jump Harb Symp Quant Biol. 2003;68:i–11. [PubMed] [Google Scholar]
9. Vollger MR, Logsdon GA, Audano PA, Sulovari A, Porubsky D, Peluso P, et al.. Improved associates and variant detection of a haploid homo genome using unmarried-molecule, high-allegiance long reads. Ann Hum Genet 2019. [Epub ahead of print]. [PMC gratuitous article] [PubMed] [Google Scholar]
ten. Miga KH. Completing the homo genome: the progress and challenge of satellite DNA assembly. Chromosome Res. 2015;23:421–6. [PubMed] [Google Scholar]
12. Mouse Genome Sequencing Consortium , Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, et al.. Initial sequencing and comparative assay of the mouse genome. Nature. 2002;420:520–62. [PubMed] [Google Scholar]
xiii. Wade CM, Kulbokas EJ tertiary, Kirby AW, Zody MC, Mullikin JC, Lander ES, et al.. The mosaic structure of variation in the laboratory mouse genome. Nature. 2002;420:574–8. [PubMed] [Google Scholar]
14. Beck JA, Lloyd South, Hafezparast 1000, Lennon-Pierce M, Eppig JT, Festing MF, et al.. Genealogies of mouse inbred strains. Nat Genet. 2000;24:23–5. [PubMed] [Google Scholar]
xv. Pickar-Oliver A, Gersbach CA. The side by side generation of CRISPR-Cas technologies and applications. Nat Rev Mol Prison cell Biol. 2019;xx:490–507. [PMC gratuitous commodity] [PubMed] [Google Scholar]
16. Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, et al.. Genome sequence of the Chocolate-brown Kingdom of norway rat yields insights into mammalian evolution. Nature. 2004;428:493–521.;. [PubMed] [Google Scholar]
17. Rat Genome Sequencing and Mapping Consortium, Baud A, Hermsen R, Guryev 5, Stridh P, Graham D, et al.. Combined sequence-based and genetic mapping analysis of complex traits in outbred rats. Nat Genet. 2013;45:767–75. [PMC free article] [PubMed] [Google Scholar]
eighteen. Lindblad-Toh Chiliad, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal G, et al.. Genome sequence, comparative analysis and haplotype construction of the domestic dog. Nature. 2005;438:803–xix. [PubMed] [Google Scholar]
xix. Larson One thousand, Karlsson EK, Perri A, Webster MT, Ho SY, Peters J, et al.. Rethinking canis familiaris domestication by integrating genetics, archeology, and biogeography. Proc Natl Acad Sci U S A. 2012;109:8878–83. [PMC free commodity] [PubMed] [Google Scholar]
20. Axelsson Due east, Ratnakumar A, Arendt ML, Maqbool K, Webster MT, Perloski M, et al.. The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature. 2013;495:360–four. [PubMed] [Google Scholar]
21. Arendt Chiliad, Cairns KM, Ballard JW, Savolainen P, Axelsson E. Diet accommodation in dog reflects spread of prehistoric agriculture. Heredity (Edinb). 2016;117:301–6. [PMC free article] [PubMed] [Google Scholar]
22. Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke Southward, et al.. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature. 2007;447:167–77. [PubMed] [Google Scholar]
23. International Craven Genome Sequencing Consortium. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature. 2004;432:695–716. [PubMed] [Google Scholar]
25. Qanbari South, Rubin CJ, Maqbool K, Weigend South, Weigend A, Geibel J, et al.. Genetics of accommodation in modern chicken. PLoS Genet. 2019;15:e1007989. [PMC free article] [PubMed] [Google Scholar]
26. Alfoldi J, Di Palma F, Grabherr 1000, Williams C, Kong Fifty, Mauceli E, et al.. The genome of the greenish anole lizard and a comparative analysis with birds and mammals. Nature. 2011;477:587–91. [PMC gratuitous article] [PubMed] [Google Scholar]
27. Vidal N, Hedges SB. The molecular evolutionary tree of lizards, snakes, and amphisbaenians. C R Biol. 2009;332:129–39. [PubMed] [Google Scholar]
28. Amemiya CT, Alföldi J, Lee AP, Fan S, Philippe H, Maccallum I, et al.. The African coelacanth genome provides insights into tetrapod evolution. Nature. 2013;496:311–six. [PMC costless article] [PubMed] [Google Scholar]
29. Metcalfe CJ, Casane D. Accommodating the load: the transposable chemical element content of very large genomes. Mob Genet Elements. 2013;iii:e24775. [PMC gratis article] [PubMed] [Google Scholar]
30. Lowe CB, Kellis Grand, Siepel A, Raney BJ, Clench M, Salama SR, et al.. Three periods of regulatory innovation during vertebrate evolution. Science. 2011;333:1019–24. [PMC free article] [PubMed] [Google Scholar]
31. Colosimo PF. Widespread parallel evolution in sticklebacks by repeated fixation of Ectodysplasin alleles. Scientific discipline. 2005;307:1928–33. [PubMed] [Google Scholar]
32. Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, et al.. The genomic ground of adaptive development in threespine sticklebacks. Nature. 2012;484:55–61. [PMC gratuitous commodity] [PubMed] [Google Scholar]
33. Martinez Barrio A, Lamichhaney S, Fan K, Rafati N, Pettersson M, Zhang H, et al.. The genetic footing for ecological adaptation of the Atlantic herring revealed by genome sequencing. Elife. 2016;v:e12081 [PMC free article] [PubMed] [Google Scholar]
34. Lamichhaney S, Fuentes-Pardo AP, Rafati N, Ryman N, McCracken GR, Bourne C, et al.. Parallel adaptive development of geographically afar herring populations on both sides of the Due north Atlantic Ocean. Proc Natl Acad Sci USA. 2017;114:E3452–61. [PMC gratis article] [PubMed] [Google Scholar]
35. Brawand D, Wagner CE, Li YI, Malinsky M, Keller I, Fan S, et al.. The genomic substrate for adaptive radiation in African cichlid fish. Nature. 2014;513:375–81. [PMC free article] [PubMed] [Google Scholar]
36. Loh YH, Bezault E, Muenzel FM, Roberts RB, Swofford R, Barluenga One thousand, et al.. Origins of shared genetic variation in African cichlids. Mol Biol Evol. 2013;30:906–17. [PMC complimentary article] [PubMed] [Google Scholar]
37. Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Washietl S, et al.. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–82. [PMC gratis article] [PubMed] [Google Scholar]
38. Capra JA, Erwin GD, McKinsey K, Rubenstein JL, Pollard KS. Many human being accelerated regions are developmental enhancers. Phil Trans R Soc B. 2013;368:20130025. [PMC free article] [PubMed] [Google Scholar]
xl. Neudecker Five, Brodsky KS, Kreth S, Ginde AA, Eltzschig HK. Emerging roles for microRNAs in perioperative medicine. Anesthesiology. 2016;124:489–506. [PMC free article] [PubMed] [Google Scholar]
41. Marques AC, Ponting CP. Catalogues of mammalian long noncoding RNAs: pocket-size conservation and incompleteness. Genome Biol. 2009;10:R124. [PMC free article] [PubMed] [Google Scholar]
42. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, Deoxyribonucleic acid-bounden proteins and nucleosome position. Nat Methods. 2013;x:1213–8. [PMC free commodity] [PubMed] [Google Scholar]
43. Nagano T, Lubling Y, Stevens TJ, Schoenfelder Southward, Yaffe E, Dean W, et al.. Single-jail cell Hi-C reveals cell-to-cell variability in chromosome construction. Nature. 2013;502:59–64. [PMC gratuitous article] [PubMed] [Google Scholar]
44. Sauerwald N, Singhal A, Kingsford C. Analysis of the structural variability of topologically associated domains equally revealed by Hi-C. NAR Genom Bioinform. 2020;2. [PMC free article] [PubMed] [Google Scholar]
45. Genereux DP, Serres A, Armstrong J, Johnson J, Marinescu VD. A comparative genomics multitool for scientific discovery and conservation. Nature. 2020; forthcoming. [Google Scholar]
46. Love RR, Weisenfeld NI, Jaffe DB, Besansky NJ, Neafsey DE. Evaluation of DISCOVAR de novo using a musquito sample for price-effective short-read genome assembly. BMC Genomics. 2016;17:187. [PMC free article] [PubMed] [Google Scholar]
47. Schizophrenia Working Group of the Psychiatric Genomics Consortium, Ripke S, Neale BM, Corvin A, Walters JTR, Farh K-H. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511:421–vii. [PMC free commodity] [PubMed] [Google Scholar]
48. Sanders SJ, He X, Willsey AJ, Ercan-Sencicek AG, Samocha KE, Cicek AE, et al.. Insights into autism spectrum disorder genomic architecture and biological science from 71 risk loci. Neuron. 2015;87:1215–33. [PMC free article] [PubMed] [Google Scholar]
49. Parker HG, Ostrander EA. Canine genomics and genetics: running with the pack. PLoS Genet. 2005;ane:e58. [PMC costless article] [PubMed] [Google Scholar]
50. Pang LY, Argyle DJ. Veterinary oncology: biological science, large data and precision medicine. Vet J. 2016;213:38–45. [PubMed] [Google Scholar]
51. Karlsson EK, Sigurdsson Due south, Ivansson E, Thomas R, Elvers I, Wright J, et al.. Genome-wide analyses implicate 33 loci in heritable dog osteosarcoma, including regulatory variants almost CDKN2A/B. Genome Biol. 2013;14:R132. [PMC free article] [PubMed] [Google Scholar]
52. Wilbe Thou, Jokinen P, Truvé K, Seppala EH, Karlsson EK, Biagi T, et al.. Genome-broad association mapping identifies multiple loci for a canine SLE-related disease complex. Nat Genet. 2010;42:250–4. [PubMed] [Google Scholar]
53. Dodman NH, Karlsson EK, Moon-Fanelli A, Galdzicka M, Perloski Chiliad, Shuster L, et al.. A canine chromosome vii locus confers compulsive disorder susceptibility. Mol Psychiatry. 2010;15:eight–10. [PubMed] [Google Scholar]
54. Tang R, Noh HJ, Wang D, Sigurdsson South, Swofford R, Perloski M, et al.. Candidate genes and functional noncoding variants identified in a canine model of obsessive-compulsive disorder. Genome Biol. 2014;15:R25. [PMC free commodity] [PubMed] [Google Scholar]
55. Noh HJ, Tang R, Flannick J, O'Dushlaine C, Swofford R, Howrigan D, et al.. Integrating evolutionary and regulatory information with a multispecies approach implicates genes and pathways in obsessive-compulsive disorder. Nat Commun. 2017;eight:774. [PMC free article] [PubMed] [Google Scholar]
56. Tunstall T, Kock R, Vahala J, Diekhans M, Fiddes I, Armstrong J, et al.. Evaluating recovery potential of the northern white rhino from cryopreserved somatic cells. Genome Res. 2018;28:780–8. [PMC complimentary article] [PubMed] [Google Scholar]
57. Lewin HA, Robinson GE, Kress WJ, Baker WJ, Coddington J, Crandall KA, et al.. Earth BioGenome Project: sequencing life for the futurity of life. Proc Natl Acad Sci Usa. 2018;115:4325–33. [PMC free article] [PubMed] [Google Scholar]
58. Kircher K, Xiong C, Martin B, Schubach M, Inoue F, Bell RJA, et al.. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat Commun. 2019;10:3583. [PMC free article] [PubMed] [Google Scholar]
59. Li N, Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165:2213–33. [PMC free commodity] [PubMed] [Google Scholar]
60. Shalek AK, Satija R, Adiconis X, Gertner RS, Gaublomme JT, Raychowdhury R, et al.. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature. 2013;498:236–40. [PMC free article] [PubMed] [Google Scholar]
61. Stahl PL, Salmén F, Vickovic S, Lundmark A, Navarro JF, Magnusson J, et al.. Visualization and assay of factor expression in tissue sections by spatial transcriptomics. Science. 2016;353:78–82. [PubMed] [Google Scholar]
62. Crowley E, Di Nicolantonio F, Loupakis F, Bardelli A. Liquid biopsy: monitoring cancer-genetics in the blood. Nat Rev Clin Oncol. 2013;ten:472–84. [PubMed] [Google Scholar]
63. Meadows JRS, Lindblad-Toh K. Dissecting evolution and disease using comparative vertebrate genomics. Nat Rev Genet. 2017;18:624–36. [PubMed] [Google Scholar]
Articles from Upsala Journal of Medical Sciences are provided here courtesy of Upsala Medical Society
Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7054949/
Posted by: heidrickwred1975.blogspot.com
0 Response to "Do Other Animals Besides Humans Have Junk Dna"
Post a Comment