Abstract
The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non-proteincoding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not.
Original language | English |
---|---|
Article number | e1000112 |
Journal | PLoS biology |
Volume | 7 |
Issue number | 5 |
DOIs | |
State | Published - May 2009 |
Fingerprint
Dive into the research topics of 'Lineage-specific biology revealed by a finished genome assembly of the mouse'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
Lineage-specific biology revealed by a finished genome assembly of the mouse. / Church, Deanna M.; Goodstadt, Leo; Hillier, Ladeana W.; Zody, Michael C.; Goldstein, Steve; She, Xinwe; Bult, Carol J.; Agarwala, Richa; Cherry, Joshua L.; DiCuccio, Michael; Hlavina, Wratko; Kapustin, Yuri; Meric, Peter; Maglott, Donna; Birtle, Zoë; Marques, Ana C.; Graves, Tina; Zhou, Shiguo; Teague, Brian; Potamousis, Konstantinos; Churas, Christopher; Place, Michael; Herschleb, Jill; Runnheim, Ron; Forrest, Daniel; Amos-Landgraf, James; Schwartz, David C.; Cheng, Ze; Lindblad-Toh, Kerstin; Eichler, Evan E.; Ponting, Chris P.; Muzny, Donna M.; Dugan-Rocha, Shannon; Ding, Yan; Scherer, Steven E.; Buhay, Christian J.; Cree, Andrew; Hernandez, Judith; Holder, Michael; Hume, Jennifer; Jackson, Laronda R.; Kovar, Christie; Lee, Sandra L.; Lewis, Lora R.; Metzker, Michael L.; Narareth, Lynne V.; Sabo, Aniko; Sodergren, Erica; Gibbs, Richard A.; FitzGerald, Michael; Cook, April; Jaffe, David B.; Garber, Manuel; Zimmer, Andrew R.; Pirun, Mono; Russell, Lyndsey; Sharpe, Ted; Chaturvedi, Michael Kamal Kabir; Wilkinson, Jane; LaButti, Kurt; Yang, Xiaoping; Bessette, Daniel; Allen, Nicole R.; Nguyen, Cindy; Nguyen, Thu; Dunbar, Chelsea; Lubonja, Rakela; Matthews, Charles; Liu, Xiaohong; Benamara, Mostafa; Negash, Tamrat; Lokyitsang, Tashi; Decktor, Karin; Piqani, Bruno; Munson, Glen; Tenzin, Pema; Stone, Sabrina; Macdonald, Pendexter; Arachchi, Harindra; Abouelleil, Amr; Lui, Annie; Priest, Margaret; Gearin, Gary; Brown, Adam; Aftuck, Lynne; Shea, Terrance; Sykes, Sean; Berlin, Aaron; Chu, Jeff; Dooley, Kathleen; Hagopian, Daniel; Hall, Jennifer; Hafez, Nabil; Smith, Cherylyn L.; Olandt, Peter; Miller, Karen; Ventkataraman, Vijay; Rachupka, Anthony; Dorris, Lester; Ayotte, Laura; Mabbitt, Richard; Erickson, Jeffrey; Horn, Andrea; An, Peter; Naylor, Jerome W.; Settipalli, Sampath; Lander, Eric S.; Wilson, Richard K.; Graves, Tina A.; Fulton, Robert S.; Rock, Susan M.; Chinwalla, Asif T.; Bernard, Kelly; Courtney, Laura P.; Fronick, Catrina; Fulton, Lucinda L.; O'Laughlin, Michelle; Kremitzki, Colin L.; Minx, Patrick J.; Nelson, Joanne O.; Schatzkamer, Kyriena L.; Strong, Cynthia; Wollam, Aye M.; Weinstock, George M.; Yang, Shiaw Pyng; Rogers, Jane; Grafham, Darren; Humphray, Sean; Nicholson, Christine; Bird, Christine; Brown, Andrew J.; Burton, John; Clee, Chris; Hunt, Adrienne; Jones, Matt C.; Lloyd, Christine; Matthews, Lucy; Mclaren, Karen; Mclaren, Stuart; McLay, Kirsten; Palmer, Sophie A.; Plumb, Robert; Shownkeen, Ratna; Sims, Sarah; Quail, Mike A.; Whitehead, Siobhan L.; Willey, David L.; Deschamps, Stephane; Kenton, Steven; Song, Lin; Do, Trang; Roe, Bruce; Bouffard, Gerard G.; Blakesley, Robert W.; Green, Eric D.; Kucherlapati, Raju; Grills, George; Li, Li; Montgomery, Kate T.; Kramer, Melissa; Speigel, Lori; McCombie, W. Richard; Lucas, Susan; Terry, Astrid; Gordon, Laurie; Stubbs, Lisa; Denny, Paul; Brown, Steve D.M.; Mallon, Anne Marie; Campbell, R. Duncan; Botherby, Marc R.M.; Jackson, Ian J.; Rubenfield, Marc J.; Rogosin, Andrea M.; Smith, Douglas R.
In: PLoS biology, Vol. 7, No. 5, e1000112, 05.2009.Research output: Contribution to journal › Article › peer-review
TY - JOUR
T1 - Lineage-specific biology revealed by a finished genome assembly of the mouse
AU - Church, Deanna M.
AU - Goodstadt, Leo
AU - Hillier, Ladeana W.
AU - Zody, Michael C.
AU - Goldstein, Steve
AU - She, Xinwe
AU - Bult, Carol J.
AU - Agarwala, Richa
AU - Cherry, Joshua L.
AU - DiCuccio, Michael
AU - Hlavina, Wratko
AU - Kapustin, Yuri
AU - Meric, Peter
AU - Maglott, Donna
AU - Birtle, Zoë
AU - Marques, Ana C.
AU - Graves, Tina
AU - Zhou, Shiguo
AU - Teague, Brian
AU - Potamousis, Konstantinos
AU - Churas, Christopher
AU - Place, Michael
AU - Herschleb, Jill
AU - Runnheim, Ron
AU - Forrest, Daniel
AU - Amos-Landgraf, James
AU - Schwartz, David C.
AU - Cheng, Ze
AU - Lindblad-Toh, Kerstin
AU - Eichler, Evan E.
AU - Ponting, Chris P.
AU - Muzny, Donna M.
AU - Dugan-Rocha, Shannon
AU - Ding, Yan
AU - Scherer, Steven E.
AU - Buhay, Christian J.
AU - Cree, Andrew
AU - Hernandez, Judith
AU - Holder, Michael
AU - Hume, Jennifer
AU - Jackson, Laronda R.
AU - Kovar, Christie
AU - Lee, Sandra L.
AU - Lewis, Lora R.
AU - Metzker, Michael L.
AU - Narareth, Lynne V.
AU - Sabo, Aniko
AU - Sodergren, Erica
AU - Gibbs, Richard A.
AU - FitzGerald, Michael
AU - Cook, April
AU - Jaffe, David B.
AU - Garber, Manuel
AU - Zimmer, Andrew R.
AU - Pirun, Mono
AU - Russell, Lyndsey
AU - Sharpe, Ted
AU - Chaturvedi, Michael Kamal Kabir
AU - Wilkinson, Jane
AU - LaButti, Kurt
AU - Yang, Xiaoping
AU - Bessette, Daniel
AU - Allen, Nicole R.
AU - Nguyen, Cindy
AU - Nguyen, Thu
AU - Dunbar, Chelsea
AU - Lubonja, Rakela
AU - Matthews, Charles
AU - Liu, Xiaohong
AU - Benamara, Mostafa
AU - Negash, Tamrat
AU - Lokyitsang, Tashi
AU - Decktor, Karin
AU - Piqani, Bruno
AU - Munson, Glen
AU - Tenzin, Pema
AU - Stone, Sabrina
AU - Macdonald, Pendexter
AU - Arachchi, Harindra
AU - Abouelleil, Amr
AU - Lui, Annie
AU - Priest, Margaret
AU - Gearin, Gary
AU - Brown, Adam
AU - Aftuck, Lynne
AU - Shea, Terrance
AU - Sykes, Sean
AU - Berlin, Aaron
AU - Chu, Jeff
AU - Dooley, Kathleen
AU - Hagopian, Daniel
AU - Hall, Jennifer
AU - Hafez, Nabil
AU - Smith, Cherylyn L.
AU - Olandt, Peter
AU - Miller, Karen
AU - Ventkataraman, Vijay
AU - Rachupka, Anthony
AU - Dorris, Lester
AU - Ayotte, Laura
AU - Mabbitt, Richard
AU - Erickson, Jeffrey
AU - Horn, Andrea
AU - An, Peter
AU - Naylor, Jerome W.
AU - Settipalli, Sampath
AU - Lander, Eric S.
AU - Wilson, Richard K.
AU - Graves, Tina A.
AU - Fulton, Robert S.
AU - Rock, Susan M.
AU - Chinwalla, Asif T.
AU - Bernard, Kelly
AU - Courtney, Laura P.
AU - Fronick, Catrina
AU - Fulton, Lucinda L.
AU - O'Laughlin, Michelle
AU - Kremitzki, Colin L.
AU - Minx, Patrick J.
AU - Nelson, Joanne O.
AU - Schatzkamer, Kyriena L.
AU - Strong, Cynthia
AU - Wollam, Aye M.
AU - Weinstock, George M.
AU - Yang, Shiaw Pyng
AU - Rogers, Jane
AU - Grafham, Darren
AU - Humphray, Sean
AU - Nicholson, Christine
AU - Bird, Christine
AU - Brown, Andrew J.
AU - Burton, John
AU - Clee, Chris
AU - Hunt, Adrienne
AU - Jones, Matt C.
AU - Lloyd, Christine
AU - Matthews, Lucy
AU - Mclaren, Karen
AU - Mclaren, Stuart
AU - McLay, Kirsten
AU - Palmer, Sophie A.
AU - Plumb, Robert
AU - Shownkeen, Ratna
AU - Sims, Sarah
AU - Quail, Mike A.
AU - Whitehead, Siobhan L.
AU - Willey, David L.
AU - Deschamps, Stephane
AU - Kenton, Steven
AU - Song, Lin
AU - Do, Trang
AU - Roe, Bruce
AU - Bouffard, Gerard G.
AU - Blakesley, Robert W.
AU - Green, Eric D.
AU - Kucherlapati, Raju
AU - Grills, George
AU - Li, Li
AU - Montgomery, Kate T.
AU - Kramer, Melissa
AU - Speigel, Lori
AU - McCombie, W. Richard
AU - Lucas, Susan
AU - Terry, Astrid
AU - Gordon, Laurie
AU - Stubbs, Lisa
AU - Denny, Paul
AU - Brown, Steve D.M.
AU - Mallon, Anne Marie
AU - Campbell, R. Duncan
AU - Botherby, Marc R.M.
AU - Jackson, Ian J.
AU - Rubenfield, Marc J.
AU - Rogosin, Andrea M.
AU - Smith, Douglas R.
PY - 2009/5
Y1 - 2009/5
N2 - The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non-proteincoding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not.
AB - The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non-proteincoding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not.
UR - http://www.scopus.com/inward/record.url?scp=66249148986&partnerID=8YFLogxK
U2 - 10.1371/journal.pbio.1000112
DO - 10.1371/journal.pbio.1000112
M3 - Article
C2 - 19468303
AN - SCOPUS:66249148986
VL - 7
JO - PLoS Biology
JF - PLoS Biology
SN - 1544-9173
IS - 5
M1 - e1000112
ER -