The first Japanese genome
Nature Genetics | doi:10.1038/ng.691
Whole-genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing
Akihiro Fujimoto1,2, Hidewaki Nakagawa1, Naoya Hosono1, Kaoru Nakano1, Tetsuo Abe1, Keith A Boroevich1, Masao Nagasaki3, Rui Yamaguchi3, Tetsuo Shibuya3, Michiaki Kubo1, Satoru Miyano2,3, Yusuke Nakamura1,3 & Tatsuhiko Tsunoda1,2
We report the analysis of a Japanese male using high-throughput sequencing to ×40 coverage. More than 99% of the sequence reads were mapped to the reference human genome. Using a Bayesian decision method, we identified 3,132,608 single nucleotide variations (SNVs). Comparison with six previously reported genomes revealed an excess of singleton nonsense and nonsynonymous SNVs, as well as singleton SNVs in conserved non-coding regions. We also identified 5,319 deletions smaller than 10 kb with high accuracy, in addition to copy number variations and rearrangements. De novo assembly of the unmapped sequence reads generated around 3 Mb of novel sequence, which showed high similarity to non-reference human genomes and the human herpesvirus 4 genome. Our analysis suggests that considerable variation remains undiscovered in the human genome and that whole-genome sequencing is an invaluable tool for obtaining a complete understanding of human genetic variation.