In order to expand our knowledge of the soybean genome and to create a useful DNA repeat sequence database, over 24 000 DNA fragments from a soybean [Glycine max (L.) Merr.] cv. Williams 82 genomic shotgun library were sequenced. Additional sequences came from over 29 000 bacterial artificial chromosome (BAC) end sequences derived from a BstI library of the cv. Williams 82 genome. Analysis of these sequences identified 348 different DNA repeats, many of which appear to be novel. To extend the utility of the work, a pilot study was also conducted using methylation filtration to estimate the hypomethylated, soybean gene space. A comparison between 8366 sequences obtained from a filtered library and 23 788 from an unfiltered library indicate a gene-enrichment of ∼3.2-fold in the hypomethylated sequences. Given the 1.1-Gb soybean genome, our analysis predicts a ∼343-Mb hypomethylated, gene-rich space.