Nucleotide sequences for long terminal repeat (LTR), gag, the protease gene, and pol of a human T-lymphotropic virus type 1 (HTLV-1) isolate of probable Caribbean origin (HTLV-1CH) and a Zairian isolate (HTLV-1EL) were determined providing complete proviral sequences for these isolates. These sequences were compared with those available from previously analyzed isolates. Nucleotide sequence differences of 1.2-3.3% were identified among isolates for which complete genetic information was available. Nucleotide sequence diversity was distributed relatively evenly over the genome with 1.3-5.2% differences in the LTR, 1.1-2.9% differences in gag, 0.7-2.1% differences in the protease gene, 0.9-2.5% differences in pol, 0.9-2.4% differences in env, 0.0-1.4% differences in rex, and 0.1-2.6% differences in tax. There were 1.2-2.3% amino acid differences overall, with 0.8-1.6% nonconservative amino acid alterations. Nucleotide differences were not found in regions of the LTR which are important for transcriptional activity or Tax response. Within the Rex-response element, nucleotide differences were found predominantly in loop rather than stem structures, thus, maintaining the overall secondary structure necessary for Rex activity. Evolutionary tree analysis of the sequence differences suggests a predominant clustering of different HTLV1 strains according to geographical origin. An open reading frame was also identified on the minus DNA strand situated between the env and rex/tax genes which exhibits 0.1-6.9% nucleotide sequence variation among HTLV1 strains. The limited sequence variation among HTLV-1 strains is in striking contrast to the extensive heterogeneity seen among human immunodeficiency virus (HIV) strains.