TY - JOUR
T1 - Plosive/fricative distinction
T2 - The voiceless case
AU - Weigelt, La Deana F.
AU - Sadoff, Steven J.
AU - Miller, James D.
PY - 1990/6
Y1 - 1990/6
N2 - Using only three measures of the waveform, the zero-crossing rate, the logarithm of the root-mean-square (rms) energy, and the derivative of the log rms energy with respect to time [termed rate of rise (ROR)], voiceless plosives (including affricates) can be distinguished from voiceless fricatives in word-initial, medial, and final positions. Peaks in the ROR contour are considered for significance to the plosive/fricative distinction by examining the log rms energy and zero-crossing rate. Then, the magnitude of the first significant peak in the ROR contour is used as the primary classifier. The algorithm was tested on 1364 tokens (720 word-initial tokens produced by four female and four male speakers; 360 word-medial tokens produced by two males and two females; 320 word-final tokens produced by two males and two females). Data from two male and two female speakers (360 word-initial tokens) were used as a training set, and the remaining data were used as a test set. The overall rate of correct classification was 96.8%. Implications of this result are discussed.
AB - Using only three measures of the waveform, the zero-crossing rate, the logarithm of the root-mean-square (rms) energy, and the derivative of the log rms energy with respect to time [termed rate of rise (ROR)], voiceless plosives (including affricates) can be distinguished from voiceless fricatives in word-initial, medial, and final positions. Peaks in the ROR contour are considered for significance to the plosive/fricative distinction by examining the log rms energy and zero-crossing rate. Then, the magnitude of the first significant peak in the ROR contour is used as the primary classifier. The algorithm was tested on 1364 tokens (720 word-initial tokens produced by four female and four male speakers; 360 word-medial tokens produced by two males and two females; 320 word-final tokens produced by two males and two females). Data from two male and two female speakers (360 word-initial tokens) were used as a training set, and the remaining data were used as a test set. The overall rate of correct classification was 96.8%. Implications of this result are discussed.
UR - http://www.scopus.com/inward/record.url?scp=0025145947&partnerID=8YFLogxK
U2 - 10.1121/1.399063
DO - 10.1121/1.399063
M3 - Article
C2 - 2373806
AN - SCOPUS:0025145947
SN - 0001-4966
VL - 87
SP - 2729
EP - 2737
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 6
ER -