Toshie Hatano, Yasuo Horiuchi, Akira Ichikawa
EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology 149-152 2003年1月1日
In this study, we introduced a new model of how a human understands speech in real time and performed a cognitive experiment to investigate the unit for processing and understanding speech. In the model, first humans segment the acoustical signal into some acoustical units, and then the mental lexicon is accessed and searched for the segmented units. For this segmentation, we believe that prosody information must be used. In order to investigate how humans segment acoustical speech using only prosody, we performed an experiment in which participants listened to a pair of segmented speech materials, where each material was divided from the same speech material where the two segmentation positions differed from each other, and judged which material sounded more natural. On the basis of the results of this experiment, it is suggested that humans tend to segment speech based on the accent rules of Japanese, and that the introduced model is supported.