現在の音声認識は,実使用環境に存在する雑音などの外的要因により性能劣化を免れない.このため,これまで数々の研究が行われてきた.しかしながら,異なるタスク,異なる評価データが用いられてきたため性能の比較が非常に困難であった.このため,情報処理学会音声言語情報処理研究会の下に雑音下音声認識評価のワーキンググループを2001年10月に組織し、評価用標準コーパス、標準バックエンドの作成、配布を行ってきた。本稿では,本活動の現状と今後の予定,狙いについて述べる.Performance degradation by environmental interference such as noise and reverberation is inevitable for the current state of the art speech recognition. So far there have been many researches to overcome this problem. However, it has been very difficult to know actual improvements and compare those methods since those methods were developed for individual tasks and on different corpus. To overcome these problems, we organized a working group under Information Processing Society of Japan. This paper introduces current activities and a future road-map of a common standardized framework for noisy speech recognition by the working group organized by the authors.
近年の情報処理技術の発達に伴い,情報処理の分野ではあまり取り扱われることの無かった人間の感性をコンピュータで処理する研究が盛んになってきている.擬人化エージェントや感性ロボットが人のように振舞うためには,人間の感性を認識し,自らの感情を表出することが必要である.感情を認識し,表出する感性ロボットには,ifbotなどがある.我々は,感性ロボットに応用するための感情認識技術について研究している.しかし,感情認識の研究は始まったばかりであり,感情認識のために利用できる言語コーパスが少ない.また、そのようなコーパスは人手により作成する必要があるが,感情情報の付与手法やデータのフォーマットなどが統一されておらず,コーパスの構築を行い研究を進めるための環境としては不十分だと考えられる.我々は,感性情報処理の研究のための言語コーパスの作成を支援するシステムの開発を行っている.本稿では感情コーパス作成支援システムの開発概要について述べる.In recent years, computer automation have developed in various types of industries, making research about processing human sensibility more active. Emotion recognition and expression technologies are needed to create anthropomorphic agents and sensibility robots that behave like humans. The "ifbot" is an example of a sensibility robot which expresses emotions and recognizes emotions. However, language corpora for emotion recognition are small because emotion recognition is still in the primitive stage of research. We need to construct emotion corpora manually in order to progress the research efficiently, but there doesn't exist a unified format or methods for constructing such emotion corpora. We are developing a support system for constructing a large emotion corpus. In this paper, we propose a system which supports making a natural language corpus of tagged emotion information and describe the outline of the system development.
情報検索の基本要素となるキーワードは、ドキュメントの探索から記述にわたってあらゆることに使われている。典型的に、キーワード抽出のアルゴリズムでは、キーワード抽出するため、ドキュメントの収集が必要とされる。ドキュメント収集なしのキーワード抽出は重要性を獲得することである。この問題に関しては既に研究されている。しかし、二つの難題が残されている。一つは、キーワードの質は情報検索作業でどれほど機能するかという点に基づいていないのである。もう一つは、キーワードは一つの言語に特定されているのである。本稿では、多言語に適用でき、しかも、有効的にキーワードを抽出できる新しいアルゴリズムを提案した。Keywords are a fundamental part of information retrieval. Keywords are used for everything from searching to describing a document. Typically, algorithms for keyword extraction require a document collection in order to extract keywords. Extracting keywords without a document collection is gaining importance. Research has been done to deal with the problem. However, there are two problems 1) the quality of the keywords was not based on how well they perform in IR tasks and 2) they were designed for only one language. This paper proposes a new algorithm that is applicable to multiple languages and extracts effective keywords.
中日機械翻訳における数量詞の処理は常に誤りを引き起こす.本研究では それらの文法特徴に基づき量詞を分類して処理する方法を提案する.まず 中日対訳コーパスから収集した数量詞の例文を形態素解析して 得られた量詞の種類と数量詞に修飾される名詞の語義特徴を統計して 異なる数量詞と出現する位置の異なりなどにより 機械翻訳における数量詞の翻訳規則を獲得した.翻訳実験システムは2つのモジュールによって構成され 一つはこの数量詞が翻訳するがどうかを確認し.もう一つは この数量詞が翻訳する場合 翻訳形式を選定するのである.得られた翻訳規則を利用して中日数量詞の機械翻訳の評価実験を行った.最後に 実験データの適応性を検証し 提案した方法の有効性を論証した.Quantifiers and numerals often give rise to trouble in Chinese-Japanese machine translation. In this paper, an approach is proposed based on the syntactic features after classification. First, morphological analysis is performed on the sentences extracted from a Chinese-Japanese aligned corpus, which consists of quantifiers and numerals. Next, statistical information is obtained based on the word meaning of the noun that has an accompanying quantifier. Using the difference in quantifier type and position between Chinese and Japanese, quantifier translation rules were acquired. The translation and experiment system is made up of 2 modules. One is to check the quantifier translation and the other is to select the correct translation rule. The evaluation experiment was conducted using the acquired translation rules. Finally, the adaptability of the experimental data is verified and the validity of the proposed method is proven.
日本語の使役表現のX(使役者)がY(被使役者)に/をVさせるにおいて,「させる」が動詞の未然形に下接する.中国語の使役表現はX(使役者)「叫」「?」「使」Y(被使役者)Vのような形で表され,使役詞「叫」「?」「使」と動詞がセットになって「させる」という意味になる.機械翻訳において,中国語の使役表現が「使役詞+動詞」で表現されるのを正しく認識できなければ,日本語に訳す時に大きな障害になる.本論では,教科書及びホームページから大量の実例文を選出し,使役表現および関連情報を抽出し,その情報を分析し,使役表現の特徴などの検討によって,中日機械翻訳における使役表現の翻訳規則を提案する.In Japanese, a causative sentence is expressed as X GA Y NI VSASERU. And Causative sentences in Chinese are expressed as XJIAOVY, XRANGVY or XSHIVY.Chinese Causatives JAIO RANG and SHI are used together with a verb to express SASERU in Japanese. A big obstacle in Chinese-Japanese Machine Translation, is caused by if the causative expression "causative + verb" in Chinese not being recognized. In this research, rules of translating causative expressions in Chinese-Japanese machine translation are proposed by extracting causative expressions and related information from a large amount of examples taken from books and websites and by analyzing and evaluating the features of causative expressions.
SFに基づく機械翻訳はコーパスベースの翻訳手法であり,構文解析や意味解析を必要としない.そのため処理が高速であるという特徴がある.またSFはコーパスから作成されるため訳文が非常に自然である.本研究では,SFを用いた機械翻訳システムをできるだけ多くのユーザに評価してもらうために,翻訳システムをWeb上に構築した.本稿では,構築した翻訳システムの構成について述べると共に,構築の段階で明らかとなった問題について考察する.さらに問題点を解決するための方法について提案する.Super-Function based machine translation is a corpus-based translation method. This method uses Super-Function (SF) to translate without thorough syntactic and semantic analysis as most MT systems do. Therefore translation speed is very fast and translation results are very fluent, because SF is created from a bilingual corpus. In this research, the translation system was built using web technologies in order to have as many users as possible evaluate SF based machine translation. In this paper, we describe the structure of the built translation system and consider the problems which became clear in the process of construction of the system. Furthermore, we propose methods for solving these problems.
構文解析後、分の意味構造を決定するのは重要である。本稿では、Penn Chinese Treebank のために意味的な依存構造を自動的に付与する方法を提案する。まず手動で主辞と意味的依存関係を付与しテストデータを作成する。その後、異なるフィチャーのもとで、二つの教師つき機械学習アルゴリズムをデータに適用し,意味関係を推定する。最後に,中国語の特徴に基づき優先規則を作成し、元コーパスの中に問題がある木構造に対して曖昧性解消を行う。評価実験の結果によると、提案したアルゴリズムが中国語の意味的な依存構造を決定するには有効である。After parsing it is difficult to determine the semantic structure of sentences for Chinese sentences. In this paper, we attempt to automatically annotate the Penn Chinese Treebank with semantic dependency structure. Initially a small portion of the Penn Chinese Treebank was manually annotated with headword and semantic dependency relations. Two supervised machine learning algorithms with varying features were then adopted to learn the relations. Finally, a set of preferences rules were created based on features of Chinese to solve some problem patterns that were found in the Penn Chinese Treebank dealing with ambiguous structures. The experimental results show that the algorithms and proposed approach are effective for determining semantic dependency structure automatically.
In recent years, IP telephone use has spread rapidly thanks to the development of VoIP (Voice overIP) technology. However, an unavoidable problem of the IP telephone is deterioration of speech due to packetloss, which often occurs on the wireless network. To overcome this problem, we propose a novel packet loss concealmentalgorithm using speech recognition and synthesis. This proposed method uses linguistic informationand can deal with the lack of syllable units which conventional methods are unable to handle. We conductedsubjective and objective evaluation experiments. These results showed the effectiveness of the proposed method.Although there is a processing delay in the proposed method, we believe that this method will open up newapplications for speech recognition and speech synthesis technology.
本稿では,SLP雑音下音声認識評価ワーキンググループの活動成果として,自動車内音声認識の評価用データベースCENSREC-3と,標準評価スクリプトによるベースライン評価結果について述べる.CENSREC-3の音声認識タスクは,実走行車内での孤立単語音声認識であり,音声データの収録は,接話マイクロホンと遠隔マイクロホンの2種類を用いて,3種類の走行速度と6種類の車内環境を組み合わせた16種類の環境下で行っている.CENSREC-3では,これら様々な環境したで収録された音声データを用いた6種類の評価環境を提供する.This paper introduces a common database, an evaluation framework, and its baseline recognition results for in-car speech recognition, CENSREC-3, as an outcome of IPSJ-SIG SLP Noisy Speech Recognition Evaluation Working Group. CENSREC-3 which is a sequel of AURORA-2J is designed as the evaluation framework of isolated word recognition in real driving car environments. Speech data was collected using 2 microphones, a close-talking microphone and a hands-free microphone, under carefully controlled 16 different driving conditions, i.e., combinations of 3 car speeds and 5 car conditions. CENSREC-3 provides 6 evaluation environments which are designed using speech data collected in these
In this paper, we propose to use the Simple Principal Component Analysis (SPCA) for dimensionality reduction of the vector space information retrieval model. The SPCA algorithm is a data-oriented fast method which does not require the computation of the variance-covariance matrix. In SPCA, principal components are estimated iteratively so we also propose a criteria to determine the convergence. The optimum number of iterations for each principal component can be determined using the criteria. Experimentally, we show that the SPCA-based method offers improvement over the conventional SVD-based method despite its small amount of computation. This advantage of SPCA can be attributed to its iterative procedure which is similar to clustering methods such as <i>k</i>-means clustering. On the other hand, the proposed method which orthogonalizes the basis vectors also achieved much higher accuracy than the conventional random projection method based on <i>k</i>-means clustering.
In order to meet the demand to acquire necessary information efficiently from large electronic text, the Question and Answering (QA) technology to show a clear reply automatically to a question asked in the user's natural language has widely attracted attention in recent years. Although the research of QA system in China is later than that in western countries and Japan, it has attracted more and more attention recently. In this paper, we propose a Question-Answering construction, which synthesizes the answer retrieval to the questions asked most frequently based on common knowledge, and the document retrieval concerning sightseeing information. In order to improve reply accuracy, one must consider the synthetic model based on statistic VSM and the shallow semantic analysis, and the domain is limited to sightseeing information. A Chinese QA system about sightseeing based on the proposed method has been built. The result is obtained by evaluation experiments, where high accuracy can be achieved when the results of retrieval were regarded as correct, if the correct answer appeared among those of the top three resemblance degree. The experiments proved the efficiency of our method and it is feasible to develop Question-Answering technology based on this method.
This paper reports an evaluation of European Telecommunications Standards Institute (ETSI) standard Distributed Speech Recognition (DSR) front-end through continuous speech recognition on a Japanese speech corpus and proposes methods, the Bias Removal Methods (BRMs), that reduce the distortion between feature parameters and the VQ codebook. Experimental results show that (1) using non-quantized features in an acoustic model training procedure can improve the recognition performance of DSR front-end features and (2) broadening the analysis band can improve the recognition performance for the same bitrate. The proposed method can improve the recognition performance in DSR condition. Notably, we observed an 18% relative improvement in the error rate using the proposed method under mismatch of channel characteristic conditions.