黒岩眞吾

クロイワシンゴ (Shingo Kuroiwa)

基本情報

所属: 千葉大学大学院工学研究院教授

学位: 博士(電気通信大学大学院電気通信学研究科電子工学専攻)

研究者番号: 20333510
J-GLOBAL ID: 200901017262764603
researchmap会員ID: 1000356498

外部リンク: http://www.ailab.tj.chiba-u.jp/~kuroiwa/

研究キーワード

研究分野

経歴

2007年10月 - 現在

千葉大学大学院工学研究院教授

受賞

2017年4月

2017年電気通信大学同窓会賞, 音声認識システムの実用化，失語症の方向けのコミュニケーション支援機器の開発等で大きく社会に貢献一般社団法人目黒会

黒岩眞吾
2017年3月

千葉エリア産学官連携オープンフォーラム2016千葉大学長賞（優秀賞）ロボットやタブレットを活用した『失語症者向け言語訓練システム』千葉大学

黒岩眞吾
1997年

第５回（平成９年度）技術開発賞日本音響学会

黒岩眞吾, 中村誠, 山本誠一, 酒寄信一, 武笠貴史, 藤岡雅宣, 阿部信子
1997年

社長表彰(業務改善) 国際電信電話株式会社

黒岩眞吾, 中村誠, 山本誠一, 酒寄信一, 武笠貴史, 藤岡雅宣, 阿部信子
1997年

平８年度電子情報通信学会学術奨励賞電子情報通信学会

山本誠一, 武田一哉, 井ノ上直己, 黒岩眞吾

もっとみる

論文

136

Text-Independent Speaker Verification Using Rank Threshold in Large Number of Speaker Models

Haruka Okamoto, Satoru Tsuge, Amira Abdelwahab, Masafumi Nishida, Yasuo Horiuchi, Shingo Kuroiwa

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 2319-+ 2009年査読有り

In this paper, we propose a novel speaker verification method which determines whether a claimer is accepted or rejected by the rank of the claimer in a large number of speaker models instead of score normalization, such as T-norm and Z-norm. The method has advantages over the standard T-norm in speaker verification accuracy. However, it needs much computation time as well as T-norm that needs calculating likelihoods for many cohort models. Hence, we also discuss the speed-up using the method that selects cohort subset for each target speaker in the training stage. This data driven approach can significantly reduce computation resulting in faster speaker verification decision. We conducted text-independent speaker verification experiments using large-scale Japanese speaker recognition evaluation corpus constructed by National Research Institute of Police Science. As a result, the proposed method achieved an equal error rate of 2.2 %, while T-norm obtained 2.7 %.
Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem

Amira Abdelwahab, Hiroo Sekiya, Ikuo Matsuba, Yasuo Horiuchi, Shingo Kuroiwa

iiWAS2009 - The 11th International Conference on Information Integration and Web-based Applications and Services 375-379 2009年査読有り

Collaborative filtering (CF) is one of the most popular recommender system technologies. It tries to identify users that have relevant interests and preferences by calculating similarities among user profiles. The idea behind this method is that, it may be of benefit to one's search for information to consult the preferences of other users who share the same or relevant interests and whose opinion can be trusted. However, the applicability of CF is limited due to the sparsity and cold-start problems. The sparsity problem occurs when available data are insufficient for identifying similar users (neighbors) and it is a major issue that limits the quality of recommendations and the applicability of CF in general. Additionally, the cold-start problem occurs when dealing with new users and new or updated items in web environments. Therefore, we propose an efficient iterative prediction technique to convert user-item sparse matrix to dense one and overcome the cold-start problem. Our experiments with MovieLens and book-crossing data sets indicate substantial and consistent improvements in recommendations accuracy compared with item-based collaborative filtering, singular value decomposition (SVD)-based collaborative filtering and semi explicit rating collaborative filtering. © 2010 ACM.
Fuzzy Cluster Analysis and its Evaluation Method(<Special Issue>BIOMETRICS AND ITS APPLICATIONS)

Kuroiwa Shingo, Tsuge Satoru, Ren Fuji

International Journal of Biomedical Soft Computing and Human Sciences: the official journal of the Biomedical Fuzzy Systems Association 14(1) 3-10 2009年

Recently, Distributed Speech Recognition (DSR) systems are widely deployed in Japanese cellular telephone networks. In these systems, personal authentication with voice is strongly desired. In this paper, we present several speaker recognition techniques developed in the University of Tokushima for Distributed Speaker Identification/Verification (DSI/DSV) systems. Especially, we present recent progress on a non-parametric speaker recognition system that is robust to quantization in the distributed systems comparing with conventional speaker recognition systems based on Gaussian Mixture Model (GMM). Evaluation results using the Japanese de facto standard speaker recognition corpus and CCC Speaker Recognition Evaluation 2006 data developed by the Chinese Corpus Consortium (CCC) show higher performance of the proposed method than GMM and VQ-distortion in the European Telecommunications Standards Institute (ETSI) DSR standard environment.
Speaker verification method using bone-conduction and air-conduction speech

Satoru Tsuge, Daichi Koizumi, Minoru Fukumi, Shingo Kuroiwa

2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS 2009) 449-452 2009年査読有り

Recently, some new sensors, such as bone-conductive microphones, throat microphones, and non-audible murmur (NAM) microphones, besides conventional condenser microphones have been developed for collecting speech data. Accordingly, some researchers began to study speaker and speech recognition using speech data collected by these new sensors. We focus on bone-conduction speech data collected by the bone-conductive microphone. In this paper, we first investigate speaker verification performances of bone-conduction speech. In addition, we propose a method of using bone-conduction speech and air-conduction together for the speaker verification. The proposed method integrates the similarity calculated by air-conduction speech model and similarity calculated by bone-conduction speech model. Using 99 female speakers' speech data, we conducted speaker verification experiments. Experimental results show that the speaker verification performance of bone-conduction is lower than that of air-conduction speech. However, the proposed method can improve the speaker verification performance of bone- and air-conduction speech. Actually, the proposed method can reduce the equal error rate of air-conduction speech by 16.0% and the equal error rate of bone-conduction speech by 71.7%.
CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments

Norihide Kitaoka, Takeshi Yamada, Satoru Tsuge, Chiyomi Miyajima, Kazumasa Yamamoto, Takanobu Nishiura, Masato Nakayama, Yuki Denda, Masakiyo Fujimoto, Tetsuya Takiguchi, Satoshi Tamura, Shigeki Matsuda, Tetsuji Ogawa, Shingo Kuroiwa, Kazuya Takeda, Satoshi Nakamura

Acoustical Science and Technology 30(5) 363-371 2009年査読有り

Voice activity detection (VAD) plays an important role in speech processing including speech recognition, speech enhancement, and speech coding under noisy environments. We have developed an evaluation framework for VAD under noisy environments, named CENSREC-1-C. We designed this framework for simple isolated utterance detection and hence, this framework consists of noisy continuous digit utterances and evaluation tools for VAD results. We define two evaluation measures, one for frame-level detection performance and the other for utterance-level detection performance. We also provide the evaluation results of a power-based VAD method as a reference. ©2009 The Acoustical Society of Japan.
Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments - Newest Part of the CENSREC Series -

Takanobu Nishiura, Masato Nakayama, Yuki Denda, Norihide Kitaoka, Kazumasa Yamamoto, Takeshi Yamada, Satoru Tsuge, Chiyomi Miyajima, Masakiyo Fujimoto, Tetsuya Takiguchi, Satoshi Tamura, Shingo Kuroiwa, Kazuya Takeda, Satoshi Nakamura

SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 1828-1834 2008年査読有り

Recently, speech recognition performance has been drastically improved by statistical methods and huge speech databases. Now performance improvement under such realistic environments as noisy conditions is being focused on. Since October 2001, we from the working group of the Information Processing Society in Japan have been working on evaluation methodologies and frameworks for Japanese noisy speech recognition. We have released frameworks including databases and evaluation tools called CENSREC-1 (Corpus and Environment for Noisy Speech RECognition 1; formerly AURORA-2J), CENSREC-2 (in-car connected digits recognition), CENSREC-3 (in-car isolated word recognition), and CENSREC-1-C (voice activity detection under noisy conditions). In this paper, we newly introduce a collection of databases and evaluation tools named CENSREC-4, which is an evaluation framework for distant-talking speech under hands-free conditions. Distant-talking speech recognition is crucial for a hands-free speech interface. Therefore, we measured room impulse responses to investigate reverberant speech recognition. The results of evaluation experiments proved that CENSREC-4 is an effective database suitable for evaluating the new dereverberation method because the traditional dereverberation process had difficulty sufficiently improving the recognition performance. The framework was released in March 2008, and many studies are being conducted with it in Japan.
Sign Language Recognition Based on Position and Movement Using Multi-Stream HMM

Masaru Maebatake, Iori Suzuki, Masafumi Nishida, Yasuo Horiuchi, Shingo Kuroiwa

PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON UNIVERSAL COMMUNICATION 478-481 2008年査読有り

In sign language, hand positions and movements represent meaning of words. Hence, we have been developing sign language recognition methods using both of hand positions and movements. However, in the previous studies, each feature has same weight to calculate the probability for the recognition. In this study, we propose a sign language recognition method by using a multi-stream HMM technique to show the importance of position and movement information for the sign language recognition. We conducted recognition experiments using 21,960 sign language word data. As a result, 75.6% recognition accuracy was obtained with the appropriate weight (position:movement=0.2:0.8), while 70.6% was obtained with the same weight. From the result, we can conclude that the hand movement is more important for the sign language recognition than the hand position. In addition, we conducted experiments to discuss the optimal number of the states and mixtures and the best accuracy was obtained by the 15 states and two mixtures for each word HMM.
Combination method of Bone-conduction Speech and Air-conduction Speech for Speaker Recognition

Satoru Tsuge, Takashi Osanai, Hisanori Makinae, Toshiaki Kamada, Minoru Fukumi, Shingo Kuroiwa

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 1929-+ 2008年査読有り

Recently, some new sensors, such as bone-conductive microphones, throat microphones, and non-audible murmur (NAM) microphones, besides conventional condenser microphones have been developed for collecting speech data. Accordingly, some researchers began to study speaker and speech recognition using speech data collected by these new sensors. We focus on bone-conduction speech data collected by the bone-conductive microphone. This paper proposes a novel speaker identification method which combines "bone-conduction speech" and "air-conduction speech". The proposed method conducts speaker identification by integrating the similarity calculated by air-conduction speech model and similarity calculated by bone-conduction speech model. For evaluating the proposed method, we conduct the speaker identification experiment using part of a large bone-conduction speech corpus constructed by National Research Institute of Police Science, Japan (NRIPS). Experimental results show that the proposed method can reduce a identification error rate of air-conduction speech and bone-conduction speech. Especially, the proposed method achieves that the average error reduction rate from air-conduction speech to the proposed method is 35.8%.
CENSREC-4: Development of Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments

Masato Nakayama, Takanobu Nishiura, Yuki Denda, Norihide Kitaoka, Kazumasa Yamamoto, Takeshi Yamada, Satoru Tsuge, Chiyomi Miyajima, Masakiyo Fujimoto, Tetsuya Takiguchi, Satoshi Tamura, Tetsuji Ogawa, Shigeki Matsuda, Shingo Kuroiwa, Kazuya Takeda, Satoshi Nakamura

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 968-+ 2008年査読有り
A Method for Automatically Estimating F0 Model Parameters and A Speech Re-Synthesis Tool Using F0 Model and STRAIGHT

Shota Sato, Taro Kimura, Yasuo Horiuchi, Masafumi Nishida, Shingo Kuroiwa, Akira Ichikawa

INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 545-+ 2008年査読有り

In this paper, we describe a speech re-synthesis tool using the fundamental frequency (F0) generation model proposed by Fujisaki et al. and STRAIGHT, designed by Kawahara, which can be used for listening experiments by modifying F0 model parameters. To create the tool, we first established a method for automatically estimating F0 model parameters by using genetic algorithms. Next, we combined the proposed method and STRAIGHT. We can change the prosody of input speech by manually modifying the F0 model parameters with the tool and evaluate the relation between human perception and F0 model parameters. We confirmed the ability of this tool to make natural speech data that have various prosodic parameters.
Japanese Emotion Corpus Analysis and its Usefor Automatic Emotion Word Identification.

Junko Minato, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Engineering Letters 16(1) 172-177 2008年査読有り
The Creation of a Chinese Emotion Ontology Based on HowNet.

Jiajun Yan, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Engineering Letters 16(1) 166-171 2008年査読有り
A Low Cost Machine Translation Method for Cross-Lingual Information Retrieval.

David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Engineering Letters 16(1) 160-165 2008年査読有り
A method of positioning unknown words in an existing thesaurus based on an association mechanism

Seiji Tsuchiya, Noriyuki Okumura, Hirokazu Watabe, Tsukasa Kawaoka, Fuji Ren, Shingo Kuroiwa

Proceedings of the IASTED International Conference on Artificial Intelligence and Applications, AIA 2008 121-125 2008年

Our research aims to develop a robot which can make smooth conversation with human beings. Such robot needs to have abilities to understand and interpret words. Currently, a technique based on a large-scale language dictionary or a corpus is predominantly mostly used in the field of the language processing. As quite a lot of costs, resources and time are necessary to create such linguistic capital, automatic construction technique is also being researched. However, the knowledge in a category of common sense is inherent to human and difficult to construct automatically although it is indispensable knowledge for robot to realize conversation with human beings without sense of unease. In this paper, we propose a technique which contributes to semiautomatic construction of a large-scale language dictionary and a corpus. Concretely, a system using the proposed technique indicates a position of an unknown word to be registered in an existing thesaurus dictionary. The proposed technique was able to improve approximately 20% in accuracy compared with the traditional techniques.
Improving parsing of 'BA' sentences for machine translation

Dapeng Yin, Min Shao, Fuji Ren, Shingo Kuroiwa

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING 3(1) 106-112 2008年1月査読有り

The research on Chinese-Japanese machine translation has been lasting for many years, and now this research field is increasingly thoroughly refined. In practical machine translation system, the processing of a simple and short Chinese sentence has somewhat good results. However, the translation of complex long Chinese sentence still has difficulties. For example, these systems are still unable to solve the translation problem of complex 'BA' sentences. In this article a new method of parsing of 'BA' sentence for machine translation based on valency theory is proposed. A 'BA' sentence is one that has a prepositional word 'BA'. The structural character of a 'BA' sentence is that the original verb is behind the object word. The object word after the 'BA' preposition is used as an adverbial modifier of an active word. First, a large number of grammar items from Chinese grammar books are collected, and some elementary judgment rules are set by classifying and including the collected grammar items. Then, these judgment rules are put into use in actual Chinese language and are modified by checking their results instantly. Rules are checked and modified by using the statistical information from an actual corpus. Then, a five-segment model used for 'BA' sentence translation is brought forward after the above mentioned analysis. Finally, we applied this proposed model into our developed machine translation system and evaluated the experimental results. It achieved a 91.3% rate of accuracy and the satisfying result verified effectiveness of our five-segment model for 'BA' sentence translation. (C) 2007 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
Empathetic Understanding as Caring in Nursing Using Electroencephalographic Data as Evidence

Kyoko Osaka, Tetsuya Tanioka, Shuichi Ueno, Chiemi Kawanishi, Toshiko Tada, Shingo Kuroiwa, Fuji Ren

International Journal for Human Caring Vol.12(No.1) 7-16 2008年1月査読有り

We presume that the measurement of electroencephalographic (EEG) changes, those activities that are considered physiological indicators, enables an objective understanding of changes in emotions of those who have difficulty in expressing these through facial expression or physical action. Generally, EEG is used in the hospital to examine encephalopathy and brain disorder. Using an electroencephalograph device to acquire digital data we propose a method to objectively capture changes in the recognition state of people from changes in EEG activities (action potential), and a way to apply it into a clinical situation.
The technique of emotion recognition based on electroencephalogram

Kyoko Osaka, Seiji Tsuchiya, Fuji Ren, Shingo Kuroiwa, Tetsuya Tanioka, Rozzano C. Locsin

INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL 11(1) 55-68 2008年1月査読有り

We aim to develop the mechanism that is possible to sympathize with man, and we target at the thing to read feelings which man feels from the brain wave. This time, as an initial stage of the research how the subject is able to make judgments to be impressed from the brain wave is verified. Concretely, it is investigated by using electroencephalograph (EEG) that the brain is an active state when the subject own declaring to have been impressed. Three kinds of evaluation method are used for this research. One method is statistically evaluated based on the strength of potential. Other method is evaluated objective based on the place which brain waves activate. Another method is evaluated by comparing a subject's subjectivity with change of EEG. Subjects are two persons and a small number this time, and since those attributes are partial, a question remains in the justification of a result. However, it is also the fact which becomes clear from this result that a subject's impression condition can fully be judged from the activity state of brain waves.
A Construction of Emotion Thesaurus Basing on Chinese Character and Empirical Knowledge

Yu Zhang, Zhuoming Li, Fuji Ren, Shingo Kuroiwa

Research in Computing Science Vol.32 330-340 2007年11月査読有り

There have been some studies about spoken natural language dialog, and most of them have successfully been developed within the specified domains. However, current human-computer interfaces only get the data to process their programs. Aiming at developing an affective dialog system, we have been exploring how to incorporate emotional aspects of dialog into existing dialog processing techniques. As a preliminary step toward this goal, we work on making a Chinese emotion classification model which is used to recognize the main affective attribute from a sentence or a text. Finally we have done experiments to evaluate our model.
Color Features Based Speaking Detection with Hidden Markov Model in Video Sequences

Peilin Jiang, Ran Li, Fuji Ren, Shingo Kuroiwa, Nanning Zheng

Research in Computing Science Vol.32 374-381 2007年11月査読有り

The Human Computer Interface Technology has faced challenges of understanding user's mind actively. In the ￣rst, the speak detection is a primary technique in applications of human computer interface(HCI) and other applications like surveillance system, video conferenceand multimedia data base management in computer vision and speechrecognition. This paper describes a novel method to detect speaker witha probabilistic model of behavior of speaking. After human face recognition, the especial components under the nonlinear transformation incolor space of lip represent the speci￣c mouth region and then combine the groups of coherent motions . Next the simple movements in themouth region are modeled by hidden Markov models. The experimentalresults demonstrate that the model representing speaking is e±ciencyand successful in applying to driver video surveillance system.
Sentence alignment using P-NNT and GMM

Mohamed Abdel Fattah, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

COMPUTER SPEECH AND LANGUAGE 21(4) 594-608 2007年10月査読有り

Parallel corpora have become an essential resource for work in multilingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross-language information retrieval and machine translation applications. In this paper, we present two new approaches to align English-Arabic sentences in bilingual parallel corpora based on probabilistic neural network (P-NNT) and Gaussian mixture model (GMM) classifiers. A feature vector is extracted from the text pair under consideration. This vector contains text features such as-length, punctuation score, and cognate score values. A set of manually prepared training data was assigned to train the probabilistic neural network and Gaussian mixture model. Another set of data was used for testing. Using the probabilistic neural network and Gaussian mixture model approaches, we could achieve error reduction of 27% and 50%, respectively, over the length based approach when applied on a set of parallel English-Arabic documents. In addition, the results of (P-NNT) and (GMM) outperform the results of the combined model which exploits length, punctuation and cognates in a dynamic framework. The GMM approach outperforms Melamed and Moore's approaches too. Moreover these new approaches are valid for any languages pair and are quite flexible since the feature vector may contain more, less or different features, such as a lexical matching feature and Hanzi characters in Japanese-Chinese texts, than the ones used in the current research. (c) 2007 Elsevier Ltd. All rights reserved.
Chinese semantic dependency analysis: Construction of a treebank and its use in classification

Jiajun Yan, David B. Bracewell, Shingo Kuroiwa, Fuji Ren

ACM Transactions on Speech and Language Processing 4(2) 5 2007年5月1日査読有り

Semantic analysis is a standard tool in the Natural Language Processing (NLP) toolbox with widespread applications. In this article, we look at tagging part of the Penn Chinese Treebank with semantic dependency. Then we take this tagged data to train a maximum entropy classifier to label the semantic relations between headwords and dependents to perform semantic analysis on Chinese sentences. The classifier was able to achieve an accuracy of over 84%. We then analyze the errors in classification to determine the problems and possible solutions for this type of semantic analysis. © 2007 ACM.
Automatic text summarization based on lexical chains and structural features

Lei Yu, Jia Ma, Fuji Ren, Shingo Kuroiwa

SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 2, PROCEEDINGS 574-+ 2007年査読有り

The rapid growth of the Internet has resulted in enormous amounts of information that has become more difficult to access efficiently. The primary goal of this research is to create an efficient tool that is able to summarize large documents automatically. We propose concept chains to link semantically-related concepts based on Hownet knowledge database to improve the performance of Text Summarization and suit Chinese text. Lexical chains is a technique for identifying semantically-related terms in text. The resulting concept chains are then used to identify candidate sentences useful for extraction. Moreover, the other method based on structural features which can makes the summary of the text have more general content and more balance is also proposed. The final experimental results proved the effectiveness of our methods.
The framework of Mental State Transition analysis

Peilin Jiang, Hua Xiang, Fuji Ren, Shingo Kuroiwa, Nanning Zheng

MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE 4827 1046-+ 2007年査読有り

The Human Computer Interaction (HCI) Technology has emerged in the different fields in applications in computer vision and recognition systems such as virtual environment, video games, e-business and multimedia management. In this paper we propose a framework of designing the Mental State Transition (MST) of a human being or virtual character. The expressions of human emotion can be easily remarked by facial expressions, gestures, sound and other visual characteristics. But the potential MST modeling in affective data are always hidden actually. We analysis the framework of MST and employ DBNs to construct the MST networks and finally the experiment has been implemented to derive the ground truth of the data and verify the effectiveness.
Emotion estimation algorithm based on interpersonal emotion included in emotional dialogue sentences

Kazuyuki Matsumoto, Fuji Ren, Shingo Kuroiwa, Seiji Tsuchiya

MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE 4827 1035-+ 2007年査読有り

Emotion recognition aims to make computer understand ambiguous information of human emotion. Recently, research of emotion recognition is actively progressing in various fields such as natural language processing, speech signal processing, image data processing or brain wave analysis. We propose a method to recognize emotion in dialogue text by using originally created Emotion Word Dictionary. The words in the dictionary are weighted according to the occurrence rates in the existing emotion expression dictionary. We also propose a method to judge the object of emotion and emotion expressivity in dialogue sentences. The experiment using 1,190 sentences proved about 80% accuracy.
Artificial Bandwidth Extension for Speech Signals using Speech Recogniton

Shingo Kuroiwa, Masashi Takashina, Satoru Tsuge, Ren Fuji

INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 1045-1048 2007年査読有り

In this paper, we propose a non-realtime speech bandwidth extension method using HMM-based speech recognition and HMM-based speech synthesis. In the proposed method, first, the phoneme-state sequence is estimated from the bandlimited speech signals using the speech recognition technique. Next, for estimating spectrum envelopes of lost high-frequency components, an HMM-based speech synthesis technique generates a synthetic speech signal (spectrum sequence) according to the predicted phoneme-state sequence. Since both speech recognition and speech synthesis take into account dynamic feature vectors, we can obtain a smoothly varying spectrum sequence. For evaluating the proposed method, we conducted subjective and objective experiments. The experimental results show the effectiveness of the proposed method for bandwidth extension. However, the proposed method needs more improvement in speech quality.
Development of vad evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition perfornlance

Norihide Kitaoka, Kazumasa Yamamoto, Tomohiro Kusamizu, Seiichi Nakagawa, Takeshi Yamada, Satoru Tsuge, Chiyomi Miyajima, Takanobu Nishiura, Masato Nakayama, Yuki Denda, Masakiyo Fujimoto, Tetsuya Takiguchi, Satoshi Tamura, Shingo Kuroiwa, Kazuya Takeda, Satoshi Nakamura

2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2 607-+ 2007年査読有り

Voice activity detection (VAD) plays an important role in speech processing including speech recognition, speech enhancement, and speech coding in noisy environments. We developed an evaluation framework for VAD in such environments, called Corpus and Environment for Noisy Speech Recognition 1 Concatenated (CENSREC-1-C). This framework consists of noisy continuous digit utterances and evaluation tools for VAD results. By adoptiong two evaluation measures, one for frame-level detection performance and the other for utterance-level detection performance, we provide the evaluation results of a power-based VAD method as a baseline. When using VAD in speech recognizer, the detected speech segments are extended to avoid the loss of speech frames and the pause segments are then absorbed by a pause model. We investigate the balance of an explicit segmentation by VAD and an implicit segmentation by a pause model using an experimental simulation of segment extension and show that a small extension improves speech recognition.
Low Cost Japanese-English Machine Translation for Cross-Lingual Information Retrieval.

David B. Bracewell, Fuji Ren, Shingo Kuroiwa

International Conference on Artificial Intelligence and Pattern Recognition, AIPR-07, Orlando, Florida, USA, July 9-12, 2007 22-27 2007年査読有り
Semi-Automatic Construction of an Emotion Ontology Using HowNet.

Jiajun Yan, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

International Conference on Artificial Intelligence and Pattern Recognition, AIPR-07, Orlando, Florida, USA, July 9-12, 2007 17-21 2007年査読有り
Speaker Identification Method Using Earth Mover's Distance for CCC Speaker Recognition Evaluation 2006.

Shingo Kuroiwa, Satoru Tsuge, Masahiko Kita, Fuji Ren

IJCLCLP 12(3) 2007年査読有り
Question answering system of Confucian Analects based on pragmatics information and categories

Ye Yang, Song Liu, Shingo Kuroiwa, Fuji Ren

PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07) 361-+ 2007年査読有り

This paper constructs a question answering system of Confucian Analects. As a result of context change and the difference of words' connotation between modem Chinese and ancient Chinese, the accuracy of content-based retrieval and category-based retrieval in the classical literature is quite low. In view of this, the paper has established the categories and pragmatics information base for Confucian Analects. It also proposes a retrieval method based on pragmatics information and categories. To increase accuracy and efficiency, the category keyword collection and the question type keyword table are established as well. When the system recognize the type and category of the user's question, it uses key word semantic matching. Namely, the category keyword collection and the question type keyword table are separately used to decide the category and the type. The experiments evidenced the effectiveness of answer extraction approach based on pragmatics information specific to in query with deep meaning.
Sentence alignment using feed forward neural network

Mohamed Abdel Fattah, Fuji Ren, Shingo Kuroiwa

INTERNATIONAL JOURNAL OF NEURAL SYSTEMS 16(6) 423-434 2006年12月査読有り

Parallel corpora have become an essential resource for work in multi lingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross language information retrieval and machine translation applications. In this paper, we present a new approach to align sentences in bilingual parallel corpora based on feed forward neural network classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuate score, and cognate score values. A set of manually prepared training data has been assigned to train the feed forward neural network. Another set of data was used for testing. Using this new approach, we could achieve an error reduction of 60% over length based approach when applied on English-Arabic parallel documents. Moreover this new approach is valid for any language pair and it is quite flexible approach since the feature parameter vector may contain more/less or different features than that we used in our system such as lexical match feature.
Building frames of knowledge for causal agents in WordNet

David B. Bracewell, Fuji Ren, Shingo Kuroiwa

WSEAS Transactions on Computers 5(9) 1880-1885 2006年9月

WordNet has become a standard tool in the NLP researcher's toolkit. While giving a plethora of information it does lack certain information that would be of great benefit. This paper examines building frames of knowledge for a subset of causal agents in WordNet. This extra knowledge can help in Question &amp Answering, Machine Translation, etc. After an examination of the WordNet glosses different classes were created that allow for obtaining knowledge about actions, attributes, and domains.
Stemming to improve translation lexicon creation form bitexts

MA Fattah, FJ Ren, S Kuroiwa

INFORMATION PROCESSING & MANAGEMENT 42(4) 1003-1016 2006年7月査読有り

Arabic is a morphologically rich language that presents significant challenges to many natural language processing applications because a word often conveys complex meanings decomposable into several morphemes (i.e. prefix, stem, suffix). By segmenting words into morphemes, we could improve the performance of English/Arabic translation pair's extraction from parallel texts. This paper describes two algorithms and their combination to automatically extract an English/Arabic bilingual dictionary from parallel texts that exist in the Internet archive after using an Arabic light stemmer as a preprocessing step. Before using the Arabic light stemmer, the total system precision and recall were 88.6% and 81.5% respectively, then the system precision an recall increased to 91.6% and 82.6% respectively after applying the Arabic light stemmer on the Arabic documents. The algorithms have certain variables which values can be changed to control the system precision and recall. Like most of the systems do, the accuracy of our system is directly proportional to the number of sentence pairs used. However our system is able to extract translation pairs from a very small parallel corpus. This new system can extract translations from only two sentences in one language and two sentences in the other language if the requirements of the system accomplished. Moreover, this system is able to extract word pairs that are translation of each others, synonyms and the explanation of the word in the other language as well. By controlling the system variables, we could achieve 100% precision for the output bilingual dictionary with a small recall. (c) 2005 Elsevier Ltd. All rights reserved.
A new Question Answering system for Chinese restricted domain

HQ Hu, PL Jiang, FJ Ren, S Kuroiwa

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E89D(6) 1848-1859 2006年6月査読有り

In this paper, we propose the construction of a web-based Question Answering (QA) system for restricted domain, which combines three resource information databases for the retrieval mechanism, including a Question&Answer database, a special domain documents database and the web resource retrieved by Google search engine. We describe a new retrieval technique of integrating a probabilistic technique based on OkapiBM25 and a semantic analysis which based on the ontology of HowNet knowledge base and a special domain HowNet created for the restricted domain. Furthermore, we provide a method of question expansion by computing word semantic similarity. The system is first developed for a middle-size domain of sightseeing information. The experiments proved the efficiency of our method for restricted domain and it is feasible to transfer to other domains expediently using the proposed method.
Effects of phoneme type and frequency on distributed speaker identification and verification

MA Fattah, F Ren, S Kuroiwa

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E89D(5) 1712-1719 2006年5月査読有り

In the European Telecommunication Standards Institute (ETSI), Distributed Speech Recognition (DSR) front-end, the distortion added due to feature compression on the front end side increases the variance flooring effect, which in turn increases the identification error rate. The penalty incurred in reducing the bit rate is the degradation in speaker recognition performance. In this paper, we present a nontraditional solution for the previously mentioned problem. To reduce the bit rate, a speech signal is segmented at the client, and the most effective phonemes (determined according to their type and frequency) for speaker recognition are selected and sent to the server. Speaker recognition occurs at the server. Applying this approach to YOHO corpus, we achieved an identification error rate (ER) of 0.05% using an average segment of 20.4% for a testing utterance in a speaker identification task. We also achieved an equal error rate (EER) of 0.42% using an average segment of 15.1% for a testing utterance in a speaker verification task.
Nonparametric speaker recognition method using Earth Mover's Distance

S Kuroiwa, Y Umeda, S Tsuge, F Ren

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E89D(3) 1074-1081 2006年3月査読有り

In this paper, we propose a distributed speaker recognition method using a nonparametric speaker model and Earth Mover's Distance (EMD). In distributed speaker recognition, the quantized feature vectors are sent to a server. The Gaussian mixture model (GMM), the traditional method used for speaker recognition, is trained using the maximum likelihood approach. However, it is difficult to fit continuous density functions to quantized data. To overcome this problem, the proposed method represents each speaker model with a speaker-dependent VQ code histogram designed by registered feature vectors and directly calculates the distance between the histograms of speaker models and testing quantized feature vectors. To measure the distance between each speaker model and testing data, we use EMD which can calculate the distance between histograms with different bins. We conducted text-independent speaker identification experiments using the proposed method. Compared to results using the traditional GMM, the proposed method yielded relative error reductions of 32% for quantized data.
Nonparametric speaker recognition method using Earth Mover's Distance

S Kuroiwa, Y Umeda, S Tsuge, F Ren

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E89D(3) 1074-1081 2006年3月査読有り

In this paper, we propose a distributed speaker recognition method using a nonparametric speaker model and Earth Mover's Distance (EMD). In distributed speaker recognition, the quantized feature vectors are sent to a server. The Gaussian mixture model (GMM), the traditional method used for speaker recognition, is trained using the maximum likelihood approach. However, it is difficult to fit continuous density functions to quantized data. To overcome this problem, the proposed method represents each speaker model with a speaker-dependent VQ code histogram designed by registered feature vectors and directly calculates the distance between the histograms of speaker models and testing quantized feature vectors. To measure the distance between each speaker model and testing data, we use EMD which can calculate the distance between histograms with different bins. We conducted text-independent speaker identification experiments using the proposed method. Compared to results using the traditional GMM, the proposed method yielded relative error reductions of 32% for quantized data.
A Chinese Automatic Text Summarization system for mobile devices

Lei Yu, Mengge Liu, Fuji Ren, Shingo Kuroiwa

PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION 426-429 2006年査読有り

A large amount of on-line information and lengthiness information can't fit for the mobile devices. In order to save this problem, we propose a method which collects original news text from on-line information and extracts summary sentences from them automatically. On this basis, we adopt WML(Wireless Markup Language) to build a news website for mobile devices browsing through the news summary. The system is mainly made up by Automatic News Collection and Auto Text Summarization. Our experimental results proved the effectiveness of the means.
Machine transliteration

Mohamed Abdel Fattah, Fuji Ren, Shingo Kuroiwa

PACLIC 20: PROCEEDINGS OF THE 20TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION 370-373 2006年査読有り

In the present study, we present different approaches for transliteration proper noun pair's extraction from parallel corpora based on different similarity measures between the English and Romanized Arabic proper nouns under consideration. The strength of our new system is that it works well for low-frequency words. We evaluate the presented new approaches using an English-Arabic parallel corpus. Most of our results outperform previously published results in terms of precision, recall and F-Measure.
Evaluation of EMD-based speaker recognition using ISCSLP2006 Chinese speaker recognition evaluation corpus

Shingo Kuroiwa, Satoru Tsuge, Masahiko Kita, Fuji Ren

CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS 4274 539-+ 2006年査読有り

In this paper, we present the evaluation results of our proposed text-independent speaker recognition method based on the Earth Mover's Distance (EMD) using ISCSLP2006 Chinese speaker recognition evaluation corpus developed by the Chinese Corpus Consortium (CCC). The EMD based speaker recognition (EMD-SR) was originally designed to apply to a distributed speaker identification system, in which the feature vectors are compressed by vector quantization at a terminal and sent to a server that executes a pattern matching process. In this structure, we had to train speaker models using quantized data, so that we utilized a non-pararyletric speaker model and EMD. From the experimental results on a Japanese speech corpus, EMD-SR showed higher robustness to the quantized data than the conventional GMM technique. Moreover, it has achieved higher accuracy than the GMM even if the data were not quantized. Hence, we have taken the challenge of ISCSLP2006 speaker recognition evaluation by using EMD-SR. Since the identification tasks defined in the evaluation were on an open-set basis, we introduce a new speaker verification module in this paper. Evaluation results showed that EMD-SR achieves 99.3% Identification Correctness Rate in a closed-channel speaker identification task.
Lost Speech Reconstruction Method using Speech Recognition based on Missing Feature Theory and HMM-based Speech Synthesis

Shingo Kuroiwa, Satoru Tsuge, Fuji Ren

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 1105-1108 2006年査読有り

In recent years, IP telephone service has spread rapidly. However, an unavoidable problem of IP telephone service is deterioration of speech due to packet loss, which often occurs on wireless networks. To overcome this problem, we propose a novel lost speech reconstruction method using speech recognition based on Missing Feature Theory and HMM-based speech synthesis. The proposed method uses linguistic information and can deal with the lack of syllable units which conventional methods are unable to handle. We conducted subjective and objective evaluation experiments under speaker independent conditions. These results showed the effectiveness of the proposed method. Although there is a processing delay in the proposed method, we believe that this method will open up new applications for speech recognition and speech synthesis technology.
Treatment of quantifiers in Chinese-Japanese machine translation

Dapeng Yin, Min Shao, Peilin Jiang, Fuji Ren, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 930-935 2006年査読有り

Quantifiers and numerals often cause mistakes in Chinese-Japanese machine translation. In this paper, an approach is proposed based on the syntactic features after classification. Using the difference in type and position of quantifiers between Chinese and Japanese, quantifier translation rules were acquired. Evaluation was conducted using the acquired translation rules. Finally, the adaptability of the experimental data was verified and the methods achieved the accuracy of 90.75%, which showed that they were effective in processing quantifiers and numerals.
Statistical analysis of a Japanese emotion corpus for natural language processing

Junko Minato, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 924-929 2006年査読有り

In this paper, we build a Japanese emotion corpus and perform statistical analysis on it. We manually entered in about 1,200 example dialogue sentences. We collected statistical information from the corpus to analyze the way emotion is expressed in Japanese dialogue. Such statistics should prove useful for dealing with emotion in natural language. We believe the collected statistics accurately describe emotion in Japanese dialogue.
Determining the emotion of news articles

David B. Bracewell, Junko Minato, Fuji Ren, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 918-923 2006年査読有り

Authors of news stories through their choice in words and phrasing inject an underlying emotion into their stories. A story about the same event or person can have radically different emotions depending on the author, newspaper, and nationality. In this paper we propose a system to judge the emotion of a news article based on emotion word, idiom and modifier dictionaries. This type of system allows one to judge the world opinion on varying topics by looking at the emotion used within news articles about the topic.
Emotion estimation system based on emotion occurrence sentence pattern

Kazuyuki Matsumoto, Ren Fuji, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 902-911 2006年査読有り

The approach of emotion estimation from the conventional text was for estimating superficial emotion expression mainly. However emotions may be included in human's utterance even if emotion expressions are not in it. In this paper, we proposed an emotion estimation algorithm for conversation sentence. We gave the rules of emotion occurrence to 1616 sentence patterns. In addition, we developed a dictionary which consisted of emotional words and emotional idioms. The proposed method can estimate emotions in a sentence by matching the sentence pattern of emotion occurrence and the rule. Furthermore, we can get two or more emotions included in the sentence by calculating emotion parameter. We constructed the experiment system based on the proposed method for evaluation. We analyzed weblog data including 253 sentences by the system, and conducted the experiment to evaluate emotion estimation accuracy. As a result, we obtained the estimation accuracy of about 60%.
A semantic analyzer for aiding emotion recognition in Chinese

Jiajun Yan, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 893-901 2006年査読有り

In this paper we present a semantic analyzer for aiding emotion recognition in Chinese. The analyzer uses a decision tree to assign semantic dependency relations between headwords and modifiers. It is able to achieve an accuracy of 83.5%. The semantic information is combined with rules for Chinese verbs containing emotion to describe the emotion of the people in the sentence. The rules give information on how to assign emotion to agents, receivers, etc. depending on the verb in the sentence.
Text-based English-Arabic sentence alignment

Mohamed Abdel Fattah, Fuji Ren, Shingo Kuroiwa

COMPUTATIONAL INTELLIGENCE, PT 2, PROCEEDINGS 4114 748-753 2006年査読有り

In this paper, we present a new approach to align sentences in bilingual parallel corpora based on the use of the linguistic information of the text pair in Gaussian mixture model (GMM) classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuation score, cognate score and a bilingual lexicon extracted from the parallel corpus under consideration. A set of manually prepared training data has been assigned to train the Gaussian mixture model. Another set of data was used for testing. Using the Gaussian mixture model approach, we could achieve error reduction of 160% over length based approach when applied on English-Arabic parallel documents. In addition, the results of (GMM) outperform the results of the combined model which exploits length, punctuation, cognate and bilingual lexicon in a dynamic framework.
Study of intra-speaker's speech variability over long and short time periods for speech recognition

Satoru Tsuge, Masami Shishibori, Kenji Kita, Fuji Ren, Shingo Kuroiwa

2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13 397-400 2006年査読有り

In this paper, we describe a Japanese speech corpus collected for investigating the speech variability of a specific speaker over short and long time periods and then report the variability of speech recognition performance over short and long time periods. Although speakers use a speaker-dependent speech recognition system, it is known that speech recognition performance varies pending when the utterance was uttered. This is because speech quality varies by occasion even if the speaker and utterance remain constant. However, the relationships between intra-speaker speech variability and speech recognition performance are not clear. Hence, we have been collecting speech data to investigate these relationships since November 2002. In this paper, we introduce our speech corpus and report speech recognition experiments using our corpus. Experimental results show that the variability of recognition performance over different days is larger than variability of recognition performance within a day.
A Machine Learning Approach to Determine Semantic Dependency Structure in Chinese.

Jiajun Yan, David B. Bracewell, Fuji Ren, Shingo Kuroiwa

Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference, Melbourne Beach, Florida, USA, May 11-13, 2006 782-786 2006年査読有り
A question answering system on special domain and the implementation of speech interface

HQ Hu, FJ Ren, S Kuroiwa, SW Zhang

COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING 3878 458-469 2006年査読有り

In this paper, we propose a construction of Question Answering(QA) system, which synthesizes the answers retrieval from the frequent asked questions database and documents database, based on special domain about sightseeing information. A speech interface for the special domain was implemented along with the text interface, using an acoustic model HMM, a pronunciation lexicon, and a language model FSN on the basis of the feature of Chinese sentence patterns. We consider the synthetic model based on statistic VSM and shallow language analysis for sightseeing information. Experimental results showed high accuracy can be achieved for the special domain and the speech interface is available for frequently asked questions about sightseeing information.

MISC

588

感情音声における韻律の時系列変動を表す特徴量の検討

三井陸矢, 堀内靖雄, 黒岩眞吾

人工知能学会第97回研究会言語・音声理解と対話処理研究会 86-91 2023年3月
日本手話における手の直線運動の音素に関するモーションキャプチャによる分析

仲本征矢, 堀内靖雄, 原大介, 黒岩眞吾

人工知能学会第97回研究会言語・音声理解と対話処理研究会 68-73 2023年3月
食行動の自動評価及び分析のためのデータベース構築

伴野, 司, 森野, 智子, 黒岩, 眞吾, 西田, 昌史, 西村, 雅史

第85回全国大会講演論文集 2023(1) 967-968 2023年2月16日

「食べること」は生きる上での基本行動の一つであり，その質の維持・向上は乳幼児から高齢者まで，あらゆる人の心身の健康を維持・増進する上で重要な要素である．今回，食行動の質を保健・医療，情報工学の両面から分析・評価することを目的として，1) 食物を口に運び，2) 様々な位置で咀嚼し，3) 順次食塊を形成し，4) 食道へ送り込むという一連の食行動に関するセンサ情報を，食物や対象者の属性に関する情報とともに蓄積するためのデータベースを設計したので報告する．
音量を考慮した金管アンサンブル合奏システム

平松俊哉, 堀内靖雄, 黒岩眞吾

情報処理学会研究報(音楽情報科学) 2023-MUS-136(26) 1-8 2023年2月
ピアノの基礎練習における技術習得支援システム

平田瑛成, 堀内靖雄, 黒岩眞吾

情報処理学会研究報(音楽情報科学) 2023-MUS-136(6) 1-8 2023年2月

もっとみる

講演・口頭発表等

Cross-Lingual Speaker Identification for Japanese-English Bilinguals

Ryotaro Sano, Masafumi Nishida, Satoru Tsuge, Shingo Kuroiwa, Hiroyuki Yoshimura

2023 IEEE 12th Global Conference on Consumer Electronics 2023年10月12日
Utterance-Style-Dependent Speaker Verification by Utilizing Emotions

Hibiki Takayama, Masafumi Nishida, Satoru Tsuge, Shingo Kuroiwa

2023 IEEE 12th Global Conference on Consumer Electronics 2023年10月11日
感情を想定した発話スタイル依存型話者照合

高山響, 西田昌史, 柘植覚, 黒岩眞吾, 西村雅史

日本音響学会第150回(2023年秋季)研究発表会 2023年9月27日
単語発声による同一話者判定 DNN の学習と話者照合

亀田健太郎, 黒岩眞吾, 堀内靖雄, 柘植覚, 西田昌史

日本音響学会第150回(2023年秋季)研究発表会 2023年9月27日
食行動の自動評価及び分析のためのデータベース構築

伴野司, 森野智子, 黒岩眞吾, 西田昌史, 西村雅史

情報処理学会第85回全国大会 2023年3月

もっとみる

所属学協会

Works(作品等)

ActVoice Smart (音声認識を用いた絵カード呼称訓練ソフト）

株式会社エスコアール

2017年8月 - 現在ソフトウェア
ハナセル（音声認識を用いた失語症を持つ人向け言語訓練タブレット）

株式会社イントロム

2018年6月 - 2025年3月ソフトウェア
リハログ（言語訓練プラン作成及び記録システム）

株式会社イントロム

2017年1月 - 2025年3月 Web Service
ActVoice for Pepper(会話ロボット向け呼称訓練アプリ）

株式会社ロボキュア

2017年1月 - 2018年6月ソフトウェア
CD版「そのまま使える失語症教材１」

鈴木勉, 宇野園子, 佐藤ゆう子, 朝田真理, 石戸純子, 泉谷聡子, 前川友絵*, 井堀奈美, 鶴田, 薫, 堀田牧子, 4コマ画, 阿部裕実*, 有賀恵子, 小川節子, 須田悦子, 相馬肖美, 寺田奈々, 中嶋基充, 西脇恵子, 文章読解, 統括:宇野園子, 100字, 片山芳恵, 斎藤敬子, 嶋田真砂美, 栁澤瑶貴, 高山亜希子*, 井上澄香, 上杉由美, 鈴木和子, 村西幸代, 鈴木直哉小熊真由, 木村佐知子, 相楽涼子, 治田寛之, 山本弘美, 黒岩眞吾

2018年教材