研究者業績

川本 一彦

Kazuhiko Kawamoto

基本情報

所属
千葉大学 大学院情報学研究院 教授
学位
博士(工学)(2002年3月 千葉大学)

連絡先
kawafaculty.chiba-u.jp
ORCID ID
 https://orcid.org/0000-0003-3701-1961
J-GLOBAL ID
201101069474935716
researchmap会員ID
B000000393

外部リンク

論文

 136
  • Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 6017-6026 2024年6月  査読有り
  • Yusuke Marumo, Kazuhiko Kawamoto, Hiroshi Kera
    CV4Animals Workshop in conjunction with CVPR 2024年6月  査読有り
  • Katsuya KOSUKEGAWA, Kazuhiko KAWAMOTO
    IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences E107-A(4) 666-669 2024年4月  査読有り最終著者責任著者
  • Nariki Tanaka, Hiroshi Kera, Kazuhiko Kawamoto
    Computer Vision and Image Understanding 240(103936) 2024年3月  査読有り責任著者
  • Nan Wu, Hiroshi Kera, Kazuhiko Kawamoto
    Advanced Computational Intelligence and Intelligent Informatics, Communications in Computer and Information Science 1932 18-28 2023年10月30日  査読有り
  • Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera
    ICCV 2023 Workshop on Uncertainty Quantification for Computer Vision 2023年10月  査読有り
  • Tomoyasu Nanaumi, Kazuhiko Kawamoto, Hiroshi Kera
    ICCV 2023 Workshop on Uncertainty Quantification for Computer Vision 2023年10月  査読有り
  • Nan Wu, Hiroshi Kera, Kazuhiko Kawamoto
    Applied Intelligence 53(20) 24142-24156 2023年7月17日  査読有り責任著者
  • Takeshi Haga, Hiroshi Kera, Kazuhiko Kawamoto
    Sensors 23(5) 2515-2515 2023年2月24日  査読有り責任著者
  • Katsuya Kosukegawa, Yasukuni Mori, Hiroki Suyari, Kazuhiko Kawamoto
    Scientific Reports 13(1) 2023年2月9日  査読有り責任著者
  • Takuto Otomo, Hiroshi Kera, Kazuhiko Kawamoto
    IEEE International Conference on Systems, Man, and Cybernetics (SMC) 676-681 2022年10月9日  査読有り
  • Takaaki Azakami, Hiroshi Kera, Kazuhiko Kawamoto
    IEEE International Conference on Systems, Man, and Cybernetics (SMC) 682-687 2022年10月9日  査読有り
  • Chun Yang Tan, Kazuhiko Kawamoto, Hiroshi Kera
    ECCV 2022 Workshop on Adversarial Robustness in the Real World 2022年10月  査読有り
  • Nariki Tanaka, Hiroshi Kera, Kazuhiko Kawamoto
    Proceedings of the AAAI Conference on Artificial Intelligence 36(2) 2335-2343 2022年6月28日  査読有り
    Skeleton-based action recognition models have recently been shown to be vulnerable to adversarial attacks. Compared to adversarial attacks on images, perturbations to skeletons are typically bounded to a lower dimension of approximately 100 per frame. This lower-dimensional setting makes it more difficult to generate imperceptible perturbations. Existing attacks resolve this by exploiting the temporal structure of the skeleton motion so that the perturbation dimension increases to thousands. In this paper, we show that adversarial attacks can be performed on skeleton-based action recognition models, even in a significantly low-dimensional setting without any temporal manipulation. Specifically, we restrict the perturbations to the lengths of the skeleton's bones, which allows an adversary to manipulate only approximately 30 effective dimensions. We conducted experiments on the NTU RGB+D and HDM05 datasets and demonstrate that the proposed attack successfully deceived models with sometimes greater than 90% success rate by small perturbations. Furthermore, we discovered an interesting phenomenon: in our low-dimensional setting, the adversarial training with the bone length attack shares a similar property with data augmentation, and it not only improves the adversarial robustness but also improves the classification accuracy on the original data. This is an interesting counterexample of the trade-off between adversarial robustness and clean accuracy, which has been widely observed in studies on adversarial training in the high-dimensional regime.
  • Kazuma Fujii, Hiroshi Kera, Kazuhiko Kawamoto
    IEEE Access 10 59534-59543 2022年6月  査読有り責任著者
    Unsupervised domain adaptation, which involves transferring knowledge from a label-rich source domain to an unlabeled target domain, can be used to substantially reduce annotation costs in the field of object detection. In this study, we demonstrate that adversarial training in the source domain can be employed as a new approach for unsupervised domain adaptation. Specifically, we establish that adversarially trained detectors achieve improved detection performance in target domains that are significantly shifted from source domains. This phenomenon is attributed to the fact that adversarially trained detectors can be used to extract robust features that are in alignment with human perception and worth transferring across domains while discarding domain-specific non-robust features. In addition, we propose a method that combines adversarial training and feature alignment to ensure the improved alignment of robust features with the target domain. We conduct experiments on four benchmark datasets and confirm the effectiveness of our proposed approach on large domain shifts from real to artistic images. Compared to the baseline models, the adversarially trained detectors improve the mean average precision by up to 7.7%, and further by up to 11.8% when feature alignments are incorporated. Although our method degrades performance for small domain shifts, quantification of the domain shift based on the Fréchet distance allows us to determine whether adversarial training should be conducted.
  • Fei Yan, Nan Wu, Abdullah M. Iliyasu, Kazuhiko Kawamoto, Kaoru Hirota
    Applied Intelligence 52(8) 9406-9422 2022年6月  査読有り
    In addition to the almost five million lives lost and millions more than that in hospitalisations, efforts to mitigate the spread of the COVID-19 pandemic, which that has disrupted every aspect of human life deserves the contributions of all and sundry. Education is one of the areas most affected by the COVID-imposed abhorrence to physical (i.e., face-to-face (F2F)) communication. Consequently, schools, colleges, and universities worldwide have been forced to transition to different forms of online and virtual learning. Unlike F2F classes where the instructors could monitor and adjust lessons and content in tandem with the learners’ perceived emotions and engagement, in online learning environments (OLE), such tasks are daunting to undertake. In our modest contribution to ameliorate disruptions to education caused by the pandemic, this study presents an intuitive model to monitor the concentration, understanding, and engagement expected of a productive classroom environment. The proposed apposite OLE (i.e., AOLE) provides an intelligent 3D visualisation of the classroom atmosphere (CA), which could assist instructors adjust and tailor both content and instruction for maximum delivery. Furthermore, individual learner status could be tracked via visualisation of his/her emotion curve at any stage of the lesson or learning cycle. Considering the enormous emotional and psychological toll caused by COVID and the attendant shift to OLE, the emotion curves could be progressively compared through the duration of the learning cycle and the semester to track learners’ performance through to the final examinations. In terms of learning within the CA, our proposed AOLE is assessed within a class of 15 students and three instructors. Correlation of the outcomes reported with those from administered questionnaires validate the potential of our proposed model as a support for learning and counselling during these unprecedentedtimes that we find ourselves.
  • Shun Kimura, Kazuhiko Kawamoto
    Proc. of the 7th International Workshop on Advanced Computational Intelligence and Intelligent Informatics 2021年11月  査読有り
  • Kazuma Fujii, Kazuhiko Kawamoto
    Array 11 100071-100071 2021年9月  査読有り責任著者
  • Nan Wu, Kazuhiko Kawamoto
    Sensors 21(11) 3793-3793 2021年5月30日  査読有り責任著者
    Large datasets are often used to improve the accuracy of action recognition. However, very large datasets are problematic as, for example, the annotation of large datasets is labor-intensive. This has encouraged research in zero-shot action recognition (ZSAR). Presently, most ZSAR methods recognize actions according to each video frame. These methods are affected by light, camera angle, and background, and most methods are unable to process time series data. The accuracy of the model is reduced owing to these reasons. In this paper, in order to solve these problems, we propose a three-stream graph convolutional network that processes both types of data. Our model has two parts. One part can process RGB data, which contains extensive useful information. The other part can process skeleton data, which is not affected by light and background. By combining these two outputs with a weighted sum, our model predicts the final results for ZSAR. Experiments conducted on three datasets demonstrate that our model has greater accuracy than a baseline model. Moreover, we also prove that our model can learn from human experience, which can make the model more accurate.
  • Kodai Uchiyama, Kazuhiko Kawamoto
    IEEE Access 9 50106-50111 2021年  査読有り責任著者
  • Kazuma Kurisaki, Kazuhiko Kawamoto
    IEEE Access 9 3269-3277 2021年  査読有り責任著者
  • Wataru Okamoto, Kazuhiko Kawamoto
    2020 Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCIS-ISIS) 2020年12月5日  査読有り
  • Nan Wu, Kazuhiko Kawamoto
    2020 Joint 11th International Conference on Soft Computing and Intelligent Systems and 21st International Symposium on Advanced Intelligent Systems (SCIS-ISIS) 2020年12月5日  査読有り
  • Shinobu Takahashi, Kazuhiko Kawamoto
    Proc. of the 9th International Symposiumu on Computational Intelligence and Industrial Applications 1-5 2020年11月  査読有り
  • Calvin Janitra Halim, Kazuhiko Kawamoto
    Sensors 20(15) 4195-4195 2020年7月28日  査読有り責任著者
    Recent approaches to time series forecasting, especially forecasting spatiotemporal sequences, have leveraged the approximation power of deep neural networks to model the complexity of such sequences, specifically approaches that are based on recurrent neural networks. Still, as spatiotemporal sequences that arise in the real world are noisy and chaotic, modeling approaches that utilize probabilistic temporal models, such as deep Markov models (DMMs), are favorable because of their ability to model uncertainty, increasing their robustness to noise. However, approaches based on DMMs do not maintain the spatial characteristics of spatiotemporal sequences, with most of the approaches converting the observed input into 1D data halfway through the model. To solve this, we propose a model that retains the spatial aspect of the target sequence with a DMM that consists of 2D convolutional neural networks. We then show the robustness of our method to data with large variance compared with naive forecast, vanilla DMM, and convolutional long short-term memory (LSTM) using synthetic data, even outperforming the DNN models over a longer forecast period. We also point out the limitations of our model when forecasting real-world precipitation data and the possible future work that can be done to address these limitations, along with additional future research potential.
  • Calvin Janitra Halim, Kazuhiko Kawamoto
    Advances in Intelligent Systems and Computing 1128 AISC 37-44 2020年  査読有り
  • Masatoshi Nakano, Nobuyoshi Komuro, Kazuhiko Kawamoto
    2019 IEEE 8th Global Conference on Consumer Electronics (GCCE) 1160-1163 2019年10月  査読有り
  • Yuki Nakahira, Kazuhiko Kawamoto
    2019 IEEE International Conference on Image Processing (ICIP) 749-753 2019年9月  査読有り
  • 中村伊吹, 川本一彦, 岡本一志
    電子情報通信学会論文誌D J102-D(8) 506-513 2019年8月  査読有り責任著者
  • 河野曜平, 川本一彦
    電子情報通信学会論文誌 D J102-D(7) 491-493 2019年7月  査読有り責任著者
  • Yuta Segawa, Kazuhiko Kawamoto, Kazushi Okamoto
    EURASIP Journal on Image and Video Processing 2018(1) 2018年12月  査読有り責任著者
  • Shumpei Kobayashi, Kazuhiko Kawamoto
    ISCIIA and ITCA 2018 - 8th International Symposium on Computational Intelligence and Industrial Applications and 12th China-Japan International Workshop on Information Technology and Control Applications 1-5 2018年11月  査読有り
    We propose a deep multi-task network for recognizing user’s activities using first-person or egocentric images. The proposed network is composed of three subnetworks: action recognition network, object recognition network, and hand segmentation network. The first two networks are trained for recognizing actions and objects composing the activities in a multitask framework. The hand segmentation network is used to directly capture hand appearance in the deep network. In this paper, we experimentally explore the best deep architecture with the three subnetworks. We show that the activity recognition accuracy increases by 5–12% compared to a baseline model.
  • Yuki Nakahira, Kazuhiko Kawamoto
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC) 1276-1281 2018年  査読有り
    Generative adversarial networks( GANs) have been successfully applied for generating high quality natural images and have been extended to the generation of RGB videos and 3D volume data. In this paper we consider the task of generating RGB-D videos, which is less extensively studied and still challenging. We explore deep GAN architectures suitable for the task, and develop 4 GAN architectures based on existing video-based GANs. With a facial expression database, we experimentally find that an extended version of the motion and content decomposed GANs, known as MoCoGAN, provides the highest quality RGB-D videos. We discuss several applications of our GAN to content creation and data augmentation, and also discuss its potential applications in behavioral experiments.
  • Ryosuke Nakaishi, Kazuhiko Kawamoto
    IWACIII 2017 - 5th International Workshop on Advanced Computational Intelligence and Intelligent Informatics 1-5 2017年11月  査読有り
    The purpose of this research is to construct a pedes-trian tracking model using data assimilation. We use a floor field for expressing behaviors of pedestrians. We acquire pedestrian data and build a new floor field. Next, we try semantic segmentation using the deep convolution neural network (DCNN). We experiment with discriminating a background and pedestrians and consider effectiveness of semantic segmentation. Finally, we compare an accuracy of pedestrian tracking using background subtraction and an accuracy of pedestrian tracking using semantic segmentation.
  • Kazushi OKAMOTO, Kazuhiko KAWAMOTO
    Journal of Japan Society for Fuzzy Theory and Intelligent Informatics 29(2) 574-578 2017年4月  査読有り
  • Kazuhiko Kawamoto, Yoshiyuki Tomura, Kazushi Okamoto
    2017 JOINT 17TH WORLD CONGRESS OF INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (IFSA-SCIS) 1-4 2017年  査読有り
    We investigate the use of kriging, which is a spatial statistical tool in random fields, for modeling pedestrian dynamics with interaction between pedestrians. In particular we focus on a comparison of prediction and interpolation of pedestrian movements and we discuss advantages and disadvantages of these two computations and demonstrate them with publicly available dataset.
  • 川本 一彦, 古閑 勇祐, 岡本 一志
    知能と情報 28(6) 932-941 2016年12月  査読有り
    本研究では,人物軌跡データを用いて人物移動を模擬する確率的セルオートマトンを学習する方法を提案し,動画像上での複数人物追跡に応用する.確率的セルオートマトンの学習では,対象空間全体で十分かつ密に人物軌跡データを収集することは一般に困難であることから,ディリクレ平滑化を用いてベイズ的にデータの欠損や不足を補う方法を導入する.さらに,人物追跡のために,学習した確率的セルオートマトンによる人物移動シミュレーションを画像を用いて逐次的に更新するための逐次データ同化アルゴリズムを示す.実動画像を用いた追跡実験では,提案する確率的セルオートマトンを用いることにより,データに基づかない方法と比較して追跡性能が向上することを示す.
  • Yuta Segawa, Kazuhiko Kawamoto, Kazushi Okamoto
    ISCIIA 2016 - 7th International Symposium on Computational Intelligence and Industrial Applications 1-6 2016年11月  査読有り
    We propose a method for recognizing first-person activities based on image classification using DCNN fine-tuning. It is well-known that fine-tuning for pretrained DCNN (deep convolutional neural network) provides a high accuracy in the image classification. However, it costs to collect training data of first-person activities with their labels. On the other hand, in first-person activity videos, objects associated with the activity often come in sight and their appearance does not change very much over the video sequences. Therefore, training images of first-person activities handling some objects can be generated without complex processes. Here, in this study we generate training datasets artificially for fine-tuning and apply them to recognizing first-person reading activities. For classification, we use a DCNN model pre-trained on ImageNet and retrain only final layer weights in the model on our artificial datasets. In our experiments, we evaluate F-measure of the DCNN by applying it to the classification of real first-person activity images. As baseline approaches, we also use DCNN models assembled from scratch, conventional NN (nearest neighbor), and SVM (support vector machine). In addition, we consider how the artificially generated images affect DCNN features by visualizing them. The fine-tuned DCNN model has provided the best F-measure 99.1% in the experiments.
  • 古谷佳大, 堀内靖雄, 川本一彦, 下元正義, 眞崎浩一, 黒岩眞吾, 鈴木広一
    電子情報通信学会論文誌D J99-D(1) 90-92 2016年1月  査読有り
  • Kazuhiko Kawamoto, Yoshiyuki Tomura, Kazushi Okamoto
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS) 921-924 2016年  査読有り
    This paper proposes a method for learning pedestrian dynamics with kriging, which is a spatial interpolation method in geosciences. Pedestrian dynamics is generally restricted by other pedestrians and its restriction is caused by social interaction between them. In the proposed method, the social interaction is represented by spatio-temporal correlation of pedestrian dynamics and the correlation is estimated by kriging. As an application of the proposed method, the prediction of pedestrian movement is examined and its performance is evaluated with publicly available benchmark dataset. The experimental results show that 10-step ahead prediction is successful with more than 80% trajectories of the datasets if 2.0[m] distance error is allowed.
  • Kazuhiko Kawamoto, Yoshiyuki Tomura, Kazushi Okamoto
    2015 International Workshop on Advanced Computational Intelligence and Intelligent Informatics, IWACIII 2015 94-97 2015年  査読有り
    © 2015, Fuji Technology Press Ltd. All rights reserved. This paper proposes a regression-based method for modeling social interaction of pedestrians. Pedestrian movements generally are restricted by other pedestrians and its restriction is caused by social interaction between them. In the proposed method, the social interaction is represented by spatiotemporal correlation of pedestrian movements and the correlation is estimated by kriging, which is a spatiotemporal regression method. As an application of the proposed method, the prediction of pedestrian movement is examined and its performance is evaluated with publicly available benchmark dataset. The experimental results show that more than 80% successful prediction rate for hundreds of pedestrian trajectories is achieved if 1.5[m] distance error is allowed.
  • Kazushi Okamoto, Kazuhiko Kawamoto
    2015 International Workshop on Advanced Computational Intelligence and Intelligent Informatics, IWACIII 2015 142-145 2015年  査読有り
    © 2015, Fuji Technology Press Ltd. All rights reserved. This paper validates kernel size parameters of CNN (convolutional neural networks) for omnidirectional images to determine the best parameters for high human detection accuracy and low computational costs for training. The assumed CNN is two hidden layered architecture. The effectiveness of changing kernel sizes of each layer, which are measured as false detection rate, non-detection rate, and training times, are validated based on human detection experiment with real omnidirectional images. The experimental results suggest that larger 1st hidden layer kernel size and 2nd hidden layer kernel size slightly smaller than the 1st layer are recommended.
  • 浅沼 仁, 岡本 一志, 川本 一彦
    知能と情報 27(5) 813-825 2015年  査読有り
    カメラパラメータを推定することなく,全方位画像という歪みや位置による見えの変化のある画像に対する人検出器を提案する.Deep Convolutional Neural Networkをベースに実現し,さらに,少量の学習サンプルに対して並進・スケーリング・回転・輝度変化の変形を適用することで大量の学習サンプルを生成する手法も提案している.実際に設置されている全方位カメラの画像を用いた人検出実験では,HOGとReal AdaBoostの組み合わせによる人検出器が誤検出率0.001で未検出率77.5%であるのに対し,提案手法が誤検出率0.001における未検出率が28.2%となることを確認している.提案する学習サンプル生成法が精度向上に寄与することや可視化した特徴マップの検出精度との関連性も検証している.これによりカメラパラメータの推定が難しい状況でも全方位画像からの人検出の精度向上を実現する.
  • Kazushi Okamoto, Hitoshi Asanuma, Kazuhiko Kawamoto
    2014 WORLD AUTOMATION CONGRESS (WAC): EMERGING TECHNOLOGIES FOR A NEW PARADIGM IN SYSTEM OF SYSTEMS ENGINEERING 415-420 2014年  査読有り
    A graph based data mining method, which discovers automatically usage patterns from user-to-user and user-to-object interactions in a collaborative learning space, is proposed. The proposal describes mathematically observed users, objects, and their interactions at a given time as a set of graphs (a usage pattern) whose node is a user or an object and edge is assigned depending on a physical distance between two nodes. It is validated that the proposal can provide useful data for interview planning and evidences for interview results. On the validation, detection of frequent local usage patterns, detection of rare spatial layouts among usage patterns, and grouping hours containing similar local usage patterns are demonstrated with the 324 pictures taken at the collaborative learning space in Chiba University Library.
  • Jiro Nakajima, Akihiro Sugimoto, Kazuhiko Kawamoto
    IMAGE AND VIDEO TECHNOLOGY, PSIVT 2013 8333 468-480 2014年  査読有り
    The saliency map has been proposed to identify regions that draw human visual attention. Differences of features from the surroundings are hierarchially computed for an image or an image sequence in multiple resolutions and they are fused in a fully bottom-up manner to obtain a saliency map. A video usually contains sounds, and not only visual stimuli but also auditory stimuli attract human attention. Nevertheless, most conventional methods discard auditory information and image information alone is used in computing a saliency map. This paper presents a method for constructing a visual saliency map by integrating image features with auditory features. We assume a single moving sound source in a video and introduce a sound source feature. Our method detects the sound source feature using the correlation between audio signals and sound source motion, and computes its importance in each frame in a video using an auditory saliency map. The importance is used to fuse the sound source feature with image features to construct a visual saliency map. Experiments using subjects demonstrate that a saliency map by our proposed method reflects human's visual attention more accurately than that by a conventional method.
  • Kazushi Okamoto, Mayu Horiuchi, Kazuhiko Kawamoto
    Proc. 14th International Symposium on Advanced Intelligent Systems 1-11 2013年  査読有り
  • Fu-Cheng Lin, Kazuhiko Kawamoto, Kazushi Okamoto
    Proc. 3nd International Workshop on Advanced Computational Intelligence and Intelligent Informatics 1-5 2013年  査読有り
  • Hayato Itoh, Tomoya Sakai, Kazuhiko Kawamoto, Atsushi Imiya
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7953 LNCS 26-42 2013年  査読有り
    In this paper, we experimentally evaluate the validity of dimension-reduction methods for the computation of the similarity in pattern recognition. Image pattern recognition uses pattern recognition techniques for the classification of image data. For the numerical achievement of image pattern recognition techniques, images are sampled using an array of pixels. This sampling procedure derives vectors in a higher-dimensional metric space from image patterns. For the accurate achievement of pattern recognition techniques, the dimension reduction of data vectors is an essential methodology, since the time and space complexities of data processing depend on the dimension of data. However, dimension reduction causes information loss of geometrical and topological features of image patterns. The desired dimension-reduction method selects an appropriate low-dimensional subspace that preserves the information used for classification. © 2013 Springer-Verlag.
  • Hayato Itoh, Tomoya Sakai, Kazuhiko Kawamoto, Atsushi Imiya
    IMAGE ANALYSIS, SCIA 2013 7944 195-204 2013年  査読有り
    In this paper, we experimentally evaluate the validity of dimension-reduction methods which preserve topology for image pattern recognition. Image pattern recognition uses pattern recognition techniques for the classification of image data. For the numerical achievement of image pattern recognition techniques, images are sampled using an array of pixels. This sampling procedure derives vectors in a higher-dimensional metric space from image patterns. For the accurate achievement of pattern recognition techniques, the dimension reduction of data vectors is an essential methodology, since the time and space complexities of data processing depend on the dimension of data. However, the dimension reduction causes information loss of geometrical and topological features of image patterns. The desired dimension-reduction method selects an appropriate low-dimensional subspace that preserves the topological information of the classification space.
  • Kazuhiko Kawamoto, Hikaru Kazama, Kazushi Okamoto
    2013 SECOND INTERNATIONAL CONFERENCE ON ROBOT, VISION AND SIGNAL PROCESSING (RVSP) 160-163 2013年  査読有り
    We propose an image retrieval based method for visual localization in indoor scenes, provided that a geotagged image database in indoor environments is given. For image retrieval, we introduce a voting based image similarity which is robust to geometric image transformations and occlusions. In order to further improve the performance of image retrieval, we introduce two additional procedures: multiple voting and a ratio test. These two procedures are effective in increasing the true positives and in decreasing the false positives, respectively. In addition, we introduce a particle filter to smoothly estimate the trajectory of a moving camera used for visual localization. In experiments with real images captured at an university library, we show that the proposed method outperforms a structure-from-motion based method.

MISC

 217

講演・口頭発表等

 36

所属学協会

 5

共同研究・競争的資金等の研究課題

 13

産業財産権

 1