A Multimodal Associative Recommendation System

Overall Concept

Recommendation underlies many internet and web services. We develop novel recommendation techniques that simulate human cognitive memory, i.e. crossmodal associative recall between vision and language. We use machine learning techniques to convert between image and text using a corpus of articles containing images. Combined with user lifelog and social data, this technology provides personalized crossmodal recommendation services in a mobile environment using smartphones and tablets. This work is supported by the IT R&D Program of Ministry of Knowledge Economy under KEIT.

  R&D Objectives (year by year) 





Research on Information Extraction
in Multimodal Richmedia

  • Attribute definition and relation summarization of complex information of image, movie and text data
  • Framework development for compounding descriptors
  • Mutual generation using image-text cross modality information based on machine learning


Research on Context-based Information Extraction in Richmedia

  • Methods for context-based extraction of compound information and descriptor generation
  • Cross-modal context analysis in images and movies
  • Multimodal topic modeling with compound information extracted from richmedia
  • Testing service of online article-mall connection system


Research on Information Extraction of Richmedia in Dynamic Environments

  • Learning and modeling compound information in time-varying and space-varying data
  • Interactive analysis of compound information in richmedia in incremental way
  • Incremental social analysis by multimodal topic models and its application to microblog analysis


Research on Recommendation Methods based on Multimodal Associativity

  • Multimodal-associative modeling of user preferences in richmedia
  • Interactive recommendation methods in dynamic richmedia environment
  • Multimodal interactive article-mall connection system with user preference catch and context recognition


Development of MARS and Its Application to Adaptive Recommendation

  • Recommendation engine based on multimodal association and user preference modeling
  • Constructing the framework system of multimodal associative recommendation system (MARS)
  • Personalized/adaptive richmedia recommendation system


  Target system to be developed: MARS  


  • Layered hypernetwork models for cross-modal associative text and image keyword generation in multimodal information retrieval, J.-W. Ha, B.-H. Kim, B. Lee, and B.-T. Zhang, Proceedings of the Eleventh Pacific Rim International Conference on AI (PRICAI2010), 2010.
  • Visual query expansion via incremental hypernetwork models of image and text, M.-O. Heo, M.-G. Kang, and B.-T. Zhang, Proceedings of the Eleventh Pacific Rim International Conference on AI (PRICAI2010), 2010.
  • Text-to-image cross-modal retrieval of magazine articles based on higher-order pattern recall by hypernetworks, J.-W. Ha, B.-H. Kim, H.-W. Kim, W. C. Yoon, J.-H. Eom, and B.-T. Zhang, The 10th International Symposium on Advanced Intelligent Systems (ISIS 2009), pp. 274-277, 2009. (Best paper award)
  • Evolutionary hypernetworks for learning to generate music from examples, H.-W. Kim, B.-H. Kim, and B.-T. Zhang IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 47-52, 2009.
  • Gender classification with cortical thickness measurement, J.-W. Ha, J.H. Jang, D.-H. Kang, W.H. Jung, J.S. Kwon, and B.-T. Zhang, IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 41-46, 2009.
  • Learning word sense disambiguation in biomedical text with difference between training and test distributions, J.-W. Son and S.-B. Park, In Proc. of the ACM Third International Workshop on Data and Text Mining in Bioinformatics, 2009.
  • A weighting scheme for tag recommendation in social bookmarking systems, S. Ju and K.-B. Hwang, In Proceedings of ECML/PKDD Discovery Challenge 2009. (To appear)
  • Evolutionary hypernetwork classifiers for protein-protein interaction sentence filtering, J. Bootkrajang, S. Kim, and B.-T. Zhang, The Genetic and Evolutionary Computation Conference (GECCO 2009), pp. 185-191, 2009.
  • An empirical study of choosing efficient discriminative seeds for oligonucleotide design, W.-H. Chung and S.-B. Park, In Proc. of Eighth the International Conference on Bioinformatics, 2009.
  • Auto-tagging method for unlabeled item images with hypernetworks for article-related item recommender systems, J.-W. Ha, B.-H. Kim, B. Lee, and B.-T. Zhang, Journal of the Korean Institute of Information Scientists and Engineers: Computing Practices and Letters, 16(10):1010-1014, 2010.
  • Bi-Source 토픽 모델 기법을 이용한 기사-상품 연관 검색, 김병희, 이바도, 하성종, 조남익, 장병탁, 한국정보과학회 가을학술발표 논문집, 37 2(A), pp. 74-75, 2010.11.
  • 잡지기사 관련 상품 연계 추천 서비스를 위한 하이퍼네트워크 기반의 상품이미지 자동 태깅 기법,하정우,김병희,이바도,장병탁, 2010 한국컴퓨터종합학술대회(KCC2010) 논문집, 37 1(A), pp. 123-124, 2010.06.
  • 랜덤 하이퍼그래프 모델을 이용한 순차적 멀티모달 데이터에서의 문장 생성, 윤웅창,장병탁, 2010 한국컴퓨터종합학술대회(KCC2010) 논문집, 37 1(C), pp. 376-379, 2010.06.
  • 진화 하이퍼네트워크를 이용한 음악 학습 크로스오버 음악 생성, 김현우, 김병희, 장병탁, 2009 한국컴퓨터종합학술대회논문집, 36 1(A), pp. 134-138, 2009.07.
  • 하이퍼망 메모리 기반 유아 언어학습 생성 모델, 이지훈, 이은석, 장병탁, 2009 한국컴퓨터종합학술대회논문집, 36 1(A), pp. 128-129, 2009.07.

Project Title

Multimodal Information Extraction and Recommendation Technologies for Next-Generation Customized Services Based on Machine Learning


March 2009 - February 2014


The Ministry of Knowledge and Economy &
Korea Evaluation Institute of Industrial Technology

Principal Investigator

Prof. Byoung-Tak Zhang


Prof. Nam Ik Cho

Byoung-Hee Kim

WoongChang Yoon

Seong-Jong Ha

Jung-Woo Ha

Bado Lee

Collaborative Labs

ISPL Laboratory, Seoul National University
Digital Design House Co. Ltd (Storysearch & Storyshop)


Byoung-Hee Kim


bhkim -at- bi snu ac kr