LaText
Text Mining based on Latent
Variable Models
¢Æ ¿¬±¸ ¸ñÇ¥ ¢Æ
º» ¿¬±¸´Â °úÇÐ ±â¼úºÎÀÇ ³ú½Å°æÁ¤º¸Çлç¾÷ÀÇ ÀÏȯÀ¸·Î ÁøÇàµÇ°í ÀÖÀ¸¸ç Ã߷й×ÇнÀ±â¼ú
ÆÀÀÇ ÃÖÁ¾ ¸ñÇ¥´Â Àΰ£ÀÇ ±â¾ï°ú ÇнÀ¿¡ °üÇÑ ÀÎÁö½Å°æ±âÀüÀÇ ¸ðµ¨À» °³¹ßÇϰí À̸¦ ¹ÙÅÁÀ¸·Î ³ôÀº Á¤È®µµ¿Í À¶Å뼺 ÀÖ´Â ½Å°æ¸Á ±â¹Ý Ãß·Ð
¹× ÇнÀ ±â¼úÀ» °³¹ßÇϰí À̸¦ °øÇÐÀûÀ¸·Î ÀÀ¿ëÇÑ ÀÀ¿ë½Ã½ºÅÛÀ» °³¹ßÇÏ´Â °ÍÀÌ´Ù.
º» ¿¬±¸ÆÀÀÌ ¼ÓÇÑ Á¤º¸Å½»öÆÀÀº Á¤º¸ ºÐ·ù, ¿©°ú, ÃßÃ⠵ °üÇÑ ÀÎÁö½É¸®ÇÐ ±â¹ÝÀÇ
±â°èÇнÀ ±â¼ú°ú À¥ ÄÁÅÙÃ÷ ¸¶ÀÌ´× ±â¼ú¿¡ ´ëÇØ ¿¬±¸Çϰí À̸¦ ½Å°æ¸Á ±â¹ÝÀÇ ´ë±Ô¸ð Á¤º¸ °Ë»ö½Ã½ºÅÛÀÇ °³¹ß¿¡ Ȱ¿ëÇÏ¿© ±Ã±ØÀûÀ¸·Î
´ë¿ë·® °í¼º´ÉÀÇ Á¤º¸°Ë»ö ½Ã½ºÅÛ Neuro-IR °³¹ßÀ» ¸ñÇ¥·Î ÇÑ´Ù.
.
¢Æ ¿¬±¸ ÃßÁø °èȹ ¹× ¹æ¹ý ¢Æ
|
¿¬±¸°³¹ß¸ñÇ¥
|
¿¬±¸°³¹ß ³»¿ë ¹×
¹üÀ§
|
1Â÷³âµµ
|
ÅØ½ºÆ®Á¤º¸ ºÐ¼®À»
À§ÇÑ Àº´Ðº¯¼ö ½Å°æ¸Á ¸ðµ¨ °³¹ß
|
ÅØ½ºÆ®
¹®¼ÀÇ ºÐ¼®/ºÐ·ù¸¦ À§ÇÑ Àº´Ð º¯¼ö ½Å°æ¸Á ¸ðµ¨ ¿¬±¸
(multiple-cause models, PLSA, LSA, NMF, ICA, HMM,
etc)
Àº´Ðº¯¼ö
½Å°æ¸Á ±â¹ÝÀÇ ¹®¼ Àε¦½Ì ±â¹ý ¿¬±¸
¹®¼ÀÇ
ÁÖÁ¦¾î ÃßÃâÀ» À§ÇÑ Àº´Ðº¯¼ö ½Å°æ¸Á ¸ðµ¨ÀÇ °³¹ß
|
´Ù¾çÇÑ
À¥ ÄÁÅÙÃ÷ Á¤º¸ÀÇ ºÐ¼®, ºÐ·ù, ¿©°ú ¹æ¹ý ¿¬±¸
|
´Ù¾çÇÑ
À¥ »çÀÌÆ®ÀÇ ÄÁÅÙÃ÷ Á¤º¸¿¡ ´ëÇÑ ºÐ¼® ¹æ¹ý ¿¬±¸
½Å°æ¸ÁÀ»
±â¹ÝÀ¸·Î À¥ ÄÁÅÙÃ÷ Á¤º¸¸¦ ºÐ¼®, ºÐ·ù, ¿©°úÇÒ ¼ö
ÀÖ´Â ¹æ¹ý¿¡ °üÇÑ ¿¬±¸
|
Á¤º¸ºÐ·ù
½Ã½ºÅÛ Å½»ö ¹× ÀÎÁö½É¸®ÇÐ, ¼ö¸®½É¸®ÇÐÀû ¸ðÇü °³¹ß
|
ÀÎÁö½É¸®ÇÐÀû
½ÇÇèÀ» ÅëÇÑ Àΰ£ÀÇ Á¤º¸ ºÐ·ù¿Í ¹üÁÖÈ¿¡ °üÇÑ ¿¬±¸
Àΰ£ÀÇ
Á¤º¸ ºÐ·ùü°è¿¡ ´ëÇÑ ÇൿÀû/¼ö¸®Àû ¸ðÇüÀÇ °³¹ß
ÅØ½ºÆ®
󸮿¡ °íÀ¯ÇÑ ÀÎÁö±âÁ¦ ¿¬±¸
Á¤º¸
ºÐ·ù¿Í ¹üÁÖÈ¿¡ ´ëÇÑ °³ÀÎÂ÷ ¿¬±¸
|
2Â÷³âµµ
|
Àº´Ðº¯¼ö
½Å°æ¸Á ÇнÀ ±â¹ÝÀÇ Á¤º¸ °Ë»ö ±â¼ú °³¹ß
|
Á¤º¸°Ë»ö
½Å°æ¸Á ¸ðµ¨ÀÇ ÀÚµ¿ÇнÀ ±â¹ý ¿¬±¸
´ë±Ô¸ð
ÅØ½ºÆ® ¹®¼ÀÇ ºÐ¼®, ºÐ·ù, ¿©°ú ±â¼ú °³¹ß
10
GB ¹®¼ µ¥ÀÌÅÍ¿¡ ´ëÇÑ ±âº» ¼º´É Å×½ºÆ®
|
½Å°æ¸Á
±â¹ÝÀÇ À¥ÄÁÅÙÃ÷ Á¤º¸ ÃßÃâ ±â¼ú °³¹ß
|
»ç¿ëÀÚÀÇ
¿ä±¸ ȤÀº ¼ºÇâ¿¡ ¸Â°Ô ºÐ¼®µÈ À¥ ÄÁÅÙÃ÷ Á¤º¸¸¦ ÃßÃâÇÒ
¼ö ÀÖ´Â ±â¼ú¿¡ °üÇÑ ¿¬±¸
|
Àΰ£¿¡°Ô
ÀûÇÕÇÑ ½Ã½ºÅÛÀÇ ±¸Ãà ¹æ½Ä°ú ±¸Ãà½Ã Á¦ÇÑÁ¡ ÇØ°á ¹æ¾È
¿¬±¸
|
Á¦¾ÈµÈ
¸ðÇüÀÇ ±¸Çö °¡´É¼º°ú ±¸Çö ±â¹ý ¿¬±¸
°³ÀÎÂ÷¸¦
ÀÌ¿ëÇÏ´Â ½Ã½ºÅÛÀÇ ±¸Çö ¹æ¹ý ¿¬±¸
|
3Â÷³âµµ
|
Àº´Ðº¯¼ö
½Å°æ¸Á ¸ðµ¨¿¡ ±â¹ÝÇÑ °í¼º´É Á¤º¸°Ë»ö ½Ã½ºÅÛ Neuro-IRÀÇ
±¸Çö ¹× Æò°¡
|
Àº´Ðº¯¼ö
½Å°æ¸Á ±â¹ÝÀÇ Text Mining ±â¼ú °³¹ß
Neuro-IR
°³¹ß ¹× TREC ad-hoc retrieval¿¡¼ »óÀ§ ±×·ì ´ëºñ
105% ¼º´É ´Þ¼º
100GB
¹®¼¸¦ ´Ù·ç´Â ´º½º µµ¿ì¹Ì¿¡ ´ëÇÑ Neuro-IRÀÇ ¼º´É
Æò°¡
|
µ¥ÀÌÅͺ£À̽º
±¸Ãà ¹× ´Ù¸¥°úÁ¦¿ÍÀÇ ½Ã½ºÅÛ ÅëÇÕ
|
Á¦Ç°Á¤º¸
µ¥ÀÌÅͺ£À̽º ±¸Ãà
µ¥ÀÌÅͺ£À̽º
È¿¿ë¼º È®ÀÎ
´Ù¸¥°úÁ¦ÀÇ
½Ã½ºÅÛ°úÀÇ ÅëÇÕ
|
2´Ü°è¿¡¼ÀÇ
½Ã½ºÅÛÀ» ½ÇÁ¦·Î ±¸ÇöÇÏ°í ±¸ÇöµÈ ½Ã½ºÅÛ¿¡ ´ëÇÑ Æò°¡
|
¸ðÇüÀÇ
±¸Çö ¹æ¹ý °³¹ß
°³¹ßµÈ
½Ã½ºÅÛ°ú ±âÁØ ´Ù¸¥ ¸ðÇü°úÀÇ ºñ±³ ¿¬±¸
°³ÀÎÂ÷
ÀÌ¿ë ½Ã½ºÅÛÀÇ ¼öÇà´É·Â¿¡ ´ëÇÑ ¿¬±¸
|
¢Æ Publications ¢Æ
- International Journal
- Word Sense Disambiguation by Learning Decision Trees from Unlabeled Data, Seong-Bae Park and Byoung-Tak Zhang, Applied Intelligence, vol. 19, pp. 27-38, 2003
- Genetic Mining of HTML Structures for Effective Web-Document Retrieval, Sun Kim and Byoung-Tak Zhang, Applied Intelligence, 18(3), pp. 243-256, 2003.
- Gene Expression Pattern Analysis via Latent Variable Models Coupled with Topographic Clustering, Jeong-Ho Chang, Sung Wook Chi, and Byoung-Tak Zhang, Genomics and Informatics, vol. 1, no. 1, pp. 34-40, 2003 (to appear)
- An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition, Yu-Seop Kim, Jeong-Ho Chang, and Byoung-Tak Zhang, Lecture Notes in Artificial Intelligence, vol. 2637, pp. 111-116, 2003
- Large Scale Unstructured Document Classification Using Unlabeled Data and Syntactic Information, Seong-Bae Park and Byoung-Tak Zhang, Lecture Notes in Artificial Intelligence, vol. 2637, pp. 88-99, 2003.
- A Bayesian Evolutionary Approach to the Design and Learning of Heterogeneous Neural Trees, Byoung-Tak Zhang, Integrated Computer-Aided Engineering, vol. 9, no. 1, pp. 73-86, 2002
- Topic Extraction from Text Documents using Mulitple-cause Networks, Jeong-Ho Chang, Jae Won Lee, Yuseop Kim, and Byoung-Tak Zhang, Lecture Notes in Artificial Intelligence vol. 2417, pp. 434-443, 2002
- Construction of Large-Scale Bayesian Networks by Local to Global Search, Kyu-Baek Hwang, Jae Won Lee, Seung-Woo Chung, and Byoung-Tak Zhang, Lecture Notes in Artificial Intelligence vol. 2417, pp. 375-383, 2002
- Target Word Selection using WordNet and Data-driven Models in Machine Translation, Yu-Seop Kim, Jeong-Ho Chang, and Byoung-Tak Zhang, Lecture Notes in Artificial Intelligence vol. 2417, p. 607, 2002
- Customer Data Mining and Visualization by Generative Topographic Mapping Methods, Jin-San Yang and Byoung-Tak Zhang, Data Mining and Knowledge Discovery, 2002 (submitted)
- Domestic Journal
- È¿À²Àû ±¸Á¶ ÇнÀ ¾Ë°í¸®Áò°ú µ¥ÀÌŸ Â÷¿ø Ãà¼Ò¸¦ ÅëÇÑ º£ÀÌÁö¾È¸Á ±â¹ÝÀÇ ¸¶ÀÌÅ©·Î¾î·¹ÀÌ µ¥ÀÌŸ ºÐ¼®¹ý,
Ȳ±Ô¹é,
ÀåÁ¤È£, À庴Ź, Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö: ¼ÒÇÁÆ®¿þ¾î ¹× ÀÀ¿ë, vol. 29, no. 11/12, 2002
- Àڱⱸ¼º HMMÀ» ÀÌ¿ëÇÑ À¥¹®¼ Á¤º¸ ÃßÃâ,
¾öÀçÈ«, À庴Ź, Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö: ¼ÒÇÁÆ®¿þ¾î ¹× ÀÀ¿ë, 2002 (submitted)
- International Conference
-
Classification of the Risk Types of Human Papilloma Virus by Decision Trees , Seong-Bae Park, Sohyun Hwang, and Byoung-Tak Zhang, The Fourth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL03), 2003(accepted)
-
Automatic Webpage Classification Enhanced by Unlabeled Data, Seong-Bae Park and Byoung-Tak Zhang, The Fourth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL03), 2003(accepted)
- Analysis of Gene Expression Profiles and Drug Activity Patterns by Clustering and Bayesian Network Learning, Jeong-Ho Chang, Kyu-Baek Hwang, and Byoung-Tak Zhang, In
Methods of Microarray Data Analysis II (Papers
from CAMDA'01), Kluwer Academic Publishers,
pp. 169-184, 2002
- A
Boosted Maximum Entropy Model for Learning Text
Chunking, Seong-Bae Park and Byoung-Tak Zhang, In
Proceedings of 19th International Conference
on Machine Learning (ICML'02), pp. 482-489, 2002
- Stock
Trading System using Reinforcement Learning
with Cooperative Agents, Jang-Min O,
Jae Won Lee, and Byoung-Tak Zhang, In Proceedings
of 19th International Conference on Machine
Learning (ICML'02), pp. 451-458, 2002
- A
Comparative Evaluation of Data-driven Models
in Translation Selection of Machine Translation,
Yuseop Kim, Jeong-Ho Chang, and Byoung-Tak
Zhang, Proceedings of the 19th International Conference on Computational Linguistics (COLING2002), vol. 1, pp. 453-459, 2002.
- Concurrent
Evolution of Neural Networks and Their Data
Sets, Je-Gun Joung and Byoung-Tak Zhang, In
Proceedings of 8th International Conference
on Neural Information Processing (ICONIP'01),
pp. 115-120, 2001.
- Domestic Conference
- Ç︧ȦÃ÷¸Ó½Å ÇнÀ ±â¹ÝÀÇ ÀÇ¹Ì Ä¿³ÎÀ» ÀÌ¿ëÇÑ ¹®¼ À¯»çµµ ÃøÁ¤, ÀåÁ¤È£, ±èÀ¯¼·, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 440-442, 2003
- ¾Ó»óºí º£ÀÌÁö¾È¸Á¿¡ ÀÇÇÑ À¯ÀüÀÚ¹ßÇöµ¥ÀÌÅÍ ºÐ·ù, Ȳ±Ô¹é, ÀåÁ¤È£, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 434-436, 2003
- Á¤º¸º´¸ñ±â¹ý¿¡ ÀÇÇÑ À¯ÀüÀÚ ¹ßÇö µ¥ÀÌÅÍÀÇ ÀÌÁß Å¬·¯½ºÅ͸µ, ±èº´Èñ, Ȳ±Ô¹é, ÀåÁ¤È£, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 362-364, 2003
- ºñ¿ëÀÇÁ¸ÇнÀ¿¡ ÀÇÇÑ ÀÎÀ¯µÎÁ¾ ¹ÙÀÌ·¯½ºÀÇ ºÐ·ù, Ȳ¼ÒÇö, ¹Ú¼º¹è, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 401-403, 2003
- ½Ã³À½º ÀüÀ§È°µ¿¿¡ ±â¹ÝÇÑ ºÐÀڽŰæ¸Á, Á¤È£Áø, Á¶µ¿¿¬, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 416-418, 2003
- ÁøÈ¿¬»êÀ» ÀÌ¿ëÇÑ ÀÚ¿¬¾î ÆÄ½Ì, ±èµ¿¹Î, ¹Ú¼º¹è, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 419-421, 2003
- ÃÖ´ë ¿£Æ®·ÎÇÇ ºÎ½ºÆÃ ¸ðµ¨À» ÀÌ¿ëÇÑ Ç°»ç ¸ðÈ£¼º ÇØ¼Ò, ¹Ú¼º¹è, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ Ãá°è Çмú ´ëȸ ³í¹®Áý(B), pp. 522-524, 2003
- °áÁ¤ Æ®¸®¿¡ ÀÇÇÑ ÀÎÀ¯µÎÁ¾ ¹ÙÀÌ·¯½ºÀÇ À§Ç豺 ºÐ·ù, Ȳ¼ÒÇö, ¹Ú¼º¹è, À庴Ź Çѱ¹ µ¥ÀÌÅ͸¶ÀÌ´× ÇÐȸ Ãß°èÇмú´ëȸ ³í¹®Áý, pp. 148-160, 2002
- Àº´Ðº¯¼ö¸ðµ¨À» ÀÌ¿ëÇÑ ¹®¼ Ãßõ, ÀÌÁ¾¿ì, À庴Ź, Çѱ¹ Áö´ÉÁ¤º¸½Ã½ºÅÛÇÐȸ Ãß°è Çмú´ëȸ ³í¹®Áý, pp. 514-519, 2002
- Ãִ뿣Ʈ·ÎÇÇ ºÎ½ºÆÃ ¸ðµ¨À» ÀÌ¿ëÇÑ ÀüÄ¡»ç Á¢¼Ó ¸ðÈ£¼º ÇØ¼Ò, ¹Ú¼º¹è, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ °¡À» Çмú¹ßÇ¥ ³í¹®Áý (II), Á¦ 29±Ç 2È£, pp. 670-672, 2002
- °èÃþÀû ±ºÁýȸ¦ ÅëÇÑ À̽ºÆ®(Yeast) ´Ü¹éÁúÀÇ °íÂ÷ »óÈ£ ÀÛ¿ë ÃßÃâ, ¾öÀçÈ«, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ °¡À» Çмú¹ßÇ¥ ³í¹®Áý (II), Á¦ 29±Ç 2È£, pp. 364-366, 2002
- Co-Trained Support Vector
MachinesÀ» ÀÌ¿ëÇÑ ¹®¼ºÐ·ù, ¹Ú¼º¹è, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ º½ Çмú¹ßÇ¥ ³í¹®Áý (B), Á¦ 29±Ç 1È£, pp. 259-261, 2002
- ÀáÀçÀṉ̀¸Á¶ ±â¹Ý ´Ü¾îÀ¯»çµµ¿¡ ÀÇÇÑ ¿ª¾î
¼±ÅÃ, ÀåÁ¤È£, ±èÀ¯¼·, À庴Ź, Çѱ¹Á¤º¸°úÇÐȸ º½ Çмú¹ßÇ¥ ³í¹®Áý (B), Á¦ 29±Ç 1È£, pp. 502-504, 2002
- S-HMMÀ» ÀÌ¿ëÇÑ ÅØ½ºÆ® Á¤º¸ÃßÃâ , ¾öÀçÈ«, À庴Ź, Çѱ¹ Á¤º¸°úÇÐȸ º½ Çмú¹ßÇ¥ ³í¹®Áý (B), Á¦ 29±Ç 1È£, pp. 328-330, 2002
- Latent variable model ±â¹Ý text learning¿¡ °üÇÑ ºñ±³ ¿¬±¸, ÀåÁ¤È£, À庴Ź, Çѱ¹ ³úÇÐȸ Çмú´ëȸ ³í¹®Áý,, pp. 120-121, 2002
|
This page is maintained by Jeong-Ho Chang (jhchang@bi.snu.ac.kr). Last Updates: April 28, 2003. |