Text Processing in NLP and its Applications
  • 2012 Spring Semester Undergraduate Course for Computer Science and Engineering
  • Instructor: Michael Strube
  • TA: not assgined
  • Classroom: 302-209
  • Lecture Homepage: http://michael.kimstrube.de/tnlpa.php
  • Time: not assgined
  • Text:
    •             [1] Jurafsky, Daniel & James H. Martin (2008). Speech and Language Processing, 2nd                   ed. Upper Saddle River, N.J.: Prentice Hall.
    •             [2] Bird, Steven, Ewan Klein & Edward Loper (2009). Natural Language Processing                   with Python-Analyzing Text with the Natural Language Toolkit. O'Reilly.
  • References:
  • Evaluation:
    • Project and project presentation - maybe in small groups (30%)
    • Paper presentation (20%)
    • Questions submitted to instructor (30%)
    • Participation in discussion (20%)
  • Course Description:
    • A text is more than a sequence of sentences. For understanding a text, the reader needs to infer semantic and pragmatic relations between the sentences. In Computational Linguistics methods have been developed for capturing the specific character of text: models of local and global coherence, coreference resolution algorithms, theories describing the rhetorical, temporal, causal, and argumentative structure of text. After giving a short introduction into Natural Language Processing (NLP), these models, methods and algorithms will be discussed. However, their benefit over simpler approaches can only be evaluated within NLP applications. Therefore, the class will also cover applications processing texts such as information extraction, question answering, automatic summarization, sentiment analysis, etc.
  • Goal:
    • Students should understand which NLP application requires which discourse processing component. Students should be able to extend applications by such components and evaluate them.

  • Lecture Schedule
  • Week Topics Slides
    Week 1
    • Introduction to NLP
     
    Week 2
    • Hands on introduction to NLTK, tools, corpora
     
    Week 3
       a.  Text structure: local and global coherence
       b.  Model: Centering
     
    Week 4
       a.  Application: Anaphora resolution with Centering
       b.  Application:Evaluating readability with Centering
     
    Week 5
       a.  Application: Information ordering with Centering
       b.  Application: Information ordering with language       models, LSA, machine learning
     
    Week 6
       a.  Method Information structure/information status
       b.  Application: Using information status for generating       pitch accent; anaphoricity and information status
     
    Week 7
       a.  Model: Lexical cohesion, lexical chains
       b.  Application: Automatic summarization using lexical       chains
     
    Week 8
       a.  Model: Entity grid
       b.  Application: ...using the entity grid
     
    Week 9
       a.  Model: Introduction into coreference resolution (task,       linguistic issues, corpora, evaluation)
       b.  Model: Machine learning for coreference resolution
     
    Week 10
       a.  Model: Graph-based approaches to coreference       resolution
       b.  Application: Coreference resolution for automatic       summarization; coreference resolution for question       answering
     
    Week 11
       a.  Model: Text segmentation (lexical approaches)
       b.  Model: Text segmentation (unsupervised       approaches)
     
    Week 12
       a.  Application: Segmentation for automatic       summarization
       b.  Model: Discourse structure (RST)

     

    Week 13
       a.  Model: Discourse structure (DLTAG)
       b.  Application: Discourse structure and automatic       summarization

     

    Week 14
       a.  Application: Discourse structure and question       answering
       b.  Application: Discourse strucure and sentiment       analysis

     

    Week 15
    • Project presentations

     

    Week 16
       a.  Future Directions
       b.  Wrap-up

     


This page is maintained by Beom-Jin Lee
Last update: 2012.03.12.