SlideShare a Scribd company logo
Outline
           Topic Detection and Tracking                                                               • Topic detection and tracking
                                                                                                      • Overview of TDT 2004
                  Valentin Jijkoun & Maarten de Rijke

                                          Informatics Institute
                                        University of Amsterdam
              http://ilps.science.uva.nl/Teaching/II0506


                                             March 6, 2006



Internet Information/MIKII6, March 6, 2006                   Valentin Jijkoun, Maarten de Rijke   1   Internet Information/MIKII6, March 6, 2006       Valentin Jijkoun, Maarten de Rijke   2




TDT…                                                                                                  Topic Detection and Tracking
• Introduction…                                                                                           Terabytes of                             • 5 TDT Applications
       – http://www.nist.gov/speech/tests/tdt/                                                          Unorganized data
                                                                                                                                                     – Story
                                                                                                                                                       segmentation*
                                                                                                                                                     – Topic Tracking
                                                                                                                                                     – Topic Detection
                                                                                                                                                     – First Story
                                                                                                                                                       Detection
                                                                                                                                                     – Link Detection

                                                                                                                                                              * Not evaluated in 2004

Internet Information/MIKII6, March 6, 2006                   Valentin Jijkoun, Maarten de Rijke   3   Internet Information/MIKII6, March 6, 2006       Valentin Jijkoun, Maarten de Rijke   4




TDT’s Research Domain                                                                                 Definitions
• Technology challenge                                                                                • An event is …
       – Develop applications that organize and locate                                                       – A specific thing that happens at a specific time and
                                                                                                               place along with all necessary preconditions and
         relevant stories from a continuous feed of
                                                                                                               unavoidable consequences.
         news stories
                                                                                                      • A topic is …
• Research driven by evaluation tasks                                                                        – an event or activity, along with all directly related
• Composite applications built from                                                                            events and activities
       – Document Retrieval                                                                           • A broadcast news story is …
                                                                                                             – a section of transcribed text with substantive
       – Speech-to-Text (STT) – not included in 2004
                                                                                                               information content and a unified topical focus
       – Story Segmentation – not included in 2004
Internet Information/MIKII6, March 6, 2006                   Valentin Jijkoun, Maarten de Rijke   5   Internet Information/MIKII6, March 6, 2006       Valentin Jijkoun, Maarten de Rijke   6
TDT 2004                                                                                                                                Evaluation Corpus
               • TDT Evaluation Overview                                                                                                                                            TDT4                         TDT5                   • 2004: same
                                                                                                                                                                                    (2003                        (2004
                                                                                                                                                                                                                                          languages as 2003
               • Changes in 2004                                                                                                                   Collection Dates
                                                                                                                                                                                   corpus)
                                                                                                                                                                             Oct 1, 2000 to
                                                                                                                                                                                                                corpus)
                                                                                                                                                                                                          April 1, 2003 to              • Summary of
               • 2004 TDT Evaluation Result Summaries                                                                                              Newswire
                                                                                                                                                                             Jan 31, 2001

                                                                                                                                                                             3 Arabic
                                                                                                                                                                                                          Sep 30, 2003

                                                                                                                                                                                                          6 Arabic
                                                                                                                                                                                                                                          differences
                                 – New Event Detection                                                                                             Sources                   2 English                    7 English                                – New time period
                                                                                                                                                                             2 Mandarin                   4 Mandarin
                                                                                                                                                                                                                                                   – No broadcast news
                                 – Link Detection                                                                                                  Broadcast News
                                                                                                                                                   Sources
                                                                                                                                                                             2 Arabic                            NONE
                                                                                                                                                                                                                                                            • No non-news stories
                                                                                                                                                                             5 English
                                 – Topic Tracking                                                                                                                            5 Mandarin                                                            – 4.5 times more stories
                                                                                                                                                   Story Counts              90735 news,                  407503 news,
                                 – Experimental Tasks:                                                                                                                       7513 non-news                0 non-news
                                                                                                                                                                                                                                                   – 3.1 times more topics
                                                                                                                                                                             stories                                                               – Topics have ! as
                                     • Supervised Adaptive Topic Tracking                                                                          Annotated topics          80                           250                                        many on-topic stories
                                     • Hierarchical Topic Detection                                                                                Average topic             79 stories                   40 stories
                                                                                                                                                   size
               Internet Information/MIKII6, March 6, 2006                                          Valentin Jijkoun, Maarten de Rijke    7             Internet Information/MIKII6, March 6, 2006                                                           Valentin Jijkoun, Maarten de Rijke       8




               Topic Size Distribution                                                                                                                 Multilingual Topic overlap
                                                                                                                                                          Single Overlap Topics                                                             Multiply Overlap Topics
                                   35             7                     21            63            62              62
                              Arb+Eng+Man      Arb+Eng               Eng+Man          Eng           Man             Arb                                       Common               Topic ID                                                             107: Casablanca bombs
                                                                                                                                                               Stories                                                                 105
                                                                                                                                                                                                      Unique
                                                                                                                                                                                                      Stories
                                                                                                                                                                                                                             126            12
                              1000                                                                                                                                                                                                                                                                   283
                                                                                                                                                                                                                                                 583                           106            118
                                                                                                                                                                     72              89                                532
                                                                                                                                                              105           20                380                                                                                     2
                                                                                                                                                                                                                                      215         3            107                             154
                                                                                                                                                                                                                                                                              1110    6                    9
 Number of On-Topic Stories




                                                                                                                                                                                                                             92                         1
                                                                                                                                                                     29              125                                                                                  25
                               100                                                                                                                            451           63                140                                                                                1            60
                                                                                                                                                                                                                                      227                       70
                                                                                                                                        Arabic                                                                               93                    1
                                                                                                                                                                                                                                                                                                     9
                                                                                                                                        Mandarin                     151             189
                                                                                                                                                              42               2                 90                                     171            22      71         2
                                                                                                                                        English                                                                                                                                  80
                                                                                                                                                                                                                                  5                             6
                                10                                                                                                                                                                                                                                                        78
                                                                                                                                                                     69              145
                                                                                                                                                              427              1                  3


                                                                                                                                                                     186             193                                                Topics on                      71: Demonstrations in
                                                                                                                                                              31               1              104
                                 1
                                                                                                                                                                                                                                        Terrorism                      Casablanca
                                                            Topics (sorted by language and size)
               Internet Information/MIKII6, March 6, 2006                                          Valentin Jijkoun, Maarten de Rijke    9             Internet Information/MIKII6, March 6, 2006                                                           Valentin Jijkoun, Maarten de Rijke       10




                Topic labels                                                                                                                       Participation by Task:
                                                                                                                                                   Showing the Number of Submitted System Runs
                              Single Overlap Topics                                  Multiple Overlap Topics                                                          Sites                              New Event           Hierarchical                         Topic Tracking                            Link
                                                                                                                                                                                                         Detection              Topic                       Traditional          Supervised
                                                                                                                                                                                                                                                                                                          Detection
 72 Court indicts Liberian President                                               105 UN official killed in attack                                                                                                           Detection                                          Adaptation
 89 Liberian former president arrives in exile                                     126 British soldiers attacked in Basra                                     CMU          Carnegie Mellon Univ.
                                                                                   215 Jerusalem: Bus suicide bombing
                                                                                                                                                                                                                1                                                6                        8                    10
                                                                                                                                                                           International Business
 29 Swedish Foreign Minister killed                                                227 Bin Laden Videotape                                                    IBM
                                                                                                                                                                                 Machines                       4
125 Sweden rejects the Euro                                                        171 Morocco: death sentences for bombing
                                                                                                                                                   Domestic




                                                                                                                                                                               Stottler Henke
                                                                                      suspects                                                                SHAI
                                                                                                                                                                              Associates, Inc.                  5
151 Egyptian delegation in Gaza                                                                                                                               UIowa              Univ. of Iowa
                                                                                                                                                                                                                                                                                                               4
189 Palestinian public uprising suspended                                          107 Casablanca bombs
                                                                                                                                                              UMd             Univ. of Maryland
   for                 three months                                                 71 Demonstrations in Casablanca                                                                                                                                              1                        2
                                                                                                                                                                            Univ. Massachusetts
                                                                                                                                                              UMass                                             4                       6                        5                        7                    4
 69 Earthquake in Algeria                                                          106 Bombing in Riyadh, Saudi Arabia
                                                                                                                                                                           Chinese Univ. of Hong
145 Visit of Morocco Minister of Foreign                                           118 World Economic Forum in Jordan                                         CUHK
                                                                                                                                                                                   Kong                                                 1
   Affairs to Algeria                                                              154 Saudi suicide bomber dies in shootout                                               Institute of Computing
                                                                                                                                                              ICT                                                                      11                        1
                                                                                    60 Saudi King has eye surgery
                                                                                                                                                   Foreign




                                                                                                                                                                            Technology Chinese
                                                                                                                                                                           Academy of Sciences
186 Press conference between Lebanon and                                            80 Spanish Elections
   US foreign ministers                                                                                                                                       NEU          Northeastern University
                                                                                                                                                                                                                                                                 2                                             2
                                                                                                                                                                                  in China
193 Colin Powell Plans to visit Middle East
                                                                                                                                                                            Netherlands Org for
   and Europe                                                                                                                                                 TNO
                                                                                                                                                                             Applied Scientific                                         8
                                                                                                                                                                                Research

               Internet Information/MIKII6, March 6, 2006                                          Valentin Jijkoun, Maarten de Rijke    11            Internet Information/MIKII6, March 6, 2006                                                           Valentin Jijkoun, Maarten de Rijke       12
New Event Detection Task                                                                                                 TDT Evaluation Methodology
• System Goal:                                                                                                           • Tasks are modeled as detection tasks
       – To detect the first story that discusses each                                                                          – Systems are presented with many trials and must
                                                                                                                                  answer the question: “Is this example a target trial?”
         topic
                                                                                                                                – Systems respond:
                                                                                                                                        • YES this is a target, or NO this is not
                                                                                                                                        • Each decision includes a likelihood score indicating the
                                              First Stories on two topics                                                                 system’s confidence in the decision
                                                                                                                         • System performance measured by linearly
                   = Topic 1
                   = Topic 2                                                                                               combining the system’s missed detection rate
                                                                                                                           and false alarm rate
                                                   Not First Stories
Internet Information/MIKII6, March 6, 2006                          Valentin Jijkoun, Maarten de Rijke           13      Internet Information/MIKII6, March 6, 2006                                               Valentin Jijkoun, Maarten de Rijke   14




Detection Evaluation Methodology                                                                                         Performance Measures Example
   •     Performance is measured in terms of Detection Cost                                                                                                                                                                                Bar Chart
           – CDet = CMiss * PMiss * Ptarget + CFA * PFA * (1- Ptarget)
                                                                                                                                                    DET Curve
                                                                                                                                                                                                                                            Actual Normalized
                                                                                                                                                                                                                                      1
           – Constants:                                                                                                                                                                                                                     Detection Cost
                   • CMiss = 1 and CFA = 0.1 are preset costs                                                                                                                                                                               Minimum DET
                   • Ptarget = 0.02 is the a priori probability of a target                                                                                                                                                                 Normalized Cost
           – System performance estimates
                   • PMiss and PFA                                                                                                             P(miss) = 5.5%
                                                                                                                                                                               > Min DET Norm


                                                                                                                                                                                                                   Detection Cost
           – Normalized Detection Cost generally lies between 0 and 1:                                                                         P(fa)=1.1%                        Cost = 0.11
                   • (CDet)Norm = CDet/min{CMiss*Ptarget, CFA * (1-Ptarget)}                                                                                                                                                         0.1
                                                                                                                                                                                                     te r
   •     Detection Error Tradeoff (DET) curves graphically depict the
                                                                                                                                                                                          is   bet
         performance tradeoff between PMiss and PFA                                                                                                                                le f t
           – Makes use of likelihood scores attached to the YES/NO decisions                                                                                               tto m
                                                                                                                                                                      Bo
   ! Two important scores per system
           – Actual Normalized Detection Cost                                                                                                                                                                                       0.01

                                                                                                                                                                                                            >
                   • Based on the YES/NO decision threshold                                                                                                                 P(miss) = 0.7%                      Min DET NormEnglish                    Mandarin
           – Minimum Normalized DET point                                                                                                                                   P(fa)=1.5%                          Cost = 0.08
                   • Based on the DET curve: Minimum score with proper threshold

Internet Information/MIKII6, March 6, 2006                          Valentin Jijkoun, Maarten de Rijke           15      Internet Information/MIKII6, March 6, 2006                                               Valentin Jijkoun, Maarten de Rijke   16




  Primary New Event Detection Results                                                                                    TDT Link Detection Task
  Newswire, English Texts

                                                                                                                         System Goal:
                                                                                             Actual Norm(Cost)                – To detect whether a pair of stories discuss the same topic.
                                                                                             Minimum Norm(Cost)                  (Can be thought of as a “primitive operator” to build a variety of
                                                                                             1
                                                                                                                                   applications)
                                                                          Normalized Cost




                                                                                                                                                                                          ?
                                                                                            0.1
                                                                                                            1
                                                                                                  U1

                                                                                                       1



                                                                                                                     1
                                                                                                            AI
                                                                                                       M



                                                                                                                 ass
                                                                                             CM

                                                                                                   IB

                                                                                                           SH
                                                                                                                UM




                                             2003’s best score
Internet Information/MIKII6, March 6, 2006                          Valentin Jijkoun, Maarten de Rijke           17      Internet Information/MIKII6, March 6, 2006                                               Valentin Jijkoun, Maarten de Rijke   18
Primary Link Detection Results                                                                                       Topic Tracking Task
Newswire, Multilingual links, 10-file deferral period

                                                                                                Actual Norm(Cost)          • System Goal:
                                                                                                Minimum Norm(Cost)
                                                                                                1                                  – To detect stories that discuss the target topic, in
                                                                                                                                     multiple source streams
                                                                                                                                          • Supervised Training




                                                                            Normalized Cost
                                                                                                                                                  – Given Nt samples stories that discuss a given target topic
                                                                                               0.1
                                                                                                                                          • Testing
                                                                                                                                                  – Find all subsequent stories that discuss the target topic

                                                                                              0.01
                                                                                                                                                                                              on-topic
                                                                                                                                                                                              unknown
                                                                                                                                      training data



                                                                                                UI 1
                                                                                                    U1



                                                                                                UM 1
                                                                                                                                                                                              unknown




                                                                                                       1
                                                                                                    U

                                                                                                     a
                                                                                                   ass
                                                                                                  ow
                                                                                                 NE
                                                                                                 CM
                                                                                                                                                                                       test data
                                    Scores are better than last year!
Internet Information/MIKII6, March 6, 2006                       Valentin Jijkoun, Maarten de Rijke       19         Internet Information/MIKII6, March 6, 2006                            Valentin Jijkoun, Maarten de Rijke   20




Primary Tracking Results                                                                                             Supervised Adaptive Tracking Task
Newswire, Multilingual Texts, 1 English Training Story

                                                                                                                           • Variation of Topic Tracking system goal:
                                                                                                Actual Norm(Cost)                  – To detect stories that discuss the target topic when
                                                                                                Minimum Norm(Cost)
                                                                                                1
                                                                                                                                     a human provides feedback to the system
                                                                                                                                          • System receives human judgment (on or off-topic)
                                                                                                                                            for every retrieved story
                                                                            Normalized Cost




                                                                                                                                   – Same task as TREC 2002 Adaptive Filtering
                                                                                               0.1
                                                                                                                                                                                           on-topic
                                                                                                                                                                                           unknown
                                                                                                                                                                                           un-retrieved
                                                                                              0.01                                                                                         retrieved on-topic
                                                                                                                                      training data                                        retrieved off-topic
                                                                                                      UM 1
                                                                                                      NE 1
                                                                                                         U1




                                                                                                     UM D1
                                                                                                            1
                                                                                                         U
                                                                                                          T



                                                                                                        ass
                                                                                                       IC
                                                                                               CM




                                                                                                                                                                                       test data
                                             2003’s best score
Internet Information/MIKII6, March 6, 2006                       Valentin Jijkoun, Maarten de Rijke       21         Internet Information/MIKII6, March 6, 2006                            Valentin Jijkoun, Maarten de Rijke   22




Supervised Adaptive Tracking Metrics                                                                                 Supervised Adaptive Tracking Metrics
   • Normalized Detection Cost                                                                                          • Linear Utility Measure Computation:
           – Same measure as for basic Tracking task                                                                            – Basic formula: U = Wrel ! R - NR
                                                                                                                                        • R = number of relevant stories retrieved
   • Linear Utility Measure                                                                                                             • NR = number of non-relevant stories retrieved
           – As defined for TREC 2002 Filtering Track                                                                                   • Wrel = relative weight of relevant vs non-relevant
             (Robertson & Soboroff)                                                                                                       (set to 10, by analogy with CMiss vs. CFA weights for CDet)
           – Measures value of the stories sent to the user:                                                                    – Normalization across topics:
                   • Credit for relevant stories, debit for non-relevant stories                                                        • Divide by maximum possible utility score for each topic
                   • Equivalent to thresholding based on estimated probability                                                  – Scaling across topics:
                     of relevance                                                                                                       • Define arbitrary minimum possible score, to avoid having
           – No penalty for missing relevant stories                                                                                      average dominated by a few topics with huge NR counts
             (i.e. all precision, no recall)                                                                                            • Corresponds to application scenario in which user stops looking
           – Implication: Challenge is to beat the “do-nothing” baseline                                                                  at stories when system exceeds some tolerable false alarm rate
             (i.e. a system that rejects all stories)                                                                           – Scaled, normalized value:
                                                                                                                                        Uscale = [ max(Unorm, Umin) ] / [ 1 - Umin ]


Internet Information/MIKII6, March 6, 2006                       Valentin Jijkoun, Maarten de Rijke       23         Internet Information/MIKII6, March 6, 2006                            Valentin Jijkoun, Maarten de Rijke   24
Supervised Adaptive Tracking                                                                                                                                                                                                            Effect of Supervised Adaptation
                                 Best Two Submissions per Site
                                    Newswire, Multilingual Texts, 1 English Training Story                                                                                                                                                                           • CMU4 is a simple cosine similarity tracker
                                                                                                                                                                                                                                                                             – Contrastive run submitted without supervised
                                                                                                                                                                                                                                                                               adaptation
                                                                                                                                                                                                                                                                                                                                     Minimum Norm(Cost)
                                                                                                                                                                                                                                                                                                                                            1




                                                                                                                                                                                                                                                                                                                       Normalized Cost
                                                                                                                                                                                                                                                                                                                                          0.1




                                                                                                                                                                                                                                                                                                                                         0.01




                                                                                                                                                                                                                                                                                                                                                    g

                                                                                                                                                                                                                                                                                                                                                           ing
                                                                                                                                                                                                                                                                                                                                                kin

                                                                                                                                                                                                                                                                                                                                                        ck
                                                                                                                                                                                                                                                                                                                                            ac

                                                                                                                                                                                                                                                                                                                                                       a
                                                                                                                                                                                                                                                                                                                                          Tr

                                                                                                                                                                                                                                                                                                                                                    Tr
                                                                                          Best 2004 standard tracking result!




                                                                                                                                                                                                                                                                                                                                                  SA
                          Internet Information/MIKII6, March 6, 2006                                                                                                          Valentin Jijkoun, Maarten de Rijke                           25                     Internet Information/MIKII6, March 6, 2006   Valentin Jijkoun, Maarten de Rijke          26




                          Supervised Adaptive Tracking
                          Utility vs. Detection cost                                                                                                                                                                                                              Hierarchical Topic Detection
                                                                                                          Actual Normalized DET Cost

                                                                                                                                                                                                                                                                  • System goal:
                                                                                                          Min. Normalized DET Cost
                                                                                                                                                                                                       Minimum DET Cost vs. Scaled Utility
                             1                                                                            Scaled Utility
                                                                                                                                                                                                   0.8
                                                                                                                                                                                                                                                                         – To detect topics in terms of the (clusters of) stories
    System Performance




                                                                                                                                                                                                   0.7

                         0.1
                                                                                                                                                                                                   0.6                                                                     that discuss them
                                                                                                                                                                                  Scaled Utility




                                                                                                                                                                                                   0.5
                                                                                                                                                                                                   0.4                                                            • Problems with past Topic Detection evaluations:
              0.01
                                                                                                                                                                                                   0.3
                                                                                                                                                                                                   0.2
                                                                                                                                                                                                                           y = 1.0398x + 0.2942                          – Topics are at different levels of granularity,
                                                                                                                                                                                                                                     2

                                                                                                                                                                                                                                                                           yet systems had to choose single operating point
                                 CMU6

                                        CMU2

                                               CMU1

                                                      CMU5

                                                             CMU3-TrecUtl

                                                                            CMU4

                                                                                   CMU7

                                                                                          CMU8-dbg

                                                                                                     UMass2

                                                                                                              UMass1

                                                                                                                       UMass3

                                                                                                                                UMass4

                                                                                                                                         UMass7

                                                                                                                                                  UMD1

                                                                                                                                                         UMD2




                                                                                                                                                                                                                                 R = 0.2349
                                                                                                                                                                                                   0.1
                                                                                                                                                                                                       0
                                                                                                                                                                                                           0         0.2             0.4         0.6        0.8
                                                                                                                                                                                                                                                                           for creating a new cluster
                                                                                                                                                                                                                     Minimum DET Cost                                    – Stories may pertain to multiple topics,
                         •      Performance on Utility measure:
                                  – 2/3 of systems surpassed baseline scaled utility score                                                                                                                                                                                 yet systems had to assign each to only one cluster
                                      (0.33)
                                  – Most systems optimized for detection cost, not utility
                         • Detection Cost and Utility are uncorrelated: R2 of 0.23
                          Internet– Even for March 6, 2006
                                  Information/MIKII6, CMU3 which was tuned for utility Valentin Jijkoun, Maarten de Rijke                                                                                                                  27                     Internet Information/MIKII6, March 6, 2006   Valentin Jijkoun, Maarten de Rijke          28




                     Topic Hierarchy Solves Problems                                                                                                                                                                                                              Hierarchical Topic Detection
•                 System operation:                                                                                                                                                                                              a               Vertex           Observations
                          – Unsupervised topic training -                                                                                                                                                                                        Edge
                            no topic instances as input                                                                                                                                                                                          Story IDs        • All systems structured hierarchy as a tree –
                          – Assign each story to one or more clusters
                          – Clusters may overlap or include other
                                                                                                                                                                                                                                                                    each vertex has one parent
                                                                                                                                                                                                               a
                            clusters
                          – Clusters must be organized as directed
                                                                                                                                                                                                                                                                  • Travel cost has very little effect on finding the
                            acyclic graph (DAG) with single root                                                                                                                                       b                     c
                                                                                                                                                                                                                                                                    best cluster
                          – Treated as retrospective search                                                                                                              s1
•                 Semantics of topic hierarchy:                                                                                                                           s2
                                                                                                                                                                                                                                                s4         s3            – Setting WDET to 1.0 has little effect on topic mapping
                          – Root = entire collection
                          – Leaf nodes = the most specific topics                                                                                                    d                             e                 f                     g
                                                                                                                                                                                                                                                                  • Cost parameters favor false alarms
                          – Intermediate nodes represent different                                                                                                                                                                                                       – Average mapped cluster sizes are between
                            levels of granularity                                                                                                               s5                                                                               s10        s9
•                 Performance assessment:                                                                                                                       s6
                                                                                                                                                                                                                                                                           1262 and 7757 stories
                          – Given a topic, find matching cluster                                                                                                                                                                                                         – Average topic size is 40 stories
                            with lowest cost                                                                                                                                                       h                 i                     j

                                                                                                                                                                 s7
                                                                                                                                                                                                           s11              s13                      s15
                                                                                                                                                                         s8
                                                                                                                                                                                                                   s12                   s14                s16
                          Internet Information/MIKII6, March 6, 2006                                                                                                          Valentin Jijkoun, Maarten de Rijke                           29                     Internet Information/MIKII6, March 6, 2006   Valentin Jijkoun, Maarten de Rijke          30
Summary                                                                                                         What do teams use?
   •     Eleven research groups participated in five evaluation tasks
   •     Error rates increased for new event detection
           – Why?
                                                                                                                • TNO (HDT at TDT 2004)
   •     Error rates decreased for tracking                                                                            – Focus on scalability
   •     Error rates decreased for link detection
   •     Dry run of hierarchical topic detection completed                                                             – Agglomerative clustering scalable for large
           – Solves previous problems with topic detection task, but raises new issues
           – Questions to consider:                                                                                      document collections
                   • Is the specified hierarchical structure (single-root DAG) appropriate?
                   • Is the minimal cost metric appropriate?
                                                                                                                               • Take a sample
                   • If so, is the normalization right?                                                                        • Build a hierarchical cluster structure of this sample
   •     Dry run of supervised adaptive tracking completed
           – Promising results for including relevance feedback                                                                • Optimize resulting binary tree for minimal cost
           – Questions to consider:                                                                                              metric
                   • Should we continue the task?
                   • If so, should we continue using both metrics?                                                                     – Detection cost, travel cost
                                                                                                                               • Assign remaining docs from the corpus to cluster
                                                                                                                                 in the structure obtained from the sample

Internet Information/MIKII6, March 6, 2006                            Valentin Jijkoun, Maarten de Rijke   31   Internet Information/MIKII6, March 6, 2006             Valentin Jijkoun, Maarten de Rijke   32




Umass at TDT 2004
• Hierarchical topic detection
       – Topic detection classifies stories into different
         topics
       – Two step algorithm
               • 1-NN for event formation
                       – Stories from same source selected and time ordered
                       – Stories are processed one by one, each incoming story
                         is compared to (a certain number of) stories before it
               • Agglomerative clustering for building the hierarchy
                       – Events are sorted by time order according to time stamp
                         of first story, then do a bounded agglomerative clustering
                         for the events

Internet Information/MIKII6, March 6, 2006                            Valentin Jijkoun, Maarten de Rijke   33

More Related Content

PPT
Latent Semantics & Social Interaction
PDF
Utilizing temporal information in topic detection and tracking
PDF
Topic detection & tracking
PPTX
2011 DevLearn – Beyond SCORM – Aaron Silvers
PPTX
Topic defense- Situation modeling and detection
PDF
503 Final Presentation
PDF
Utilizing Semantics in the Production of iTV Shows (ESWC 2009)
PDF
Simple semantics in topic detection and tracking
Latent Semantics & Social Interaction
Utilizing temporal information in topic detection and tracking
Topic detection & tracking
2011 DevLearn – Beyond SCORM – Aaron Silvers
Topic defense- Situation modeling and detection
503 Final Presentation
Utilizing Semantics in the Production of iTV Shows (ESWC 2009)
Simple semantics in topic detection and tracking

Similar to Topic detection and tracking (20)

PDF
EdChang - Parallel Algorithms For Mining Large Scale Data
PDF
Time for Events -- Presentation to New Economic School / Center for the Study...
 
PDF
Text Stream Processing Tutorial @WIMS 2012
PDF
EPICS project: building cultural heritage stories by teachers for students
PPT
Compare & Contrast Using The Web To Discover Comparable Cases For News Stories
PDF
TKclass: Crowdsourcing tools
PDF
Social life networks presentation at fb 110713
PDF
We Know It (Newsfromthefront 2010)
PPTX
Icccn2011 jiang-0802
PPTX
Remembrance of data past
PDF
Sigma Knowledge Engineering Environment
KEY
Power of Story
PDF
SXSW 2012 Lessons Emily Reeves
PDF
ECIR 2013 Keynote - Time for Events
 
PPTX
Thesis personalized situation recognition
PDF
Alla ricerca della user story perduta
PDF
Alla ricerca della User Story perduta
PPTX
Competing with Robots: Making Research Skills Relevant to 21st Century Students
KEY
Juicy Stories: Creating Empathy and Connection
PDF
Topic Tracking for Punjabi Language
EdChang - Parallel Algorithms For Mining Large Scale Data
Time for Events -- Presentation to New Economic School / Center for the Study...
 
Text Stream Processing Tutorial @WIMS 2012
EPICS project: building cultural heritage stories by teachers for students
Compare & Contrast Using The Web To Discover Comparable Cases For News Stories
TKclass: Crowdsourcing tools
Social life networks presentation at fb 110713
We Know It (Newsfromthefront 2010)
Icccn2011 jiang-0802
Remembrance of data past
Sigma Knowledge Engineering Environment
Power of Story
SXSW 2012 Lessons Emily Reeves
ECIR 2013 Keynote - Time for Events
 
Thesis personalized situation recognition
Alla ricerca della user story perduta
Alla ricerca della User Story perduta
Competing with Robots: Making Research Skills Relevant to 21st Century Students
Juicy Stories: Creating Empathy and Connection
Topic Tracking for Punjabi Language
Ad

More from George Ang (20)

PDF
Wrapper induction construct wrappers automatically to extract information f...
PDF
Opinion mining and summarization
PPT
Huffman coding
PPT
Do not crawl in the dust 
different ur ls similar text
PPT
大规模数据处理的那些事儿
PPT
腾讯大讲堂02 休闲游戏发展的文化趋势
PPT
腾讯大讲堂03 qq邮箱成长历程
PPT
腾讯大讲堂04 im qq
PPT
腾讯大讲堂05 面向对象应对之道
PPT
腾讯大讲堂06 qq邮箱性能优化
PPT
腾讯大讲堂07 qq空间
PPT
腾讯大讲堂08 可扩展web架构探讨
PPT
腾讯大讲堂09 如何建设高性能网站
PPT
腾讯大讲堂01 移动qq产品发展历程
PPT
腾讯大讲堂10 customer engagement
PPT
腾讯大讲堂11 拍拍ce工作经验分享
PPT
腾讯大讲堂14 qq直播(qq live) 介绍
PPT
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
PPTX
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
PPT
腾讯大讲堂16 产品经理工作心得分享
Wrapper induction construct wrappers automatically to extract information f...
Opinion mining and summarization
Huffman coding
Do not crawl in the dust 
different ur ls similar text
大规模数据处理的那些事儿
腾讯大讲堂02 休闲游戏发展的文化趋势
腾讯大讲堂03 qq邮箱成长历程
腾讯大讲堂04 im qq
腾讯大讲堂05 面向对象应对之道
腾讯大讲堂06 qq邮箱性能优化
腾讯大讲堂07 qq空间
腾讯大讲堂08 可扩展web架构探讨
腾讯大讲堂09 如何建设高性能网站
腾讯大讲堂01 移动qq产品发展历程
腾讯大讲堂10 customer engagement
腾讯大讲堂11 拍拍ce工作经验分享
腾讯大讲堂14 qq直播(qq live) 介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂15 市场研究及数据分析理念及方法概要介绍
腾讯大讲堂16 产品经理工作心得分享
Ad

Recently uploaded (20)

PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPT
Teaching material agriculture food technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Encapsulation theory and applications.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Tartificialntelligence_presentation.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Advanced methodologies resolving dimensionality complications for autism neur...
Empathic Computing: Creating Shared Understanding
Diabetes mellitus diagnosis method based random forest with bat algorithm
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Spectroscopy.pptx food analysis technology
Teaching material agriculture food technology
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Encapsulation theory and applications.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Spectral efficient network and resource selection model in 5G networks
Group 1 Presentation -Planning and Decision Making .pptx
Network Security Unit 5.pdf for BCA BBA.
Programs and apps: productivity, graphics, security and other tools
NewMind AI Weekly Chronicles - August'25-Week II
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Tartificialntelligence_presentation.pptx

Topic detection and tracking

  • 1. Outline Topic Detection and Tracking • Topic detection and tracking • Overview of TDT 2004 Valentin Jijkoun & Maarten de Rijke Informatics Institute University of Amsterdam http://ilps.science.uva.nl/Teaching/II0506 March 6, 2006 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 1 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 2 TDT… Topic Detection and Tracking • Introduction… Terabytes of • 5 TDT Applications – http://www.nist.gov/speech/tests/tdt/ Unorganized data – Story segmentation* – Topic Tracking – Topic Detection – First Story Detection – Link Detection * Not evaluated in 2004 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 3 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 4 TDT’s Research Domain Definitions • Technology challenge • An event is … – Develop applications that organize and locate – A specific thing that happens at a specific time and place along with all necessary preconditions and relevant stories from a continuous feed of unavoidable consequences. news stories • A topic is … • Research driven by evaluation tasks – an event or activity, along with all directly related • Composite applications built from events and activities – Document Retrieval • A broadcast news story is … – a section of transcribed text with substantive – Speech-to-Text (STT) – not included in 2004 information content and a unified topical focus – Story Segmentation – not included in 2004 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 5 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 6
  • 2. TDT 2004 Evaluation Corpus • TDT Evaluation Overview TDT4 TDT5 • 2004: same (2003 (2004 languages as 2003 • Changes in 2004 Collection Dates corpus) Oct 1, 2000 to corpus) April 1, 2003 to • Summary of • 2004 TDT Evaluation Result Summaries Newswire Jan 31, 2001 3 Arabic Sep 30, 2003 6 Arabic differences – New Event Detection Sources 2 English 7 English – New time period 2 Mandarin 4 Mandarin – No broadcast news – Link Detection Broadcast News Sources 2 Arabic NONE • No non-news stories 5 English – Topic Tracking 5 Mandarin – 4.5 times more stories Story Counts 90735 news, 407503 news, – Experimental Tasks: 7513 non-news 0 non-news – 3.1 times more topics stories – Topics have ! as • Supervised Adaptive Topic Tracking Annotated topics 80 250 many on-topic stories • Hierarchical Topic Detection Average topic 79 stories 40 stories size Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 7 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 8 Topic Size Distribution Multilingual Topic overlap Single Overlap Topics Multiply Overlap Topics 35 7 21 63 62 62 Arb+Eng+Man Arb+Eng Eng+Man Eng Man Arb Common Topic ID 107: Casablanca bombs Stories 105 Unique Stories 126 12 1000 283 583 106 118 72 89 532 105 20 380 2 215 3 107 154 1110 6 9 Number of On-Topic Stories 92 1 29 125 25 100 451 63 140 1 60 227 70 Arabic 93 1 9 Mandarin 151 189 42 2 90 171 22 71 2 English 80 5 6 10 78 69 145 427 1 3 186 193 Topics on 71: Demonstrations in 31 1 104 1 Terrorism Casablanca Topics (sorted by language and size) Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 9 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 10 Topic labels Participation by Task: Showing the Number of Submitted System Runs Single Overlap Topics Multiple Overlap Topics Sites New Event Hierarchical Topic Tracking Link Detection Topic Traditional Supervised Detection 72 Court indicts Liberian President 105 UN official killed in attack Detection Adaptation 89 Liberian former president arrives in exile 126 British soldiers attacked in Basra CMU Carnegie Mellon Univ. 215 Jerusalem: Bus suicide bombing 1 6 8 10 International Business 29 Swedish Foreign Minister killed 227 Bin Laden Videotape IBM Machines 4 125 Sweden rejects the Euro 171 Morocco: death sentences for bombing Domestic Stottler Henke suspects SHAI Associates, Inc. 5 151 Egyptian delegation in Gaza UIowa Univ. of Iowa 4 189 Palestinian public uprising suspended 107 Casablanca bombs UMd Univ. of Maryland for three months 71 Demonstrations in Casablanca 1 2 Univ. Massachusetts UMass 4 6 5 7 4 69 Earthquake in Algeria 106 Bombing in Riyadh, Saudi Arabia Chinese Univ. of Hong 145 Visit of Morocco Minister of Foreign 118 World Economic Forum in Jordan CUHK Kong 1 Affairs to Algeria 154 Saudi suicide bomber dies in shootout Institute of Computing ICT 11 1 60 Saudi King has eye surgery Foreign Technology Chinese Academy of Sciences 186 Press conference between Lebanon and 80 Spanish Elections US foreign ministers NEU Northeastern University 2 2 in China 193 Colin Powell Plans to visit Middle East Netherlands Org for and Europe TNO Applied Scientific 8 Research Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 11 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 12
  • 3. New Event Detection Task TDT Evaluation Methodology • System Goal: • Tasks are modeled as detection tasks – To detect the first story that discusses each – Systems are presented with many trials and must answer the question: “Is this example a target trial?” topic – Systems respond: • YES this is a target, or NO this is not • Each decision includes a likelihood score indicating the First Stories on two topics system’s confidence in the decision • System performance measured by linearly = Topic 1 = Topic 2 combining the system’s missed detection rate and false alarm rate Not First Stories Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 13 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 14 Detection Evaluation Methodology Performance Measures Example • Performance is measured in terms of Detection Cost Bar Chart – CDet = CMiss * PMiss * Ptarget + CFA * PFA * (1- Ptarget) DET Curve Actual Normalized 1 – Constants: Detection Cost • CMiss = 1 and CFA = 0.1 are preset costs Minimum DET • Ptarget = 0.02 is the a priori probability of a target Normalized Cost – System performance estimates • PMiss and PFA P(miss) = 5.5% > Min DET Norm Detection Cost – Normalized Detection Cost generally lies between 0 and 1: P(fa)=1.1% Cost = 0.11 • (CDet)Norm = CDet/min{CMiss*Ptarget, CFA * (1-Ptarget)} 0.1 te r • Detection Error Tradeoff (DET) curves graphically depict the is bet performance tradeoff between PMiss and PFA le f t – Makes use of likelihood scores attached to the YES/NO decisions tto m Bo ! Two important scores per system – Actual Normalized Detection Cost 0.01 > • Based on the YES/NO decision threshold P(miss) = 0.7% Min DET NormEnglish Mandarin – Minimum Normalized DET point P(fa)=1.5% Cost = 0.08 • Based on the DET curve: Minimum score with proper threshold Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 15 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 16 Primary New Event Detection Results TDT Link Detection Task Newswire, English Texts System Goal: Actual Norm(Cost) – To detect whether a pair of stories discuss the same topic. Minimum Norm(Cost) (Can be thought of as a “primitive operator” to build a variety of 1 applications) Normalized Cost ? 0.1 1 U1 1 1 AI M ass CM IB SH UM 2003’s best score Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 17 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 18
  • 4. Primary Link Detection Results Topic Tracking Task Newswire, Multilingual links, 10-file deferral period Actual Norm(Cost) • System Goal: Minimum Norm(Cost) 1 – To detect stories that discuss the target topic, in multiple source streams • Supervised Training Normalized Cost – Given Nt samples stories that discuss a given target topic 0.1 • Testing – Find all subsequent stories that discuss the target topic 0.01 on-topic unknown training data UI 1 U1 UM 1 unknown 1 U a ass ow NE CM test data Scores are better than last year! Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 19 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 20 Primary Tracking Results Supervised Adaptive Tracking Task Newswire, Multilingual Texts, 1 English Training Story • Variation of Topic Tracking system goal: Actual Norm(Cost) – To detect stories that discuss the target topic when Minimum Norm(Cost) 1 a human provides feedback to the system • System receives human judgment (on or off-topic) for every retrieved story Normalized Cost – Same task as TREC 2002 Adaptive Filtering 0.1 on-topic unknown un-retrieved 0.01 retrieved on-topic training data retrieved off-topic UM 1 NE 1 U1 UM D1 1 U T ass IC CM test data 2003’s best score Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 21 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 22 Supervised Adaptive Tracking Metrics Supervised Adaptive Tracking Metrics • Normalized Detection Cost • Linear Utility Measure Computation: – Same measure as for basic Tracking task – Basic formula: U = Wrel ! R - NR • R = number of relevant stories retrieved • Linear Utility Measure • NR = number of non-relevant stories retrieved – As defined for TREC 2002 Filtering Track • Wrel = relative weight of relevant vs non-relevant (Robertson & Soboroff) (set to 10, by analogy with CMiss vs. CFA weights for CDet) – Measures value of the stories sent to the user: – Normalization across topics: • Credit for relevant stories, debit for non-relevant stories • Divide by maximum possible utility score for each topic • Equivalent to thresholding based on estimated probability – Scaling across topics: of relevance • Define arbitrary minimum possible score, to avoid having – No penalty for missing relevant stories average dominated by a few topics with huge NR counts (i.e. all precision, no recall) • Corresponds to application scenario in which user stops looking – Implication: Challenge is to beat the “do-nothing” baseline at stories when system exceeds some tolerable false alarm rate (i.e. a system that rejects all stories) – Scaled, normalized value: Uscale = [ max(Unorm, Umin) ] / [ 1 - Umin ] Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 23 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 24
  • 5. Supervised Adaptive Tracking Effect of Supervised Adaptation Best Two Submissions per Site Newswire, Multilingual Texts, 1 English Training Story • CMU4 is a simple cosine similarity tracker – Contrastive run submitted without supervised adaptation Minimum Norm(Cost) 1 Normalized Cost 0.1 0.01 g ing kin ck ac a Tr Tr Best 2004 standard tracking result! SA Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 25 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 26 Supervised Adaptive Tracking Utility vs. Detection cost Hierarchical Topic Detection Actual Normalized DET Cost • System goal: Min. Normalized DET Cost Minimum DET Cost vs. Scaled Utility 1 Scaled Utility 0.8 – To detect topics in terms of the (clusters of) stories System Performance 0.7 0.1 0.6 that discuss them Scaled Utility 0.5 0.4 • Problems with past Topic Detection evaluations: 0.01 0.3 0.2 y = 1.0398x + 0.2942 – Topics are at different levels of granularity, 2 yet systems had to choose single operating point CMU6 CMU2 CMU1 CMU5 CMU3-TrecUtl CMU4 CMU7 CMU8-dbg UMass2 UMass1 UMass3 UMass4 UMass7 UMD1 UMD2 R = 0.2349 0.1 0 0 0.2 0.4 0.6 0.8 for creating a new cluster Minimum DET Cost – Stories may pertain to multiple topics, • Performance on Utility measure: – 2/3 of systems surpassed baseline scaled utility score yet systems had to assign each to only one cluster (0.33) – Most systems optimized for detection cost, not utility • Detection Cost and Utility are uncorrelated: R2 of 0.23 Internet– Even for March 6, 2006 Information/MIKII6, CMU3 which was tuned for utility Valentin Jijkoun, Maarten de Rijke 27 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 28 Topic Hierarchy Solves Problems Hierarchical Topic Detection • System operation: a Vertex Observations – Unsupervised topic training - Edge no topic instances as input Story IDs • All systems structured hierarchy as a tree – – Assign each story to one or more clusters – Clusters may overlap or include other each vertex has one parent a clusters – Clusters must be organized as directed • Travel cost has very little effect on finding the acyclic graph (DAG) with single root b c best cluster – Treated as retrospective search s1 • Semantics of topic hierarchy: s2 s4 s3 – Setting WDET to 1.0 has little effect on topic mapping – Root = entire collection – Leaf nodes = the most specific topics d e f g • Cost parameters favor false alarms – Intermediate nodes represent different – Average mapped cluster sizes are between levels of granularity s5 s10 s9 • Performance assessment: s6 1262 and 7757 stories – Given a topic, find matching cluster – Average topic size is 40 stories with lowest cost h i j s7 s11 s13 s15 s8 s12 s14 s16 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 29 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 30
  • 6. Summary What do teams use? • Eleven research groups participated in five evaluation tasks • Error rates increased for new event detection – Why? • TNO (HDT at TDT 2004) • Error rates decreased for tracking – Focus on scalability • Error rates decreased for link detection • Dry run of hierarchical topic detection completed – Agglomerative clustering scalable for large – Solves previous problems with topic detection task, but raises new issues – Questions to consider: document collections • Is the specified hierarchical structure (single-root DAG) appropriate? • Is the minimal cost metric appropriate? • Take a sample • If so, is the normalization right? • Build a hierarchical cluster structure of this sample • Dry run of supervised adaptive tracking completed – Promising results for including relevance feedback • Optimize resulting binary tree for minimal cost – Questions to consider: metric • Should we continue the task? • If so, should we continue using both metrics? – Detection cost, travel cost • Assign remaining docs from the corpus to cluster in the structure obtained from the sample Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 31 Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 32 Umass at TDT 2004 • Hierarchical topic detection – Topic detection classifies stories into different topics – Two step algorithm • 1-NN for event formation – Stories from same source selected and time ordered – Stories are processed one by one, each incoming story is compared to (a certain number of) stories before it • Agglomerative clustering for building the hierarchy – Events are sorted by time order according to time stamp of first story, then do a bounded agglomerative clustering for the events Internet Information/MIKII6, March 6, 2006 Valentin Jijkoun, Maarten de Rijke 33