Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements
Author(s)
Marton, Gregory
Downloadnuggeteer.pdf (230.8Kb)
Other Contributors
Infolab
Advisor
Boris Katz
Metadata
Show full item recordAbstract
TREC Definition and Relationship questions are evaluated on thebasis of information nuggets that may be contained in systemresponses. Human evaluators provide informal descriptions of eachnugget, and judgements (assignments of nuggets to responses) for eachresponse submitted by participants.The best present automatic evaluation for these kinds of questions isPourpre. Pourpre uses a stemmed unigram similarity of responses withnugget descriptions, yielding an aggregate result that is difficult tointerpret, but is useful for relative comparison. Nuggeteer, bycontrast, uses both the human descriptions and the human judgements,and makes binary decisions about each response, so that the end resultis as interpretable as the official score.I explore n-gram length, use of judgements, stemming, and termweighting, and provide a new algorithm quantitatively comparable to,and qualitatively better than the state of the art.
Date issued
2006-01-09Series/Report no.
Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory
Keywords
natural language, question answering