Library Header Image
LSE Research Online LSE Library Services

Using lexical patterns in the Google Web 1T corpus to deduce semantic relations between nouns

Nulty, Paul and Costello, Fintan J. (2009) Using lexical patterns in the Google Web 1T corpus to deduce semantic relations between nouns. In: DEW '09 Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions, 2009-06-01, Boulder CO, United States.

Full text not available from this repository.


This paper investigates methods for using lexical patterns in a corpus to deduce the semantic relation that holds between two nouns in a noun-noun compound phrase such as "flu virus" or "morning exercise". Much of the previous work in this area has used automated queries to commercial web search engines. In our experiments we use the Google Web 1T corpus. This corpus contains every 2, 3, 4 and 5 gram occurring more than 40 times in Google's index of the web, but has the advantage of being available to researchers directly rather than through a web interface. This paper evaluates the performance of the Web 1T corpus on the task compared to similar systems in the literature, and also investigates what kind of lexical patterns are most informative when trying to identify a semantic relation between two nouns.

Item Type: Conference or Workshop Item (Paper)
Official URL:
Additional Information: © 2009 Association for Computational Linguistics
Divisions: Methodology
Subjects: C Auxiliary Sciences of History > C Auxiliary sciences of history (General)
Q Science > QA Mathematics > QA76 Computer software
JEL classification: C - Mathematical and Quantitative Methods > C6 - Mathematical Methods and Programming > C63 - Computational Techniques
Date Deposited: 08 Jul 2014 16:08
Last Modified: 29 Mar 2024 05:27

Actions (login required)

View Item View Item