{"id":401,"date":"2015-07-26T22:13:16","date_gmt":"2015-07-26T22:13:16","guid":{"rendered":"https:\/\/elex.link\/elex2015\/?page_id=401"},"modified":"2015-08-10T15:25:08","modified_gmt":"2015-08-10T15:25:08","slug":"paper-26","status":"publish","type":"page","link":"https:\/\/elex.link\/elex2015\/conference-proceedings\/paper-26\/","title":{"rendered":"paper-26"},"content":{"rendered":"<h3><strong>Longest\u2013commonest Match<\/strong><\/h3>\n<p><strong>Authors:<\/strong> Adam Kilgarriff, V\u00edt Baisa, Pavel Rychl\u00fd, Milo\u0161 Jakub\u00ed\u010dek<\/p>\n<p style=\"text-align: justify;\"><strong>Abstract:<\/strong><br \/>\nFinding two-word collocations is a well-studied task within natural language processing. The result of this task for a given headword is usually a list of collocations sorted by a salience score. In corpus manager Sketch Engine, these pairs are extracted from data using a word\u00a0sketch grammar relation rules and log-dice statistics resulting in a sorted list of triples &lt;headword, grammar-relation, collocate&gt;. The longest\u2013commonest match is a straightforward extension \u00a0of these two-word collocations into multiword expressions. The resulting expressions\u00a0are also very useful for representing the most common realisation of the collocational pair and\u00a0to facilitate the interpretation of the raw triplet because sometimes, for such a triple, it is not clear from what texts it comes. We present here an algorithm behind the longest\u2013commonest match together with a simple evaluation. The longest\u2013commonest match is already implemented in Sketch Engine.<\/p>\n<p><strong>Keywords:<\/strong> multiword expresion; collocation; word sketch; Sketch Engine<\/p>\n<p><strong>Reference:<\/strong> In Kosem, I., Jakub\u00ed\u010dek, M., Kallas, J.,\u00a0Krek, S. (eds.) <em>Electronic lexicography in the 21st century: linking lexical data in the digital age. Proceedings of the eLex 2015 conference, 11-13 August 2015, Herstmonceux Castle, United Kingdom<\/em>. Ljubljana\/Brighton: Trojina, Institute for Applied Slovene Studies\/Lexical Computing Ltd., pp. 397-404.<\/p>\n<p><strong>URL:<\/strong> <a href=\"https:\/\/elex.link\/elex2015\/proceedings\/eLex_2015_26_Kilgarriff+etal.pdf\">https:\/\/elex.link\/elex2015\/proceedings\/eLex_2015_26_Kilgarriff+etal.pdf<\/a><\/p>\n<p><strong>P<\/strong><strong>ublished: <\/strong>2015<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Longest\u2013commonest Match Authors: Adam Kilgarriff, V\u00edt Baisa, Pavel Rychl\u00fd, Milo\u0161 Jakub\u00ed\u010dek Abstract: Finding two-word collocations is a well-studied task within natural language processing. The result of this task for a given headword is usually a list of collocations sorted by &hellip; <a class=\"more-link\" href=\"https:\/\/elex.link\/elex2015\/conference-proceedings\/paper-26\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":4,"featured_media":0,"parent":327,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-401","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/pages\/401","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/comments?post=401"}],"version-history":[{"count":5,"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/pages\/401\/revisions"}],"predecessor-version":[{"id":580,"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/pages\/401\/revisions\/580"}],"up":[{"embeddable":true,"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/pages\/327"}],"wp:attachment":[{"href":"https:\/\/elex.link\/elex2015\/wp-json\/wp\/v2\/media?parent=401"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}