{"id":749,"date":"2019-12-05T17:22:16","date_gmt":"2019-12-05T16:22:16","guid":{"rendered":"http:\/\/tecnolengua.uma.es\/?page_id=749"},"modified":"2021-04-12T11:48:42","modified_gmt":"2021-04-12T10:48:42","slug":"sentiecon","status":"publish","type":"page","link":"https:\/\/tecnolengua.uma.es\/?page_id=749","title":{"rendered":"SentiEcon"},"content":{"rendered":"<div id=\"dslc-theme-content\"><div id=\"dslc-theme-content-inner\"><p class=\"qtranxs-available-languages-message qtranxs-available-languages-message-es\">Disculpa, pero esta entrada est\u00e1 disponible s\u00f3lo en <a href=\"https:\/\/tecnolengua.uma.es\/index.php?rest_route=%2Fwp%2Fv2%2Fpages%2F749&lang=en\" class=\"qtranxs-available-language-link qtranxs-available-language-link-en\" title=\"English\">English<\/a>.<\/p><p>SentiEcon (<a href=\"http:\/\/www.islrn.org\/resources\/314-817-285-706-3\/\">ISLRN: 314-817-285-706-3<\/a>) is a large, comprehensive, domain-specific computational lexicon for Economy and Finance designed for sentiment analysis applications.<\/p>\n<p>SentiEcon was created as a plug-in lexicon for the sentiment analysis tool <a href=\"http:\/\/tecnolengua.uma.es\/?dslc_projects=lingmotif\">Lingmotif<\/a>, and thus it follows its data structure requirements and presupposes the availability of a general-language core sentiment lexicon that covers non-specific sentiment-carrying terms and phrases. It contains 6,470 entries, both single and multi-word expressions, each with tags denoting their semantic orientation and intensity. SentiEcon\u2019s is formatted in a tab-separated UTF-8 file (e.g., money &lt;launder&gt; VB neg 3).<\/p>\n<p>We evaluate SentiEcon\u2019s performance by comparing results in a sentence classification task using exclusively sentiment words as features. This sentence dataset was extracted from business news texts, and included certain key words known to recurrently convey strong semantic orientation, such as \u201cdebt\u201d, \u201cinflation\u201d or \u201cmarkets\u201d. The results show that performance is significantly improved when adding SentiEcon to a general-language sentiment lexicon.<\/p>\n<table border=\"1\">\n<colgroup>\n<col \/>\n<col \/> <\/colgroup>\n<tbody>\n<tr>\n<td><strong>Data field<\/strong><\/td>\n<td><strong>Example\/List<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Word form<\/td>\n<td>launder, haircut, european central bank<\/td>\n<\/tr>\n<tr>\n<td>PoS<\/td>\n<td>[ALL, NN, JJ, VB, RB, UH, IN]<\/td>\n<\/tr>\n<tr>\n<td>Polarity<\/td>\n<td>[POS, NEG, NEU]<\/td>\n<\/tr>\n<tr>\n<td>Intensity<\/td>\n<td>[0, 1, 2, 3]<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Table 1: SentiEcon\u2019s data fields<\/p>\n<table style=\"height: 168px;\" border=\"1\" width=\"217\">\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/> <\/colgroup>\n<tbody>\n<tr>\n<td><strong>Polarity<\/strong><\/td>\n<td><strong>Words<\/strong><\/td>\n<td><strong>MWE<\/strong><\/td>\n<td><strong>Total<\/strong><\/td>\n<\/tr>\n<tr>\n<td>POS<\/td>\n<td>343<\/td>\n<td>1022<\/td>\n<td>1365<\/td>\n<\/tr>\n<tr>\n<td>NEG<\/td>\n<td>736<\/td>\n<td>1708<\/td>\n<td>2444<\/td>\n<\/tr>\n<tr>\n<td>NEU<\/td>\n<td>309<\/td>\n<td>2352<\/td>\n<td>2661<\/td>\n<\/tr>\n<tr>\n<td>Total<\/td>\n<td>1388<\/td>\n<td>5082<\/td>\n<td>6470<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Table 3: Count and distribution of entries in SentiEcon<\/p>\n<h2>SentiEcon GS-1000<\/h2>\n<p>SentiEcon GS-1000 (<a href=\"http:\/\/www.islrn.org\/resources\/524-008-163-978-0\/\">ISLRN:\u00a0524-008-163-978-0<\/a>) is a manually annotated gold standard dataset consisting of 1,000 sentences initially compiled\u00a0 to evaluate the performance of SentiEcon.\u00a0 Two domain experts annotated the dataset by classifying each sentence as belonging to one of three categories: POSITIVE, NEGATIVE, and NONE. They were instructed to take into account only the information available in the sentences and to annotate sentences. Annotation was carried out independently and then they were asked to reach a consensus in differing cases.<\/p>\n<h1 class=\"title-header\">License request<\/h1>\n<p>SentiEcon will be soon available under the ELRA license and GS-1000 will be realeased under Creative Commons-BY-NC 3.0. You can request a trial version of SentiEcon and\/or SentiEcon GS-1000 for academic purposes. Simply write us an email to <strong>tecnolengua [a] uma.es<\/strong> from an academic account with the following information and we will respond shortly:<\/p>\n<ul>\n<li>Full name<\/li>\n<li>Institution and department \/ position<\/li>\n<li>Brief description of the needs.<\/li>\n<\/ul>\n<h1>Citing SentiEcon<\/h1>\n<p>If you use SentiEcon please cite us:<\/p>\n<p>Moreno-Ortiz, A., Fern\u00e1ndez-Cruz, Javier, &amp; P\u00e9rez-Hern\u00e1ndez, Chantal. (2020). Design and Evaluation of SentiEcon: A fine-grained Economic\/Financial Sentiment Lexicon from a Corpus of Business News. Proceedings of the 12th Language Resources and Evaluation ConferenceAt: Marseille, 5067-5074. <span class=\"s1\">http:\/\/www.lrec-conf.org\/proceedings\/lrec2020\/pdf\/2020.lrec-1.623.pdf<\/span><\/p><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Disculpa, pero esta entrada est\u00e1 disponible s\u00f3lo en English.SentiEcon (ISLRN: 314-817-285-706-3) is a large, comprehensive, domain-specific computational lexicon for Economy and Finance designed for sentiment analysis applications. SentiEcon was created as a plug-in lexicon for the sentiment analysis tool Lingmotif, and thus it follows its data structure requirements and presupposes the availability of a general-language [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":8,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-749","page","type-page","status-publish","hentry"],"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/pages\/749","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=749"}],"version-history":[{"count":11,"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/pages\/749\/revisions"}],"predecessor-version":[{"id":875,"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/pages\/749\/revisions\/875"}],"up":[{"embeddable":true,"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=\/wp\/v2\/pages\/8"}],"wp:attachment":[{"href":"https:\/\/tecnolengua.uma.es\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=749"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}