Sentiment Analysis Using Deep Learning
The study was aimed to analyze advantages of the Deep Learning methods over other baseline machine learning methods using sentiment analysis task in Twitter. All the techniques were evaluated using a set of English tweets with classification on a five-point ordinal scale provided by SemEval-2017 organizers. For the implementation, we used two open source Python libraries. The results and conclusions of the study are discussed.
With the popularization of social media, a vast amount of textual content with additional geo-located and time-stamped information is directly generated by human every day. Both tweet meaning and extended message information can be analyzed in a purpose of exploration of public mood variations within a certain time periods. This paper aims at describing the development of the program for public mood monitoring based on sentiment analysis of Twitter content in Russian. Machine learning (naive Bayes classifier) and natural language processing techniques were used for the program implementation. As a result, the client-server program was implemented, where the server-side application collects tweets via Twitter API and analyses tweets using naive Bayes classifier, and the client-side web application visualizes the public mood using Google Charts libraries. The mood visualization consists of the Russian mood geo chart, the mood changes plot through the day, and the mood changes plot through the week. Cloud computing services were used in this program in two cases. Firstly, the program was deployed on Google App Engine, which allows completely abstracts away infrastructure, so the server administration is not required. Secondly, the data is stored in Google Cloud Datastore, that is, the highly-scalable NoSQL document database, which is fully integrated with Google App Engine.
The Semantic Evaluation (SemEval) series of workshops focuses on the evaluation and comparison of systems that can analyse diverse semantic phenomena in text with the aim of extending the current state of the art in semantic analysis and creating high quality annotated datasets in a range of increasingly challenging problems in natural language semantics. SemEval provides an exciting forum for researchers to propose challenging research problems in semantics and to build systems/techniques to address such research problems. SemEval-2016 is the tenth workshop in the series of International Workshops on Semantic Evaluation Exercises. The first three workshops, SensEval-1 (1998), SensEval-2 (2001), and SensEval-3 (2004), focused on word sense disambiguation, each time growing in the number of languages offered, in the number of tasks, and also in the number of participating teams. In 2007, the workshop was renamed to SemEval, and the subsequent SemEval workshops evolved to include semantic analysis tasks beyond word sense disambiguation. In 2012, SemEval turned into a yearly event. It currently runs every year, but on a two-year cycle, i.e., the tasks for SemEval-2016 were proposed in 2015. SemEval-2016 was co-located with the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’2016) in San Diego, California. It included the following 14 shared tasks organized in five tracks: • Text Similarity and Question Answering Track – Task 1: Semantic Textual Similarity: A Unified Framework for Semantic Processing and Evaluation – Task 2: Interpretable Semantic Textual Similarity – Task 3: Community Question Answering • Sentiment Analysis Track – Task 4: Sentiment Analysis in Twitter – Task 5: Aspect-Based Sentiment Analysis – Task 6: Detecting Stance in Tweets – Task 7: Determining Sentiment Intensity of English and Arabic Phrases • Semantic Parsing Track – Task 8: Meaning Representation Parsing – Task 9: Chinese Semantic Dependency Parsing • Semantic Analysis Track – Task 10: Detecting Minimal Semantic Units and their Meanings – Task 11: Complex Word Identification – Task 12: Clinical TempEval iii • Semantic Taxonomy Track – Task 13: TExEval-2 – Taxonomy Extraction – Task 14: Semantic Taxonomy Enrichment This volume contains both Task Description papers that describe each of the above tasks and System Description papers that describe the systems that participated in the above tasks. A total of 14 task description papers and 198 system description papers are included in this volume. We are grateful to all task organisers as well as the large number of participants whose enthusiastic participation has made SemEval once again a successful event. We are thankful to the task organisers who also served as area chairs, and to task organisers and participants who reviewed paper submissions. These proceedings have greatly benefited from their detailed and thoughtful feedback. We also thank the NAACL 2016 conference organizers for their support. Finally, we most gratefully acknowledge the support of our sponsor, the ACL Special Interest Group on the Lexicon (SIGLEX). The SemEval-2016 organizers, Steven Bethard, Daniel Cer, Marine Carpuat, David Jurgens, Preslav Nakov and Torsten Zesch
Nowadays, product reviews on e-commerce sites tend to be a valuable resource in terms of evaluation of customers' behavior, their preferences, and needs. This paper provides an approach for sentiment analysis of product reviews in Russian using convolutional neural networks. We use Word2Vec pre-trained vectors as inputs for neural networks. This approach utilizes no hand-crafted features or sentiment lexicons. The training dataset was collected from reviews on top-ranked goods from the major e-commerce site in Russia, where the user-ranked scores were used as class labels. The system demonstrated the F-measure score up to 75.45% in a three-class classification. The collected training dataset and word embeddings are available to the research community.
Volatility prediction—an essential concept in financial markets—has recently been addressed using sentiment analysis methods. We investigate the sentiment of annual disclosures of companies in stock markets to forecast volatility. We specifically explore the use of recent Information Retrieval (IR) term weighting models that are effectively extended by related terms using word embeddings. In parallel to textual information, factual market data have been widely used as the mainstream approach to forecast market risk. We therefore study different fusion methods to combine text and market data resources. Our word embedding-based approach significantly outperforms state-of-the-art methods. In addition, we investigate the characteristics of the reports of the companies in different financial sectors.
The relevance of this study is due to the need to identify the degree of statistical interdependence of text messages, which are available on the Internet, and the financial results of companies. The main purpose of the work is to identify features of the text messages that can help to differentiate between successful organizations and organizations, which are in crisis.
In this paper, we describe a deep-learning system for emotion detection in textual conversations that participated in SemEval-2019 Task 3 “EmoContext”. We designed a specific architecture of bidirectional LSTM which allows not only to learn semantic and sentiment feature representation, but also to capture user-specific conversation features. To fine-tune word embeddings using distant supervision we additionally collected a significant amount of emotional texts. The system achieved 72.59% micro-average F1 score for emotion classes on the test dataset, thereby significantly outperforming the officially-released baseline. Word embeddings and the source code were released for the research community.