Fake opinion detection: how similar are crowdsourced datasets to real data?

Fornaciari T.; Cagnina L.; Poesio M.

doi:10.1007/s10579-020-09486-5

Publications

?

Fake opinion detection: how similar are crowdsourced datasets to real data?

Language Resources and Evaluation. 2020. Vol. 54. No. 4. P. 1019–1058.

Fornaciari T., Cagnina L., Россо П., Poesio M.

Identifying deceptive online reviews is a challenging tasks for Natural Language Processing (NLP). Collecting corpora for the task is difficult, because normally it is not possible to know whether reviews are genuine. A common workaround involves collecting (supposedly) truthful reviews online and adding them to a set of deceptive reviews obtained through crowdsourcing services. Models trained this way are generally successful at discriminating between 'genuine' online reviews and the crowdsourced deceptive reviews. It has been argued that the deceptive reviews obtained via crowdsourcing are very different from real fake reviews, but the claim has never been properly tested. In this paper, we compare (false) crowdsourced reviews with a set of 'real' fake reviews published on line. We evaluate their degree of similarity and their usefulness in training models for the detection of untrustworthy reviews. We find that the deceptive reviews collected via crowdsourcing are significantly different from the fake reviews published online. In the case of the artificially produced deceptive texts, it turns out that their domain similarity with the targets affects the models' performance, much more than their untruthfulness. This suggests that the use of crowdsourced datasets for opinion spam detection may not result in models applicable to the real task of detecting deceptive reviews. As an alternative method to create large-size datasets for the fake reviews detection task, we propose methods based on the probabilistic annotation of unlabeled texts, relying on the use of meta-information generally available on the e-commerce sites. Such methods are independent from the content of the reviews and allow to train reliable models for the detection of fake reviews.

Research target: Computer Science

Priority areas: humanitarian IT and mathematics

Language: English

DOI

Text on another site

Keywords: crowdsourcing Deception detection Ground truth Probabilistic labeling

Tencent и Open Source. Как относится к открытому ПО самый дорогой бренд Китая?

Silakov D., Системный администратор 2026 № 5 С. 46–51

В предыдущей статье про Open Source в КНР [1] мы рассказали про Alibaba – крупную корпорацию, занимающую тридцатое место в рейтинге самых значимых мировых брэндов за 2025 год [2]. Место почетное, но не первое среди китайских компаний – на тринадцатом месте расположилась Tencent, разработчик WeChat и ряда других продуктов, широко используемых нашими восточными соседями. Tencent ...

Added: July 14, 2026

2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

IEEE, 2026.

Added: July 13, 2026

Mathematical Optimization Theory and Operations Research, 25th International Conference, MOTOR 2026 Irkutsk, Russia, July 6–11, 2026 Proceedings

Switzerland: Springer, 2026.

This volume contains the refereed proceedings of the 25th International Conference on Mathematical Optimization Theory and Operations Research (MOTOR 2026) 1 held during July 6–11 in a picturesque place near Lake Baikal, Irkutsk, Russia. The MOTOR conference is a direct successor and scientific inheritor of several prominent events on mathematical programming, combinatorial and stochastic optimization, ...

Added: July 12, 2026

Задачи бесконечной регулярной реализуемости

Шиманогов И. Н., Vyalyi M., Дискретный анализ и исследование операций 2025 Т. 32 № 4 С. 213–230

A well-studied class of algorithmic problems is that of regular realizability: checking the non-emptiness of the intersection of a regular language with a given language. This problem has a natural algebraic interpretation: verifying whether an element of a Boolean algebra belongs to the kernel of a certain homomorphism. This motivates the consideration of an analogous ...

Added: July 12, 2026

Improving Differential Equation Solving in Compact Language Models via Activation Steering and Reinforcement Learning

Surkov A., Ignatenko V., Koltcov Sergei, Computers, Materials and Continua 2026

Large language models have recently demonstrated promising capabilities in mathematical reasoning; however, their performance on tasks requiring strict symbolic manipulation, such as solving differential equations, remains limited, especially for compact models. In this work, we investigate whether activation steering combined with reinforcement learning can improve the quality of solutions generated by pretrained language models without ...

Added: July 8, 2026

Computational Science and Its Applications – ICCSA 2026 Workshops

Springer, 2027.

The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics (LNBI), has established itself as a medium for the publication of new developments in computer science and information technology research, teaching, and education. LNCS enjoys close cooperation with the computer science R & ...

Added: July 8, 2026

Conference Proceedings: 2026 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), 14-15 May 2026

IEEE, 2026.

The purpose of the 2026 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT) is to bring together researchers and practitioners from multiple areas of radio science, including biomedical engineering, radioelectronics, microelectronics, information technology, smart energy, information security and others. ...

Added: July 8, 2026

Моделирование специализированных алгоритмов маршрутизации в сетях на кристалле, представленных сериями семейств циркулянтных топологий

Маликов М. А., Монахова Э. А., Rzaev E. et al., Ученые записки Казанского университета. Серия: Физико-математические науки 2026 Т. 168 № 2 С. 269–286

This article examines series of families of two-dimensional circulant networks with rectangular L -shapes, optimal in diameter, as network-on-chip topologies with a minimal number of crossings between the links and a bounded length of the maximum link that does not depend on the network size. New network-on-chip routing algorithms, which use the coordinates of three adjacent zeros in the ...

Added: July 8, 2026

Algorithmic overlaps as thermodynamic variables: From local to cluster Monte Carlo dynamics in critical phenomena

Pilé I., Shchur L., Deng Y., Physical Review B: Condensed Matter and Materials Physics 2026 Vol. 114 Article 014101

We investigate the spatial overlap of successive spin configurations in Markov chain Monte Carlo simulations using the local Metropolis algorithm and the Swendsen-Wang and Wolff cluster algorithms. We examine the dynamics of these algorithms for models in different universality classes: Ising model, Potts model with three components, and four-state Potts model. The overlap of two ...

Added: July 6, 2026

Журнал Телекоммуникации №1 за 2026

М.: Наука и технологии, 2026.

«Телекоммуникации» ежемесячный рецензируемый производственный, информационно-аналитический и учебно-методический журнал выходит в свет с июля 2000 г. Для руководителей и работников промышленности, научно-исследовательских и проектно-конструкторских институтов, высших учебных заведений, аспирантов и студентов, а также для специалистов, разрабатывающих, выпускающих и эксплуатирующих средства телекоммуникаций. Новости разработок и производства, прогнозы развития, защита информации, Нормативные, справочные, аналитические и учебно-методические материалы. Переход к глобальному информационному ...

Added: July 4, 2026

"Труды МФТИ" Том 17, № 4 (68) (2025)

МФТИ, 2025.

абота редакции научного журнала «Труды Московского физико-технического института» (кратко «Труды МФТИ»), редакционной коллегии и редакционного совета осуществляется в соответствии с Положением, утвержденным ректором института. В состав редакционной коллегии входят руководители института, факультетов, институтских и факультетских кафедр. Главный редактор журнала —президент МФТИ, член-корр. РАН Кудрявцев Н.Н. Журнал «Труды МФТИ» входит в базу данных РИНЦ (Российский Индекс Научного Цитирования) и доступен в электронной ...

Added: July 4, 2026

Modulation Recognition for Industrial Internet of Things Communication Signals Under Few-Shot Conditions Based on Attention Mechanism and Relation Network

Hualin M., Jie Z., Jerome Y. et al., Journal of Internet Technology 2026 Vol. 27 No. 3 P. 367–382

In open, interference-prone scenarios, the scarcity of precisely annotated signal samples limits the application of deep learning–based modulation identification, which generally relies on extensive labeled data for stability. Relation Networks, as an emerging class of deep learning models, exhibit rapid convergence in few-shot learning tasks. Motivated by the fast convergence property of relation-based learning and ...

Added: July 3, 2026

Кодовые конструкции на базе обобщенных каскадных кодов для систем связи, использующих прием на основе порядковых статистик

Osipov D., Информационно-управляющие системы 2026 № 3 С. 49–62

Introduction: In many communication systems under construction and those to be created power control and channel estimation techniques developed for the previous generation communication systems fail to provide desired precision. One way to solve this problem is to use order-statistics-based reception techniques that do not need channel estimation or power control. To ensure the desired ...

Added: July 3, 2026

Graph Games and Logic Design. Recent Developments and Further Directions. (TREN, volume 66)

Springer, 2026.

This book presents established and new research on the close connections between graph games and systems of logic, particularly existing and newly designed modal logics. The volume utilizes two graph games – the sabotage game and the hide-and-seek game – to demonstrate the natural interplay between designing new graph games and exploring new kinds of ...

Added: June 30, 2026

The 12th International Conference on Information Technology and Quantitative Management (ITQM 2025)

Netherlands: ScienceDirect, 2025.

No ...

Added: June 28, 2026

Object-centric process management: A research manifesto

Seidel A., Weske M., Montali M. et al., Information Systems 2026 Vol. 141 Article 102728

Business process management employs process models and event logs to represent the behavior of the information systems under study. Traditional case-centric notions consider the order of activities and events in isolated process instances. The emerging field of object-centric processes challenges this assumption by putting objects in the center. Object-centric process mining and modeling approaches identify ...

Added: June 27, 2026

2024 26th International Conference on Digital Signal Processing and its Applications (DSPA)

IEEE, 2024.

A.S. Popov Russian Science and Technical Society with support from V. A. Trapeznikov Institute of Control Sciences, V.A. Kotelnikov Institute of Radio Engineering and Electronics, Autex Ltd. is leading the ХХVIII International Conference «Digital Signal Processing and its Applications — DSPA-2024» ...

Added: June 27, 2026

Построение методик оценки качества восприятия (QOE) потокового видео

Ivchenko A., Дворкович А. В., Телекоммуникации 2020 Т. 12 С. 2–11

Dynamic Adaptive Streaming over HTTP (DASH) technology powers most multimedia services. Its specific features (re-buffering, quality switching, etc.) necessitate the development of specialized methods for assessing user subjective quality of experience (QoE) based on objective parameters. This article examines the impact of various metrics on QoE and presents assessment models with Spearman correlation coefficients up ...

Added: June 27, 2026

Event-Driven Platform for Machine Vision Component Integration with Operation Center

Gadzhimirzaev S., Хельвас А. В., 2023 3rd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET) Mohammedia, Morocco 2023 P. 1–6

The article proposes the architecture for eventdriven Emergency Operation Center with Machine Vision Component. Sources of information are analyzed and approaches to machine vision events for tactical situations detection and estimation are discussed. Messages from Machine Vision Components are converted to Common Alerting Protocol and processed by Operation Center environment for tactical situations recognition. ...

Added: June 26, 2026

Дискретное моделирование процесса восстановительного ремонта участка дороги

Gadzhimirzaev S., Хельвас А. В., Компьютерные исследования и моделирование 2022 Т. 14 № 6 С. 1255–1268

This work contains a description of the results of modeling the process of maintaining the readiness of a section of the road network under strikes of with specified parameters. A one-dimensional section of road up to 40 km long with a total number of strikes up to 100 during the work of the brigade is ...

Added: June 26, 2026

Подход к оценке динамики уровня консолидированности отраcли

P. P. Lukianchenko, Danilov A. M., Bugaev A. S. et al., Computer Research and Modeling 2023 Vol. 15 No. 1 P. 129–140

In this article we propose a new approach to the analysis of econometric industry parameters for the industry consolidation level. The research is based on the simple industry automatic control model. The state of the industry is measured by quarterly obtained econometric parameters from each industry’s company provided by the tax control regulator. An approach ...

Added: June 26, 2026

Цифровой двойник полностью автоматизированного склада с глубокими стеллажами

Gadzhimirzaev S., Хельвас А. В., International Frequency Sensor Association (IFSA) Publishing, 19-21 February 2025 Granada, Spain 2025 P. 172–176

The paper presents models for an innovative fully robotic warehouse for storing boxed goods. A discrete multiagent simulation of the movement of shuttles in a warehouse for a given sequence of pallet shipments has been implemented. Different strategies for placement of boxes in various areas of a warehouse are evaluated, as well as optimal routing ...

Added: June 26, 2026

Growth in noncommutative algebras and entropy in derived categories

Piontkovski D., / Series arXiv "math". 2026.

A noncommutative projective variety is defined, following Artin and Zhang, by a graded coherent algebra 𝐴. The category of coherent sheaves is then the quotient qgr(𝐴) of the category of finitely presented graded modules by the subcategory of torsion modules. We consider the categorical and polynomial entropies of the Serre twist, that is, of the ...

Added: June 23, 2026

Multilinear nilalgebras and the Jacobian theorem

Piontkovski D., / Series arXiv "math". 2025.

If a symmetric multilinear algebra is weakly nil, then it is Engel. This result may be regarded as an infinite-dimensional analogue of the well-known Jacobian theorem, which states that if a polynomial mapping has a polynomial inverse, then its Jacobian matrix is invertible. This refines a theorem of Gerstenhaber and partially answers a question posed ...

Added: June 23, 2026