Learning Alternative Name Spellings

L. E. Zhukov; Sukharev J.; Popescul A.

?

Learning Alternative Name Spellings

Information Retrieval. 2014.

Zhukov L. E., Sukharev J., Popescul A.

Name matching is a key component of systems for entity resolution or record linkage. Alternative spellings of the same names are a com- mon occurrence in many applications. We use the largest collection of genealogy person records in the world together with user search query logs to build name matching models. The procedure for building a crowd-sourced training set is outlined together with the presentation of our method. We cast the problem of learning alternative spellings as a machine translation problem at the character level. We use in- formation retrieval evaluation methodology to show that this method substantially outperforms on our data a number of standard well known phonetic and string similarity methods in terms of precision and re- call. Additionally, we rigorously compare the performance of standard methods when compared with each other. Our result can lead to a significant practical impact in entity resolution applications.

Priority areas: IT and mathematics mathematics

Language: English

Full text

Text on another site

Keywords: information retrieval Computation and Language

Advances in Information Retrieval

Kuznetsov S., Serdyukov P., Segalovich I. et al., L.: Springer, 2013.

Higher School of Economics (HSE) and supported by the Information Retrieval Specialist Group at the British Computer Society (BCS–IRSG). The conference was held during March 24–27, 2013, in Moscow, Russia – the easternmost location in the history of the ECIR series. ECIR 2013 received a total of 287 submissions in three categories: 191 full papers, ...

Added: April 15, 2013

Formal Concept Analysis Meets Information Retrieval 2013

Aachen: CEUR Workshop Proceedings, 2013.

Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classication, introduced and detailed in the book of Bernhard Ganter and Rudolf Wille, \Formal Concept Analysis", Springer 1999. The area came into being in the early 1980s and has since then spawned over 10000 scientic publications and a variety of practically ...

Added: October 10, 2013

Forrester’s Concept in Modeling Heart Dynamics

Shmid A., Novopashin M. A., Berezin A. A., IOSR Journal of Computer Engineering (IOSR-JCE) 2017 Vol. 19 No. 3 P. 113–121

The paper deals with the Forrester’s approach to analysis of heart electrical dynamics based on the hypothesis that heart belongs to the class of Complex Systems and its dynamics can be described by coupled Van der Pol differential equations with a time lag. The chain of such equations suggested by Ginzburg and Landau was used ...

Added: June 13, 2018

Logic in Central and Eastern Europe: History, Science, and Discourse

Lanham: University Press of America, 2012.

The history of logic and analytic philosophy in Central and Eastern Europe is still known to very few people. As an exception to the rule, only two scientific schools became internationally popular: the Vienna Circle and the Lvov-Warsaw School. Nevertheless, the countries included in this region have not only joint history, but also joint cultural ...

Added: February 13, 2013

Интегрированные модели и мягкие вычисления в искусственном интеллекте. Сб. научных трудов VII-й Международной научно-практической конференции (Коломна, 20-22 мая 2013)

М.: Физматлит, 2013.

Conference is devoted to application of the integrated models and soft computing in artificial intelligence. ...

Added: May 26, 2013

Proceedings of 2017 4th International Conference on Control, Decision and Information Technologies (CoDIT'17) / April 5-7, 2017

Barcelona: IEEE, 2017.

International Conference on Control, Decision and Information Technologies. ...

Added: January 17, 2018

Об артефактах, связанных с компьютерной визуализацией высокочастотных колебаний.

Зайцев А. А., Panov P. A., Известия высших учебных заведений. Геодезия и аэрофотосъемка 2015 № 5 С. 115–119

Рассматривается задача компьютерной визуализации высокочастотных колебаний, ак- туальная в приборостроении, электротехнике, волновой оптике и имеющая другие многочисленные приложения. Проведен анализ серии примеров, демонстрирующих трудности, возникающие при реа- лизации этой задачи. Указан ряд факторов, приводящих к нежелательным эффектам при построении соответствующих изображений. Даны необходимые рекомендации для адекватного изображения вы- сокочастотных колебаний. ...

Added: January 23, 2017

О некоторых медленно сходящихся системах преобразований термов

Beklemishev L. D., Оноприенко А. А., Математический сборник 2015 Т. 206 № 9 С. 3–20

We formulate some term rewriting systems in which the number of computation steps is finite for each output, but this number cannot be bounded by a provably total computable function in Peano arithmetic PA. Thus, the termination of such systems is unprovable in PA. These systems are derived from an independent combinatorial result known as the Worm ...

Added: March 13, 2016

Complete complexity dichotomy for 7-edge forbidden subgraphs in the edge coloring problem

Malyshev D., Journal of Applied and Industrial Mathematics (перевод журналов "Сибирский журнал индустриальной математики" и "Дискретный анализ и исследование операций") 2020 Vol. 14 No. 4 P. 706–721

The edge coloring problem for a graph is to minimize the number of colors that are sufficient to color all edges of the graph so that all adjacent edges receive distinct colors. The computational complexity of the problem is known for all graph classes defined by forbidden subgraphs with at most 6 edges. We improve ...

Added: January 30, 2021

Trends in Biomathematics: Modeling Cells, Flows, Epidemics, and the Environment. Selected Works from the BIOMAT Consortium Lectures, Szeged, Hungary, 2019

Springer, 2020.

This volume offers a collection of carefully selected, peer-reviewed papers presented at the BIOMAT 2019 International Symposium, which was held at the University of Szeged, Bolyai Institute and the Hungarian Academy of Sciences, Hungary, October 21st-25th, 2019. The topics covered in this volume include tumor and infection modeling; dynamics of co-infections; epidemic models on networks; ...

Added: March 11, 2021

Maximization of Submodular Functions: Theory and Enumeration Algorithms

Goldengorin B. I., European Journal of Operational Research 2009 Vol. 198 No. 1 P. 102–112

Added: July 31, 2012

Measures of uncertainty in market network analysis

Kalyagin V.A., Koldanov A.P., Koldanov P.A. et al., Physica A: Statistical Mechanics and its Applications 2014 Vol. 413 No. 1 P. 59–70

A general approach to measure statistical uncertainty of different filtration techniques for market network analysis is proposed. Two measures of statistical uncertainty are introduced and discussed. One is based on conditional risk for multiple decision statistical procedures and another one is based on average fraction of errors. It is shown that for some important cases ...

Added: July 19, 2014

VIII Конференция молодых ученых «Фундаментальные и прикладные космические исследования»

М.: ИКИ РАН, 2011.

Added: March 26, 2013

Способ редукции графов и его приложения

Sirotkin D., Malyshev D., Дискретная математика 2017 Т. 29 № 3 С. 114–125

Задача о независимом множестве для заданного обыкновенного графа состоит в вычислении размера наибольшего множества его попарно несмежных вершин. Предлагается новый способ редукции графов. С его помощью получено новое доказательство NP-полноты задачи о независимом множестве в классе планарных графов и доказана NP-полнота данной задачи в классе плоских графов, имеющих только треугольные внутренние грани, с максимальной степенью ...

Added: September 7, 2017

"Авиакосмические технологии" (АКТ-2014): Тезисы I тура XV Всероссийской научно-технической конференции и школы молодых ученых, аспирантов и студентов

ООО Фирма "Элист", 2014.

В книге представлены тезисы докладов I тура XV Всероссийской научно-технической конференции и школы молодых ученых, аспирантов и студентов. ...

Added: October 17, 2014

Пятая Международная конференция «Системный анализ и информационные технологии» САИТ-2013 (19–25 сентября 2013 г., г.Красноярск, Россия): Труды конференции. В 2-х т.

Красноярск: ИВМ СО РАН, 2013.

Труды Пятой Международной конференции «Системный анализ и информационные технологии» САИТ-2013 (19–25 сентября 2013 г., г.Красноярск, Россия): ...

Added: November 18, 2013

The complexity of the 3-colorability problem in the absence of a pair of small forbidden induced subgraphs

Malyshev D., Discrete Mathematics 2015 Vol. 338 No. 11 P. 1860–1865

We completely determine the complexity status of the 3-colorability problem for hereditary graph classes defined by two forbidden induced subgraphs with at most five vertices. ...

Added: April 7, 2014

Численное моделирование затвердевания сплавов при интенсивном сопряженном теплообмене

Marshirov V. V., Marshirova L. E., Сибирский журнал индустриальной математики 2013 Т. XVI № 4 С. 111–120

The paper considers the problem of determining the rate of cooling of metal during solidification at the intersection of the liquidus temperature under intense heat sink from the surface. The solution to this problem it is necessary to determine the process conditions, the boundary and initial conditions for which it is possible to get new ...

Added: November 17, 2013

Increasing the performance of a Mobile Ad-hoc Network using a game-theoretic approach to drone positioning

Blakeway S., Gromov D., Gromova E. et al., Vestnik Sankt-Peterburgskogo Universiteta, Prikladnaya Matematika, Informatika, Protsessy Upravleniya 2019 Vol. 15 No. 1 P. 22–38

We describe a novel game-theoretic formulation of the optimal mobile agents’ placement problem which arises in the context of Mobile Ad-hoc Networks (MANETs). This problem is modelled as a sequential multistage game. The definitions of both the Nash equilibrium and cooperative solution are given. A modification was proposed to ensure the existence of a Nash ...

Added: March 13, 2020

Классы планарных графов с полиномиально разрешимой задачей о независимом множестве

Malyshev D., Alekseev V., Дискретный анализ и исследование операций 2008 Т. 15 № 1 С. 3–10

Доказывается полиномиальная разрешимость задачи о независимом множестве для бесконечного семейства подмножеств класса планарных графов. ...

Added: August 31, 2012

Distribution’s template estimate with Wasserstein metrics

Boissard E., Le Gouic T., Loubes J., Bernoulli: a journal of mathematical statistics and probability 2015 P. 740–759

In this paper, we tackle the problem of comparing distributions of random variables and defining a mean pattern between a sample of random events. Using barycenters of measures in the Wasserstein space, we propose an iterative version as an estimation of the mean distribution. Moreover, when the distributions are a common measure warped by a ...

Added: October 13, 2018

Об одномерных проекциях многогранников задач дискретной оптимизации

Vyalyi M., Дискретная математика 1991 Т. 3 № 3 С. 35–45

Added: October 17, 2014

Совершенствование преподавания дисциплин математического цикла на основе инвариантов, необходимых для преподавания курса «Эконометрика» экономистам-бакалаврам

Kotelnikova M. V., Aistov A., Вестник Нижегородского университета им. Н.И. Лобачевского. Серия: Социальные науки 2019 Т. 55 № 3 С. 183–189

The article describes a method that allows to improve the content of disciplines of the mathematical cycle by dividing them into invariant (general) and variable parts. The invariants were identified for such disciplines as «Linear algebra», «Mathematical analysis», «Probability theory and mathematical statistics» delivered to Bachelors program students of economics at several universities. Based on ...

Added: January 28, 2020

Agent-based modelling of interactions between air pollutants and greenery using a case study of Yerevan, Armenia

Akopov A. S., Beklaryan L. A., Saghatelyan A. K., Environmental Modelling and Software 2019 Vol. 116 P. 7–25

Urban greenery such as trees can effectively reduce air pollution in a natural and eco-friendly way. However, how to spatially locate and arrange greenery in an optimal way remains as a challenging task. We developed an agent-based model of air pollution dynamics to support the optimal allocation and configuration of tree clusters in a city. The Pareto ...

Added: February 24, 2019