?
Intelligent Data Engineering and Automated Learning – IDEAL 2019
We define a most specific generalization of a fuzzy set of
topics assigned to leaves of the rooted tree of a domain taxonomy. This
generalization lifts the set to its “head subject” node in the higher ranks
of the taxonomy tree. The head subject is supposed to “tightly” cover
the query set, possibly bringing in some errors referred to as “gaps” and
“offshoots”. Our method, ParGenFS, globally minimizes a penalty function
combining the numbers of head subjects and gaps and offshoots,
differently weighted. Two applications are considered: (1) analysis of
tendencies of research in Data Science; (2) audience extending for programmatic
targeted advertising online. The former involves a taxonomy
of Data Science derived from the celebrated ACM Computing Classification
System 2012. We derive fuzzy clusters of leaf topics in learning,
retrieval and clustering. The head subjects of these clusters inform us
of some general tendencies of the research. The latter involves publicly
available IAB Tech Lab Content Taxonomy.