10. WSDM 2018:Marina Del Rey, CA, USA

Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM 2018, Marina Del Rey, CA, USA, February 5-9, 2018. ACM 【DBLP Link】

Paper Num: 115 || Session Num: 7

Demonstrations 4
Doctoral Presentations 7
Keynote Talks 6
Technical Presentations 81
Tutorials 8
WSDM Cup 2018 1
Workshops 8

Keynote Talks 6

1. A Call to Arms: Embrace Assistive AI Systems!

【Paper Link】【Pages】:1

【Authors】: Andrei Z. Broder

【Abstract】: A quarter-century ago Web search stormed the world: within a few years the Web search box became a standard tool of daily life ready to satisfy informational, transactional, and navigational queries needed for some task completion. However, two recent trends are dramatically changing the box»s role: first, the explosive spread of smartphones brings significant computational resources literally into the pockets of billions of users; second, recent technological advances in machine learning and artificial intelligence, and in particular in speech processing led to the wide deployment of assistive AI systems, culminating in personal digital assistants. Along the way, the "Web search box" has become an "assistance request box" (implicit, in the case of voice-activated assistants) and likewise, many other information processing systems (e.g. e-mail, navigation, personal search, etc) have adopted assistive aspects. Formally, the assistive systems can be viewed as a selection process within a base set of alternatives driven by some user input. The output is either one alternative or a smaller set of alternatives, maybe subject to future selection. Hence, classic IR is a particular instance of this formulation, where the input is a textual query and the selection process is relevance ranking over the corpus. In increasing order of selection capabilities, assistive systems can be classified into three categories: Subordinate : systems where the selection is fully specified by the request; if this results in a singleton the system provides it, otherwise the system provides a random alternative from the result set. Therefore, the challenge for subordinate systems consists only in the correct interpretation of the user request (e.g., weather information, simple personal schedule management, a "play jazz" request). Conducive : systems that reduce the set of alternatives to a smaller set, possibly via an interactive process (e.g. the classic ten blue links, the three "smart replies" in Gmail, interactive recommendations, etc). Decisive : systems that make all necessary decisions to reach the desired goal (in other words, select a single alternative from the set of possibilities) including resolving ambiguities and other substantive decisions without further input from the user (e.g., typical translation systems, self-driving cars). The main goal of this talk is to examine these developments and to urge the WSDM community to increase its focus on assistive AI solutions that are becoming pertinent to a wide variety of information processing problems. I will mostly present ideas and work in progress, and there will be many more open questions than definitive answers.

【Keywords】:

2. On the Power of Massive Text Data.

【Paper Link】【Pages】:2

【Authors】: Jiawei Han

【Abstract】: The real-world big data is largely unstructured, dynamic, and interconnected, in the form of natural language text. It is highly desirable to transform such massive unstructured data into structured knowledge. Many researchers and practitioners rely on labor-intensive labeling and curation to extract knowledge from unstructured text data. However, such approaches may not be scalable to web-scale or adaptable to new domains, especially considering that a lot of text corpora are highly dynamic and domain-specific. We argue that massive text data itself contains a large body of hidden patterns, structures, and knowledge. Equipped with domain-independent and domain-specific knowledge-bases, a promising direction is to develop more systematic data mining methods to turn massive unstructured text data into structured knowledge. We introduce a set of methods developed recently in our own group on exploration of the power of big text data, including mining quality phrases using unsupervised, weakly supervised and distantly supervised approaches, recognition and typing of entities and relations by distant supervision, meta-pattern-based entity-attribute-value extraction, set expansion and local embedding-based multi-faceted taxonomy discovery, allocation of text documents into multi-dimensional text cubes, construction of heterogeneous information networks from text cube, and eventually mining multi-dimensional structured knowledge from massive text data. We show that massive text data itself can be powerful at disclosing patterns, structures and hidden knowledge, and it is promising to explore the power of massive, interrelated text data for transforming such unstructured data into structured knowledge.

【Keywords】: data mining; data to knowledge; heterogeneous information networks; multi-dimensional text cube; text mining

3. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution.

【Paper Link】【Pages】:3

【Authors】: Judea Pearl

【Abstract】: Current machine learning systems operate, almost exclusively, in a statistical, or model-blind mode, which entails severe theoretical limits on their power and performance. Such systems cannot reason about interventions and retrospection and, therefore, cannot serve as the basis for strong AI. To achieve human level intelligence, learning machines need the guidance of a model of reality, similar to the ones used in causal inference. To demonstrate the essential role of such models, I will present a summary of seven tasks which are beyond reach of current machine learning systems and which have been accomplished using the tools of causal inference.

【Keywords】: keynote talk

4. Conversations, Machine Learning and Privacy: LinkedIn's Path Towards Transforming Interaction with Its Members.

【Paper Link】【Pages】:4

【Authors】: Igor Perisic

【Abstract】: At LinkedIn, we believe that having the right conversations with our members is key to unlocking economic opportunity for them. For us, these conversations are in a broader context than traditionally defined dialogues. A typical dialogue usually only considers a limited time-window as context and is trying to satisfy an immediate intent. Advanced dialogue systems allow an user to take a number of turns, in that short-time window, to get clear on the user's intent. However, our members are having conversations with us over long periods of time about their long-term goals, such as staying informed, growing a professional network, advancing a career, getting a job, finding qualified leads, etc. These conversational goals are often hierarchical. For example, getting a great job is a key part of advancing your career. Our goal at LinkedIn is to be able to have simultaneous conversations with our members on all of these levels. To do this, we have to build machine learning systems that understand that there are multiple multi-level conversations going on. We have made strong headway in building components of this conversational vision by learning how to approximate long-term member value and defining an optimization framework that can incorporate multiple conflicting objectives. These problems consider the states of these conversations when interacting with our members and actively make decisions that optimize this ongoing dialogue. We have a challenging and interesting road ahead. In this talk, Igor will present the current state of LinkedIn's machine-learning efforts towards building robust, long-term conversational systems. He will then discuss the potential privacy and ethical issues surrounding having these conversational interactions through an ever-increasing number of touchpoints with our members.

【Keywords】:

5. From Search to Research: Direct Answers, Perspectives and Dialog.

【Paper Link】【Pages】:5

【Authors】: Harry Shum

【Abstract】: Advances in artificial intelligence have improved machine understanding of speech, images, and natural language. This in turn has allowed us to greatly enhance the intelligence of products such as Bing and Cortana. This keynote describes our continuing journey beyond keyword-driven systems, into dialog and intelligent agent functionality, helping our users "research more, search less". Modern systems attempt to provide concise direct answers, which can fit on a small screen or become a spoken response. To find such answers, Microsoft can draw from a uniquely broad inventory of data sources such as the Bing Web & Knowledge graphs, the workplace graph of Office 365, and the Microsoft Academic Graph. Since these graphs contain a lot of text information, we apply machine reading and comprehension technology to extract concise answers. Microsoft has entries frequently topping the leaderboards in the community»s machine reading contests. To select the right answers, we use deep multi-task learning to develop a vector representation that is usable across multiple data sources and scenarios. This is combined with a large-scale data processing and serving infrastructure. We use this not only to find a single answer, but also to find multiple answers in cases where multiple valid perspectives exist. In the case of numeric answers, we provide some context to help users understand what the numbers mean. This is part of our effort to consider not just IQ but EQ in our conversational systems, where the chatbot Xiaoice leads the way in establishing a human connection, to develop long and sustained conversations. These advances improve product quality, enable new user experiences and have challenged us to rethink the entire intelligent search platform at Microsoft.

【Keywords】:

6. Scalable Algorithms in the Age of Big Data and Network Sciences: Characterization, Primitives, and Techniques.

【Paper Link】【Pages】:6-7

【Authors】: Shang-Hua Teng

【Abstract】: In the age of network sciences and machine learning, efficient algorithms are now in higher demand more than ever before. Big Data fundamentally challenges the classical notion of efficient algorithms: Algorithms that used to be considered efficient, according to polynomial-time characterization, may no longer be adequate for solving today»s problems. It is not just desirable, but essential, that efficient algorithms should be scalable. In other words, their complexity should be nearly linear or sub-linear with respect to the problem size. Thus, scalability, not just polynomial-time computability, should be elevated as the central complexity notion for characterizing efficient computation. In this talk, I will highlight a family of fundamental algorithmic techniques for designing provably-good scalable algorithms: (1) scalable primitives and scalable reduction, (2) spectral approximation of graphs and matrices, (3) sparsification by multilevel structures, (4) advanced sampling, (5) local network exploration. For the first, I will focus on the emerging Laplacian Paradigm, that has led to breakthroughs in scalable algorithms for several fundamental problems in network analysis, machine learning, and scientific computing. I will then illustrate these algorithmic techniques with four recent applications: (1) sampling from graphic models, (2) network centrality approximation, (3) social-influence analysis (4) local clustering. Mathematical and algorithmic solution to these problems exemplify the fusion of combinatorial, numerical, and statistical thinking in data and network analysis.

【Keywords】: advanced sampling; big data; graph sparsification; local algorithms; machine learning; network sciences; scalable algorithms

WSDM Cup 2018 1

7. WSDM Cup 2018: Music Recommendation and Churn Prediction.

【Paper Link】【Pages】:8-9

【Authors】: Yian Chen ; Xing Xie ; Shou-De Lin ; Arden Chiu

【Abstract】: Excellent recommendation system facilitates users retrieving contents they like and, what»s much more important - the contents they might like but they are not aware of yet. It will further increase the satisfaction of users and increase the retention rate and conversion rate indirectly. While the public's now listening to all kinds of music, recommendation algorithms still struggle in key areas. Without enough historical data, how would an algorithm know if listeners will like a new song or a new artist? And, how would it know what songs to recommend brand new users? In WSDM Cup 2018, the first task is to solve the abovementioned challenges to build a better music recommendation system. The 2nd task in the Cup focuses on churn prediction. For a subscription business, accurately predicting churn is critical to long-term success. Even slight variations in churn can drastically affect profits. In this task, participants are asked to build an algorithm that predicts whether a user will churn after their subscription expires. The competition data and award are provided by KKBOX, a leading music streaming service in Taiwan.

【Keywords】: wsdm cup 2018 recommendation

Technical Presentations 81

8. Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches.

【Paper Link】【Pages】:10-18

【Authors】: Armita Afsharinejad ; Neil Hurley

【Abstract】: This paper evaluates two algorithms, BLIP and JLT, for creating differentially private data sketches of user profiles, in terms of their ability to protect a kNN collaborative filtering algorithm from an inference attack by third-parties. The transformed user profiles are employed in a user-based top-N collaborative filtering system. For the first time, a theoretical analysis of the BLIP is carried out, to derive expressions that relate its parameters to its performance. This allows the two techniques to be fairly compared. The impact of deploying these approaches on the utility of the system---its ability to make good recommendations, and on its privacy level---the ability of third-parties to make inferences about the underlying user preferences, is examined. An active inference attack is evaluated, that consists of the injection of a number of tailored sybil profiles into the system database. User profile data of targeted users is then inferred from the recommendations made to the sybils. Although the differentially private sketches are designed to allow the transformed user profiles to be published without compromising privacy, the attack we examine does not use such information and depends only on some pre-existing knowledge of some user preferences as well as the neighbourhood size of the kNN algorithm. Our analysis therefore assesses in practical terms a relatively weak privacy attack, which is extremely simple to apply in systems that allow low-cost generation of sybils. We find that, for a given differential privacy level, the BLIP injects less noise into the system, but for a given level of noise, the JLT offers a more compact representation.

【Keywords】: collaborative filtering; data sketching; privacy preservation; user-based top-n recommneder

【Paper Link】【Pages】:19-27

【Authors】: Nazanin Alipourfard ; Peter G. Fennell ; Kristina Lerman

【Abstract】: We investigate how Simpson»s paradox affects analysis of trends in social data. According to the paradox, the trends observed in data that has been aggregated over an entire population may be different from, and even opposite to, those of the underlying subgroups. Failure to take this effect into account can lead analysis to wrong conclusions. We present a statistical method to automatically identify Simpson»s paradox in data by comparing statistical trends in the aggregate data to those in the disaggregated subgroups. We apply the approach to data from Stack Exchange, a popular question-answering platform, to analyze factors affecting answerer performance, specifically, the likelihood that an answer written by a user will be accepted by the asker as the best answer to his or her question. Our analysis confirms a known Simpson»s paradox and identifies several new instances. These paradoxes provide novel insights into user behavior on Stack Exchange.

【Keywords】: collaborative and social computing systems and tools; ecological fallacies; exploratory data analysis; regression analysis; simpson's paradox; statistical paradigms

【Paper Link】【Pages】:28-36

【Authors】: Saeid Balaneshin Kordan ; Alexander Kotov

【Abstract】: Recent advances in deep learning and distributed representations of images and text have resulted in the emergence of several neural architectures for cross-modal retrieval tasks, such as searching collections of images in response to textual queries and assigning textual descriptions to images. However, the multi-modal retrieval scenario, when a query can be either a text or an image and the goal is to retrieve both a textual fragment and an image, which should be considered as an atomic unit, has been significantly less studied. In this paper, we propose a gated neural architecture to project image and keyword queries as well as multi-modal retrieval units into the same low-dimensional embedding space and perform semantic matching in this space. The proposed architecture is trained to minimize structured hinge loss and can be applied to both cross- and multi-modal retrieval. Experimental results for six different cross- and multi-modal retrieval tasks obtained on publicly available datasets indicate superior retrieval accuracy of the proposed architecture in comparison to the state-of-art baselines.

【Keywords】: cross-modal ir; deep neural networks; multi-modal ir

11. A Discrete Choice Model for Subset Selection.

【Paper Link】【Pages】:37-45

【Authors】: Austin R. Benson ; Ravi Kumar ; Andrew Tomkins

【Abstract】: Multinomial logistic regression is a classical technique for modeling how individuals choose an item from a finite set of alternatives. This methodology is a workhorse in both discrete choice theory and machine learning. However, it is unclear how to generalize multinomial logistic regression to subset selection, allowing the choice of more than one item at a time. We present a new model for subset selection derived from the perspective of random utility maximization in discrete choice theory. In our model, the quality of a subset is determined by the quality of its elements, plus an optional correction. Given a budget on the number of subsets that may receive correction, we develop a framework for learning the quality scores for each item, the choice of subsets, and the correction for each subset. We show that, given the subsets to receive correction, we can efficiently and optimally learn the remaining model parameters jointly. We show further that learning the optimal subsets is both NP-hard and non-submodular, but there are efficient heuristics that perform well in practice. We combine these pieces to provide an overall learning solution and apply it to subset prediction tasks. We find that with reasonably-sized budgets, there are significant gains in average per-choice likelihood ranging from 7% to 8x depending on the dataset and also substantial improvements over a determinantal point process model.

【Keywords】: discrete choice; logit; subset; subset selection

12. Latent Cross: Making Use of Context in Recurrent Recommender Systems.

【Paper Link】【Pages】:46-54

【Authors】: Alex Beutel ; Paul Covington ; Sagar Jain ; Can Xu ; Jia Li ; Vince Gatto ; Ed H. Chi

【Abstract】: The success of recommender systems often depends on their ability to understand and make use of the context of the recommendation request. Significant research has focused on how time, location, interfaces, and a plethora of other contextual features affect recommendations. However, in using deep neural networks for recommender systems, researchers often ignore these contexts or incorporate them as ordinary features in the model. In this paper, we study how to effectively treat contextual data in neural recommender systems. We begin with an empirical analysis of the conventional approach to context as features in feed-forward recommenders and demonstrate that this approach is inefficient in capturing common feature crosses. We apply this insight to design a state-of-the-art RNN recommender system. We first describe our RNN-based recommender system in use at YouTube. Next, we offer "Latent Cross," an easy-to-use technique to incorporate contextual data in the RNN by embedding the context feature first and then performing an element-wise product of the context embedding with model's hidden states. We demonstrate the improvement in performance by using this Latent Cross technique in multiple experimental settings.

【Keywords】: contextual recommendation; recommender system; recurrent neural network

13. Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments.

【Paper Link】【Pages】:55-63

【Authors】: Roman Budylin ; Alexey Drutsa ; Ilya Katsev ; Valeriya Tsoy

【Abstract】: We study ratio overall evaluation criteria (user behavior quality metrics) and, in particular, average values of non-user level metrics, that are widely used in A/B testing as an important part of modern Internet companies» evaluation instruments (e.g., abandonment rate, a user»s absence time after a session). We focus on the problem of sensitivity improvement of these criteria, since there is a large gap between the variety of sensitivity improvement techniques designed for user level metrics and the variety of such techniques for ratio criteria. We propose a novel transformation of a ratio criterion to the average value of a user level (randomization-unit level, in general) metric that creates an opportunity to directly use a wide range of sensitivity improvement techniques designed for the user level that make A/B tests more efficient. We provide theoretical guarantees on the novel metric»s consistency in terms of preservation of two crucial properties (directionality and significance level) w.r.t. the source ratio criteria. The experimental evaluation of the approach is done on hundreds large-scale real A/B tests run at one of the most popular global search engines, reinforces the theoretical results, and demonstrates up to $+34%$ of sensitivity rate improvement achieved by the transformation combined with the best known regression adjustment.

【Keywords】: a/b test; delta method; directionality; linearization; non-user level metric; online controlled experiment; ratio oec; sensitivity

14. Neural Graph Learning: Training Neural Networks Using Graphs.

【Paper Link】【Pages】:64-71

【Authors】: Thang D. Bui ; Sujith Ravi ; Vivek Ramavajjala

【Abstract】: Label propagation is a powerful and flexible semi-supervised learning technique on graphs. Neural networks, on the other hand, have proven track records in many supervised learning tasks. In this work, we propose a training framework with a graph-regularised objective, namely Neural Graph Machines, that can combine the power of neural networks and label propagation. This work generalises previous literature on graph-augmented training of neural networks, enabling it to be applied to multiple neural architectures (Feed-forward NNs, CNNs and LSTM RNNs) and a wide range of graphs. The new objective allows the neural networks to harness both labeled and unlabeled data by: (a)~allowing the network to train using labeled data as in the supervised setting, (b)~biasing the network to learn similar hidden representations for neighboring nodes on a graph, in the same vein as label propagation. Such architectures with the proposed objective can be trained efficiently using stochastic gradient descent and scaled to large graphs, with a runtime that is linear in the number of edges. The proposed joint training approach convincingly outperforms many existing methods on a wide range of tasks (multi-label classification on social graphs, news categorization, document classification and semantic intent classification), with multiple forms of graph inputs (including graphs with and without node-level features) and using different types of neural networks.

【Keywords】: graph; neural network; semi-supervised learning

15. Sketch 'Em All: Fast Approximate Similarity Search for Dynamic Data Streams.

【Paper Link】【Pages】:72-80

【Authors】: Marc Bury ; Chris Schwiegelshohn ; Mara Sorella

【Abstract】: Recommender systems are an integral part of many web applications. With increasingly larger user bases, scalability has become an important issue. Many of the most scalable algorithms with respect to both space and running times are based on locality sensitive hashing. However, a significant drawback is that these methods are only able to handle insertions to user profiles and tend to perform poorly when items may be removed. We initiate the study of scalable locality sensitive hashing (LSH) for dynamic input. Specifically, using the Jaccard index as similarity measure, we design (1) a sketching algorithm for similarity estimation via a black box reduction to $\ell_0$ norm estimation and (2) a locality sensitive hashing scheme maintainable in fully dynamic data streams that quickly filters out low-similarity pairs. Our algorithms have little to no overhead in terms of running time compared to previous LSH approaches for the insertion only case, and drastically outperform previous algorithms in case of deletions.

【Keywords】: dynamic data streams; jaccard index; locality sensitive hashing

16. Fast Coreset-based Diversity Maximization under Matroid Constraints.

【Paper Link】【Pages】:81-89

【Authors】: Matteo Ceccarello ; Andrea Pietracaprina ; Geppino Pucci

【Abstract】: Max-sum diversity is a fundamental primitive for web search and data mining. For a given set S of n elements, it returns a subset of k«l n representatives maximizing the sum of their pairwise distances, where distance models dissimilarity. An important variant of the primitive prescribes that the desired subset of representatives satisfies an additional orthogonal requirement, which can be specified as a matroid constraint (i.e., a feasible solution must be an independent set of size k). While unconstrained max-sum diversity admits efficient coreset-based strategies, the only known approaches dealing with the additional matroid constraint are inherently sequential and are based on an expensive local search over the entire input set. We devise the first coreset constructions for max-sum diversity under various matroid constraints, together with efficient sequential, MapReduce and Streaming implementations. By running the local-search on the coreset rather than on the entire input, we obtain the first practical solutions for large instances. Technically, our coresets are subsets of S containing a feasible solution which is no more than a factor 1-ε away from the optimal solution, for any fixed ε <1, and, for spaces of bounded doubling dimension, they have a small size independent of n. Extensive experiments show that, with respect to full-blown local search, our coreset-based approach yields solutions of comparable quality, with improvements of up to two orders of magnitude in the running time, also for input sets of unknown dimensionality.

【Keywords】: approximation algorithms; coresets; diversity maximization; doubling dimension; doubling spaces; mapreduce; matroids; streaming

17. Putting Data in the Driver's Seat: Optimizing Earnings for On-Demand Ride-Hailing.

【Paper Link】【Pages】:90-98

【Authors】: Harshal A. Chaudhari ; John W. Byers ; Evimaria Terzi

【Abstract】: On-demand ride-hailing platforms like Uber and Lyft are helping reshape urban transportation, by enabling car owners to become drivers for hire with minimal overhead. Although there are many studies that consider ride-hailing platforms holistically, e.g., from the perspective of supply and demand equilibria, little emphasis has been placed on optimization for the individual, self-interested drivers that currently comprise these fleets. While some individuals drive opportunistically either as their schedule allows or on a fixed schedule, we show that strategic behavior regarding when and where to drive can substantially increase driver income. In this paper, we formalize the problem of devising a driver strategy to maximize expected earnings, describe a series of dynamic programming algorithms to solve these problems under different sets of modeled actions available to the drivers, and exemplify the models and methods on a large scale simulation of driving for Uber in NYC. In our experiments, we use a newly-collected dataset that combines the NYC taxi rides dataset along with Uber API data, to build time-varying traffic and payout matrices for a representative six-month time period in greater NYC. From this input, we can reason about prospective itineraries and payoffs. Moreover, the framework enables us to rigorously reason about and analyze the sensitivity of our results to perturbations in the input data. Among our main findings is that repositioning throughout the day is key to maximizing driver earnings, whereas »chasing surge' is typically misguided and sometimes a costly move.

【Keywords】: on-demand ride-hailing; sharing economy; surge pricing; uncertainty

18. Improving Negative Sampling for Word Representation using Self-embedded Features.

【Paper Link】【Pages】:99-107

【Authors】: Long Chen ; Fajie Yuan ; Joemon M. Jose ; Weinan Zhang

【Abstract】: Although the word-popularity based negative sampler has shown superb performance in the skip-gram model, the theoretical motivation behind oversampling popular (non-observed) words as negative samples is still not well understood. In this paper, we start from an investigation of the gradient vanishing issue in the skip-gram model without a proper negative sampler. By performing an insightful analysis from the stochastic gradient descent (SGD) learning perspective, we demonstrate, both theoretically and intuitively, negative samples with larger inner product scores are more informative than those with lower scores for the SGD learner in terms of both convergence rate and accuracy. Understanding this, we propose an alternative sampling algorithm that dynamically selects informative negative samples during each SGD update. More importantly, the proposed sampler accounts for multi-dimensional self-embedded features during the sampling process, which essentially makes it more effective than the original popularity-based (one-dimensional) sampler. Empirical experiments further verify our observations and show that our fine-grained samplers gain significant improvement over the existing ones without increasing computational complexity.

【Keywords】:

19. Sequential Recommendation with User Memory Networks.

【Paper Link】【Pages】:108-116

【Authors】: Xu Chen ; Hongteng Xu ; Yongfeng Zhang ; Jiaxi Tang ; Yixin Cao ; Zheng Qin ; Hongyuan Zha

【Abstract】: User preferences are usually dynamic in real-world recommender systems, and a user»s historical behavior records may not be equally important when predicting his/her future interests. Existing recommendation algorithms -- including both shallow and deep approaches -- usually embed a user»s historical records into a single latent vector/representation, which may have lost the per item- or feature-level correlations between a user»s historical records and future interests. In this paper, we aim to express, store, and manipulate users» historical records in a more explicit, dynamic, and effective manner. To do so, we introduce the memory mechanism to recommender systems. Specifically, we design a memory-augmented neural network (MANN) integrated with the insights of collaborative filtering for recommendation. By leveraging the external memory matrix in MANN, we store and update users» historical records explicitly, which enhances the expressiveness of the model. We further adapt our framework to both item- and feature-level versions, and design the corresponding memory reading/writing operations according to the nature of personalized recommendation scenarios. Compared with state-of-the-art methods that consider users» sequential behavior for recommendation, e.g., sequential recommenders with recurrent neural networks (RNN) or Markov chains, our method achieves significantly and consistently better performance on four real-world datasets. Moreover, experimental analyses show that our method is able to extract the intuitive patterns of how users» future actions are affected by previous behaviors.

【Keywords】: collaborative filtering; memory networks; sequential recommendation

【Paper Link】【Pages】:117-125

【Authors】: Sreyasi Nag Chowdhury ; Niket Tandon ; Hakan Ferhatosmanoglu ; Gerhard Weikum

【Abstract】: The social media explosion has populated the Internet with a wealth of images. There are two existing paradigms for image retrieval: 1)content-based image retrieval (BIR), which has traditionally used visual features for similarity search (e.g., SIFT features), and 2) tag-based image retrieval (TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains semantic expressiveness by advances in deep-learning-based detection of visual labels. TBIR benefits from query-and-click logs to automatically infer more informative labels. However, learning-based tagging still yields noisy labels and is restricted to concrete objects, missing out on generalizations and abstractions. Click-based tagging is limited to terms that appear in the textual context of an image or in queries that lead to a click. This paper addresses the above limitations by semantically refining and expanding the labels suggested by learning-based object detection. We consider the semantic coherence between the labels for different objects, leverage lexical and commonsense knowledge, and cast the label assignment into a constrained optimization problem solved by an integer linear program. Experiments show that our method, called VISIR, improves the quality of the state-of-the-art visual labeling tools like LSDA and YOLO.

【Keywords】: background knowledge; image tagging; semantic coherence

21. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search.

【Paper Link】【Pages】:126-134

【Authors】: Zhuyun Dai ; Chenyan Xiong ; Jamie Callan ; Zhiyuan Liu

【Abstract】: This paper presents \textttConv-KNRM, a Convolutional Kernel-based Neural Ranking Model that models n-gram soft matches for ad-hoc search. Instead of exact matching query and document n-grams, \textttConv-KNRM uses Convolutional Neural Networks to represent n-grams of various lengths and soft matches them in a unified embedding space. The n-gram soft matches are then utilized by the kernel pooling and learning-to-rank layers to generate the final ranking score. \textttConv-KNRM can be learned end-to-end and fully optimized from user feedback. The learned model»s generalizability is investigated by testing how well it performs in a related domain with small amounts of training data. Experiments on English search logs, Chinese search logs, and TREC Web track tasks demonstrated consistent advantages of \textttConv-KNRM over prior neural IR methods and feature-based methods.

【Keywords】: n-gram soft match; neural ir; relevance ranking

22. Demographics and Dynamics of Mechanical Turk Workers.

【Paper Link】【Pages】:135-143

【Authors】: Djellel Eddine Difallah ; Elena Filatova ; Panos Ipeirotis

【Abstract】: We present an analysis of the population dynamics and demographics of Amazon Mechanical Turk workers based on the results of the survey that we conducted over a period of 28 months, with more than 85K responses from 40K unique participants. The demographics survey is ongoing (as of November 2017), and the results are available at http://demographics.mturk-tracker.com: we provide an API for researchers to download the survey data. We use techniques from the field of ecology, in particular, the capture-recapture technique, to understand the size and dynamics of the underlying population. We also demonstrate how to model and account for the inherent selection biases in such surveys. Our results indicate that there are more than 100K workers available in Amazon»s crowdsourcing platform, the participation of the workers in the platform follows a heavy-tailed distribution, and at any given time there are more than 2K active workers. We also show that the half-life of a worker on the platform is around 12-18 months and that the rate of arrival of new workers balances the rate of departures, keeping the overall worker population relatively stable. Finally, we demonstrate how we can estimate the biases of different demographics to participate in the survey tasks, and show how to correct such biases. Our methodology is generic and can be applied to any platform where we are interested in understanding the dynamics and demographics of the underlying user population.

【Keywords】: amazon mechanical turk; capture-recapture; crowdsourcing; demographics; dynamics; selection bias; surveys

23. Joint Generative-Discriminative Aggregation Model for Multi-Option Crowd Labels.

【Paper Link】【Pages】:144-152

【Authors】: Kamran Ghasedi Dizaji ; Yanhua Yang ; Heng Huang

【Abstract】: Although some crowdsourcing aggregation models have been introduced to aggregate noisy crowd labels, these models mostly consider single-option (i.e. discrete) crowd labels as the input variables, and are not compatible with multi-option (i.e. non-deterministic) crowd data. In this paper, we propose a novel joint generative-discriminative aggregation model, which is able to efficiently deal with both single-option and multi-option crowd labels. Considering the confidence of workers for each option as the input data, we first introduce a new discriminative aggregation model, called Constrained Weighted Majority Voting (CWMVL1), which improves the performance of majority voting method. CWMVL1 considers flexible reliability parameters for crowd workers, employs L1-norm loss function to deal with noisy crowd data, and includes optimization constraints to have probabilistic outputs. We prove that our object is convex, and derive an efficient optimization algorithm. Moreover, we integrate the discriminative CWMVL1 model with a generative model, resulting in a powerful joint aggregation model. Combination of these sub-models is obtained in a probabilistic framework rather than a heuristic way. For our joint model, we derive an efficient optimization algorithm, which alternates between updating the parameters and estimating the potential true labels. Experimental results indicate that the proposed aggregation models achieve superior or competitive results in comparison with the state-of-the-art models on single-option and multi-option crowd datasets, while having faster convergence rates and more reliable predictions.

【Keywords】: discriminative aggregation model; joint generative-disctiminative aggregation model; multi-option crowd labels

24. Predicting Audio Advertisement Quality.

【Paper Link】【Pages】:153-161

【Authors】: Samaneh Ebrahimi ; Hossein Vahabi ; Matthew Prockup ; Oriol Nieto

【Abstract】: Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.

【Keywords】: acoustic features; ad quality; advertising; audio ads; cnn; dnn

25. Cognitive Biases in Crowdsourcing.

【Paper Link】【Pages】:162-170

【Authors】: Carsten Eickhoff

【Abstract】: Crowdsourcing has become a popular paradigm in data curation, annotation and evaluation for many artificial intelligence and information retrieval applications. Considerable efforts have gone into devising effective quality control mechanisms that identify or discourage cheat submissions in an attempt to improve the quality of noisy crowd judgments. Besides purposeful cheating, there is another source of noise that is often alluded to but insufficiently studied: Cognitive biases. This paper investigates the prevalence and effect size of a range of common cognitive biases on a standard relevance judgment task. Our experiments are based on three sizable publicly available document collections and note significant detrimental effects on annotation quality, system ranking and the performance of derived rankers when task design does not account for such biases.

【Keywords】: cognitive biases; crowdsourcing; human computation; relevance assessment

26. User Profiling through Deep Multimodal Fusion.

【Paper Link】【Pages】:171-179

【Authors】: Golnoosh Farnadi ; Jie Tang ; Martine De Cock ; Marie-Francine Moens

【Abstract】: User profiling in social media has gained a lot of attention due to its varied set of applications in advertising, marketing, recruiting, and law enforcement. Among the various techniques for user modeling, there is fairly limited work on how to merge multiple sources or modalities of user data - such as text, images, and relations - to arrive at more accurate user profiles. In this paper, we propose a deep learning approach that extracts and fuses information across different modalities. Our hybrid user profiling framework utilizes a shared representation between modalities to integrate three sources of data at the feature level, and combines the decision of separate networks that operate on each combination of data sources at the decision level. Our experimental results on more than 5K Facebook users demonstrate that our approach outperforms competing approaches for inferring age, gender and personality traits of social media users. We get highly accurate results with AUC values of more than 0.9 for the task of age prediction and 0.95 for the task of gender prediction.

【Keywords】: age and gender prediction; deep neural networks; personality prediction; social media; user modeling

27. Orienteering Algorithms for Generating Travel Itineraries.

【Paper Link】【Pages】:180-188

【Authors】: Zachary Friggstad ; Sreenivas Gollapudi ; Kostas Kollias ; Tamás Sarlós ; Chaitanya Swamy ; Andrew Tomkins

【Abstract】: We study the problem of automatically and efficiently generating itineraries for users who are on vacation. We focus on the common case, wherein the trip duration is more than a single day. Previous efficient algorithms based on greedy heuristics suffer from two problems. First, the itineraries are often unbalanced, with excellent days visiting top attractions followed by days of exclusively lower-quality alternatives. Second, the trips often re-visit neighborhoods repeatedly in order to cover increasingly low-tier points of interest. Our primary technical contribution is an algorithm that addresses both these problems by maximizing the quality of the worst day. We give theoretical results showing that this algorithm»s competitive factor is within a factor two of the guarantee of the best available algorithm for a single day, across many variations of the problem. We also give detailed empirical evaluations using two distinct datasets:(a) anonymized Google historical visit data and(b) Foursquare public check-in data. We show first that the overall utility of our itineraries is almost identical to that of algorithms specifically designed to maximize total utility, while the utility of the worst day of our itineraries is roughly twice that obtained from other approaches. We then turn to evaluation based on human raters who score our itineraries only slightly below the itineraries created by human travel experts with deep knowledge of the area.

【Keywords】: approximation algorithms; orienteering; travel itineraries

28. Unsubscription: A Simple Way to Ease Overload in Email.

【Paper Link】【Pages】:189-197

【Authors】: Iftah Gamzu ; Liane Lewin-Eytan ; Natalia Silberstein

【Abstract】: The constant growth of machine-generated mail, which today consists of more than 90% of non-spam mail traffic, is a major contributor toinformation overload in email, where users become overwhelmed with a flood of messages from commercial entities. A large part of this traffic is often junk mail that the user would prefer not to receive. Surprisingly, nearly 95% of this traffic is in fact solicited by the users themselves in the form of subscriptions to mailing services. These subscriptions are many times unintentional. Although unsubscription option from such services is enforced by commercial laws, it is hardly actually used by users. We perform a large scale study ofunsubscribable traffic, namely, messages that provide unsubscription option to users. We consider users behavior over such traffic in Yahoo Web mail service, and demonstrate a significant gap between users low interest in this traffic, and their lack of active behavior in decreasing its load. We conjecture that the cause of this gap is the lack of an efficient and easily accessible mechanism that would help users to unsubscribe. We validate our conjecture with an online large scale experiment, where we provide users with a novel mail feature for managing unsubscribable traffic, based on personalized recommendations. The experiment demonstrates the imminent need that exists for such a mechanism.

【Keywords】: machine-generated mail; mail mining; unsubscription recommendations

29. Offline A/B Testing for Recommender Systems.

【Paper Link】【Pages】:198-206

【Authors】: Alexandre Gilotte ; Clément Calauzènes ; Thomas Nedelec ; Alexandre Abraham ; Simon Dollé

【Abstract】: Online A/B testing evaluates the impact of a new technology by running it in a real production environment and testing its performance on a subset of the users of the platform. It is a well-known practice to run a preliminary offline evaluation on historical data to iterate faster on new ideas, and to detect poor policies in order to avoid losing money or breaking the system. For such offline evaluations, we are interested in methods that can compute offline an estimate of the potential uplift of performance generated by a new technology. Offline performance can be measured using estimators known as counterfactual or off-policy estimators. Traditional counterfactual estimators, such as capped importance sampling or normalised importance sampling, exhibit unsatisfying bias-variance compromises when experimenting on personalized product recommendation systems. To overcome this issue, we model the bias incurred by these estimators rather than bound it in the worst case, which leads us to propose a new counterfactual estimator. We provide a benchmark of the different estimators showing their correlation with business metrics observed by running online A/B tests on a large-scale commercial recommender system.

【Keywords】: counterfactual estimation; importance sampling.; off-policy evaluation; recommender system

【Paper Link】【Pages】:207-215

【Authors】: Ido Guy ; Alexander Nus ; Dan Pelleg ; Idan Szpektor

【Abstract】: With mobile devices, users are taking ever-growing numbers of photos every day. These photos are uploaded to social sites such as Facebook and Flickr, often automatically. Yet, the portion of these uploaded photos being publicly shared is low, and on a constant decline. Deciding which photo to share takes considerable time and attention, and many users would rather forfeit the social interaction and engagement than sift through their piles of uploaded photos. In this paper, we introduce a novel task of recommending socially-engaging photos to their creators for public sharing. This will turn a tedious manual chore into a quick, software-assisted process. We provide extensive analysis over a large-scale dataset from the Flickr photo sharing website, which reveals some of the traits of photo sharing in such sites. Additionally, we present a ranking algorithm for the task that comprises three steps:(a) grouping of near-duplicate photos;(b) ranking the photos in each group by their "shareability"; and(c) ranking the groups by their likelihood to contain a shareable photo. A large-scale experiment allows us to evaluate our algorithm and show its benefits compared to competitive baselines and algorithmic alternatives.

【Keywords】: content sharing; flickr; image ranking; learning to rank; photo sharing; social recommender systems

31. Identifying Informational vs. Conversational Questions on Community Question Answering Archives.

【Paper Link】【Pages】:216-224

【Authors】: Ido Guy ; Victor Makarenkov ; Niva Hazon ; Lior Rokach ; Bracha Shapira

【Abstract】: Questions on community question answering websites usually reflect one of two intents: learning information or starting a conversation. In this paper, we revisit this fundamental classification task of informational versus conversational questions, which was originally introduced and studied in 2009. We use a substantially larger dataset of archived questions from Yahoo Answers, which includes the question»s title, description, answers, and votes. We replicate the original experiments over this dataset, point out the common and different from the original results, and present a broad set of characteristics that distinguish the two question types. We also develop new classifiers that make use of additional data types, advanced machine learning, and a large dataset of unlabeled data, which achieve enhanced performance.

【Keywords】: community question answering; label propagation; long short term memory networks; user intent; yahoo answers

32. Robust Transfer Learning for Cross-domain Collaborative Filtering Using Multiple Rating Patterns Approximation.

【Paper Link】【Pages】:225-233

【Authors】: Ming He ; Jiuling Zhang ; Peng Yang ; Kaisheng Yao

【Abstract】: Collaborative filtering techniques are a common approach for building recommendations, and have been widely applied in real recommender systems. However, collaborative filtering usually suffers from limited performance due to the sparsity of user-item interaction. To address this issue, auxiliary information is usually used to improve the performance. Transfer learning provides the key idea of using knowledge from auxiliary domains. An assumption of transfer learning in collaborative filtering is that the source domain is a full rating matrix, which may not hold in many real-world applications. In this paper, we investigate how to leverage rating patterns from multiple incomplete source domains to improve the quality of recommender systems. First, by exploiting the transferred learning, we compress the knowledge from the source domain into a cluster-level rating matrix. The rating patterns in the low-level matrix can be transferred to the target domain. Specifically, we design a knowledge extraction method to enrich rating patterns by relaxing the full rating restriction on the source domain. Finally, we propose a robust multiple-rating-pattern transfer learning model for cross-domain collaborative filtering, which is called MINDTL, to accurately predict missing values in the target domain. Extensive experiments on real-world datasets demonstrate that our proposed approach is effective and outperforms several alternative methods.

【Keywords】: collaborative filtering; cross-domain; recommender system; transfer learning

33. Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons.

【Paper Link】【Pages】:234-242

【Authors】: Tom Hope ; Dafna Shahaf

【Abstract】: Crowdsourcing has become a popular method for collecting labeled training data. However, in many practical scenarios traditional labeling can be difficult for crowdworkers(for example, if the data is high-dimensional or unintuitive, or the labels are continuous). In this work, we develop a novel model for crowdsourcing that can complement standard practices by exploiting people»s intuitions about groups and relations between them. We employ a recent machine learning setting, called Ballpark Learning, that can estimate individual labels given only coarse, aggregated signal over groups of data points. To address the important case of continuous labels, we extend the Ballpark setting(which focused on classification) to regression problems. We formulate the problem as a convex optimization problem and propose fast, simple methods with an innate robustness to outliers. We evaluate our methods on real-world datasets, demonstrating how useful constraints about groups can be harnessed from a crowd of non-experts. Our methods can rival supervised models trained on many true labels, and can obtain considerably better results from the crowd than a standard label-collection process(for a lower price). By collecting rough guesses on groups of instances and using machine learning to infer the individual labels, our lightweight framework is able to address core crowdsourcing challenges and train machine learning models in a cost-effective way.

【Keywords】: ballpark learning; crowdsourcing; group comparisons; learning from label proportions; machine learning; weak supervision

34. Collaborative Filtering via Additive Ordinal Regression.

【Paper Link】【Pages】:243-251

【Authors】: Jun Hu ; Ping Li

【Abstract】: Accurately predicting user preferences/ratings over items are crucial for many Internet applications, e.g., recommender systems, online advertising. In current main-stream algorithms regarding the rating prediction problem, discrete rating scores are often viewed as either numerical values or(nominal) categorical labels. Practically, viewing user rating scores as numerical values or categorical labels cannot precisely reflect the exact degree of user preferences. It is expected that for each user, the quantitative distance/scale between any pair of adjacent rating scores could be different. In this paper, we propose a new ordinal regression approach. We view ordered preference scores in an additive way, where we are able to model users» internal rating patterns. Specifically, we model and learn the quantitative distances/scales between any pair of adjacent rating scores. In this way, we can generate a mapping from users» assigned discrete rating scores to the exact magnitude/degree of user preferences for items. In the application of rating prediction, we combine our newly proposed ordinal regression method with matrix factorization, forming a new ordinal matrix factorization method. Through extensive experiments on benchmark datasets, we show that our method significantly outperforms existing ordinal methods, as well as other popular collaborative filtering methods in terms of the rating prediction accuracy.

【Keywords】: collaborative filtering; matrix factorization; ordinal regression

【Paper Link】【Pages】:252-260

【Authors】: Wenjian Hu ; Krishna Kumar Singh ; Fanyi Xiao ; Jinyoung Han ; Chen-Nee Chuah ; Yong Jae Lee

【Abstract】: Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest. However, existing work mainly focuses on modeling popularity using a single metric such as the total number of likes or shares. In this work, we propose Diffusion-LSTM, a memory-based deep recurrent network that learns to recursively predict the entire diffusion path of an image through a social network. By combining user social features and image features, and encoding the diffusion path taken thus far with an explicit memory cell, our model predicts the diffusion path of an image more accurately compared to alternate baselines that either encode only image or social features, or lack memory. By mapping individual users to user prototypes, our model can generalize to new users not seen during training. Finally, we demonstrate our model»s capability of generating diffusion trees, and show that the generated trees closely resemble ground-truth trees.

【Keywords】: diffusion path prediction; online social networks; recurrent neural networks

36. Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction.

【Paper Link】【Pages】:261-269

【Authors】: Ziniu Hu ; Weiqing Liu ; Jiang Bian ; Xuanzhe Liu ; Tie-Yan Liu

【Abstract】: Stock trend prediction plays a critical role in seeking maximized profit from the stock investment. However, precise trend prediction is very difficult since the highly volatile and non-stationary nature of the stock market. Exploding information on the Internet together with the advancing development of natural language processing and text mining techniques have enabled investors to unveil market trends and volatility from online content. Unfortunately, the quality, trustworthiness, and comprehensiveness of online content related to stock market vary drastically, and a large portion consists of the low-quality news, comments, or even rumors. To address this challenge, we imitate the learning process of human beings facing such chaotic online news, driven by three principles: sequential content dependency, diverse influence, and effective and efficient learning. In this paper, to capture the first two principles, we designed a Hybrid Attention Networks(HAN) to predict the stock trend based on the sequence of recent related news. Moreover, we apply the self-paced learning mechanism to imitate the third principle. Extensive experiments on real-world stock market data demonstrate the effectiveness of our framework. A further simulation illustrates that a straightforward trading strategy based on our proposed framework can significantly increase the annualized return.

【Keywords】: deep learning; stock trend prediction; text mining

37. Exploring Expert Cognition for Attributed Network Embedding.

【Paper Link】【Pages】:270-278

【Authors】: Xiao Huang ; Qingquan Song ; Jundong Li ; Xia Hu

【Abstract】: Attributed network embedding has been widely used in modeling real-world systems. The obtained low-dimensional vector representations of nodes preserve their proximity in terms of both network topology and node attributes, upon which different analysis algorithms can be applied. Recent advances in explanation-based learning and human-in-the-loop models show that by involving experts, the performance of many learning tasks can be enhanced. It is because experts have a better cognition in the latent information such as domain knowledge, conventions, and hidden relations. It motivates us to employ experts to transform their meaningful cognition into concrete data to advance network embedding. However, learning and incorporating the expert cognition into the embedding remains a challenging task. Because expert cognition does not have a concrete form, and is difficult to be measured and laborious to obtain. Also, in a real-world network, there are various types of expert cognition such as the comprehension of word meaning and the discernment of similar nodes. It is nontrivial to identify the types that could lead to a significant improvement in the embedding. In this paper, we study a novel problem of exploring expert cognition for attributed network embedding and propose a principled framework NEEC. We formulate the process of learning expert cognition as a task of asking experts a number of concise and general queries. Guided by the exemplar theory and prototype theory in cognitive science, the queries are systematically selected and can be generalized to various real-world networks. The returned answers from the experts contain their valuable cognition. We model them as new edges and directly add into the attributed network, upon which different embedding methods can be applied towards a more informative embedding representation. Experiments on real-world datasets verify the effectiveness and efficiency of NEEC.

【Keywords】: attributed networks; human cognition; human-in-the-loop; network embedding

38. Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval.

【Paper Link】【Pages】:279-287

【Authors】: Kai Hui ; Andrew Yates ; Klaus Berberich ; Gerard de Melo

【Abstract】: Neural IR models, such as DRMM and PACRR, have achieved strong results by successfully capturing relevance matching signals. We argue that the context of these matching signals is also important. Intuitively, when extracting, modeling, and combining matching signals, one would like to consider the surrounding text(local context) as well as other signals from the same document that can contribute to the overall relevance score. In this work, we highlight three potential shortcomings caused by not considering context information and propose three neural ingredients to address them: a disambiguation component, cascade k-max pooling, and a shuffling combination layer. Incorporating these components into the PACRR model yields Co-PACER, a novel context-aware neural IR model. Extensive comparisons with established models on TREC Web Track data confirm that the proposed model can achieve superior search results. In addition, an ablation analysis is conducted to gain insights into the impact of and interactions between different components. We release our code to enable future comparisons.

【Keywords】:

39. Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking.

【Paper Link】【Pages】:288-296

【Authors】: Zhengshen Jiang ; Hongzhi Liu ; Bin Fu ; Zhonghai Wu ; Tao Zhang

【Abstract】: Recommendation based on heterogeneous information network(HIN) is attracting more and more attention due to its ability to emulate collaborative filtering, content-based filtering, context-aware recommendation and combinations of any of these recommendation semantics. Random walk based methods are usually used to mine the paths, weigh the paths, and compute the closeness or relevance between two nodes in a HIN. A key for the success of these methods is how to properly set the weights of links in a HIN. In existing methods, the weights of links are mostly set heuristically. In this paper, we propose a Bayesian Personalized Ranking(BPR) based machine learning method, called HeteLearn, to learn the weights of links in a HIN. In order to model user preferences for personalized recommendation, we also propose a generalized random walk with restart model on HINs. We evaluate the proposed method in a personalized recommendation task and a tag recommendation task. Experimental results show that our method performs significantly better than both the traditional collaborative filtering and the state-of-the-art HIN-based recommendation methods.

【Keywords】: bayesian personalized ranking; heterogeneous information network; random walk; recommender systems

40. Fast and Scalable Distributed Loopy Belief Propagation on Real-World Graphs.

【Paper Link】【Pages】:297-305

【Authors】: Saehan Jo ; Jaemin Yoo ; U. Kang

【Abstract】: Given graphs with millions or billions of vertices and edges, how can we efficiently make inferences based on partial knowledge? Loopy Belief Propagation(LBP) is a graph inference algorithm widely used in various applications including social network analysis, malware detection, recommendation, and image restoration. The algorithm calculates approximate marginal probabilities of vertices in a graph within a linear running time proportional to the number of edges. However, when it comes to real-world graphs with millions or billions of vertices and edges, this cost overwhelms the computing power of a single machine. Moreover, this kind of large-scale graphs does not fit into the memory of a single machine. Although several distributed LBP methods have been proposed, previous works do not consider the properties of real-world graphs, especially the effect of power-law degree distribution on LBP. Therefore, our work focuses on developing a fast and scalable LBP for such large real-world graphs on distributed environment. In this paper, we propose DLBP, a Distributed Loopy Belief Propagation algorithm which efficiently computes LBP in a distributed manner across multiple machines. By setting the correct convergence criterion and carefully scheduling the computations, DLBP provides up to 10.7x speed up compared to standard distributed LBP. We show that DLBP demonstrates near-linear scalability with respect to the number of machines as well as the number of edges.

【Keywords】: distributed graph processing; loopy belief propagation; real-world graphs

41. Combating Crowdsourced Review Manipulators: A Neighborhood-Based Approach.

【Paper Link】【Pages】:306-314

【Authors】: Parisa Kaghazgaran ; James Caverlee ; Anna Cinzia Squicciarini

【Abstract】: We propose a system called TwoFace to uncover crowdsourced review manipulators who target online review systems. A unique feature of TwoFace is its three-phase framework:(i) in the first phase, we intelligently sample actual evidence of manipulation(e.g., review manipulators) by exploiting low moderation crowdsourcing platforms that reveal evidence of strategic manipulation;(ii) we then propagate the suspiciousness of these seed users to identify similar users through a random walk over a "suspiciousness»» graph; and(iii) finally, we uncover(hidden) distant users who serve structurally similar roles by mapping users into a low-dimensional embedding space that captures community structure. Altogether, the TwoFace system recovers 83% to 93% of all manipulators in a sample from Amazon of 38,590 reviewers, even when the system is seeded with only a few samples from malicious crowdsourcing sites.

【Keywords】: embedding network; fake reviews; malicious crowdsourcing; online review; review manipulation; suspiciousness propagation

42. Topic Chronicle Forest for Topic Discovery and Tracking.

【Paper Link】【Pages】:315-323

【Authors】: Noriaki Kawamae

【Abstract】: To ease comprehension of given time-stamped corpora, we extend topic models to handle both the specificity and temporality of topics; this is a significant advance over previous models which fail to provide both views simultaneously. Our proposed model consists of the Topic Chronicle Forest(TCF) and Thematic Dirichlet Processes(TDP). TCF is a set of Topic Chronicle Trees, where each tree is a hierarchy of topics that becomes more specialized toward the leaves. Only one tree is defined in each time interval, a region, and is used for TDP to generate a document. The advantage of our approach lies in providing more compact topic organization, while preserving both the semantic of a given corpus and the thematic of each document. Experiments show that TCF is a useful extension for longitudinal topic discovery and tracking, and helps us to organize and digest data sets.

【Keywords】: hierarchical bayesian nonparametrics; hierarchical dirichlet processes; time varying topic models; trend analysis

43. Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation.

【Paper Link】【Pages】:324-332

【Authors】: Jooyeon Kim ; Behzad Tabibian ; Alice Oh ; Bernhard Schölkopf ; Manuel Gomez-Rodriguez

【Abstract】: Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking. If this party identifies the story as misinformation, it is marked as disputed. However, given the uncertain number of exposures, the high cost of fact checking, and the trade-off between flags and exposures, the above mentioned procedure requires careful reasoning and smart algorithms which, to the best of our knowledge, do not exist to date. In this paper, we first introduce a flexible representation of the above procedure using the framework of marked temporal point processes. Then, we develop a scalable online algorithm, CURB, to select which stories to send for fact checking and when to do so to efficiently reduce the spread of misinformation with provable guarantees. In doing so, we need to solve a novel stochastic optimal control problem for stochastic differential equations with jumps, which is of independent interest. Experiments on two real-world datasets gathered from Twitter and Weibo show that our algorithm may be able to effectively reduce the spread of fake news and misinformation.

【Keywords】: crowdsourcing; fact-checking; fake news; misinformation; social networking sites; stochastic differential equation; stochastic optimal control; temporal point processes

44. REV2: Fraudulent User Prediction in Rating Platforms.

【Paper Link】【Pages】:333-341

【Authors】: Srijan Kumar ; Bryan Hooi ; Disha Makhija ; Mohit Kumar ; Christos Faloutsos ; V. S. Subrahmanian

【Abstract】: Rating platforms enable large-scale collection of user opinion about items(e.g., products or other users). However, untrustworthy users give fraudulent ratings for excessive monetary gains. In this paper, we present REV2, a system to identify such fraudulent users. We propose three interdependent intrinsic quality metrics---fairness of a user, reliability of a rating and goodness of a product. The fairness and reliability quantify the trustworthiness of a user and rating, respectively, and goodness quantifies the quality of a product. Intuitively, a user is fair if it provides reliable scores that are close to the goodness of products. We propose six axioms to establish the interdependency between the scores, and then, formulate a mutually recursive definition that satisfies these axioms. We extend the formulation to address cold start problem and incorporate behavior properties. We develop the REV2 algorithm to calculate these intrinsic quality scores for all users, ratings, and products. We show that this algorithm is guaranteed to converge and has linear time complexity. By conducting extensive experiments on five rating datasets, we show that REV2 outperforms nine existing algorithms in detecting fair and unfair users. We reported the 150 most unfair users in the Flipkart network to their review fraud investigators, and 127 users were identified as being fraudulent(84.6% accuracy). The REV2 algorithm is being deployed at Flipkart.

【Keywords】:

45. Web Search of Fashion Items with Multimodal Querying.

【Paper Link】【Pages】:342-350

【Authors】: Katrien Laenen ; Susana Zoghbi ; Marie-Francine Moens

【Abstract】: In this paper, we introduce a novel multimodal fashion search paradigm where e-commerce data is searched with a multimodal query composed of both an image and text. In this setting, the query image shows a fashion product that the user likes and the query text allows to change certain product attributes to fit the product to the user's desire. Multimodal search gives users the means to clearly express what they are looking for. This is in contrast to current e-commerce search mechanisms, which are cumbersome and often fail to grasp the customer's needs. Multimodal search requires intermodal representations of visual and textual fashion attributes which can be mixed and matched to form the user's desired product, and which have a mechanism to indicate when a visual and textual fashion attribute represent the same concept. With a neural network, we induce a common, multimodal space for visual and textual fashion attributes where their inner product measures their semantic similarity. We build a multimodal retrieval model which operates on the obtained intermodal representations and which ranks images based on their relevance to a multimodal query. We demonstrate that our model is able to retrieve images that both exhibit the necessary query image attributes and satisfy the query texts. Moreover, we show that our model substantially outperforms two state-of-the-art retrieval models adapted to multimodal fashion search.

【Keywords】: intermodal representation; multimodal embedding space; multimodal retrieval model; multimodal search

46. Joint Non-negative Matrix Factorization for Learning Ideological Leaning on Twitter.

【Paper Link】【Pages】:351-359

【Authors】: Preethi Lahoti ; Kiran Garimella ; Aristides Gionis

【Abstract】: People are shifting from traditional news sources to online news at an incredibly fast rate. However, the technology behind online news consumption promotes content that confirms the users» existing point of view. This phenomenon has led to polarization of opinions and intolerance towards opposing views. Thus, a key problem is to model information filter bubbles on social media and design methods to eliminate them. In this paper, we use a machine-learning approach to learn a liberal-conservative ideology space on Twitter, and show how we can use the learned latent space to tackle the filter bubble problem. We model the problem of learning the liberal-conservative ideology space of social media users and media sources as a constrained non-negative matrix-factorization problem. Our model incorporates the social-network structure and content-consumption information in a joint factorization problem with shared latent factors. We validate our model and solution on a real-world Twitter dataset consisting of controversial topics, and show that we are able to separate users by ideology with over 90% purity. When applied to media sources, our approach estimates ideology scores that are highly correlated(Pearson correlation 0.9) with ground-truth ideology scores. Finally, we demonstrate the utility of our model in real-world scenarios, by illustrating how the learned ideology latent space can be used to develop exploratory and interactive interfaces that can help users in diffusing their information filter bubble.

【Keywords】: combining link and content; graph regularization; ideology; information filter bubble; latent space learning; manifold learning; matrix factorization; polarization; social networks; twitter

47. Bayesian Optimization for Optimizing Retrieval Systems.

【Paper Link】【Pages】:360-368

【Authors】: Dan Li ; Evangelos Kanoulas

【Abstract】: The effectiveness of information retrieval systems heavily depends on a large number of hyperparameters that need to be tuned. Hyperparameters range from the choice of different system components, e.g., stopword lists, stemming methods, or retrieval models, to model parameters, such as k1 and b in BM25, or the number of query expansion terms. Grid and random search, the dominant methods to search for the optimal system configuration, lack a search strategy that can guide them in the hyperparameter space. This makes them inefficient and ineffective. In this paper, we propose to use Bayesian Optimization to jointly search and optimize over the hyperparameter space. Bayesian Optimization, a sequential decision making method, suggests the next most promising configuration to be tested on the basis of the retrieval effectiveness of configurations that have been examined so far. To demonstrate the efficiency and effectiveness of Bayesian Optimization we conduct experiments on TREC collections, and show that Bayesian Optimization outperforms manual tuning, grid search and random search, both in terms of retrieval effectiveness of the configuration found, and in terms of efficiency in finding this configuration.

【Keywords】: bayesian optimization; covariance function; hyperparameter optimisation; retrieval system

48. Streaming Link Prediction on Dynamic Attributed Networks.

【Paper Link】【Pages】:369-377

【Authors】: Jundong Li ; Kewei Cheng ; Liang Wu ; Huan Liu

【Abstract】: Link prediction targets to predict the future node interactions mainly based on the current network snapshot. It is a key step in understanding the formation and evolution of the underlying networks; and has practical implications in many real-world applications, ranging from friendship recommendation, click through prediction to targeted advertising. Most existing efforts are devoted to plain networks and assume the availability of network structure in memory before link prediction takes place. However, this assumption is untenable as many real-world networks are affiliated with rich node attributes, and often, the network structure and node attributes are both dynamically evolving at an unprecedented rate. Even though recent studies show that node attributes have an added value to network structure for accurate link prediction, it still remains a daunting task to support link prediction in an online fashion on such dynamic attributed networks. As changes in the dynamic attributed networks are often transient and can be endless, link prediction algorithms need to be efficient by making only one pass of the data with limited memory overhead. To tackle these challenges, we study a novel problem of streaming link prediction on dynamic attributed networks and present a novel framework - SLIDE. Methodologically, SLIDE maintains and updates a low-rank sketching matrix to summarize all observed data, and we further leverage the sketching matrix to infer missing links on the fly. The whole procedure is theoretically guaranteed, and empirical experiments on real-world dynamic attributed networks validate the effectiveness and efficiency of the proposed framework.

【Keywords】: data stream; dynamic attributed networks; link prediction; matrix sketching

49. Inferring Dockless Shared Bike Distribution in New Cities.

【Paper Link】【Pages】:378-386

【Authors】: Zhaoyang Liu ; Yanyan Shen ; Yanmin Zhu

【Abstract】: Recently, dockless shared bike services have achieved great success and reinvented bike sharing business in China. When expanding bike sharing business into a new city, most start-ups always wish to find out how to cover the whole city with a suitable bike distribution. In this paper, we study the problem of inferring bike distribution in new cities, which is challenging. As no dockless bikes are deployed in the new city, we propose to learn insights on bike distribution from cities populated with dockless bikes. We exploit multi-source data to identify important features that affect bike distributions and develop a novel inference model combining Factor Analysis and Convolutional Neural Network techniques. The extensive experiments on real-life datasets show that the proposed solution provides significantly more accurate inference results compared with competitive prediction methods.

【Keywords】: distribution inference; dockless shared bikes; geoconv neural network; urban computing

50. Multi-Dimensional Network Embedding with Hierarchical Structure.

【Paper Link】【Pages】:387-395

【Authors】: Yao Ma ; Zhaochun Ren ; Ziheng Jiang ; Jiliang Tang ; Dawei Yin

【Abstract】: Information networks are ubiquitous in many applications. A popular way to facilitate the information in a network is to embed the network structure into low-dimension spaces where each node is represented as a vector. The learned representations have been proven to advance various network analysis tasks such as link prediction and node classification. The majority of existing embedding algorithms are designed for the networks with one type of nodes and one dimension of relations among nodes. However, many networks in the real-world complex systems have multiple types of nodes and multiple dimensions of relations. For example, an e-commerce network can have users and items, and items can be viewed or purchased by users, corresponding to two dimensions of relations. In addition, some types of nodes can present hierarchical structure. For example, authors in publication networks are associated to affiliations; and items in e-commerce networks belong to categories. Most of existing methods cannot be naturally applicable to these networks. In this paper, we aim to learn representations for networks with multiple dimensions and hierarchical structure. In particular, we provide an approach to capture independent information from each dimension and dependent information across dimensions and propose a framework MINES, which performs Multi-dImension Network Embedding with hierarchical Structure. Experimental results on a network from a real-world e-commerce website demonstrate the effectiveness of the proposed framework.

【Keywords】: hierarchical structure; multi-dimensional networks; network embedding

51. Query Driven Algorithm Selection in Early Stage Retrieval.

【Paper Link】【Pages】:396-404

【Authors】: Joel Mackenzie ; J. Shane Culpepper ; Roi Blanco ; Matt Crane ; Charles L. A. Clarke ; Jimmy Lin

【Abstract】: Large scale retrieval systems often employ cascaded ranking architectures, in which an initial set of candidate documents are iteratively refined and re-ranked by increasingly sophisticated and expensive ranking models. In this paper, we propose a unified framework for predicting a range of performance-sensitive parameters based on minimizing end-to-end effectiveness loss. The framework does not require relevance judgments for training, is amenable to predicting a wide range of parameters, allows for fine tuned efficiency-effectiveness trade-offs, and can be easily deployed in large scale search systems with minimal overhead. As a proof of concept, we show that the framework can accurately predict a number of performance parameters on a query-by-query basis, allowing efficient and effective retrieval, while simultaneously minimizing the tail latency of an early-stage candidate generation system. On the 50 million document ClueWeb09B collection, and across 25,000 queries, our hybrid system can achieve superior early-stage efficiency to fixed parameter systems without loss of effectiveness, and allows more finely-grained efficiency-effectiveness trade-offs across the multiple stages of the retrieval system.

【Keywords】: experimentation; measurement; multi-stage retrieval; performance; query prediction

52. Index Compression Using Byte-Aligned ANS Coding and Two-Dimensional Contexts.

【Paper Link】【Pages】:405-413

【Authors】: Alistair Moffat ; Matthias Petri

【Abstract】: We examine approaches used for block-based inverted index compression, such as the OptPFOR mechanism, in which fixed-length blocks of postings data are compressed independently of each other. Building on previous work in which asymmetric numeral systems (ANS) entropy coding is used to represent each block, we explore a number of enhancements: (i) the use of two-dimensional conditioning contexts, with two aggregate parameters used in each block to categorize the distribution of symbol values that underlies the ANS approach, rather than just one; (ii) the use of a byte-friendly strategic mapping from symbols to ANS codeword buckets; and (iii) the use of a context merging process to combine similar probability distributions. Collectively, these improvements yield superior compression for index data, outperforming the reference point set by the Interp mechanism, and hence representing a significant step forward. We describe experiments using the 426 GiB gov2 collection and a new large collection of publicly-available news articles to demonstrate that claim, and provide query evaluation throughput rates compared to other block-based mechanisms.

【Keywords】: asymmetric numeral systems; entropy coder; index compression; inverted index; postings list

53. Fusing Diversity in Recommendations in Heterogeneous Information Networks.

【Paper Link】【Pages】:414-422

【Authors】: Sharad Nandanwar ; Aayush Moroney ; M. Narasimha Murty

【Abstract】: In the past, hybrid recommender systems have shown the power of exploiting relationships amongst objects which directly or indirectly effect the recommendation task. However, the effect of all relations is not equal, and choosing their right balance for a recommendation problem at hand is non-trivial. We model these interactions using a Heterogeneous Information Network, and propose a systematic framework for learning their influence weights for a given recommendation task. Further, we address the issue of redundant results, which is very much prevalent in recommender systems. To alleviate redundancy in recommendations we use Vertex Reinforced Random Walk (a non-Markovian random walk) over a heterogeneous graph. It works by boosting the transitions to the influential nodes, while simultaneously shrinking the weights of others. This helps in discouraging recommendation of multiple influential nodes which lie in close proximity of each other, thus ensuring diversity. Finally, we demonstrate the effectiveness of our approach by experimenting on real world datasets. We find that, with the weights of relations learned using the proposed non-Markovian random walk based framework, the results consistently improve over the baselines.

【Keywords】: diversity in recommendation; heterogeneous information network; multivariate random walk; vertex reinforced random walk

54. Neural Personalized Ranking for Image Recommendation.

【Paper Link】【Pages】:423-431

【Authors】: Wei Niu ; James Caverlee ; Haokai Lu

【Abstract】: We propose a new model toward improving the quality of image recommendations in social sharing communities like Pinterest, Flickr, and Instagram. Concretely, we propose Neural Personalized Ranking (NPR) -- a personalized pairwise ranking model over implicit feedback datasets -- that is inspired by Bayesian Personalized Ranking (BPR) and recent advances in neural networks. We further build an enhanced model by augmenting the basic NPR model with multiple contextual preference clues including user tags, geographic features, and visual factors. In our experiments over the Flickr YFCC100M dataset, we demonstrate the proposed NPR model is more effective than multiple baselines. Moreover, the contextual enhanced NPR model significantly outperforms the base model by 16.6% and a contextual enhanced BPR model by 4.5% in precision and recall.

【Keywords】: contextual; geo-social; image recommendation; implicit feedback; neural personalized ranking; user preference

55. Learning to Discover Domain-Specific Web Content.

【Paper Link】【Pages】:432-440

【Authors】: Kien Pham ; Aécio S. R. Santos ; Juliana Freire

【Abstract】: The ability to discover all content relevant to an information domain has many applications, from helping in the understanding of humanitarian crises to countering human and arms trafficking. In such applications, time is of essence: it is crucial to both maximize coverage and identify new content as soon as it becomes available, so that appropriate actions can be taken. In this paper, we propose new methods for efficient domain-specific re-crawling that maximize the yield for new content. By learning patterns of pages that have a high yield, our methods select a small set of pages that can be re-crawled frequently, increasing the coverage and freshness while conserving resources. Unlike previous approaches to this problem, our methods combine different factors to optimize the re-crawling strategy, do not require full snapshots for the learning step, and dynamically adapt the strategy as the crawl progresses. In an empirical evaluation, we have simulated the framework over 600 partial crawl snapshots in three different domains. The results show that our approach can achieve 150% higher coverage compared to existing, state-of-the-art techniques. In addition, it is also able to capture 80% of new relevant content within less than 4 hours of publication.

【Keywords】: content discovery; focused crawling; web monitoring

56. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation.

【Paper Link】【Pages】:441-449

【Authors】: Yashoteja Prabhu ; Anil Kag ; Shilpa Gopinath ; Kunal Dahiya ; Shrutendra Harsola ; Rahul Agrawal ; Manik Varma

【Abstract】: The objective in extreme multi-label learning is to build classifiers that can annotate a data point with the subset of relevant labels from an extremely large label set. Extreme classification has, thus far, only been studied in the context of predicting labels for novel test points. This paper formulates the extreme classification problem when predictions need to be made on training points with partially revealed labels. This allows the reformulation of warm-start tagging, ranking and recommendation problems as extreme multi-label learning with each item to be ranked/recommended being mapped onto a separate label. The SwiftXML algorithm is developed to tackle such warm-start applications by leveraging label features. SwiftXML improves upon the state-of-the-art tree based extreme classifiers by partitioning tree nodes using two hyperplanes learnt jointly in the label and data point feature spaces. Optimization is carried out via an alternating minimization algorithm allowing SwiftXML to efficiently scale to large problems. Experiments on multiple benchmark tasks, including tagging on Wikipedia and item-to-item recommendation on Amazon, reveal that SwiftXML's predictions can be up to 14% more accurate as compared to leading extreme classifiers. SwiftXML also demonstrates the benefits of reformulating warm-start recommendation problems as extreme multi-label learning tasks by scaling beyond classical recommender systems and achieving prediction accuracy gains of up to 37%. Furthermore, in a live deployment for sponsored search on Bing, it was observed that SwiftXML could show significantly better ads as compared to a large ensemble of state-of-the-art techniques currently in production and could increase the relative click-through-rate by 10% while simultaneously reducing the bounce rate by 30%.

【Keywords】: extreme multi-label learning; large scale recommender systems with user and item features; sponsored search

57. DSANLS: Accelerating Distributed Nonnegative Matrix Factorization via Sketching.

【Paper Link】【Pages】:450-458

【Authors】: Yuqiu Qian ; Conghui Tan ; Nikos Mamoulis ; David W. Cheung

【Abstract】: Nonnegative matrix factorization (NMF) has been successfully applied in different fields, such as text mining, image processing, and video analysis. NMF is the problem of determining two nonnegative low rank matrices U and V, for a given input matrix M, such that m ≈ UV⊥. There is an increasing interest in parallel and distributed NMF algorithms, due to the high cost of centralized NMF on large matrices. In this paper, we propose a distributed sketched alternating nonnegative least squares(DSANLS) framework for NMF, which utilizes a matrix sketching technique to reduce the size of nonnegative least squares subproblems in each iteration for U and V. We design and analyze two different random matrix generation techniques and two subproblem solvers. Our theoretical analysis shows that DSANLS converges to the stationary point of the original NMF problem and it greatly reduces the computational cost in each subproblem as well as the communication cost within the cluster. DSANLS is implemented using MPI for communication, and tested on both dense and sparse real datasets. The results demonstrate the efficiency and scalability of our framework, compared to the state-of-art distributed NMF MPI implementation.

【Keywords】:

58. Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec.

【Paper Link】【Pages】:459-467

【Authors】: Jiezhong Qiu ; Yuxiao Dong ; Hao Ma ; Jian Li ; Kuansan Wang ; Jie Tang

【Abstract】: Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks» Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.

【Keywords】: graph spectral; matrix factorization; network embedding; representation learning; social network

59. Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning.

【Paper Link】【Pages】:468-476

【Authors】: Meng Qu ; Jian Tang ; Jiawei Han

【Abstract】: Learning node representations for networks has attracted much attention recently due to its effectiveness in a variety of applications. This paper focuses on learning node representations for heterogeneous star networks, which have a center node type linked with multiple attribute node types through different types of edges. In heterogeneous star networks, we observe that the training order of different types of edges affects the learning performance significantly. Therefore we study learning curricula for node representation learning in heterogeneous star networks, i.e., learning an optimal sequence of edges of different types for the node representation learning process. We formulate the problem as a Markov decision process, with the action as selecting a specific type of edges for learning or terminating the training process, and the state as the sequence of edge types selected so far. The reward is calculated as the performance on external tasks with node representations as features, and the goal is to take a series of actions to maximize the cumulative rewards. We propose an approach based on deep reinforcement learning for this problem. Our approach leverages LSTM models to encode states and further estimate the expected cumulative reward of each state-action pair, which essentially measures the long-term performance of different actions at each state. Experimental results on real-world heterogeneous star networks demonstrate the effectiveness and efficiency of our approach over competitive baseline approaches.

【Keywords】:

60. Leveraging Implicit Contribution Amounts to Facilitate Microfinancing Requests.

【Paper Link】【Pages】:477-485

【Authors】: Suhas Ranganath ; Ghazaleh Beigi ; Huan Liu

【Abstract】: The emergence of online microfinancing platforms provides new opportunities for people to seek financial assistance from a large number of potential contributors. However, these platforms deal with a huge number of requests, making it hard for the requesters to get assistance for their financial needs. Designing algorithms to identify potential contributors for a given request will assist in satisfying financial needs of requesters and improve the effectiveness of microfinancing platforms. Existing work correlates requests with contributor interests and profiles to design feature based approaches for recommending projects to prospective contributors. However, contributing money to financial requests has a cost on contributors which can affect his inclination to contribute in the future . Literature in economic behavior has investigated the manner in which memory of past contribution amounts affects user inclination to contribute to a given request. To systematically investigate whether these characteristics of economic behavior would help to facilitate requests in online microfinancing platforms, we present a novel framework to identify contributors for a given request from their past financial information. Individual contribution amounts are not publicly available, so we draw from financial modeling literature to model the implicit contribution amounts made to past requests. We evaluate the framework on two microfinancing platforms to demonstrate its effectiveness in identifying contributors.

【Keywords】: brownian motion; crowdfunding; lending; microfinance

61. FACH: Fast Algorithm for Detecting Cohesive Hierarchies of Communities in Large Networks.

【Paper Link】【Pages】:486-494

【Authors】: Mojtaba Rezvani ; Qing Wang ; Weifa Liang

【Abstract】: Vertices in a real-world social network can be grouped into densely connected communities that are sparsely connected to other groups. Moreover, these communities can be partitioned into successively more cohesive communities. Despite an ever-growing pile of research on hierarchical community detection, existing methods suffer from either inefficiency or inappropriate modeling. Yet, some cut-based approaches have shown to be effective in finding communities without hierarchies. In this paper, we study the hierarchical community detection problem in large networks and show that it is NP-hard. We then propose an efficient algorithm based on edge-cuts to identify the hierarchy of communities. Since communities at lower levels of the hierarchy are denser than the higher levels, we leverage a fast network sparsification technique to enhance the running time of the algorithm. We further propose a randomized approximation algorithm for information centrality of networks. We finally evaluate the performance of the proposed algorithms by conducting extensive experiments using real datasets. Our experimental results show that the proposed algorithms are promising and outperform the state-of-the-art algorithms by several orders of magnitude.

【Keywords】: hierarchical community detection; large-scale networks

【Paper Link】【Pages】:495-503

【Authors】: Farig Sadeque ; Dongfang Xu ; Steven Bethard

【Abstract】: Detecting depression is a key public health challenge, as almost 12% of all disabilities can be attributed to depression. Computational models for depression detection must prove not only that can they detect depression, but that they can do it early enough for an intervention to be plausible. However, current evaluations of depression detection are poor at measuring model latency. We identify several issues with the currently popular ERDE metric, and propose a latency-weighted F1 metric that addresses these concerns. We then apply this evaluation to several models from the recent eRisk 2017 shared task on depression detection, and show how our proposed measure can better capture system differences.

【Keywords】: depression; latency; neural networks; social media

63. Peeling Bipartite Networks for Dense Subgraph Discovery.

【Paper Link】【Pages】:504-512

【Authors】: Ahmet Erdem Sariyüce ; Ali Pinar

【Abstract】: Finding dense bipartite subgraphs and detecting the relations among them is an important problem for affiliation networks that arise in a range of domains, such as social network analysis, word-document clustering, the science of science, internet advertising, and bioinformatics. However, most dense subgraph discovery algorithms are designed for classic, unipartite graphs. Subsequently, studies on affiliation networks are conducted on the co-occurrence graphs (e.g., co-author and co-purchase) that project the bipartite structure to a unipartite structure by connecting two entities if they share an affiliation. Despite their convenience, co-occurrence networks come at a cost of loss of information and an explosion in graph sizes, which limit the quality and the efficiency of solutions. We study the dense subgraph discovery problem on bipartite graphs. We define a framework of bipartite subgraphs based on the butterfly motif (2,2-biclique) to model the dense regions in a hierarchical structure. We introduce efficient peeling algorithms to find the dense subgraphs and build relations among them. We can identify denser structures compared to the state-of-the-art algorithms on co-occurrence graphs in real-world data. Our analyses on an author-paper network and a user-product network yield interesting subgraphs and hierarchical relations such as the groups of collaborators in the same institution and spammers that give fake ratings.

【Keywords】: bipartite networks; dense subgraph discovery; k-core computation; peeling algorithms

64. Short-Term Satisfaction and Long-Term Coverage: Understanding How Users Tolerate Algorithmic Exploration.

【Paper Link】【Pages】:513-521

【Authors】: Tobias Schnabel ; Paul N. Bennett ; Susan T. Dumais ; Thorsten Joachims

【Abstract】: Any learning algorithm for recommendation faces a fundamental trade-off between exploiting partial knowledge of a user»s interests to maximize satisfaction in the short term and discovering additional user interests to maximize satisfaction in the long term. To enable discovery, a machine learning algorithm typically elicits feedback on items it is uncertain about, which is termed algorithmic exploration in machine learning. This exploration comes with a cost to the user, since the items an algorithm chooses for exploration frequently turn out to not match the user»s interests. In this paper, we study how users tolerate such exploration and how presentation strategies can mitigate the exploration cost. To this end, we conduct a behavioral study with over 600 people, where we vary how algorithmic exploration is mixed into the set of recommendations. We find that users respond non-linearly to the amount of exploration, where some exploration mixed into the set of recommendations has little effect on short-term satisfaction and behavior. For long-term satisfaction, the overall goal is to learn via exploration about the items presented. We therefore also analyze the quantity and quality of implicit feedback signals such as clicks and hovers, and how they vary with different amounts of mix-in exploration. Our findings provide insights into how to design presentation strategies for algorithmic exploration in interactive recommender systems, mitigating the short-term costs of algorithmic exploration while aiming to elicit informative feedback data for learning.

【Keywords】: algorithmic exploration; human-in-the-loop; interactive systems; recommender systems; user experience

65. CrossFire: Cross Media Joint Friend and Item Recommendations.

【Paper Link】【Pages】:522-530

【Authors】: Kai Shu ; Suhang Wang ; Jiliang Tang ; Yilin Wang ; Huan Liu

【Abstract】: Friend and item recommendation on a social media site is an important task, which not only brings conveniences to users but also benefits platform providers. However, recommendation for newly launched social media sites is challenging because they often lack user historical data and encounter data sparsity and cold-start problem. Thus, it is important to exploit auxiliary information to help improve recommendation performances on these sites. Existing approaches try to utilize the knowledge transferred from other mature sites, which often require overlapped users or similar items to ensure an effective knowledge transfer. However, these assumptions may not hold in practice because 1) Overlapped user set is often unavailable and costly to identify due to the heterogeneous user profile, content and network data, and 2) Different schemes to show item attributes across sites cause the attribute values inconsistent, incomplete, and noisy. Thus, how to transfer knowledge when no direct bridge is given between two social media sites remains a challenge. In addition, another auxiliary information we can exploit is the mutual benefit between social relationships and rating preferences within the platform. User-user relationships are widely used as side information to improve item recommendation, whereas how to exploit user-item interactions for friend recommendation is rather limited. To tackle these challenges, we propose aCross media jointF riend andI temRe commendation framework (CrossFire ), which can capture both 1) cross-platform knowledge transfer, and 2) within-platform correlations among user-user relations and user-item interactions. Empirical results on real-world datasets demonstrate the effectiveness of the proposed framework.

【Keywords】: cross media recommendation; data mining; joint learning

66. Modeling Time to Open of Emails with a Latent State for User Engagement Level.

【Paper Link】【Pages】:531-539

【Authors】: Moumita Sinha ; Vishwa Vinay ; Harvineet Singh

【Abstract】: Email messages have been an important mode of communication, not only for work, but also for social interactions and marketing. When messages have time sensitive information, it becomes relevant for the sender to know what is the expected time within which the email will be read by the recipient. In this paper we use a survival analysis framework to predict the time to open an email once it has been received. We use the Cox Proportional Hazards (CoxPH) model that offers a way to combine various features that might affect the event of opening an email. As an extension, we also apply a mixture model (MM) approach to CoxPH that distinguishes between recipients, based on a latent state of how prone to opening the messages each individual is. We compare our approach with standard classification and regression models. While the classification model provides predictions on the likelihood of an email being opened, the regression model provides prediction of the real-valued time to open. The use of survival analysis based methods allows us to jointly model both the open event as well as the time-to-open. We experimented on a large real-world dataset of marketing emails sent in a 3-month time duration. The mixture model achieves the best accuracy on our data where a high proportion of email messages go unopened.

【Keywords】: cox-proportional hazards model; email interaction data; enterprise email marketing; survival analysis; time-to-event prediction

67. Shortcutting Label Propagation for Distributed Connected Components.

【Paper Link】【Pages】:540-546

【Authors】: Stergios Stergiou ; Dipen Rughwani ; Kostas Tsioutsiouliklis

【Abstract】: Connected Components is a fundamental graph mining problem that has been studied for the PRAM, MapReduce and BSP models. We present a simple CC algorithm for BSP that does not mutate the graph, converges in O(log n) supersteps and scales to graphs of trillions of edges.

【Keywords】: connected components; distributed systems; graph algorithms; label propagation; lpsc

68. User Intent, Behaviour, and Perceived Satisfaction in Product Search.

【Paper Link】【Pages】:547-555

【Authors】: Ning Su ; Jiyin He ; Yiqun Liu ; Min Zhang ; Shaoping Ma

【Abstract】: As online shopping becomes increasingly popular, users perform more product search to purchase items. Previous studies have investigated people's online shopping behaviours and ways to predict online purchases. However, from a user perspective, there still lacks an in-depth understanding of why users search, how they interact with, and perceive the product search results. In this paper, we conduct both a user study and a log analysis to we address the following three questions: (1) what are the intents of users underlying their search activities? (2) do users behave differently under different search intents? and (3) how does user perceived satisfaction relate to their search behaviour as well as search intents, and can we predict product search satisfaction with interaction signals? Based on an online survey and search logs collected from a major commercial product search engine, we show that user intents in product search fall into three categories: Target Finding (TF), Decision Making (DM) and Exploration (EP). Through a log analysis and a user study, we observe different user interaction patterns as well as perceived satisfaction under these three intents. Using a series of user interaction features, we demonstrate that we can effectively predict user satisfaction, especially for TF and DM intents.

【Keywords】: product search; search satisfaction; user intent

69. Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction.

【Paper Link】【Pages】:556-564

【Authors】: Mingming Sun ; Xu Li ; Xin Wang ; Miao Fan ; Yue Feng ; Ping Li

【Abstract】: In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains 48,248 sentences and the corresponding facts in the SAOKE format labeled by crowdsourcing. To our knowledge, this is the largest publicly available human labeled data set for open information extraction tasks. Using this labeled SAOKE data set, we train an end-to-end neural model using the sequence-to-sequence paradigm, called Logician, to transform sentences into facts. For each sentence, different to existing algorithms which generally focus on extracting each single fact without concerning other possible facts, Logician performs a global optimization over all possible involved facts, in which facts not only compete with each other to attract the attention of words, but also cooperate to share words. An experimental study on various types of open domain relation extraction tasks reveals the consistent superiority of Logician to other states-of-the-art algorithms. The experiments verify the reasonableness of SAOKE format, the valuableness of SAOKE data set, the effectiveness of the proposed Logician model, and the feasibility of the methodology to apply end-to-end learning paradigm on supervised data sets for the challenging tasks of open information extraction.

【Keywords】: deep learning; end-to-end learning; knowledge expression; open information extraction; sequence-to-sequence learning

70. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding.

【Paper Link】【Pages】:565-573

【Authors】: Jiaxi Tang ; Ke Wang

【Abstract】: Top-N sequential recommendation models each user as a sequence of items interacted in the past and aims to predict top-N ranked items that a user will likely interact in a »near future». The order of interaction implies that sequential patterns play an important role where more recent items in a sequence have a larger impact on the next item. In this paper, we propose a Convolutional Sequence Embedding Recommendation Model »Caser» as a solution to address this requirement. The idea is to embed a sequence of recent items into an »image» in the time and latent spaces and learn sequential patterns as local features of the image using convolutional filters. This approach provides a unified and flexible network structure for capturing both general preferences and sequential patterns. The experiments on public data sets demonstrated that Caser consistently outperforms state-of-the-art sequential recommendation methods on a variety of common evaluation metrics.

【Keywords】: convolutional neural networks; recommender system; sequential prediction

71. sSketch: A Scalable Sketching Technique for PCA in the Cloud.

【Paper Link】【Pages】:574-582

【Authors】: Md. Mehrab Tanjim ; Muhammad Abdullah Adnan

【Abstract】: Multidimensional data appear frequently in many web-related applications, e.g., product ratings, the bag-of-words representation of web pages, etc. Principal Component Analysis (PCA) has been widely used for discovering patterns in relationships among entities in multidimensional data. However, existing algorithms for PCA have limited scalability since they explicitly materialize intermediate data, whose size rapidly grows as the dimension increases. To avoid scalability issues, we propose sSketch, a scalable sketching technique for PCA that employs several optimization ideas, such as mean propagation, efficient sparse matrix operations, and effective job consolidation to minimize intermediate data. Using sSketch, we also provide two other scalable methods for deriving singular value and 2-norm of reconstruction error, both of which are used for data analysis purpose. We provide our implementation on popular Spark framework for distributed platform. We compare our method against state-of-the-art library functions available for distributed settings, namely MLlib-PCA and Mahout-PCA with real big datasets. Our experiments show that our method outperforms both of them by a wide margin. To encourage reproducibility, the source code of sSketch is made publicly available at \hrefhttps://github.com/DataMiningResearch/sSketch https://github.com/DataMiningResearch/sSketch.

【Keywords】: big data; cloud computing; data mining; distributed algorithms; machine learning; web-scale mining

72. Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering.

【Paper Link】【Pages】:583-591

【Authors】: Yi Tay ; Luu Anh Tuan ; Siu Cheung Hui

【Abstract】: The dominant neural architectures in question answer retrieval are based on recurrent or convolutional encoders configured with complex word matching layers. Given that recent architectural innovations are mostly new word interaction layers or attention-based matching mechanisms, it seems to be a well-established fact that these components are mandatory for good performance. Unfortunately, the memory and computation cost incurred by these complex mechanisms are undesirable for practical applications. As such, this paper tackles the question of whether it is possible to achieve competitive performance with simple neural architectures. We propose a simple but novel deep learning architecture for fast and efficient question-answer ranking and retrieval. More specifically, our proposed model, HyperQA, is a parameter efficient neural network that outperforms other parameter intensive models such as Attentive Pooling BiLSTMs and Multi-Perspective CNNs on multiple QA benchmarks. The novelty behind HyperQA is a pairwise ranking objective that models the relationship between question and answer embeddings in Hyperbolic space instead of Euclidean space. This empowers our model with a self-organizing ability and enables automatic discovery of latent hierarchies while learning embeddings of questions and answers. Our model requires no feature engineering, no similarity matrix matching, no complicated attention mechanisms nor over-parameterized layers and yet outperforms and remains competitive to many models that have these functionalities on multiple benchmarks.

【Keywords】: deep learning; learning to rank; question answering

73. SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction.

【Paper Link】【Pages】:592-600

【Authors】: Hongwei Wang ; Fuzheng Zhang ; Min Hou ; Xing Xie ; Minyi Guo ; Qi Liu

【Abstract】: In online social networks people often express attitudes towards others, which forms massive sentiment links among users. Predicting the sign of sentiment links is a fundamental task in many areas such as personal advertising and public opinion analysis. Previous works mainly focus on textual sentiment classification, however, text information can only disclose the "tip of the iceberg»» about users» true opinions, of which the most are unobserved but implied by other sources of information such as social relation and users» profile. To address this problem, in this paper we investigate how to predict possibly existing sentiment links in the presence of heterogeneous information. First, due to the lack of explicit sentiment links in mainstream social networks, we establish a labeled heterogeneous sentiment dataset which consists of users» sentiment relation, social relation and profile knowledge by entity-level sentiment extraction method. Then we propose a novel and flexible end-to-end Signed Heterogeneous Information Network Embedding (SHINE) framework to extract users» latent representations from heterogeneous networks and predict the sign of unobserved sentiment links. SHINE utilizes multiple deep autoencoders to map each user into a low-dimension feature space while preserving the network structure. We demonstrate the superiority of SHINE over state-of-the-art baselines on link prediction and node recommendation in two real-world datasets. The experimental results also prove the efficacy of SHINE in cold start scenario.

【Keywords】: online social networks; sentiment link prediction; signed heterogeneous network embedding

74. A Unified Processing Paradigm for Interactive Location-based Web Search.

【Paper Link】【Pages】:601-609

【Authors】: Sheng Wang ; Zhifeng Bao ; Shixun Huang ; Rui Zhang

【Abstract】: This paper studies the location-based web search and aims to build a unified processing paradigm for two purposes: (1) efficiently support each of the various types of location-based queries (kNN query, top-k spatial-textual query, etc.) on two major forms of geo-tagged data, i.e., spatial point data such as geo-tagged web documents, and spatial trajectory data such as a sequence of geo-tagged travel blogs by a user; (2) support interactive search to provide quick response for a query session, within which a user usually keeps refining her query by either issuing different query types or specifying different constraints (e.g., adding a keyword and/or location, changing the choice of k, etc.) until she finds the desired results. To achieve this goal, we first propose a general Top-k query called Monotone Aggregate Spatial Keyword query-MASK, which is able to cover most types of location-based web search. Next, we develop a unified indexing (called Textual-Grid-Point Inverted Index) and query processing paradigm (called ETAIL Algorithm) to answer a single MASK query efficiently. Furthermore, we extend ETAIL to provide interactive search for multiple queries within one query session, by exploiting the commonality of textual and/or spatial dimension among queries. Last, extensive experiments on four real datasets verify the robustness and efficiency of our approach.

【Keywords】: inverted index; spatial keyword search; threshold algorithm

75. Position Bias Estimation for Unbiased Learning to Rank in Personal Search.

【Paper Link】【Pages】:610-618

【Authors】: Xuanhui Wang ; Nadav Golbandi ; Michael Bendersky ; Donald Metzler ; Marc Najork

【Abstract】: A well-known challenge in learning from click data is its inherent bias and most notably position bias. Traditional click models aim to extract the ‹query, document› relevance and the estimated bias is usually discarded after relevance is extracted. In contrast, the most recent work on unbiased learning-to-rank can effectively leverage the bias and thus focuses on estimating bias rather than relevance [20, 31]. Existing approaches use search result randomization over a small percentage of production traffic to estimate the position bias. This is not desired because result randomization can negatively impact users' search experience. In this paper, we compare different schemes for result randomization (i.e., RandTopN and RandPair) and show their negative effect in personal search. Then we study how to infer such bias from regular click data without relying on randomization. We propose a regression-based Expectation-Maximization (EM) algorithm that is based on a position bias click model and that can handle highly sparse clicks in personal search. We evaluate our EM algorithm and the extracted bias in the learning-to-rank setting. Our results show that it is promising to extract position bias from regular clicks without result randomization. The extracted bias can improve the learning-to-rank algorithms significantly. In addition, we compare the pointwise and pairwise learning-to-rank models. Our results show that pairwise models are more effective in leveraging the estimated bias.

【Keywords】: expectation-maximization; inverse propensity weighting; position bias estimation

76. A Path-constrained Framework for Discriminating Substitutable and Complementary Products in E-commerce.

【Paper Link】【Pages】:619-627

【Authors】: Zihan Wang ; Ziheng Jiang ; Zhaochun Ren ; Jiliang Tang ; Dawei Yin

【Abstract】: In personalized recommendation, candidate generation plays an infrastructural role by retrieving candidates out of billions of items. During this process, substitutes and complements constitute two main classes of retrieved candidates: substitutable products are interchangeable, whereas complementary products might be purchased together by users. Discriminating substitutable and complementary products is playing an increasingly important role in e-commerce portals by affecting the performance of candidate generation, e.g., when a user has browsed a t-shirt, it is reasonable to retrieve similar t-shirts, i.e., substitutes; whereas if the user has already purchased one, it would be better to retrieve trousers, hats or shoes, as complements of t-shirts. In this paper, we propose a path-constrained framework (PMSC) for discriminating substitutes and complements. Specifically, for each product, we first learn its embedding representations in a general semantic space. Thereafter, we project the embedding vectors into two separate spaces via a novel mapping function. In the end, we incorporate each embedding with path-constraints to further boost the discriminative ability of the model. Extensive experiments conducted on two e-commerce datasets show the effectiveness of our proposed method.

【Keywords】:

77. Customer Purchase Behavior Prediction from Payment Datasets.

【Paper Link】【Pages】:628-636

【Authors】: Yu Ting Wen ; Pei-Wen Yeh ; Tzu-Hao Tsai ; Wen-Chih Peng ; Hong-Han Shuai

【Abstract】: With the advances in the development of mobile payments, a huge amount of payment data are collected by banks. User payment data offer a good dataset to depict customer behavior patterns. A comprehensive understanding of customers' purchase behavior is crucial to developing good marketing strategies, which may trigger much greater purchase amounts. For example, by exploring customer behavior patterns, given a target store, a set of potential customers is able to be identified. In other words, personalized campaigns at the right time and in the right place can be treated as the last stage of consumption. Here we propose a probability graphical model that exploits the payment data to discover customer purchase behavior in the spatial, temporal, payment amount and product category aspects, named STPC-PGM. As a result, the mobility behavior of an individual user could be predicted with a probabilistic graphical model that accounts for all aspects of each customer's relationship with the payment platform. To achieve real time advertising, we then develop an online framework that efficiently computes the prediction results. Our experiment results show that STPC-PGM is effective in discovering customers' profiling features, and outperforms the state-of-the-art methods in purchase behavior prediction. In addition, the prediction results are being deployed in the marketing of real-world credit card users, and have presented a significant growth in the advertising conversion rate.

【Keywords】: customer behavior prediction; financial technology; real time advertising

【Paper Link】【Pages】:637-645

【Authors】: Liang Wu ; Huan Liu

【Abstract】: When a message, such as a piece of news, spreads in social networks, how can we classify it into categories of interests, such as genuine or fake news? Classification of social media content is a fundamental task for social media mining, and most existing methods regard it as a text categorization problem and mainly focus on using content features, such as words and hashtags. However, for many emerging applications like fake news and rumor detection, it is very challenging, if not impossible, to identify useful features from content. For example, intentional spreaders of fake news may manipulate the content to make it look like real news. To address this problem, this paper concentrates on modeling the propagation of messages in a social network. Specifically, we propose a novel approach, TraceMiner, to (1) infer embeddings of social media users with social network structures; and (2) utilize an LSTM-RNN to represent and classify propagation pathways of a message. Since content information is sparse and noisy on social media, adopting TraceMiner allows to provide a high degree of classification accuracy even in the absence of content information. Experimental results on real-world datasets show the superiority over state-of-the-art approaches on the task of fake news detection and news categorization.

【Keywords】: classification; fake news detection; graph mining; misinformation; social media mining; social network analysis

79. Indirect Supervision for Relation Extraction using Question-Answer Pairs.

【Paper Link】【Pages】:646-654

【Authors】: Zeqiu Wu ; Xiang Ren ; Frank F. Xu ; Ji Li ; Jiawei Han

【Abstract】: Automatic relation extraction (E)for types of interest is of great importance for interpreting massive text corpora in an efficient manner. For example, we want to identify the relationship "president_of" between entities "Donald Trump" and "United States" in a sentence expressing such a relation. Traditional RE models have heavily relied on human-annotated corpus for training, which can be costly in generating labeled data and become obstacles when dealing with more relation types. Thus, more RE extraction systems have shifted to be built upon training data automatically acquired by linking to knowledge bases (distant supervision). However, due to the incompleteness of knowledge bases and the context-agnostic labeling, the training data collected via distant supervision (DS) can be very noisy. In recent years, as increasing attention has been brought to tackling question-answering (QA) tasks, user feedback or datasets of such tasks become more accessible. In this paper, we propose a novel framework, ReQuest, to leverage question-answer pairs as an indirect source of supervision for relation extraction, and study how to use such supervision to reduce noise induced from DS. Our model jointly embeds relation mentions, types, QA entity mention pairs and text features in two low-dimensional spaces (RE and QA), where objects with same relation types or semantically similar question-answer pairs have similar representations. Shared features connect these two spaces, carrying clearer semantic knowledge from both sources. ReQuest, then use these learned embeddings to estimate the types of test relation mentions. We formulate a global objective function and adopt a novel margin-based QA loss to reduce noise in DS by exploiting semantic evidence from the QA dataset. Our experimental results achieve an average of 11% improvement in F1 score on two public RE datasets combined with TREC QA dataset. Codes and datasets can be downloaded at https://github.com/ellenmellon/ReQuest.

【Keywords】: distant supervision; indirect supervision; relation extraction

80. Why People Search for Images using Web Search Engines.

【Paper Link】【Pages】:655-663

【Authors】: Xiaohui Xie ; Yiqun Liu ; Maarten de Rijke ; Jiyin He ; Min Zhang ; Shaoping Ma

【Abstract】: What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image search behavior have mostly been query-based, focusing on what images people search for, rather than intent-based, that is, why people search for images. To date, there is no thorough investigation of how different image search intents affect users» search behavior. In this paper, we address the following questions: (1)Why do people search for images in text-based Web image search systems? (2)How does image search behavior change with user intent? (3)Can we predict user intent effectively from interactions during the early stages of a search session? To this end, we conduct both a lab-based user study and a commercial search log analysis. We show that user intents in image search can be grouped into three classes: Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals different user behavior patterns under these three intents, such as first click time, query reformulation, dwell time and mouse movement on the result page. Based on user interaction features during the early stages of an image search session, that is, before mouse scroll, we develop an intent classifier that is able to achieve promising results for classifying intents into our three intent classes. Given that all features can be obtained online and unobtrusively, the predicted intents can provide guidance for choosing ranking methods immediately after scrolling.

【Keywords】: image search; user behavior; user intent

81. OpenRec: A Modular Framework for Extensible and Adaptable Recommendation Algorithms.

【Paper Link】【Pages】:664-672

【Authors】: Longqi Yang ; Eugene Bagdasaryan ; Joshua Gruenstein ; Cheng-Kang Hsieh ; Deborah Estrin

【Abstract】: With the increasing demand for deeper understanding of users» preferences, recommender systems have gone beyond simple user-item filtering and are increasingly sophisticated, comprised of multiple components for analyzing and fusing diverse information. Unfortunately, existing frameworks do not adequately support extensibility and adaptability and consequently pose significant challenges to rapid, iterative, and systematic, experimentation. In this work, we propose OpenRec, an open and modular Python framework that supports extensible and adaptable research in recommender systems. Each recommender is modeled as a computational graph that consists of a structured ensemble of reusable modules connected through a set of well-defined interfaces. We present the architecture of OpenRec and demonstrate that OpenRec provides adaptability, modularity and reusability while maintaining training efficiency and recommendation accuracy. Our case study illustrates how OpenRec can support an efficient design process to prototype and benchmark alternative approaches with inter-changeable modules and enable development and evaluation of new algorithms.

【Keywords】: adaptable; extensible; framework; modular; recommendation

82. Dynamic Word Embeddings for Evolving Semantic Discovery.

【Paper Link】【Pages】:673-681

【Authors】: Zijun Yao ; Yifan Sun ; Weicong Ding ; Nikhil Rao ; Hui Xiong

【Abstract】: Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct of human language evolution. By studying word evolution, we can infer social trends and language constructs over different periods of human history. However, traditional techniques such as word representation learning do not adequately capture the evolving language structure and vocabulary. In this paper, we develop a dynamic statistical model to learn time-aware word vector representation. We propose a model that simultaneously learns time-aware embeddings and solves the resulting alignment problem. This model is trained on a crawled NYTimes dataset. Additionally, we develop multiple intuitive evaluation strategies of temporal word embeddings. Our qualitative and quantitative tests indicate that our method not only reliably captures this evolution over time, but also consistently outperforms state-of-the-art temporal embedding approaches on both semantic accuracy and alignment quality.

【Keywords】: dynamic word embeddings; word semantic analysis

83. Modelling Domain Relationships for Transfer Learning on Retrieval-based Question Answering Systems in E-commerce.

【Paper Link】【Pages】:682-690

【Authors】: Jianfei Yu ; Minghui Qiu ; Jing Jiang ; Jun Huang ; Shuangyong Song ; Wei Chu ; Haiqing Chen

【Abstract】: Nowadays, it is a heated topic for many industries to build automatic question-answering (QA) systems. A key solution to these QA systems is to retrieve from a QA knowledge base the most similar question of a given question, which can be reformulated as a paraphrase identification (PI) or a natural language inference (NLI) problem. However, most existing models for PI and NLI have at least two problems: They rely on a large amount of labeled data, which is not always available in real scenarios, and they may not be efficient for industrial applications. In this paper, we study transfer learning for the PI and NLI problems, aiming to propose a general framework, which can effectively and efficiently adapt the shared knowledge learned from a resource-rich source domain to a resource-poor target domain. Specifically, since most existing transfer learning methods only focus on learning a shared feature space across domains while ignoring the relationship between the source and target domains, we propose to simultaneously learn shared representations and domain relationships in a unified framework. Furthermore, we propose an efficient and effective hybrid model by combining a sentence encoding-based method and a sentence interaction-based method as our base model. Extensive experiments on both paraphrase identification and natural language inference demonstrate that our base model is efficient and has promising performance compared to the competing models, and our transfer learning method can help to significantly boost the performance. Further analysis shows that the inter-domain and intra-domain relationship captured by our model are insightful. Last but not least, we deploy our transfer learning model for PI into our online chatbot system, which can bring in significant improvements over our existing system. Finally, we launch our new system on the chatbot platform Eva in our E-commerce site AliExpress.

【Keywords】: adversarial training; domain relationships learning; retrieval-based question answering; transfer learning

【Paper Link】【Pages】:691-699

【Authors】: Qian Yu ; Wai Lam

【Abstract】: In E-commerce sites, there are platforms for users to pose product-related questions and experienced customers may provide answers voluntarily. Among the questions asked by users, a large proportion of them are yes-no questions reflecting that users wish to know whether or not the product can satisfy a certain criterion or meet a certain expectation. Both Question Answering (QA) approaches and Community Question Answering methods are not suitable for answer prediction for new questions in this setting. The reasons are that questions are product-associated and many of them are concerned about user experiences and subjective opinions. In addition to existing question-answer pairs, user written reviews can provide useful clues for answer prediction. In this paper, we propose a new framework that can tackle the task of review-aware answer prediction for product-related questions. The aspect analytics model in this framework learns latent aspects as well as aspect-specific embeddings of reviews via a 3-order Autoencoder. One advantage of this learned model is that it can generate aspect-specific representations for new questions. The predictive answer model in our framework, learned jointly from existing questions, answers, and reviews, is able to predict the answers for new yes-no questions taking into consideration of aspects. Besides, our framework can provide supportive reviews grouped by relevant aspects serving as information for explainable answers. Experiment results on 15 different product categories from a large-scale benchmark E-commence QA dataset demonstrate the effectiveness of our framework.

【Keywords】: 3-order autoencoder; aspect detection; question answering

85. Neural Ranking Models with Multiple Document Fields.

【Paper Link】【Pages】:700-708

【Authors】: Hamed Zamani ; Bhaskar Mitra ; Xia Song ; Nick Craswell ; Saurabh Tiwary

【Abstract】: Deep neural networks have recently shown promise in the ad-hoc retrieval task. However, such models have often been based on one field of the document, for example considering document title only or document body only. Since in practice documents typically have multiple fields, and given that non-neural ranking models such as BM25F have been developed to take advantage of document structure, this paper investigates how neural models can deal with multiple document fields. We introduce a model that can consume short text fields such as document title and long text fields such as document body. It can also handle multi-instance fields with variable number of instances, for example where each document has zero or more instances of incoming anchor text. Since fields vary in coverage and quality, we introduce a masking method to handle missing field instances, as well as a field-level dropout method to avoid relying too much on any one field. As in the studies of non-neural field weighting, we find it is better for the ranker to score the whole document jointly, rather than generate a per-field score and aggregate. We find that different document fields may match different aspects of the query and therefore benefit from comparing with separate representations of the query text. The combination of techniques introduced here leads to a neural ranker that can take advantage of full document structure, including multiple instance and missing instance data, of variable length. The techniques significantly enhance the performance of the ranker, and outperform a learning to rank baseline with hand-crafted features.

【Keywords】: deep neural networks; document representation; neural ranking models; representation learning; structured documents

86. Discrete Deep Learning for Fast Content-Aware Recommendation.

【Paper Link】【Pages】:717-726

【Authors】: Yan Zhang ; Hongzhi Yin ; Zi Huang ; Xingzhong Du ; Guowu Yang ; Defu Lian

【Abstract】: Cold-start problem and recommendation efficiency have been regarded as two crucial challenges in the recommender system. In this paper, we propose a hashing based deep learning framework called Discrete Deep Learning (DDL), to map users and items to Hamming space, where a user»s preference for an item can be efficiently calculated by Hamming distance, and this computation scheme significantly improves the efficiency of online recommendation. Besides, DDL unifies the user-item interaction information and the item content information to overcome the issues of data sparsity and cold-start. To be more specific, to integrate content information into our DDL framework, a deep learning model, Deep Belief Network (DBN), is applied to extract effective item representation from the item content information. Besides, the framework imposes balance and irrelevant constraints on binary codes to derive compact but informative binary codes. Due to the discrete constraints in DDL, we propose an efficient alternating optimization method consisting of iteratively solving a series of mixed-integer programming subproblems. Extensive experiments have been conducted to evaluate the performance of our DDL framework on two different Amazon datasets, and the experimental results demonstrate the superiority of DDL over the state-of-the-art methods regarding online recommendation efficiency and cold-start recommendation accuracy.

【Keywords】: cold-start; deep learning; hash code; recommender system

87. Micro Behaviors: A New Perspective in E-commerce Recommender Systems.

【Paper Link】【Pages】:727-735

【Authors】: Meizi Zhou ; Zhuoye Ding ; Jiliang Tang ; Dawei Yin

【Abstract】: The explosive popularity of e-commerce sites has reshaped users» shopping habits and an increasing number of users prefer to spend more time shopping online. This evolution allows e-commerce sites to observe rich data about users. The majority of traditional recommender systems have focused on the macro interactions between users and items, i.e., the purchase history of a customer. However, within each macro interaction between a user and an item, the user actually performs a sequence of micro behaviors, which indicate how the user locates the item, what activities the user conducts on the item (e.g., reading the comments, carting, and ordering) and how long the user stays with the item. Such micro behaviors offer fine-grained and deep understandings about users and provide tremendous opportunities to advance recommender systems in e-commerce. However, exploiting micro behaviors for recommendations is rather limited, which motivates us to investigate e-commerce recommendations from a micro-behavior perspective in this paper. Particularly, we uncover the effects of micro behaviors on recommendations and propose an interpretable Recommendation framework RIB, which models inherently the sequence of mIcro Behaviors and their effects. Experimental results on datasets from a real e-commence site demonstrate the effectiveness of the proposed framework and the importance of micro behaviors for recommendations.

【Keywords】: attention mechanism; e-commerce; micro behaviors; recommendation; rnn

88. Predicting Multi-step Citywide Passenger Demands Using Attention-based Neural Networks.

【Paper Link】【Pages】:736-744

【Authors】: Xian Zhou ; Yanyan Shen ; Yanmin Zhu ; Linpeng Huang

【Abstract】: Predicting passenger pickup/dropoff demands based on historical mobility trips has been of great importance towards better vehicle distribution for the emerging mobility-on-demand (MOD) services. Prior works focused on predicting next-step passenger demands at selected locations or hotspots. However, we argue that multi-step citywide passenger demands encapsulate both time-varying demand trends and global statuses, and hence are more beneficial to avoiding demand-service mismatching and developing effective vehicle distribution/scheduling strategies. In this paper, we propose an end-to-end deep neural network solution to the prediction task. We employ the encoder-decoder framework based on convolutional and ConvLSTM units to identify complex features that capture spatiotemporal influences and pickup-dropoff interactions on citywide passenger demands. A novel attention model is incorporated to emphasize the effects of latent citywide mobility regularities. We evaluate our proposed method using real-word mobility trips (taxis and bikes) and the experimental results show that our method achieves higher prediction accuracy than the adaptations of the state-of-the-art approaches.

【Keywords】: attention model; convolutional neural network; demands prediction; lstm

Doctoral Presentations 7

89. Event Mining over Distributed Text Streams.

【Paper Link】【Pages】:745-746

【Authors】: John Calvo Martinez

【Abstract】: This research presents a new set of techniques to deal with event mining from different text sources, a complex set of NLP tasks which aim to extract events of interest and their components including authors, targets, locations, and event categories. Our focus is on distributed text streams, such as tweets from different news agencies, in order to accurately retrieve events and its components by combining such sources in different ways using text stream mining. Therefore this research project aims to fill the gap between batch event mining, text stream mining and distributed data mining which have been used separately to address related learning tasks. We propose a multi-task and multi-stream mining approach to combine information from multiple text streams to accurately extract and categorise events under the assumptions of stream mining. Our approach also combines ontology matching to boost accuracy under imbalanced distributions. In addition, we plan to address two relatively unexplored event mining tasks: event coreference and event synthesis. Preliminary results show the appropriateness of our proposal, which is giving an increase of around 20% on macro prequential metrics for the event classification task.

【Keywords】: event mining; stream mining; text mining

90. Connectivity in Complex Networks: Measures, Inference and Optimization.

【Paper Link】【Pages】:747-748

【Authors】: Chen Chen

【Abstract】: Networks are ubiquitous in many high impact domains. Among the various aspects of network studies, connectivity is the one that plays important role in many applications (e.g., information dissemination, robustness analysis, community detection, etc.). The diversified applications have spurred numerous connectivity measures. Accordingly, ad-hoc connectivity optimization methods are designed for each measure, making it hard to model and control the connectivity of the network in a uniformed framework. On the other hand, it is often impossible to maintain an accurate structure of the network due to network dynamics and noise in real applications, which would affect the accuracy of connectivity measures and the effectiveness of corresponding connectivity optimization methods. In this work, we aim to address the challenges on network connectivity by (1)unifying a wide range of classic network connectivity measures into one uniform model; (2)proposing effective approaches to infer connectivity measures and network structures from dynamic and incomplete input data, and (3) providing a general framework to optimize the connectivity measures in the network.

【Keywords】: graph mining; network connectivity

91. Automatic Ranking of Information Retrieval Systems.

【Paper Link】【Pages】:749-750

【Authors】: Maram Hasanain

【Abstract】: Typical information retrieval system evaluation requires expensive manually-collected relevance judgments of documents, which are used to rank retrieval systems. Due to the high cost associated with collecting relevance judgments and the ever-growing scale of data to be searched in practice, ranking of retrieval systems using manual judgments is becoming less feasible. Methods to automatically rank systems in absence of judgments have been proposed to tackle this challenge. However, current techniques are still far from reaching the ranking achieved using manual judgments. I propose to advance research on automatic system ranking using supervised and unsupervised techniques.

【Keywords】: evaluation; learning-to-rank; pseudo judgments; test collections

92. Mining Twitter for Fine-Grained Political Opinion Polarity Classification, Ideology Detection and Sarcasm Detection.

【Paper Link】【Pages】:751-752

【Authors】: Sandeepa Kannangara

【Abstract】: In this paper, we propose three models for socio-political opinion polarity classification of microblog posts. Firstly, a novel probabilistic model, Joint-Entity-Sentiment-Topic (JEST) model, which captures opinions as a combination of the target entity, sentiment and topic, will be proposed. Secondly, a model for ideology detection called JEST-Ideology will be proposed to identify an individual»s orientation towards topics/issues and target entities by extending the proposed opinion polarity classification framework. Finally, we propose a novel method to accurately detect sarcastic opinions by utilizing detected fine-grained opinion and ideology.

【Keywords】: fine-grained opinion mining; ideology detection; sarcasm detection

93. Beyond Who and What: Data Driven Approaches for User Characterization.

【Paper Link】【Pages】:753-754

【Authors】: Aastha Nigam

【Abstract】: Social media and technology have drastically transformed the social and information networks around us. They have impacted how we communicate with others, search for information, and even how we express our personal opinions. Further, in this era of big data, not only are the online services collecting vast variety of user data, but we, as users, are also readily divulging significant amounts of information. Together, massive datasets obtained from diverse sources such as organizations and user generated content give us the opportunity to explore and understand complex behavior of both individuals and communities. This proposal aims at designing generalizable and scalable data-driven frameworks to gain a deeper understanding of the users, explain their actions and preferences, and infer personal traits. The proposed models will enable us to move beyond asking the conventional questions of who and what, and reveal answers about how and why. Given the varying digital persona of users motivated by their personal preferences and social attributes, we characterize users in two distinct domains: online health and peace studies. The models are designed to solve various real-world challenges to maximize their broader impact.

【Keywords】: computation social science; online healthcare; peace studies; user behavior; user modeling

94. Engagement and Incentives in Online Community: Observational Data, Prediction Models, and Field Experiments.

【Paper Link】【Pages】:755-756

【Authors】: Jiezhong Qiu

【Abstract】: This proposal aims to study user engagement pattern and how different incentive mechanisms influence user behavior in online communities. Work in this proposal investigates the diverse behavior patterns that different individuals follow in various online communities, and how incentive design can help increase user engagement. First, our work on MOOCs leads to the discovery of behavioral heterogeneity in students course selection as well as their learning patterns. Secondly, our work on social messaging groups characterizes the formation and evolution pattern of chat groups regarding their lifecycles, structures dynamics, and underlying diffusion processes. Finally, we design and deploy a large-scale online experiment to explore how social tie, as a type of incentive, can help call back dropout users in a social game community. To the end, studying engagement and incentive offers us an opportunity to understand the fundamental principles that drive our online behaviors and activities - from individuals, to groups, to communities - and, in this way, to help design and build better online communities and organizations.

【Keywords】: incentives; moocs; online engagement; online experiment; social game; social network

95. Exploiting Human Mobility Patterns for Point-of-Interest Recommendation.

【Paper Link】【Pages】:757-758

【Authors】: Zijun Yao

【Abstract】: Point-of-interest (POI) recommendation, which provides personalized recommendation of places to mobile users, is an important task in location-based social networks (LBSNs). Unlike traditional interest-oriented merchandise recommendation, POI recommendation is more complex due to the timing effects: we need to examine whether the POI fits a user»s availability. While there are some prior studies which consider temporal effects by solely using check-in timestamps for modeling, they suffer from check-in data sparsity. Recent years, the advent in positioning technology has accumulated a variety of urban data related to human mobility. There is a potential to exploit human mobility patterns from heterogeneous information sources for improving POI recommendation. To this end, we propose a novel method which incorporates the degree of temporal matching between users and POIs into personalized POI recommendations. Specifically, we profile the temporal popularity of POIs, learn the latent regularity to characterize users, and conduct comprehensive experiments with real-world data. Evaluation results demonstrate the effectiveness of the proposed method.

【Keywords】: human mobility patterns; point-of-interest recommendation

Demonstrations 4

96. Percolator: Scalable Pattern Discovery in Dynamic Graphs.

【Paper Link】【Pages】:759-762

【Authors】: Sutanay Choudhury ; Sumit Purohit ; Peng Lin ; Yinghui Wu ; Lawrence B. Holder ; Khushbu Agarwal

【Abstract】: We demonstrate \perco, a distributed system for graph pattern discovery in dynamic graphs. In contrast to conventional mining systems, Percolator advocates efficient pattern mining schemes that (1) support pattern detection with keywords; (2) integrate incremental and parallel pattern mining; and (3) support analytical queries such as trend analysis. The core idea of \perco is to dynamically decide and verify a small fraction of patterns and their instances that must be inspected in response to buffered updates in dynamic graphs, with a total mining cost independent of graph size. We demonstrate a( the feasibility of incremental pattern mining by walking through each component of \perco, b) the efficiency and scalability of \perco over the sheer size of real-world dynamic graphs, and c) how the user-friendly \gui of \perco interacts with users to support keyword-based queries that detect, browse and inspect trending patterns. We demonstrate how \perco effectively supports event and trend analysis in social media streams and research publication, respectively.

【Keywords】: data stream; graph mining; parallel system

97. Conversational Semantic Search: Looking Beyond Web Search, Q&A and Dialog Systems.

【Paper Link】【Pages】:763-766

【Authors】: Paul A. Crook ; Alex Marin ; Vipul Agarwal ; Samantha Anderson ; Ohyoung Jang ; Aliasgar Lanewala ; Karthik Tangirala ; Imed Zitouni

【Abstract】: User expectations of web search are changing. They are expecting search engines to answer questions, to be more conversational, and to offer means to complete tasks on their behalf. At the same time, to increase the breadth of tasks that personal digital assistants (PDAs), such as Microsoft»s Cortana or Amazon»s Alexa, are capable of, PDAs need to better utilize information about the world, a significant amount of which is available in the knowledge bases and answers built for search engines. It thus seems likely that the underlying systems that power web search and PDAs will converge. This demonstration presents a system that merges elements of traditional multi-turn dialog systems with web based question answering. This demo focuses on the automatic composition of semantic functional units, Botlets, to generate responses to user»s natural language (NL) queries. We show that such a system can be trained to combine information from search engine answers with PDA tasks to enable new user experiences.

【Keywords】: conversational agents; dialog systems; personal digital assistants; question answering; semantic search

98. Supporting Large-scale Geographical Visualization in a Multi-granularity Way.

【Paper Link】【Pages】:767-770

【Authors】: Mingzhao Li ; Zhifeng Bao ; Farhana Murtaza Choudhury ; Timos Sellis

【Abstract】: Urban data (e.g., real estate data, crime data) often have multiple attributes which are highly geography-related. With the scale of data increases, directly visualizing millions of individual data points on top of a map would overwhelm users' perceptual and cognitive capacity and lead to high latency when users interact with the data. In this demo, we present ConvexCubes, a system that supports interactive visualization of large-scale multidimensional urban data in a multi-granularity way. Comparing to state-of-the-art visualization-driven data structures, it exploits real-world geographic semantics (e.g., country, state, city) rather than using grid-based aggregation. Instead of calculating everything on demand, ConvexCubes utilizes existing visualization results to efficiently support different kinds of user interactions, such as zooming & panning, filtering and granularity control. Our system can be accessed at http://115.146.89.158/ConvexCubes/.

【Keywords】: data structure; geographical visualization; interactive visualization; visual analytics

99. Collabot: Personalized Group Chat Summarization.

【Paper Link】【Pages】:771-774

【Authors】: Naama Tepper ; Anat Hashavit ; Maya Barnea ; Inbal Ronen ; Lior Leiba

【Abstract】: In recent years, enterprise group chat collaboration tools, such as Slack, IBM»s Watson Workspace and Microsoft Teams, have presented unprecedented growth. With all the potential benefits of these tools - productivity increase and improved group communication - come significant challenges. Specifically, the 'always on' feature that makes it hard for users to cope with the load of conversational content and get up to speed after logging off for a while. In this demo, we present Collabot - a chat assistant service that implicitly learns users interests and social ties within a chat group and provides a personalized digest of missed content. Collabot assists users in coping with chat information overload by helping them understand the main topics discussed, collaborators, links and resources. This demo has two main contributions. First, we present a novel personalized group chat summarization algorithm; second the demonstration depicts a working implementation applied on different chat groups from different domains within IBM. A video, describing the demo can be found at https://www.youtube.com/watch?v=6cVsstiJ9vk.

【Keywords】: group chat; personalization; social networks; summarization

Tutorials 8

【Paper Link】【Pages】:775-776

【Authors】: Çigdem Aslay ; Laks V. S. Lakshmanan ; Wei Lu ; Xiaokui Xiao

【Abstract】: Starting with the earliest studies showing that the spread of new trends, information, and innovations is closely related to the social influence exerted on people by their social networks, the research on social influence theory took off, providing remarkable evidence on social influence induced viral phenomena. Fueled by the extreme popularity of online social networks and social media, computational social influence has emerged as a subfield of data mining whose goal is to analyze and optimize social influence using computational frameworks such as algorithm design and theoretical modeling. One of the fundamental problems in this field is the problem of influence maximization, primarily motivated by the application of viral marketing. The objective is to identify a small set of users in a social network who, when convinced to adopt a product, shall influence others in the network in a manner that leads to a large number of adoptions. In this tutorial, we extensively survey the research on social influence propagation and maximization, with a focus on the recent algorithmic and theoretical advances. To this end, we provide detailed reviews of the latest research effort devoted to (i) improving the efficiency and scalability of the influence maximization algorithms; (ii) context-aware modeling of the influence maximization problem to better capture real-world marketing scenarios; (iii) modeling and learning of real-world social influence; (iv) bridging the gap between social advertising and viral marketing.

【Keywords】: influence maximization; social advertising; social influence; social networks

101. Differential Privacy for Information Retrieval.

【Paper Link】【Pages】:777-778

【Authors】: Grace Hui Yang ; Sicong Zhang

【Abstract】: The concern for privacy is real for any research that uses user data. Information Retrieval (IR) is not an exception. Many IR algorithms and applications require the use of users' personal information, contextual information and other sensitive and private information. The extensive use of personalization in IR has become a double-edged sword. Sometimes, the concern becomes so overwhelming that IR research has to stop to avoid privacy leaks. The good news is that recently there have been increasing attentions paid on the joint field of privacy and IR -- privacy-preserving IR. As part of the effort, this tutorial offers an introduction to differential privacy (DP), one of the most advanced techniques in privacy research, and provides necessary set of theoretical knowledge for applying privacy techniques in IR. Differential privacy is a technique that provides strong privacy guarantees for data protection. Theoretically, it aims to maximize the data utility in statistical datasets while minimizing the risk of exposing individual data entries to any adversary. Differential privacy has been successfully applied to a wide range of applications in database (DB) and data mining (DM). The research in privacy-preserving IR is relatively new, however, research has shown that DP is also effective in supporting multiple IR tasks. This tutorial aims to lay a theoretical foundation of DP and explains how it can be applied to IR. It highlights the differences in IR tasks and DB and DM tasks and how DP connects to IR. We hope the attendees of this tutorial will have a good understanding of DP and other necessary knowledge to work on the newly minted joint research field of privacy and IR.

【Keywords】: differential privacy; privacy-preserving information retrieval

102. Neural Networks for Information Retrieval.

【Paper Link】【Pages】:779-780

【Authors】: Tom Kenter ; Alexey Borisov ; Christophe Van Gysel ; Mostafa Dehghani ; Maarten de Rijke ; Bhaskar Mitra

【Abstract】: Machine learning plays a role in many aspects of modern IR systems, and deep learning is applied in all of them. The fast pace of modern-day research has given rise to many approaches to many IR problems. The amount of information available can be overwhelming both for junior students and for experienced researchers looking for new research topics and directions. The aim of this full- day tutorial is to give a clear overview of current tried-and-trusted neural methods in IR and how they benefit IR.

【Keywords】:

103. Tutorial on Metrics of User Engagement: Applications to News, Search and E-Commerce.

【Paper Link】【Pages】:781-782

【Authors】: Mounia Lalmas ; Liangjie Hong

【Abstract】: User engagement plays a central role in companies operating online services, such as search engines, news portals, e-commerce sites, and social networks. A main challenge is to leverage collected knowledge about the daily online behavior of millions of users to understand what engage them short-term and more importantly long-term. The most common way that engagement is measured is through various online metrics, acting as proxy measures of user engagement. This tutorial will review these metrics, their advantages and drawbacks, and their appropriateness to various types of online services. As case studies, we will focus on three types of services, news, search and e-commerce. We will also briefly discuss how to develop better machine learning models to optimize online metrics, and design experiments to test these models.

【Keywords】: metrics; optimization; user engagement

104. Network Science of Teams: Characterization, Prediction, and Optimization.

【Paper Link】【Pages】:783-784

【Authors】: Liangyue Li ; Hanghang Tong

【Abstract】: Teams are increasingly indispensable to achievements in any organization. Despite the organizations' substantial dependency on teams, fundamental knowledge about the conduct of team-enabled operations is lacking, especially at the social, cognitive and information level in relation to team performance and network dynamics. Generally speaking, the team performance can be viewed as the composite of its users, the tasks that the team performs and the networks that the team is embedded in or operates on. The goal of this tutorial is to (1) provide a comprehensive review of the recent advances in optimizing teams' performance in the context of networks; and (2) identify the open challenges and future trends. We believe this is an emerging and high-impact topic in computational social science, which will attract both researchers and practitioners in the data mining as well as social science research communities. Our emphasis will be on (1) the recent emerging techniques on addressing team performance optimization problem; and (2) the open challenges/future trends, with a careful balance between the theories, algorithms and applications.

【Keywords】: network science; teamwork

【Paper Link】【Pages】:785-786

【Authors】: Alexandra Olteanu ; Emre Kiciman ; Carlos Castillo

【Abstract】: Online social data like user-generated content, expressed or implicit relations among people, and behavioral traces are at the core of many popular web applications and platforms, driving the research agenda of researchers in both academia and industry. The promises of social data are many, including the understanding of "what the world thinks»» about a social issue, brand, product, celebrity, or other entity, as well as enabling better decision-making in a variety of fields including public policy, healthcare, and economics. However, many academics and practitioners are increasingly warning against the naive usage of social data. They highlight that there are biases and inaccuracies occurring at the source of the data, but also introduced during data processing pipeline; there are methodological limitations and pitfalls, as well as ethical boundaries and unexpected outcomes that are often overlooked. Such an overlook can lead to wrong or inappropriate results that can be consequential.

【Keywords】: behavioral data; data biases; ethics; evaluation; social media

106. Athlytics: Winning in Sports with Data.

【Paper Link】【Pages】:787-788

【Authors】: Konstantinos Pelechrinis ; Evangelos E. Papalexakis

【Abstract】: Data and analytics have been part of the sports industry from as early as the 1870s, when the first boxscore in baseball was recorded. However, it is only recently that advanced data mining and machine learning techniques have been utilized for facilitating the operations of sports franchises. While part of the reason is related with the ability to collect more fine-grained data, an equally important factor for this turn to analytics is the huge success and competitive advantage that early adopters of investment in analytics enjoyed(popularized by the best-seller -Moneyball? that described the success that Oakland Athletics had with analytics). Draft selection, game-day decision making and player evaluation are just a few of the applications where sports analytics play a crucial role today. Apart from the sports clubs, other stakeholders in the industry(e.g., the leagues' offices, media, etc.) invest in analytics. The leagues increasingly rely on data in order to decide on potential rule changes. For instance, the most recent rule change in NFL, i.e., the kickoff touchback, was a result of thorough data analysis of concussion instances. In this tutorial we will review the literature in data mining and machine learning techniques for sports analytics. We will introduce the audience to the design and methodologies behind advanced metrics such as the adjusted plus/minus for evaluating basketball players, spatial metrics for evaluating the ability of a player to spread the defense in basketball, and the Player Efficiency Rating(PER). We will also go in depth in advanced data mining methods, and in particular tensor mining, that can analyze heterogenous data similar to the ones available in today's sports world.

【Keywords】:

107. Mining Knowledge Graphs From Text.

【Paper Link】【Pages】:789-790

【Authors】: Jay Pujara ; Sameer Singh

【Abstract】: Knowledge graphs have become an increasingly crucial component in machine intelligence systems, powering ubiquitous digital assistants and inspiring several large scale academic projects across the globe. Our tutorial explains why knowledge graphs are important, how knowledge graphs are constructed, and where new research opportunities exist for improving the state-of-the-art. In this tutorial, we cover the many sophisticated approaches that complete and correct knowledge graphs. We organize this exploration into two main classes of models. The first include probabilistic logical frameworks that use graphical models, random walks, or statistical rule mining to construct knowledge graphs. The second class of models includes latent space models such as matrix and tensor factorization and neural networks. We conclude the tutorial with a critical comparison of techniques and results. We will offer practical advice for novices to identify common empirical challenges and concrete data sets for initial experimentation. Finally, we will highlight promising areas of current and future work.

【Keywords】: automated knowledge base construction; information extraction; knowledge bases; knowledge graphs; natural language processing; neural embedding techniques; probabilistic models; statistical relational learning

Workshops 8

【Paper Link】【Pages】:791-792

【Authors】: Yu-Ru Lin ; Carlos Castillo ; Jie Yin

【Abstract】: During large-scale emergencies such as natural and man-made disasters, a massive amount of information is posted by the public in social media. Collecting, aggregating, and presenting this information to stakeholders can be extremely challenging, particularly if an understanding of the "big picture»» is sought. This international workshop, the fifth in the series, is a key venue for researchers and practitioners to discuss research challenges and technical issues around the usage of social media in disaster management. Workshop»s website: https://sites.google.com/site/swdm2018/

【Keywords】: disaster response; emergency management; social media

109. First Workshop on Knowledge Base Construction, Mining and Reasoning.

【Paper Link】【Pages】:793-794

【Authors】: Xiang Ren ; Craig A. Knoblock ; William Wang ; Yu Su

【Abstract】:

【Keywords】: information extraction; knowledge base; knowledge graph; knowledge reasoning

110. HeteroNAM: International Workshop on Heterogeneous Networks Analysis and Mining.

【Paper Link】【Pages】:795-796

【Authors】: Shobeir Fakhraei ; Yanen Li ; Yizhou Sun ; Tim Weninger

【Abstract】: The first International Workshop on Heterogeneous Networks Analysis and Mining is held in Los Angeles, California, USA on February 9th, 2018 and is co-located with the 11th ACM International Conference on Web Search and Data Mining. The goal of this workshop is to bring together computing researchers and practitioners to address challenges in the mining and analysis of real-world heterogeneous networks. This workshop has an exciting program that spans a number of subareas including: graph mining, learning from structured data, statistical relational learning, and network science in general. The program includes six invited speakers, lively discussion on emerging topics, and presentations of several original research papers.

【Keywords】: aligned networks; attributed networks; complex networks; graph mining; heterogeneous information networks; multi-relational data; multidimensional networks; multigraphs; multilayer networks; multimodal networks; signed networks

111. LearnIR: WSDM 2018 Workshop on Learning from User Interactions.

【Paper Link】【Pages】:797-798

【Authors】: Rishabh Mehrotra ; Ahmed Hassan Awadallah ; Emine Yilmaz

【Abstract】: While users interact with online services(e.g. search engines, recommender systems, conversational agents), they leave behind fine grained traces of interaction patterns. The ability to understand user behavior, record and interpret user interaction signals, gauge user satisfaction and incorporate user feedback gives online systems a vast treasure trove of insights for improvement and experimentation. More generally, the ability to learn from user interactions promises pathways for solving a number of problems and improving user engagement and satisfaction. Understanding and learning from user interactions involves a number of different aspects - from understanding user intent and tasks, to developing user models and personalization services. A user's understanding of their need and the overall task develop as they interact with the system. Supporting the various stages of the task involves many aspects of the system, e.g. interface features, presentation of information, retrieving and ranking. Often, online systems are not specifically designed to support users in successfully accomplishing the tasks which motivated them to interact with the system in the first place. Beyond understanding user needs, learning from user interactions involves developing the right metrics and expiermentation systems, understanding user interaction processes, their usage context and designing interfaces capable of helping users. Learning from user interactions becomes more important as new and novel ways of user interactions surface. There is a gradual shift towards searching and presenting the information in a conversational form. Chatbots, personal assistants in our phones and eyes-free devices are being used increasingly more for different purposes, including information retrieval and exploration. With improved speech recognition and information retrieval systems, more and more users are increasingly relying on such digital assistants to fulfill their information needs and complete their tasks. Such systems rely heavily on quickly learning from past interactions and incorporating implicit feedback signals into their models for rapid development.

【Keywords】:

112. MIS2: Misinformation and Misbehavior Mining on the Web.

【Paper Link】【Pages】:799-800

【Authors】: Srijan Kumar ; Meng Jiang ; Taeho Jung ; Roger Jie Luo ; Jure Leskovec

【Abstract】: Misinformation and misbehavior mining on the web(MIS2) workshop is held in Los Angeles, California, USA on February 9, 2018, and co-located with the 11th ACM International Conference on Web Search and Data Mining(WSDM 2018). Web is a dynamic ecosystem that enables malicious users to create and spread deceptive information to a wide audience in a matter of minutes. These malicious actors work on a wide variety of platforms, such as social media, e-commerce, and more. The main object of MIS2 is to discuss new and upcoming research on modeling, discovery, detection, and mitigation methods of misbehavior and misinformation on the web. MIS2 is an interdisciplinary venue for leading researchers and practitioners from the areas of data mining, social network analysis, cybersecurity, communications, human-computer interaction, and natural language processing. The topics addressed in MIS2 are extremely timely and the research presented by refereed papers and invited keynote speakers will participants a full dose of emerging research.

【Keywords】:

113. Workshop on Two-sided Marketplace Optimization: Search, Pricing, Matching & Growth.

【Paper Link】【Pages】:801-802

【Authors】: Mihajlo Grbovic ; Thanasis Noulas

【Abstract】: The 1st International Workshop on Two-sided Marketplace Optimization: Search, Pricing, Matching & Growth(TSMO) will be held in Los Angeles, California, USA on February 9th, 2018, co-located with the 11th ACM International Conference on Web Search and Data Mining(WSDM). The main objective of the workshop is to address the challenges of two-sided marketplace optimization in web-scale settings. The workshop brings together interdisciplinary researchers in information retrieval, recommender systems, personalization, and related areas, to share, exchange, learn, and develop preliminary results, new concepts, ideas, principles, and methodologies on applying data mining technologies to marketplace optimization. We have constructed an exciting program papers and invited talks that will help us better understand the future of two-sided marketplaces

【Keywords】: personalization; search ranking; smart pricing; user modeling

114. GTA3 2018: Workshop on Graph Techniques for Adversarial Activity Analytics.

【Paper Link】【Pages】:803

【Authors】: Jiejun Xu ; Hanghang Tong ; Tsai-Ching Lu ; Jingrui He ; Nadya Bliss

【Abstract】: Networks are natural analytic tools in modeling adversarial activities(e.g., human trafficking, illicit drug production, terrorist financial transaction) using different intelligence data sources. However, such activities are often covert and embedded across multiple domains and contexts. They are generally not detectable and recognizable from the perspective of an isolated network, and only become apparent when multiple networks are analyzed in a joint manner. Thus, one of the main research topics in modeling adversarial activities is to develop effective techniques to align and fuse information from different networks into a unified representation for global analysis. Based on the combined network representation, an equally important research topic is on detecting and matching indicating patterns to recognize the underlining adversarial activities in the integrated network. The focus of this workshop is to gather together the researchers from all relevant fields to share their experience and opinions on graph mining techniques in the era of big data, with emphasis on two fundamental problems - "Connecting the dots" and "finding a needle in a haystack", in the context of graph-based adversarial activity analytics.

【Keywords】: adversarial activity; graph mining; network alignment; subgraph matching

115. IFUP: Workshop on Multi-dimensional Information Fusion for User Modeling and Personalization.

【Paper Link】【Pages】:804-805

【Authors】: Feida Zhu ; Yongfeng Zhang ; Neil Yorke-Smith ; Guibing Guo ; Xu Chen

【Abstract】: Recommendation system has became an important component in many real applications, ranging from e-commerce, music app to video-sharing site and on-line book store. The key of a successful recommendation system lies in the accurate user/item profiling. With the advent of web 2.0, quite a lot of multimodal information has been accumulated, which provides us with the opportunity to profile users in a more comprehensive manner. However, directly integrating multimodal information into recommendation system is not a trivial task, because they may be either homogenous or heterogeneous, which requires more advanced method for both fusion and alignment. This workshop aims to provide a platform for discussing the challenges and corresponding innovative approaches in fusing multi-dimensional information for user modeling and recommender systems. We hope more advanced technologies can be proposed or inspired, and also we hope that the direction of integrating different types of information can catch much more attention in both academic and industry.

【Keywords】: information fusion; multi-dimensional; user modeling

10. WSDM 2018:Marina Del Rey, CA, USA

Paper Num: 115 || Session Num: 7

Keynote Talks 6

1. A Call to Arms: Embrace Assistive AI Systems!

2. On the Power of Massive Text Data.

3. Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution.

4. Conversations, Machine Learning and Privacy: LinkedIn's Path Towards Transforming Interaction with Its Members.

5. From Search to Research: Direct Answers, Perspectives and Dialog.

6. Scalable Algorithms in the Age of Big Data and Network Sciences: Characterization, Primitives, and Techniques.

WSDM Cup 2018 1

7. WSDM Cup 2018: Music Recommendation and Churn Prediction.

Technical Presentations 81

8. Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches.

9. Can you Trust the Trend?: Discovering Simpson's Paradoxes in Social Data.

10. Deep Neural Architecture for Multi-Modal Retrieval based on Joint Embedding Space for Text and Images.

11. A Discrete Choice Model for Subset Selection.

12. Latent Cross: Making Use of Context in Recurrent Recommender Systems.

13. Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments.

14. Neural Graph Learning: Training Neural Networks Using Graphs.

15. Sketch 'Em All: Fast Approximate Similarity Search for Dynamic Data Streams.

16. Fast Coreset-based Diversity Maximization under Matroid Constraints.

17. Putting Data in the Driver's Seat: Optimizing Earnings for On-Demand Ride-Hailing.

18. Improving Negative Sampling for Word Representation using Self-embedded Features.

19. Sequential Recommendation with User Memory Networks.

20. VISIR: Visual and Semantic Image Label Refinement.

21. Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search.

22. Demographics and Dynamics of Mechanical Turk Workers.

23. Joint Generative-Discriminative Aggregation Model for Multi-Option Crowd Labels.

24. Predicting Audio Advertisement Quality.

25. Cognitive Biases in Crowdsourcing.

26. User Profiling through Deep Multimodal Fusion.

27. Orienteering Algorithms for Generating Travel Itineraries.

28. Unsubscription: A Simple Way to Ease Overload in Email.

29. Offline A/B Testing for Recommender Systems.

30. Care to Share?: Learning to Rank Personal Photos for Public Sharing.

31. Identifying Informational vs. Conversational Questions on Community Question Answering Archives.

32. Robust Transfer Learning for Cross-domain Collaborative Filtering Using Multiple Rating Patterns Approximation.

33. Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons.

34. Collaborative Filtering via Additive Ordinal Regression.

35. Who Will Share My Image?: Predicting the Content Diffusion Path in Online Social Networks.

36. Listening to Chaotic Whispers: A Deep Learning Framework for News-oriented Stock Trend Prediction.

37. Exploring Expert Cognition for Attributed Network Embedding.

38. Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval.

39. Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking.

40. Fast and Scalable Distributed Loopy Belief Propagation on Real-World Graphs.

41. Combating Crowdsourced Review Manipulators: A Neighborhood-Based Approach.

42. Topic Chronicle Forest for Topic Discovery and Tracking.

43. Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation.

44. REV2: Fraudulent User Prediction in Rating Platforms.

45. Web Search of Fashion Items with Multimodal Querying.

46. Joint Non-negative Matrix Factorization for Learning Ideological Leaning on Twitter.

47. Bayesian Optimization for Optimizing Retrieval Systems.

48. Streaming Link Prediction on Dynamic Attributed Networks.

49. Inferring Dockless Shared Bike Distribution in New Cities.

50. Multi-Dimensional Network Embedding with Hierarchical Structure.

51. Query Driven Algorithm Selection in Early Stage Retrieval.

52. Index Compression Using Byte-Aligned ANS Coding and Two-Dimensional Contexts.

53. Fusing Diversity in Recommendations in Heterogeneous Information Networks.

54. Neural Personalized Ranking for Image Recommendation.

55. Learning to Discover Domain-Specific Web Content.

56. Extreme Multi-label Learning with Label Features for Warm-start Tagging, Ranking & Recommendation.

57. DSANLS: Accelerating Distributed Nonnegative Matrix Factorization via Sketching.

58. Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec.

59. Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning.

60. Leveraging Implicit Contribution Amounts to Facilitate Microfinancing Requests.

61. FACH: Fast Algorithm for Detecting Cohesive Hierarchies of Communities in Large Networks.

62. Measuring the Latency of Depression Detection in Social Media.

63. Peeling Bipartite Networks for Dense Subgraph Discovery.

64. Short-Term Satisfaction and Long-Term Coverage: Understanding How Users Tolerate Algorithmic Exploration.

65. CrossFire: Cross Media Joint Friend and Item Recommendations.

66. Modeling Time to Open of Emails with a Latent State for User Engagement Level.

67. Shortcutting Label Propagation for Distributed Connected Components.

68. User Intent, Behaviour, and Perceived Satisfaction in Product Search.

69. Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction.

70. Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding.

71. sSketch: A Scalable Sketching Technique for PCA in the Cloud.

72. Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering.

73. SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction.

74. A Unified Processing Paradigm for Interactive Location-based Web Search.

75. Position Bias Estimation for Unbiased Learning to Rank in Personal Search.