30. AAAI 2016:Phoenix, Arizona, USA

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA. AAAI Press 【DBLP Link】

Paper Num: 691 || Session Num: 36

Demonstration Papers 29
Doctoral Consortium 17
EAAI Symposium Full Paper 10
EAAI Symposium Model AI Assignments 1
EAAI Symposium Poster Paper 6
Innovative Applications Challenge Problem Papers 1
Innovative Applications Deployed Papers 3
Innovative Applications Emerging Application Papers 11
Senior Member Blue Sky Papers 4
Senior Member Summary Talks 4
Special Track on Cognitive Systems 11
Special Track on Computational Sustainability 22
Special Track on Integrated AI Capabilities 3
Student Abstracts 48
Technical Papers 11
Technical Papers: AI and the Web 33
Technical Papers: Cognitive Modeling and Cognitive Systems 2
Technical Papers: Computational Sustainability and AI 2
Technical Papers: Game Playing and Interactive Entertainment 2
Technical Papers: Game Theory and Economic Paradigms 42
Technical Papers: Heuristic Search and Optimization 22
Technical Papers: Human-Computation and Crowd Sourcing 1
Technical Papers: Humans and AI 4
Technical Papers: Knowledge Representation and Reasoning 36
Technical Papers: Machine Learning Applications 46
Technical Papers: Machine Learning Methods 137
Technical Papers: Multiagent Systems 18
Technical Papers: NLP and Knowledge Representation 16
Technical Papers: NLP and Machine Learning 28
Technical Papers: NLP and Text Mining 32
Technical Papers: Planning and Scheduling 14
Technical Papers: Reasoning under Uncertainty 15
Technical Papers: Robotics 5
Technical Papers: Search and Constraint Satisfaction 11
Technical Papers: Vision 35
What's Hot Papers 9

Technical Papers 11

1. Inferring Multi-Dimensional Ideal Points for US Supreme Court Justices.

【Paper Link】【Pages】:4-12

【Authors】: Mohammad Raihanul Islam ; K. S. M. Tozammel Hossain ; Siddharth Krishnan ; Naren Ramakrishnan

【Abstract】: In Supreme Court parlance and the political science literature, an ideal point positions a justice in a continuous space and can be interpreted as a quantification of the justice's policy preferences. We present an automated approach to infer such ideal points for justices of the US Supreme Court. This approach combines topic modeling over case opinions with the voting (and endorsing) behavior of justices. Furthermore, given a topic of interest, say the Fourth Amendment, the topic model can be optionally seeded with supervised information to steer the inference of ideal points. Application of this methodology over five years of cases provides interesting perspectives into the leaning of justices on crucial issues, coalitions underlying specific topics, and the role of swing justices in deciding the outcomes of cases.

【Keywords】: Opinion Mining; Ideal Point Analysis; Supreme Court; Topic Modeling

2. Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds.

【Paper Link】【Pages】:13-19

【Authors】: Meng Jiang ; Peng Cui ; Nicholas Jing Yuan ; Xing Xie ; Shiqiang Yang

【Abstract】: People often use multiple platforms to fulfill their different information needs. With the ultimate goal of serving people intelligently, a fundamental way is to get comprehensive understanding about user needs. How to organically integrate and bridge cross-platform information in a human-centric way is important. Existing transfer learning assumes either fully-overlapped or non-overlapped among the users. However, the real case is the users of different platforms are partially overlapped. The number of overlapped users is often small and the explicitly known overlapped users is even less due to the lacking of unified ID for a user across different platforms. In this paper, we propose a novel semi-supervised transfer learning method to address the problem of cross-platform behavior prediction, called XPTrans. To alleviate the sparsity issue, it fully exploits the small number of overlapped crowds to optimally bridge a user's behaviors in different platforms. Extensive experiments across two real social networks show that XPTrans significantly outperforms the state-of-the-art. We demonstrate that by fully exploiting 26% overlapped users, XPTrans can predict the behaviors of non-overlapped users with the same accuracy as overlapped users, which means the small overlapped crowds can successfully bridge the information across different platforms.

【Keywords】: Cross-platform; Behavior Prediction; Transfer Learning

3. Scientific Ranking over Heterogeneous Academic Hypernetwork.

【Paper Link】【Pages】:20-26

【Authors】: Ronghua Liang ; Xiaorui Jiang

【Abstract】: Ranking is an important way of retrieving authoritative papers from a large scientific literature database. Current state-of-the-art exploits the flat structure of the heterogeneous academic network to achieve a better ranking of scientific articles, however, ignores the multinomial nature of the multidimensional relationships between different types of academic entities. This paper proposes a novel mutual ranking algorithm based on the multinomial heterogeneous academic hypernetwork, which serves as a generalized model of a scientific literature database. The proposed algorithm is demonstrated effective through extensive evaluation against well-known IR metrics on a well-established benchmarking environment based on the ACL Anthology Network.

【Keywords】:

4. MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-Based Protein Structure Prediction.

【Paper Link】【Pages】:27-34

【Authors】: Zeming Lin ; Jack Lanchantin ; Yanjun Qi

【Abstract】: Predicting protein properties such as solvent accessibility and secondary structure from its primary amino acid sequence is an important task in bioinformatics. Recently, a few deep learning models have surpassed the traditional window based multilayer perceptron. Taking inspiration from the image classification domain we propose a deep convolutional neural network architecture, MUST-CNN, to predict protein properties. This architecture uses a novel multilayer shift-and-stitch (MUST) technique to generate fully dense per-position predictions on protein sequences. Our model is significantly simpler than the state-of-the-art, yet achieves better results. By combining MUST and the efficient convolution operation, we can consider far more parameters while retaining very fast prediction speeds. We beat the state-of-the-art performance on two large protein property prediction datasets.

【Keywords】: convolutional neural network; neural network; protein; sequence

【Paper Link】【Pages】:35-41

【Authors】: Eric Lofgren ; Anil Vullikanti

【Abstract】: Hospitals are typically optimized to operate near capacity, and there are serious concerns that our healthcare system is not prepared for the next pandemic. Stockpiles of different supplies, e.g., personal protective equipments (PPE) and medical equipment, need to be maintained in order to be able to respond to any future pandemics. Large outbreaks occur with a low probability, and such stockpiles require big investments. Further, hospitals often have mutual sharing agreements, which makes the problem of stockpiling decisions a natural game-theoretical problem. In this paper, we formalize hospital stockpiling as a game-theoretical problem. We use the notion of pairwise Nash stability as a solution concept for this problem, and characterize its structure. We show that stable strategies can lead to high unsatisfied demands in some scenarios, and stockpiles might not be maintained at all nodes. We also show that stable strategies and the social optimum can be computed efficiently.

【Keywords】:

6. Predicting ICU Mortality Risk by Grouping Temporal Trends from a Multivariate Panel of Physiologic Measurements.

【Paper Link】【Pages】:42-50

【Authors】: Yuan Luo ; Yu Xin ; Rohit Joshi ; Leo Celi ; Peter Szolovits

【Abstract】: ICU mortality risk prediction may help clinicians take effective interventions to improve patient outcome. Existing machine learning approaches often face challenges in integrating a comprehensive panel of physiologic variables and presenting to clinicians interpretable models. We aim to improve both accuracy and interpretability of prediction models by introducing Subgraph Augmented Non-negative Matrix Factorization (SANMF) on ICU physiologic time series. SANMF converts time series into a graph representation and applies frequent subgraph mining to automatically extract temporal trends. We then apply non-negative matrix factorization to group trends in a way that approximates patient pathophysiologic states. Trend groups are then used as features in training a logistic regression model for mortality risk prediction, and are also ranked according to their contribution to mortality risk. We evaluated SANMF against four empirical models on the task of predicting mortality or survival 30 days after discharge from ICU using the observed physiologic measurements between 12 and 24 hours after admission. SANMF outperforms all comparison models, and in particular, demonstrates an improvement in AUC (0.848 vs. 0.827, p<0.002) compared to a state-of-the-art machine learning method that uses manual feature engineering. Feature analysis was performed to illuminate insights and benefits of subgraph groups in mortality risk prediction.

【Keywords】: non-negative matrix factorization; frequent subgraph mining; physiologic time series; predicative modeling; Intensive Care Unit

7. Learning to Generate Posters of Scientific Papers.

【Paper Link】【Pages】:51-57

【Authors】: Yuting Qiang ; Yanwei Fu ; Yanwen Guo ; Zhi-Hua Zhou ; Leonid Sigal

【Abstract】: Researchers often summarize their work in the form of posters. Posters provide a coherent and efficient way to convey core ideas from scientific papers. Generating a good scientific poster, however, is a complex and time consuming cognitive task, since such posters need to be readable, informative, and visually aesthetic. In this paper, for the first time, we study the challenging problem of learning to generate posters from scientific papers. To this end, a data-driven framework, that utilizes graphical models, is proposed. Specifically, given content to display, the key elements of a good poster, including panel layout and attributes of each panel, are learned and inferred from data. Then, given inferred layout and attributes, composition of graphical elements within each panel is synthesized. To learn and validate our model, we collect and make public a Poster-Paper dataset, which consists of scientific papers and corresponding posters with exhaustively labelled panels and attributes. Qualitative and quantitative results indicate the effectiveness of our approach.

【Keywords】:

8. Face Behind Makeup.

【Paper Link】【Pages】:58-64

【Authors】: Shuyang Wang ; Yun Fu

【Abstract】: In this work, we propose a novel automatic makeup detector and remover framework. For makeup detector, a locality-constrained low-rank dictionary learning algorithm is used to determine and locate the usage of cosmetics. For the challenging task of makeup removal, a locality-constrained coupled dictionary learning (LC-CDL) framework is proposed to synthesize non-makeup face, so that the makeup could be erased according to the style. Moreover, we build a stepwise makeup dataset (SMU) which to the best of our knowledge is the first dataset with procedures of makeup. This novel technology itself carries many practical applications, e.g. products recommendation for consumers; user-specified makeup tutorial; security applications on makeup face verification. Finally, our system is evaluated on three existing (VMU, MIW, YMU) and one own-collected makeup datasets. Experimental results have demonstrated the effectiveness of DL-based method on makeup detection. The proposed LC-CDL shows very promising performance on makeup removal regarding on the structure similarity. In addition, the comparison of face verification accuracy with presence or absence of makeup is presented, which illustrates an application of our automatic makeup remover system in the context of face verification with facial makeup.

【Keywords】:

【Paper Link】【Pages】:65-71

【Authors】: Yang Yang ; Jia Jia ; Boya Wu ; Jie Tang

【Abstract】: Psychological theories suggest that emotion represents the state of mind and instinctive responses of one’s cognitive system (Cannon 1927). Emotions are a complex state of feeling that results in physical and psychological changes that influence our behavior. In this paper, we study an interesting problem of emotion contagion in social networks. In particular, by employing an image social network (Flickr) as the basis of our study, we try to unveil how users’ emotional statuses influence each other and how users’ positions in the social network affect their influential strength on emotion. We develop a probabilistic framework to formalize the problem into a role-aware contagion model. The model is able to predict users’ emotional statuses based on their historical emotional statuses and social structures. Experiments on a large Flickr dataset show that the proposed model significantly outperforms (+31% in terms of F1-score) several alternative methods in predicting users’ emotional status. We also discover several intriguing phenomena. For example, the probability that a user feels happy is roughly linear to the number of friends who are also happy; but taking a closer look, the happiness probability is superlinear to the number of happy friends who act as opinion leaders (Page et al. 1999) in the network and sublinear in the number of happy friends who span structural holes (Burt 2001). This offers a new opportunity to understand the underlying mechanism of emotional contagion in online social networks.

【Keywords】: emotion contagion, social role, social network

10. Survival Prediction by an Integrated Learning Criterion on Intermittently Varying Healthcare Data.

【Paper Link】【Pages】:72-78

【Authors】: Jianfei Zhang ; Lifei Chen ; Alain Vanasse ; Josiane Courteau ; Shengrui Wang

【Abstract】: Survival prediction is crucial to healthcare research, but is confined primarily to specific types of data involving only the present measurements. This paper considers the more general class of healthcare data found in practice, which includes a wealth of intermittently varying historical measurements in addition to the present measurements. Making survival predictions on such data bristles with challenges to the existing prediction models. For this reason, we propose a new semi-proportional hazards model using locally time-varying coefficients, and a novel complete-data model learning criterion for coefficient optimization. Experiments on the healthcare data demonstrate the effectiveness and generalizability of our model and its promise in practical applications.

【Keywords】:

11. On the Minimum Differentially Resolving Set Problem for Diffusion Source Inference in Networks.

【Paper Link】【Pages】:79-86

【Authors】: Chuan Zhou ; Weixue Lu ; Peng Zhang ; Jia Wu ; Yue Hu ; Li Guo

【Abstract】: In this paper we theoretically study the minimum Differentially Resolving Set (DRS) problem derived from the classical sensor placement optimization problem in network source locating. A DRS of a graph G = ( V, E ) is defined as a subset S ⊆ V where any two elements in V can be distinguished by their different differential characteristic sets defined on S. The minimum DRS problem aims to find a DRS S in the graph G with minimum total weight Σ v∈S w ( v ). In this paper we establish a group of Integer Linear Programming (ILP) models as the solution. By the weighted set cover theory, we propose an approximation algorithm with the Θ(ln n ) approximability for the minimum DRS problem on general graphs, where n is the graph size.

【Keywords】: Differentially Resolving Set, Diffusion Source Inference, Networks

Technical Papers: AI and the Web 33

12. From Tweets to Wellness: Wellness Event Detection from Twitter Streams.

【Paper Link】【Pages】:87-93

【Authors】: Mohammad Akbari ; Xia Hu ; Liqiang Nie ; Tat-Seng Chua

【Abstract】: Social media platforms have become the most popular means for users to share what is happening around them. The abundance and growing usage of social media has resulted in a large repository of users' social posts, which provides a stethoscope for inferring individuals' lifestyle and wellness. As users' social accounts implicitly reflect their habits, preferences, and feelings, it is feasible for us to monitor and understand the wellness of users by harvesting social media data towards a healthier lifestyle. As a first step towards accomplishing this goal, we propose to automatically extract wellness events from users' published social contents. Existing approaches for event extraction are not applicable to personal wellness events due to its domain nature characterized by plenty of noise and variety in data, insufficient samples, and inter-relation among events.To tackle these problems, we propose an optimization learning framework that utilizes the content information of microblogging messages as well as the relations between event categories. By imposing a sparse constraint on the learning model, we also tackle the problems arising from noise and variation in microblogging texts. Experimental results on a real-world dataset from Twitter have demonstrated the superior performance of our framework.

【Keywords】: Event Detection; Wellness; Healthcare; Twitter; Lifestyle; Multi-task learning; Personal Wellness Events

13. "8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality.

【Paper Link】【Pages】:94-100

【Authors】: Prakhar Biyani ; Kostas Tsioutsiouliklis ; John Blackmer

【Abstract】: Clickbaits are articles with misleading titles, exaggerating the content on the landing page. Their goal is to entice users to click on the title in order to monetize the landing page. The content on the landing page is usually of low quality. Their presence in user homepage stream of news aggregator sites (e.g., Yahoo news, Google news) may adversely impact user experience. Hence, it is important to identify and demote or block them on homepages. In this paper, we present a machine-learning model to detect clickbaits. We use a variety of features and show that the degree of informality of a webpage (as measured by different metrics) is a strong indicator of it being a clickbait. We conduct extensive experiments to evaluate our approach and analyze properties of clickbait and non-clickbait articles. Our model achieves high performance (74.9% F-1 score) in predicting clickbaits.

【Keywords】: clickbait; classification; news site; news aggregator; homepage

【Paper Link】【Pages】:101-107

【Authors】: Bor-Chun Chen ; Yan-Ying Chen ; Francine Chen ; Dhiraj Joshi

【Abstract】: Image localization is important for marketing and recommendation of local business; however, the level of granularity is still a critical issue. Given a consumer photo and its rough GPS information, we are interested in extracting the fine-grained location information, i.e. business venues, of the image. To this end, we propose a novel framework for business venue recognition. The framework mainly contains three parts. First, business-aware visual concept discovery: we mine a set of concepts that are useful for business venue recognition based on three guidelines including business awareness, visually detectable, and discriminative power. We define concepts that satisfy all of these three criteria as business-aware visual concept. Second, business-aware concept detection by convolutional neural networks (BA-CNN): we propose a new network configuration that can incorporate semantic signals mined from business reviews for extracting semantic concept features from a query image. Third, multimodal business venue recognition: we extend visually detected concepts to multimodal feature representations that allow a test image to be associated with business reviews and images from social media for business venue recognition. The experiments results show the visual concepts detected by BA-CNN can achieve up to 22.5% relative improvement for business venue recognition compared to the state-of-the-art convolutional neural network features. Experiments also show that by leveraging multimodal information from social media we can further boost the performance, especially when the database images belonging to each business venue are scarce.

【Keywords】: Image localization; business-aware; concepts; convolutional neural networks;

15. Capturing Semantic Correlation for Item Recommendation in Tagging Systems.

【Paper Link】【Pages】:108-114

【Authors】: Chaochao Chen ; Xiaolin Zheng ; Yan Wang ; Fuxing Hong ; Deren Chen

【Abstract】: The popularity of tagging systems provides a great opportunity to improve the performance of item recommendation. Although existing approaches use topic modeling to mine the semantic information of items by grouping the tags labelled for items, they overlook an important property that tags link users and items as a bridge. Thus these methods cannot deal with the data sparsity without commonly rated items (DS-WO-CRI) problem, limiting their recommendation performance. Towards solving this challenging problem, we propose a novel tag and rating based collaborative filtering (CF) model for item recommendation, which first uses topic modeling to mine the semantic information of tags for each user and for each item respectively, and then incorporates the semantic information into matrix factorization to factorize rating information and to capture the bridging feature of tags and ratings between users and items.As a result, our model captures the semantic correlation between users and items, and is able to greatly improve recommendation performance, especially in DS-WO-CRI situations.Experiments conducted on two popular real-world datasets demonstrate that our proposed model significantly outperforms the conventional CF approach, the state-of-the-art social relation based CF approach, and the state-of-the-art topic modeling based CF approaches in terms of both precision and recall, and it is an effective approach to the DS-WO-CRI problem.

【Keywords】: recommender system; matrix factorization; topic model; semantic correlation; tag system

16. Identifying Sentiment Words Using an Optimization Model with L1 Regularization.

【Paper Link】【Pages】:115-121

【Authors】: Zhi-Hong Deng ; Hongliang Yu ; Yunlun Yang

【Abstract】: Sentiment word identification is a fundamental work in numerous applications of sentiment analysis and opinion mining, such as review mining, opinion holder finding, and twitter classification. In this paper, we propose an optimization model with L1 regularization, called ISOMER, for identifying the sentiment words from the corpus. Our model can employ both seed words and documents with sentiment labels, different from most existing researches adopting seed words only. The L1 penalty in the objective function yields a sparse solution since most candidate words have no sentiment. The experiments on the real datasets show that ISOMER outperforms the classic approaches, and that the lexicon learned by ISOMER can be effectively adapted to document-level sentiment analysis.

【Keywords】: sentiment words; sentiment analysis; optimization model; opinion mining; performance

【Paper Link】【Pages】:122-128

【Authors】: Hanyin Fang ; Fei Wu ; Zhou Zhao ; Xinyu Duan ; Yueting Zhuang ; Martin Ester

【Abstract】: Community-based question answering (cQA) sites have accumulated vast amount of questions and corresponding crowdsourced answers over time. How to efficiently share the underlying information and knowledge from reliable (usually highly-reputable) answerers has become an increasingly popular research topic. A major challenge in cQA tasks is the accurate matching of high-quality answers w.r.t given questions. Many of traditional approaches likely recommend corresponding answers merely depending on the content similarity between questions and answers, therefore suffer from the sparsity bottleneck of cQA data. In this paper, we propose a novel framework which encodes not only the contents of question-answer(Q-A) but also the social interaction cues in the community to boost the cQA tasks. More specifically, our framework collaboratively utilizes the rich interaction among questions, answers and answerers to learn the relative quality rank of different answers w.r.t a same question. Moreover, the information in heterogeneous social networks is comprehensively employed to enhance the quality of question-answering (QA) matching by our deep random walk learning framework. Extensive experiments on a large-scale dataset from a real world cQA site show that leveraging the heterogeneous social information indeed achieves better performance than other state-of-the-art cQA methods.

【Keywords】:

【Paper Link】【Pages】:129-136

【Authors】: Hancheng Ge ; James Caverlee

【Abstract】: In this paper, we explore the potential of geo-social media to construct location-based interest profiles to uncover the hidden relationships among disparate locations. Through an investigation of millions of geo-tagged Tweets, we construct a per-city interest model based on fourteen high-level categories (e.g., technology, art, sports). These interest models support the discovery of related locations that are connected based on these categorical perspectives (e.g., college towns or vacation spots) but perhaps not on the individual tweet level. We then connect these city-based interest models to underlying demographic data. By building multivariate multiple linear regression (MMLR) and neural network (NN) models we show how a location's interest profile may be estimated based purely on its demographics features.

【Keywords】:

19. Inferring a Personalized Next Point-of-Interest Recommendation Model with Latent Behavior Patterns.

【Paper Link】【Pages】:137-143

【Authors】: Jing He ; Xin Li ; Lejian Liao ; Dandan Song ; William K. Cheung

【Abstract】: In this paper, we address the problem of personalized next Point-of-interest (POI) recommendation which has become an important and very challenging task in location-based social networks (LBSNs), but not well studied yet. With the conjecture that, under different contextual scenario, human exhibits distinct mobility patterns, we attempt here to jointly model the next POI recommendation under the influence of user's latent behavior pattern. We propose to adopt a third-rank tensor to model the successive check-in behaviors. By incorporating softmax function to fuse the personalized Markov chain with latent pattern, we furnish a Bayesian Personalized Ranking (BPR) approach and derive the optimization criterion accordingly. Expectation Maximization (EM) is then used to estimate the model parameters. Extensive experiments on two large-scale LBSNs datasets demonstrate the significant improvements of our model over several state-of-the-art methods.

【Keywords】: Location-based Social Networks, Point-of-Interest Recommendation, Latent Pattern, Tensor

20. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.

【Paper Link】【Pages】:144-150

【Authors】: Ruining He ; Julian McAuley

【Abstract】: Modern recommender systems model people and items by discovering or `teasing apart' the underlying dimensions that encode the properties of items and users' preferences toward them. Critically, such dimensions are uncovered based on user feedback, often in implicit form (such as purchase histories, browsing logs, etc.); in addition, some recommender systems make use of side information, such as product attributes, temporal information, or review text.However one important feature that is typically ignored by existing personalized recommendation and ranking methods is the visual appearance of the items being considered. In this paper we propose a scalable factorization model to incorporate visual signals into predictors of people's opinions, which we apply to a selection of large, real-world datasets. We make use of visual features extracted from product images using (pre-trained) deep networks, on top of which we learn an additional layer that uncovers the visual dimensions that best explain the variation in people's feedback. This not only leads to significantly more accurate personalized ranking methods, but also helps to alleviate cold start issues, and qualitatively to analyze the visual dimensions that influence people's opinions.

【Keywords】:

21. Improved Neural Machine Translation with SMT Features.

【Paper Link】【Pages】:151-157

【Authors】: Wei He ; Zhongjun He ; Hua Wu ; Haifeng Wang

【Abstract】: Neural machine translation (NMT) conducts end-to-end translation with a source language encoder and a target language decoder, making promising translation performance. However, as a newly emerged approach, the method has some limitations. An NMT system usually has to apply a vocabulary of certain size to avoid the time-consuming training and decoding, thus it causes a serious out-of-vocabulary problem. Furthermore, the decoder lacks a mechanism to guarantee all the source words to be translated and usually favors short translations, resulting in fluent but inadequate translations. In order to solve the above problems, we incorporate statistical machine translation (SMT) features, such as a translation model and an n-gram language model, with the NMT model under the log-linear framework. Our experiments show that the proposed method significantly improves the translation quality of the state-ofthe-art NMT system on Chinese-to-English translation tasks. Our method produces a gain of up to 2.33 BLEU score on NIST open test sets.

【Keywords】: Neural Machine Translation; Statistical Machine Translation; Recurrent Neural Network

22. A Scalable Framework to Choose Sellers in E-Marketplaces Using POMDPs.

【Paper Link】【Pages】:158-164

【Authors】: Athirai Aravazhi Irissappane ; Frans A. Oliehoek ; Jie Zhang

【Abstract】: In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors). Partially Observable Markov Decision Processes (POMDPs) have shown to be an effective framework for optimally selecting sellers by selectively querying advisors. However, current solution methods do not scale to hundreds or even tens of agents operating in the e-market. In this paper, we propose the Mixture of POMDP Experts (MOPE) technique, which exploits the inherent structure of trust-based domains, such as the seller selection problem in e-markets, by aggregating the solutions of smaller sub-POMDPs. We propose a number of variants of the MOPE approach that we analyze theoretically and empirically. Experiments show that MOPE can scale up to a hundred agents thereby leveraging the presence of more advisors to significantly improve buyer satisfaction.

【Keywords】: Trust; E-Marketplace; POMDP

【Paper Link】【Pages】:165-171

【Authors】: Yongpo Jia ; Xuemeng Song ; Jingbo Zhou ; Li Liu ; Liqiang Nie ; David S. Rosenblum

【Abstract】: Social networks contain a wealth of useful information. In this paper, we study a challenging task for integrating users' information from multiple heterogeneous social networks to gain a comprehensive understanding of users' interests and behaviors. Although much effort has been dedicated to study this problem, most existing approaches adopt linear or shallow models to fuse information from multiple sources. Such approaches cannot properly capture the complex nature of and relationships among different social networks. Adopting deep learning approaches to learning a joint representation can better capture the complexity, but this neglects measuring the level of confidence in each source and the consistency among different sources. In this paper, we present a framework for multiple social network learning, whose core is a novel model that fuses social networks using deep learning with source confidence and consistency regularization. To evaluate the model, we apply it to predict individuals' tendency to volunteerism. With extensive experimental evaluations, we demonstrate the effectiveness of our model, which outperforms several state-of-the-art approaches in terms of precision, recall and F1-score.

【Keywords】:

24. Detect Overlapping Communities via Ranking Node Popularities.

【Paper Link】【Pages】:172-178

【Authors】: Di Jin ; Hongcui Wang ; Jianwu Dang ; Dongxiao He ; Weixiong Zhang

【Abstract】: Detection of overlapping communities has drawn much attention lately as they are essential properties of real complex networks. Despite its influence and popularity, the well studied and widely adopted stochastic model has not been made effective for finding overlapping communities. Here we extend the stochastic model method to detection of overlapping communities with the virtue of autonomous determination of the number of communities. Our approach hinges upon the idea of ranking node popularities within communities and using a Bayesian method to shrink communities to optimize an objective function based on the stochastic generative model. We evaluated the novel approach, showing its superior performance over five state-of-the-art methods, on large real networks and synthetic networks with ground-truths of overlapping communities.

【Keywords】:

25. Top-N Recommender System via Matrix Completion.

【Paper Link】【Pages】:179-185

【Authors】: Zhao Kang ; Chong Peng ; Qiang Cheng

【Abstract】: Top-N recommender systems have been investigated widely both in industry and academia. However, the recommendation quality is far from satisfactory. In this paper, we propose a simple yet promising algorithm. We fill the user-item matrix based on a low-rank assumption and simultaneously keep the original information. To do that, a nonconvex rank relaxation rather than the nuclear norm is adopted to provide a better rank approximation and an efficient optimization strategy is designed. A comprehensive set of experiments on real datasets demonstrates that our method pushes the accuracy of Top-N recommendation to a new level.

【Keywords】: Top-N recommender system; matrix completione; nonconvex rank relaxation; log-determinant; nuclear norm

26. Robust Text Classification in the Presence of Confounding Bias.

【Paper Link】【Pages】:186-193

【Authors】: Virgile Landeiro Dos Reis ; Aron Culotta

【Abstract】: As text classifiers become increasingly used in real-time applications, it is critical to consider not only their accuracy but also their robustness to changes in the data distribution. In this paper, we consider the case where there is a confounding variable Z that influences both the text features X and the class variable Y. For example, a classifier trained to predict the health status of a user based on their online communications may be confounded by socioeconomic variables. When the influence of Z changes from training to testing data, we find that classifier accuracy can degrade rapidly. Our approach, based on Pearl's back-door adjustment, estimates the underlying effect of a text variable on the class variable while controlling for the confounding variable. Although our goal is prediction, not causal inference, we find that such adjustments are essential to building text classifiers that are robust to confounding variables. On three diverse text classifications tasks, we find that covariate adjustment results in higher accuracy than competing baselines over a range of confounding relationships (e.g., in one setting, accuracy improves from 60% to 81%).

【Keywords】: social media; text classification; causal inference

27. Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts.

【Paper Link】【Pages】:194-200

【Authors】: Qiang Liu ; Shu Wu ; Liang Wang ; Tieniu Tan

【Abstract】: Spatial and temporal contextual information plays a key role for analyzing user behaviors, and is helpful for predicting where he or she will go next. With the growing ability of collecting information, more and more temporal and spatial contextual information is collected in systems, and the location prediction problem becomes crucial and feasible. Some works have been proposed to address this problem, but they all have their limitations. Factorizing Personalized Markov Chain (FPMC) is constructed based on a strong independence assumption among different factors, which limits its performance. Tensor Factorization (TF) faces the cold start problem in predicting future actions. Recurrent Neural Networks (RNN) model shows promising performance comparing with PFMC and TF, but all these methods have problem in modeling continuous time interval and geographical distance. In this paper, we extend RNN and propose a novel method called Spatial Temporal Recurrent Neural Networks (ST-RNN). ST-RNN can model local temporal and spatial contexts in each layer with time-specific transition matrices for different time intervals and distance-specific transition matrices for different geographical distances. Experimental results show that the proposed ST-RNN model yields significant improvements over the competitive compared methods on two typical datasets, i.e., Global Terrorism Database (GTD) and Gowalla dataset.

【Keywords】:

28. Fortune Teller: Predicting Your Career Path.

【Paper Link】【Pages】:201-207

【Authors】: Ye Liu ; Luming Zhang ; Liqiang Nie ; Yan Yan ; David S. Rosenblum

【Abstract】: People go to fortune tellers in hopes of learning things about their future. A future career path is one of the topics most frequently discussed. But rather than rely on "black arts" to make predictions, in this work we scientifically and systematically study the feasibility of career path prediction from social network data. In particular, we seamlessly fuse information from multiple social networks to comprehensively describe a user and characterize progressive properties of his or her career path. This is accomplished via a multi-source learning framework with fused lasso penalty, which jointly regularizes the source and career-stage relatedness. Extensive experiments on real-world data confirm the accuracy of our model.

【Keywords】: Career Path Modeling, multi task learning, multiple social network learning

【Paper Link】【Pages】:208-214

【Authors】: Suhas Ranganath ; Fred Morstatter ; Xia Hu ; Jiliang Tang ; Suhang Wang ; Huan Liu

【Abstract】: Social media has emerged to be a popular platform for people to express their viewpoints on political protests like the Arab Spring. Millions of people use social media to communicate and mobilize their viewpoints on protests. Hence, it is a valuable tool for organizing social movements. However, the mechanisms by which protest affects the population is not known, making it difficult to estimate the number of protestors. In this paper, we are inspired by sociological theories of protest participation and propose a framework to predict from the user's past status messages and interactions whether the next post of the user will be a declaration of protest. Drawing concepts from these theories, we model the interplay between the user's status messages and messages interacting with him over time and predict whether the next post of the user will be a declaration of protest. We evaluate the framework using data from the social media platform Twitter on protests during the recent Nigerian elections and demonstrate that it can effectively predict whether the next post of a user is a declaration of protest.

【Keywords】: Protests, Brownian Motion, Political Participation

30. Context-Sensitive Twitter Sentiment Classification Using Neural Network.

【Paper Link】【Pages】:215-221

【Authors】: Yafeng Ren ; Yue Zhang ; Meishan Zhang ; Donghong Ji

【Abstract】: Sentiment classification on Twitter has attracted increasing research in recent years.Most existing work focuses on feature engineering according to the tweet content itself.In this paper, we propose a context-based neural network model for Twitter sentiment analysis, incorporating contextualized features from relevant Tweets into the model in the form of word embedding vectors.Experiments on both balanced and unbalanced datasets show that our proposed models outperform the current state-of-the-art.

【Keywords】: Twitter sentiment classification; neural network; contexutal information

31. ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources.

【Paper Link】【Pages】:222-228

【Authors】: Mehdi Samadi ; Partha Pratim Talukdar ; Manuela M. Veloso ; Manuel Blum

【Abstract】: The World Wide Web (WWW) has become a rapidly growing platform consisting of numerous sources which provide supporting or contradictory information about claims (e.g., "Chicken meat is healthy"). In order to decide whether a claim is true or false, one needs to analyze content of different sources of information on the Web, measure credibility of information sources, and aggregate all these information. This is a tedious process and the Web search engines address only part of the overall problem, viz., producing only a list of relevant sources. In this paper, we present ClaimEval, a novel and integrated approach which given a set of claims to validate, extracts a set of pro and con arguments from the Web information sources, and jointly estimates credibility of sources and correctness of claims. ClaimEval uses Probabilistic Soft Logic (PSL), resulting in a flexible and principled framework which makes it easy to state and incorporate different forms of prior-knowledge. Through extensive experiments on real-world datasets, we demonstrate ClaimEval’s capability in determining validity of a set of claims, resulting in improved accuracy compared to state-of-the-art baselines.

【Keywords】: Trust; Claim Evaluation; Information Extraction; Artificial Intelligence; Web Mining; Credibility Assessment

32. On the Effectiveness of Linear Models for One-Class Collaborative Filtering.

【Paper Link】【Pages】:229-235

【Authors】: Suvash Sedhain ; Aditya Krishna Menon ; Scott Sanner ; Darius Braziunas

【Abstract】: In many personalised recommendation problems, there are examples of items users prefer or like, but no examples of items they dislike. A state-of-the-art method for such implicit feedback, or one-class collaborative filtering (OC-CF), problems is SLIM, which makes recommendations based on a learned item-item similarity matrix. While SLIM has been shown to perform well on implicit feedback tasks, we argue that it is hindered by two limitations: first, it does not produce user-personalised predictions, which hampers recommendation performance; second, it involves solving a constrained optimisation problem, which impedes fast training. In this paper, we propose LRec, a variant of SLIM that overcomes these limitations without sacrificing any of SLIM's strengths.At its core, LRec employs linear logistic regression; despite this simplicity, LRec consistently and significantly outperforms all existing methods on a range of datasets. Our results thus illustrate that the OC-CF problem can be effectively tackled via linear classification models.

【Keywords】: One Class Collaborative Filtering, Recommender Systems, PU Learning

33. Supervised Hashing via Uncorrelated Component Analysis.

【Paper Link】【Pages】:236-242

【Authors】: Sungryull Sohn ; Hyunwoo Kim ; Junmo Kim

【Abstract】: The Approximate Nearest Neighbor (ANN) search problem is important in applications such as information retrieval. Several hashing-based search methods that provide effective solutions to the ANN search problem have been proposed. However, most of these focus on similarity preservation and coding error minimization, and pay little attention to optimizing the precision-recall curve or receiver operating characteristic curve. In this paper, we propose a novel projection-based hashing method that attempts to maximize the precision and recall. We first introduce an uncorrelated component analysis (UCA) by examining the precision and recall, and then propose a UCA-based hashing method. The proposed method is evaluated with a variety of datasets. The results show that UCA-based hashing outperforms state-of-the-art methods, and has computationally efficient training and encoding processes.

【Keywords】: Hashing; Information Retrieval

34. Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags.

【Paper Link】【Pages】:243-250

【Authors】: Niket Tandon ; Charles Hariman ; Jacopo Urbani ; Anna Rohrbach ; Marcus Rohrbach ; Gerhard Weikum

【Abstract】: Commonsense knowledge about part-whole relations (e.g., screen partOf notebook) is important for interpreting user input in web search and question answering, or for object detection in images. Prior work on knowledge base construction has compiled part-whole assertions, but with substantial limitations: i) semantically different kinds of part-whole relations are conflated into a single generic relation, ii) the arguments of a part-whole assertion are merely words with ambiguous meaning, iii) the assertions lack additional attributes like visibility (e.g., a nose is visible but a kidney is not) and cardinality information (e.g., a bird has two legs while a spider eight), iv) limited coverage of only tens of thousands of assertions. This paper presents a new method for automatically acquiring part-whole commonsense from Web contents and image tags at an unprecedented scale, yielding many millions of assertions, while specifically addressing the four shortcomings of prior work. Our method combines pattern-based information extraction methods with logical reasoning. We carefully distinguish different relations: physicalPartOf, memberOf, substanceOf. We consistently map the arguments of all assertions onto WordNet senses, eliminating the ambiguity of word-level assertions. We identify whether the parts can be visually perceived, and infer cardinalities for the assertions. The resulting commonsense knowledge base has very high quality and high coverage, with an accuracy of 89% determined by extensive sampling, and is publicly available.

【Keywords】: part whole knowledge;commonsense knowledge;knowledge bases

【Paper Link】【Pages】:251-257

【Authors】: Jiliang Tang ; Suhang Wang ; Xia Hu ; Dawei Yin ; Yingzhou Bi ; Yi Chang ; Huan Liu

【Abstract】: The pervasive presence of social media greatly enriches online users' social activities, resulting in abundant social relations. Social relations provide an independent source for recommendation, bringing about new opportunities for recommender systems. Exploiting social relations to improve recommendation performance attracts a great amount of attention in recent years. Most existing social recommender systems treat social relations homogeneously and make use of direct connections (or strong dependency connections). However, connections in online social networks are intrinsically heterogeneous and are a composite of various relations. While connected users in online social networks form groups, and users in a group share similar interests, weak dependency connections are established among these users when they are not directly connected. In this paper, we investigate how to exploit the heterogeneity of social relations and weak dependency connections for recommendation. In particular, we employ social dimensions to simultaneously capture heterogeneity of social relations and weak dependency connections, and provide principled ways to model social dimensions, and propose a recommendation framework SoDimRec which incorporates heterogeneity of social relations and weak dependency connections based on social dimensions. Experimental results on real-world data sets demonstrate the effectiveness of the proposed framework. We conduct further experiments to understand the important role of social dimensions in the proposed framework.

【Keywords】: recommendation; social recommendation; weak ties; heterogeneous social relations

36. Column-Oriented Datalog Materialization for Large Knowledge Graphs.

【Paper Link】【Pages】:258-264

【Authors】: Jacopo Urbani ; Ceriel J. H. Jacobs ; Markus Krötzsch

【Abstract】: The evaluation of Datalog rules over large Knowledge Graphs (KGs) is essential for many applications. In this paper, we present a new method of materializing Datalog inferences, which combines a column-based memory layout with novel optimization methods that avoid redundant inferences at runtime. The pro-active caching of certain subqueries further increases efficiency. Our empirical evaluation shows that this approach can often match or even surpass the performance of state-of-the-art systems, especially under restricted resources.

【Keywords】:

37. Semantic Community Identification in Large Attribute Networks.

【Paper Link】【Pages】:265-271

【Authors】: Xiao Wang ; Di Jin ; Xiaochun Cao ; Liang Yang ; Weixiong Zhang

【Abstract】: Identification of modular or community structures of a network is a key to understanding the semantics and functions of the network. While many network community detection methods have been developed, which primarily explore network topologies, they provide little semantic information of the communities discovered. Although structures and semantics are closely related, little effort has been made to discover and analyze these two essential network properties together. By integrating network topology and semantic information on nodes, e.g., node attributes, we study the problems of detection of communities and inference of their semantics simultaneously. We propose a novel nonnegative matrix factorization (NMF) model with two sets of parameters, the community membership matrix and community attribute matrix, and present efficient updating rules to evaluate the parameters with a convergence guarantee. The use of node attributes improves upon community detection and provides a semantic interpretation to the resultant network communities. Extensive experimental results on synthetic and real-world networks not only show the superior performance of the new method over the state-of-the-art approaches, but also demonstrate its ability to semantically annotate the communities.

【Keywords】:

【Paper Link】【Pages】:272-278

【Authors】: Bo Wu ; Tao Mei ; Wen-Huang Cheng ; Yongdong Zhang

【Abstract】: Time information plays a crucial role on social media popularity. Existing research on popularity prediction, effective though, ignores temporal information which is highly related to user-item associations and thus often results in limited success. An essential way is to consider all these factors (user, item, and time), which capture the dynamic nature of photo popularity. In this paper, we present a novel approach to factorize the popularity into user-item context and time-sensitive context for exploring the mechanism of dynamic popularity. The user-item context provides a holistic view of popularity, while the time-sensitive context captures the temporal dynamics nature of popularity. Accordingly, we develop two kinds of time-sensitive features, including user activeness variability and photo prevalence variability. To predict photo popularity, we propose a novel framework named Multi-scale Temporal Decomposition (MTD), which decomposes the popularity matrix in latent spaces based on contextual associations. Specifically, the proposed MTD models time-sensitive context on different time scales, which is beneficial to automatically learn temporal patterns. Based on the experiments conducted on a real-world dataset with 1.29M photos from Flickr, our proposed MTD can achieve the prediction accuracy of 79.8% and outperform the best three state-of-the-art methods with a relative improvement of 9.6% on average.

【Keywords】: Popularity; Prediction; Social Media; Temporal Dynamics; Multi-scale Decomposition

【Paper Link】【Pages】:279-286

【Authors】: Le Wu ; Yong Ge ; Qi Liu ; Enhong Chen ; Bai Long ; Zhenya Huang

【Abstract】: Researchers have long converged that the evolution of a Social Networking Service (SNS) platform is driven by the interplay between users' preferences (reflected in user-item consumption behavior) and the social network structure (reflected in user-user interaction behavior), with both kinds of users' behaviors change from time to time. However, traditional approaches either modeled these two kinds of behaviors in an isolated way or relied on a static assumption of a SNS. Thus, it is still unclear how do the roles of users' historical preferences and the dynamic social network structure affect the evolution of SNSs. Furthermore, can jointly modeling users' temporal behaviors in SNSs benefit both behavior prediction tasks?In this paper, we leverage the underlying social theories(i.e., social influence and the homophily effect) to investigate the interplay and evolution of SNSs. We propose a probabilistic approach to fuse these social theories for jointly modeling users' temporal behaviors in SNSs. Thus our proposed model has both the explanatory ability and predictive power. Experimental results on two real-world datasets demonstrate the effectiveness of our proposed model.

【Keywords】: Social Networking , User Preference, Evolution

40. Cross-Lingual Taxonomy Alignment with Bilingual Biterm Topic Model.

【Paper Link】【Pages】:287-293

【Authors】: Tianxing Wu ; Guilin Qi ; Haofen Wang ; Kang Xu ; Xuan Cui

【Abstract】: As more and more multilingual knowledge becomes available on the Web, knowledge sharing across languages has become an important task to benefit many applications. One of the most crucial kinds of knowledge on the Web is taxonomy, which is used to organize and classify the Web data. To facilitate knowledge sharing across languages, we need to deal with the problem of cross-lingual taxonomy alignment, which discovers the most relevant category in the target taxonomy of one language for each category in the source taxonomy of another language. Current approaches for aligning cross-lingual taxonomies strongly rely on domain-specific information and the features based on string similarities. In this paper, we present a new approach to deal with the problem of cross-lingual taxonomy alignment without using any domain-specific information. We first identify the candidate matched categories in the target taxonomy for each category in the source taxonomy using the cross-lingual string similarity. We then propose a novel bilingual topic model, called Bilingual Biterm Topic Model (BiBTM), to perform exact matching. BiBTM is trained by the textual contexts extracted from the Web. We conduct experiments on two kinds of real world datasets. The experimental results show that our approach significantly outperforms the designed state-of-the-art comparison methods.

【Keywords】: Cross-lingual Taxonomy Alignment; Bilingual Biterm Topic Model; Vector Similarities

【Paper Link】【Pages】:294-300

【Authors】: Liang Xie ; Jialie Shen ; Lei Zhu

【Abstract】: Cross-modal hashing (CMH) is an efficient technique for the fast retrieval of web image data, and it has gained a lot of attentions recently. However, traditional CMH methods usually apply batch learning for generating hash functions and codes. They are inefficient for the retrieval of web images which usually have streaming fashion. Online learning can be exploited for CMH. But existing online hashing methods still cannot solve two essential problems: efficient updating of hash codes and analysis of cross-modal correlation. In this paper, we propose Online Cross-modal Hashing (OCMH) which can effectively address the above two problems by learning the shared latent codes (SLC). In OCMH, hash codes can be represented by the permanent SLC and dynamic transfer matrix. Therefore, inefficient updating of hash codes is transformed to the efficient updating of SLC and transfer matrix, and the time complexity is irrelevant to the database size. Moreover, SLC is shared by all the modalities, and thus it can encode the latent cross-modal correlation, which further improves the overall cross-modal correlation between heterogeneous data. Experimental results on two real-world multi-modal web image datasets: MIR Flickr and NUS-WIDE, demonstrate the effectiveness and efficiency of OCMH for online cross-modal web image retrieval.

【Keywords】:

42. Understanding Emerging Spatial Entities.

【Paper Link】【Pages】:301-307

【Authors】: Jinyoung Yeo ; Jin-Woo Park ; Seung-won Hwang

【Abstract】: In Foursquare or Google+ Local, emerging spatial entities, such as new business or venue, are reported to grow by 1% every day. As information on such spatial entities is initially limited (e.g., only name), we need to quickly harvest related information from social media such as Flickr photos. Especially, achieving high-recall in photo population is essential for emerging spatial entities, which suffer from data sparseness (e.g., 71% restaurants of TripAdvisor in Seattle do not have any photo, as of Sep 03, 2015). Our goal is thus to address this limitation by identifying effective linking techniques for emerging spatial entities and photos. Compared with state-of-the-art baselines, our proposed approach improves recall and F1 score by up to 24% and 18%, respectively. To show the effectiveness and robustness of our approach, we have conducted extensive experiments in three different cities, Seattle, Washington D.C., and Taipei, of varying characteristics such as geographical density and language.

【Keywords】: entity linking; photo harvesting; emerging entity

43. Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark.

【Paper Link】【Pages】:308-314

【Authors】: Quanzeng You ; Jiebo Luo ; Hailin Jin ; Jianchao Yang

【Abstract】: Psychological research results have confirmed that people can have different emotional reactions to different visual stimuli. Several papers have been published on the problem of visual emotion analysis. In particular, attempts have been made to analyze and predict people's emotional reaction towards images. To this end, different kinds of hand-tuned features are proposed. The results reported on several carefully selected and labeled small image data sets have confirmed the promise of such features. While the recent successes of many computer vision related tasks are due to the adoption of Convolutional Neural Networks (CNNs), visual emotion analysis has not achieved the same level of success. This may be primarily due to the unavailability of confidently labeled and relatively large image data sets for visual emotion analysis. In this work, we introduce a new data set, which started from 3+ million weakly labeled images of different emotions and ended up 30 times as large as the current largest publicly available visual emotion data set. We hope that this data set encourages further research on visual emotion analysis. We also perform extensive benchmarking analyses on this large data set using the state of the art methods including CNNs.

【Keywords】: Image Emotion; Dataset; Benchmark; Deep Learning

44. STELLAR: Spatial-Temporal Latent Ranking for Successive Point-of-Interest Recommendation.

【Paper Link】【Pages】:315-322

【Authors】: Shenglin Zhao ; Tong Zhao ; Haiqin Yang ; Michael R. Lyu ; Irwin King

【Abstract】: Successive point-of-interest (POI) recommendation in location-based social networks (LBSNs) becomes a significant task since it helps users to navigate a number of candidate POIs and provides the best POI recommendations based on users’ most recent check-in knowledge. However, all existing methods for successive POI recommendation only focus on modeling the correlation between POIs based on users’ check-in sequences, but ignore an important fact that successive POI recommendation is a time-subtle recommendation task. In fact, even with the same previous check-in information, users would prefer different successive POIs at different time. To capture the impact of time on successive POI recommendation, in this paper, we propose a spatial-temporal latent ranking (STELLAR) method to explicitly model the interactions among user, POI, and time. In particular, the proposed STELLAR model is built upon a ranking-based pairwise tensor factorization framework with a fine-grained modeling of user-POI, POI-time, and POI-POI interactions for successive POI recommendation. Moreover, we propose a new interval-aware weight utility function to differentiate successive check-ins’ correlations, which breaks the time interval constraint in prior work. Evaluations on two real-world datasets demonstrate that the STELLAR model outperforms state-of-the-art successive POI recommendation model about 20% in Precision@5 and Recall@5.

【Keywords】: latent ranking; point-of-interest recommendation

Technical Papers: Cognitive Modeling and Cognitive Systems 2

45. Learning the Preferences of Ignorant, Inconsistent Agents.

【Paper Link】【Pages】:323-329

【Authors】: Owain Evans ; Andreas Stuhlmüller ; Noah D. Goodman

【Abstract】: An important use of machine learning is to learn what people value. What posts or photos should a user be shown? Which jobs or activities would a person find rewarding? In each case, observations of people's past choices can inform our inferences about their likes and preferences. If we assume that choices are approximately optimal according to some utility function, we can treat preference inference as Bayesian inverse planning. That is, given a prior on utility functions and some observed choices, we invert an optimal decision-making process to infer a posterior distribution on utility functions. However, people often deviate from approximate optimality. They have false beliefs, their planning is sub-optimal, and their choices may be temporally inconsistent due to hyperbolic discounting and other biases. We demonstrate how to incorporate these deviations into algorithms for preference inference by constructing generative models of planning for agents who are subject to false beliefs and time inconsistency. We explore the inferences these models make about preferences, beliefs, and biases. We present a behavioral experiment in which human subjects perform preference inference given the same observations of choices as our model. Results show that human subjects (like our model) explain choices in terms of systematic deviations from optimal behavior and suggest that they take such deviations into account when inferring preferences.

【Keywords】: Bayesian learning, cognitive biases, preference inference

46. Egocentric Video Search via Physical Interactions.

【Paper Link】【Pages】:330-337

【Authors】: Taiki Miyanishi ; Junichiro Hirayama ; Quan Kong ; Takuya Maekawa ; Hiroki Moriya ; Takayuki Suyama

【Abstract】: Retrieving past egocentric videos about personal daily life is important to support and augment human memory. Most previous retrieval approaches have ignored the crucial feature of human-physical world interactions, which is greatly related to our memory and experience of daily activities. In this paper, we propose a gesture-based egocentric video retrieval framework, which retrieves past visual experience using body gestures as non-verbal queries. We use a probabilistic framework based on a canonical correlation analysis that models physical interactions through a latent space and uses them for egocentric video retrieval and re-ranking search results. By incorporating physical interactions into the retrieval models, we address the problems resulting from the variability of human motions. We evaluate our proposed method on motion and egocentric video datasets about daily activities in household settings and demonstrate that our egocentric video retrieval framework robustly improves retrieval performance when retrieving past videos from personal and even other persons' video archives.

【Keywords】: Memory Augmentation; Lifelog; Egocentric Video Search

Technical Papers: Computational Sustainability and AI 2

47. Learning Deep Representation from Big and Heterogeneous Data for Traffic Accident Inference.

【Paper Link】【Pages】:338-344

【Authors】: Quanjun Chen ; Xuan Song ; Harutoshi Yamada ; Ryosuke Shibasaki

【Abstract】: With the rapid development of urbanization and public transportation system, the number of traffic accidents have significantly increased globally over the past decades and become a big problem for human society. Facing these possible and unexpected traffic accidents, understanding what causes traffic accident and early alarms for some possible ones will play a critical role on planning effective traffic management. However, due to the lack of supported sensing data, research is very limited on the field of updating traffic accident risk in real-time. Therefore, in this paper, we collect big and heterogeneous data (7 months traffic accident data and 1.6 million users' GPS records) to understand how human mobility will affect traffic accident risk. By mining these data, we develop a deep model of Stack denoise Autoencoder to learn hierarchical feature representation of human mobility. And these features are used for efficient prediction of traffic accident risk level. Once the model has been trained, our model can simulate corresponding traffic accident risk map with given real-time input of human mobility. The experimental results demonstrate the efficiency of our model and suggest that traffic accident risk can be significantly more predictable through human mobility.

【Keywords】: Traffic Accident; Human Mobility; Urban Computing

48. Autonomous Electricity Trading Using Time-of-Use Tariffs in a Competitive Market.

【Paper Link】【Pages】:345-352

【Authors】: Daniel Urieli ; Peter Stone

【Abstract】: This paper studies the impact of Time-Of-Use (TOU) tariffs in a competitive electricity market place. Specifically, it focuses on the question of how should an autonomous broker agent optimize TOU tariffs in a competitive retail market, and what is the impact of such tariffs on the economy. We formalize the problem of TOU tariff optimization and propose an algorithm for approximating its solution. We extensively experiment with our algorithm in a large-scale, detailed electricity retail markets simulation of the Power Trading Agent Competition (Power TAC) and: 1) find that our algorithm results in 15% peak-demand reduction, 2) find that its peak-flattening results in greater profit and/or profit-share for the broker and allows it to win against the 1st and 2nd place brokers from the Power TAC 2014 finals, and 3) analyze several economic implications of using TOU tariffs in competitive retail markets.

【Keywords】: Autonoums agents; Autonomous Electricity Trading; Smart Grid; Power Trading Agents; Time of Use Electricity Pricing

Technical Papers: Game Playing and Interactive Entertainment 2

49. Reuse of Neural Modules for General Video Game Playing.

【Paper Link】【Pages】:353-359

【Authors】: Alexander Braylan ; Mark Hollenbeck ; Elliot Meyerson ; Risto Miikkulainen

【Abstract】: A general approach to knowledge transfer is introduced in which an agent controlled by a neural network adapts how it reuses existing networks as it learns in a new domain. Networks trained for a new domain can improve their performance by routing activation selectively through previously learned neural structure, regardless of how or for what it was learned. A neuroevolution implementation of this approach is presented with application to high-dimensional sequential decision-making domains. This approach is more general than previous approaches to neural transfer for reinforcement learning. It is domain-agnostic and requires no prior assumptions about the nature of task relatedness or mappings. The method is analyzed in a stochastic version of the Arcade Learning Environment, demonstrating that it improves performance in some of the more complex Atari 2600 games, and that the success of transfer can be predicted based on a high-level characterization of game dynamics.

【Keywords】: Game Playing and Interactive Entertainment; General Game Playing; Neural Networks; Evolutionary Computation; Transfer Learning; Neural Reuse; Reinforcement Learning; General Video Game Playing; Neuroevolution; Knowledge Transfer

50. Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games Using Convolutional Networks.

【Paper Link】【Pages】:360-368

【Authors】: Nikolai Yakovenko ; Liangliang Cao ; Colin Raffel ; James Fan

【Abstract】: Poker is a family of card games that includes many varia- tions. We hypothesize that most poker games can be solved as a pattern matching problem, and propose creating a strong poker playing system based on a unified poker representa- tion. Our poker player learns through iterative self-play, and improves its understanding of the game by training on the results of its previous actions without sophisticated domain knowledge. We evaluate our system on three poker games: single player video poker, two-player Limit Texas Hold’em, and finally two-player 2-7 triple draw poker. We show that our model can quickly learn patterns in these very different poker games while it improves from zero knowledge to a competi- tive player against human experts. The contributions of this paper include: (1) a novel represen- tation for poker games, extendable to different poker vari- ations, (2) a Convolutional Neural Network (CNN) based learning model that can effectively learn the patterns in three different games, and (3) a self-trained system that signif- icantly beats the heuristic-based program on which it is trained, and our system is competitive against human expert players.

【Keywords】: deep learning;artificial intelligence;neural network;convolutional network;game theory;poker

Technical Papers: Game Theory and Economic Paradigms 42

51. Computing Possible and Necessary Equilibrium Actions (and Bipartisan Set Winners).

【Paper Link】【Pages】:369-375

【Authors】: Markus Brill ; Rupert Freeman ; Vincent Conitzer

【Abstract】: In many multiagent environments, a designer has some, but limited control over the game being played. In this paper, we formalize this by considering incompletely specified games, in which some entries of the payoff matrices can be chosen from a specified set. We show that it is NP-hard for the designer to make this choices optimally, even in zero-sum games. In fact, it is already intractable to decide whether a given action is (potentially or necessarily) played in equilibrium. We also consider incompletely specified symmetric games in which all completions are required to be symmetric. Here, hardness holds even in weak tournament games (symmetric zero-sum games whose entries are all -1, 0, or 1) and in tournament games (symmetric zero-sum games whose non-diagonal entries are all -1 or 1). The latter result settles the complexity of the possible and necessary winner problems for a social-choice-theoretic solution concept known as the bipartisan set. We finally give a mixed-integer linear programming formulation for weak tournament games and evaluate it experimentally.

【Keywords】: incomplete games, essential set, bipartisan set

52. From Duels to Battlefields: Computing Equilibria of Blotto and Other Games.

【Paper Link】【Pages】:376-382

【Authors】: AmirMahdi Ahmadinejad ; Sina Dehghani ; MohammadTaghi Hajiaghayi ; Brendan Lucier ; Hamid Mahini ; Saeed Seddighin

【Abstract】: We study the problem of computing Nash equilibria of zero-sum games.Many natural zero-sum games have exponentially many strategies, but highly structured payoffs. For example, in the well-studied Colonel Blotto game (introduced by Borel in 1921), players must divide a pool of troops among a set of battlefields with the goal of winning (i.e., having more troops in) a majority. The Colonel Blotto game is commonly used for analyzing a wide range of applications from the U.S presidential election, to innovative technology competitions, toadvertisement, to sports.However, because of the size of the strategy space, standard methods for computing equilibria of zero-sum games fail to be computationally feasible.Indeed, despite its importance, only few solutions for special variants of the problem are known. In this paper we show how to compute equilibria of Colonel Blotto games. Moreover, our approach takes the form of a general reduction: to find a Nash equilibrium of a zero-sum game, it suffices to design a separation oracle for the strategy polytope of any bilinear game that is payoff-equivalent. We then apply this technique to obtain the first polytime algorithms for a variety of games. In addition to Colonel Blotto, we also show how to compute equilibria in an infinite-strategy variant called the General Lotto game; this involves showing how to prune the strategy space to a finite subset before applying our reduction. We also consider the class of dueling games, first introduced by Immorlica et al. (2011). We show that our approach provably extends the class of dueling games for which equilibria can be computed: we introduce a new dueling game, the matching duel, on which prior methods fail to be computationally feasible but upon which our reduction can be applied.

【Keywords】: Blotto; Dueling Games; Nash Equilibrium

53. Maximizing Revenue with Limited Correlation: The Cost of Ex-Post Incentive Compatibility.

【Paper Link】【Pages】:383-389

【Authors】: Michael Albert ; Vincent Conitzer ; Giuseppe Lopomo

【Abstract】: In a landmark paper in the mechanism design literature, Cremer and McLean (1985) (CM for short) show that when a bidder’s valuation is correlated with an external signal, a monopolistic seller is able to extract the full social surplus as revenue. In the original paper and subsequent literature, the focus has been on ex-post incentive compatible (or IC) mechanisms, where truth telling is an ex-post Nash equilibrium. In this paper, we explore the implications of Bayesian versus ex-post IC in a correlated valuation setting. We generalize the full extraction result to settings that do not satisfy the assumptions of CM. In particular, we give necessary and sufficient conditions for full extraction that strictly relax the original conditions given in CM. These more general conditions characterize the situations under which requiring ex-post IC leads to a decrease in expected revenue relative to Bayesian IC. We also demonstrate that the expected revenue from the optimal ex-post IC mechanism guarantees at most a (|Θ| + 1)/4 approximation to that of a Bayesian IC mechanism, where |Θ| is the number of bidder types. Finally, using techniques from automated mechanism design, we show that, for randomly generated distributions, the average expected revenue achieved by Bayesian IC mechanisms is significantly larger than that for ex-post IC mechanisms.

【Keywords】: Mechanism Design; Interdependent Values; Limited Correlation; Automated Mechanism Design; Prior Dependent Mechanism Design

【Paper Link】【Pages】:390-396

【Authors】: Elliot Anshelevich ; Shreyas Sekar

【Abstract】: We study the Maximum Weighted Matching problem in a partial information setting where the agents' utilities for being matched to other agents are hidden and the mechanism only has access to ordinal preference information. Our model is motivated by the fact that in many settings, agents cannot express the numerical values of their utility for different outcomes, but are still able to rank the outcomes in their order of preference. Specifically, we study problems where the ground truth exists in the form of a weighted graph, and look to design algorithms that approximate the true optimum matching using only the preference orderings for each agent (induced by the hidden weights) as input. If no restrictions are placed on the weights, then one cannot hope to do better than the simple greedy algorithm, which yields a half optimal matching. Perhaps surprisingly, we show that by imposing a little structure on the weights, we can improve upon the trivial algorithm significantly: we design a 1.6-approximation algorithm for instances where the hidden weights obey the metric inequality. Our algorithm is obtained using a simple but powerful framework that allows us to combine greedy and random techniques in unconventional ways. These results are the first non-trivial ordinal approximation algorithms for such problems, and indicate that we can design robust matchings even when we are agnostic to the precise agent utilities.

【Keywords】: Ordinal Information, Matching, Approximation Algorithm

55. Strategyproof Peer Selection: Mechanisms, Analyses, and Experiments.

【Paper Link】【Pages】:397-403

【Authors】: Haris Aziz ; Omer Lev ; Nicholas Mattei ; Jeffrey S. Rosenschein ; Toby Walsh

【Abstract】: We study an important crowdsourcing setting where agents evaluate one another and, based on these evaluations, a subset of agents are selected. This setting is ubiquitous when peer review is used for distributing awards in a team, allocating funding to scientists, and selecting publications for conferences. The fundamental challenge when applying crowdsourcing in these settings is that agents may misreport their reviews of others to increase their chances of being selected. We propose a new strategyproof (impartial) mechanism called Dollar Partition that satisfies desirable axiomatic properties. We then show, using a detailed experiment with parameter values derived from target real world domains, that our mechanism performs better on average, and in the worst case, than other strategyproof mechanisms in the literature.

【Keywords】: Computational Social Choice, Peer Selection, Mechanism Design

56. A Security Game Combining Patrolling and Alarm-Triggered Responses Under Spatial and Detection Uncertainties.

【Paper Link】【Pages】:404-410

【Authors】: Nicola Basilico ; Giuseppe De Nittis ; Nicola Gatti

【Abstract】: Motivated by a number of security applications, among which border patrolling, we study, to the best of our knowledge, the first Security Game model in which patrolling strategies need to be combined with responses to signals raised by an alarm system, which is spatially uncertain (i.e., it is uncertain over the exact location the attack is ongoing) and is affected by false negatives (i.e., the missed detection rate of an attack may be positive). Ours is an infinite-horizon patrolling scenario on a graph, where a single patroller moves. We study the properties of the game model in terms of computational issues and form of the optimal strategies and we provide an approach to solve it. Finally, we provide an experimental analysis of our techniques.

【Keywords】: algorithmic game theory; security games

57. Learning Market Parameters Using Aggregate Demand Queries.

【Paper Link】【Pages】:411-417

【Authors】: Xiaohui Bei ; Wei Chen ; Jugal Garg ; Martin Hoefer ; Xiaoming Sun

【Abstract】: We study efficient algorithms for a natural learning problem in markets. There is one seller with m divisible goods and n buyers with unknown individual utility functions and budgets of money. The seller can repeatedly announce prices and observe aggregate demand bundles requested by the buyers. The goal of the seller is to learn the utility functions and budgets of the buyers. Our scenario falls into the classic domain of ''revealed preference'' analysis. Problems with revealed preference have recently started to attract increased interest in computer science due to their fundamental nature in understanding customer behavior in electronic markets. The goal of revealed preference analysis is to observe rational agent behavior, to explain it using a suitable model for the utility functions, and to predict future agent behavior. Our results are the first polynomial-time algorithms to learn utility and budget parameters via revealed preference queries in classic Fisher markets with multiple buyers. Our analysis concentrates on linear, CES, and Leontief markets, which are the most prominent classes studied in the literature. Some of our results extend to general Arrow-Debreu exchange markets.

【Keywords】: Fisher market; exchange market; revealed preference; query complexity

58. An Algorithmic Framework for Strategic Fair Division.

【Paper Link】【Pages】:418-424

【Authors】: Simina Brânzei ; Ioannis Caragiannis ; David Kurokawa ; Ariel D. Procaccia

【Abstract】: We study the paradigmatic fair division problem of fairly allocating a divisible good among agents with heterogeneous preferences, commonly known as cake cutting. Classic cake cutting protocols are susceptible to manipulation. Do their strategic outcomes still guarantee fairness? To address this question we adopt a novel algorithmic approach, proposing a concrete computational model and reasoning about the game-theoretic properties of algorithms that operate in this model. Specifically, we show that each protocol in the class of generalized cut and choose (GCC) protocols --- which includes the most important discrete cake cutting protocols --- is guaranteed to have approximate subgame perfect Nash equilibria, or even exact equilibria if the protocol's tie-breaking rule is flexible. We further observe that the (approximate) equilibria of proportional protocols --- which guarantee each of the n agents a 1/n-fraction of the cake --- must be (approximately) proportional, thereby answering the above question in the positive (at least for one common notion of fairness).

【Keywords】: fair division; equilibrium analysis; game theory; cake cutting; algorithms

59. One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats.

【Paper Link】【Pages】:425-431

【Authors】: Matthew Brown ; Arunesh Sinha ; Aaron Schlenker ; Milind Tambe

【Abstract】: An effective way of preventing attacks in secure areas is to screen for threats (people, objects) before entry, e.g., screening of airport passengers. However, screening every entity at the same level may be both ineffective and undesirable. The challenge then is to find a dynamic approach for randomized screening, allowing for more effective use of limited screening resources, leading to improved security. We address this challenge with the following contributions: (1) a threat screening game (TSG) model for general screening domains; (2) an NP-hardness proof for computing the optimal strategy of TSGs; (3) a scheme for decomposing TSGs into subgames to improve scalability; (4) a novel algorithm that exploits a compact game representation to efficiently solve TSGs, providing the optimal solution under certain conditions; and (5) an empirical comparison of our proposed algorithm against the current state-of-the-art optimal approach for large-scale game-theoretic resource allocation problems.

【Keywords】: game theory; security; optimization; allocation

60. Strategy-Based Warm Starting for Regret Minimization in Games.

【Paper Link】【Pages】:432-438

【Authors】: Noam Brown ; Tuomas Sandholm

【Abstract】: Counterfactual Regret Minimization (CFR) is a popular iterative algorithm for approximating Nash equilibria in imperfect-information multi-step two-player zero-sum games. We introduce the first general, principled method for warm starting CFR. Our approach requires only a strategy for each player, and accomplishes the warm start at the cost of a single traversal of the game tree. The method provably warm starts CFR to as many iterations as it would have taken to reach a strategy profile of the same quality as the input strategies, and does not alter the convergence bounds of the algorithms. Unlike prior approaches to warm starting, ours can be applied in all cases. Our method is agnostic to the origins of the input strategies. For example, they can be based on human domain knowledge, the observed strategy of a strong agent, the solution of a coarser abstraction, or the output of some algorithm that converges rapidly at first but slowly as it gets closer to an equilibrium. Experiments demonstrate that one can improve overall convergence in a game by first running CFR on a smaller, coarser abstraction of the game and then using the strategy in the abstract game to warm start CFR in the full game.

【Keywords】: reinforcement learning; game theory; no-regret learning; equilibrium finding; cfr; warm start; poker

61. Using Correlated Strategies for Computing Stackelberg Equilibria in Extensive-Form Games.

【Paper Link】【Pages】:439-445

【Authors】: Jiri Cermak ; Branislav Bosanský ; Karel Durkota ; Viliam Lisý ; Christopher Kiekintveld

【Abstract】: Strong Stackelberg Equilibrium (SSE) is a fundamental solution concept in game theory in which one player commits to a strategy, while the other player observes this commitment and plays a best response. We present a new algorithm for computing SSE for two-player extensive-form general-sum games with imperfect information (EFGs) where computing SSE is an NP-hard problem. Our algorithm is based on a correlated version of SSE, known as Stackelberg Extensive-Form Correlated Equilibrium (SEFCE). Our contribution is therefore twofold: (1) we give the first linear program for computing SEFCE in EFGs without chance, (2) we repeatedly solve and modify this linear program in a systematic search until we arrive to SSE. Our new algorithm outperforms the best previous algorithms by several orders of magnitude.

【Keywords】: Strong Stackelberg Equilibrium; Stackelberg Equilibrium; Correlated Equilibrium; Extensive-Form Games; Stackelberg Extensive-Form Correlated Equilibrium; Game Theory

62. Assignment and Pricing in Roommate Market.

【Paper Link】【Pages】:446-452

【Authors】: Pak Hay Chan ; Xin Huang ; Zhengyang Liu ; Chihao Zhang ; Shengyu Zhang

【Abstract】: We introduce a roommate market model, in which 2n people need to be assigned to n rooms, with two people in each room. Each person has a valuation to each room, as well as a valuation to each of other people as a roommate. Each room has a rent shared by the two people living in the room, and we need to decide who live together in which room and how much each should pay. Various solution concepts on stability and envy-freeness are proposed, with their existence studied and the computational complexity of the corresponding search problems analyzed. In particular, we show that maximizing the social welfare is NP-hard, and we give a polynomial time algorithm that achieves at least 2/3 of the maximum social welfare. Finally, we demonstrate a pricing scheme that can achieve envy-freeness for each room.

【Keywords】: Stable roommate; Resource allocation; Assignment game

63. Incentives for Strategic Behavior in Fisher Market Games.

【Paper Link】【Pages】:453-459

【Authors】: Ning Chen ; Xiaotie Deng ; Bo Tang ; Hongyang Zhang

【Abstract】: In a Fisher market game, a market equilibrium is computed in terms of the utility functions and money endowments that agents reported. As a consequence, an individual buyer may misreport his private information to obtain a utility gain. We investigate the extent to which an agent's utility can be increased by unilateral strategic plays and prove that the percentage of this improvement is at most 2 for markets with weak gross substitute utilities. Equivalently, we show that truthfully reporting is a 0.5-approximate Nash equilibrium in this game. To identify sufficient conditions for truthfully reporting being close to Nash equilibrium, we conduct a parameterized study on strategic behaviors and further show that the ratio of utility gain decreases linearly as buyer's initial endowment increases or his maximum share of an item decreases. Finally, we consider collusive behavior of a coalition and prove that the utility gain is bounded by 1/(1 - maximum share of the collusion). Our findings justify the truthful reporting assumption in Fisher markets by a quantitative study on participants incentive, and imply that under large market assumption, the utility gain of a buyer from manipulations diminishes to 0.

【Keywords】: Market; Equilibrium

64. Rules for Choosing Societal Tradeoffs.

【Paper Link】【Pages】:460-467

【Authors】: Vincent Conitzer ; Rupert Freeman ; Markus Brill ; Yuqian Li

【Abstract】: We study the societal tradeoffs problem, where a set of voters each submit their ideal tradeoff value between each pair of activities (e.g., "using a gallon of gasoline is as bad as creating 2 bags of landfill trash"), and these are then aggregated into the societal tradeoff vector using a rule. We introduce the family of distance-based rules and show that these can be justified as maximum likelihood estimators of the truth. Within this family, we single out the logarithmic distance-based rule as especially appealing based on a social-choice-theoretic axiomatization. We give an efficient algorithm for executing this rule as well as an approximate hill climbing algorithm, and evaluate these experimentally.

【Keywords】: societal tradeoffs, axiomatizations, linear programming

65. Judgment Aggregation under Issue Dependencies.

【Paper Link】【Pages】:468-474

【Authors】: Marco Costantini ; Carla Groenland ; Ulle Endriss

【Abstract】: We introduce a new family of judgment aggregation rules, called the binomial rules, designed to account for hidden dependencies between some of the issues being judged. To place them within the landscape of judgment aggregation rules, we analyse both their axiomatic properties and their computational complexity, and we show that they contain both the well-known distance-based rule and the basic rule returning the most frequent overall judgment as special cases. To evaluate the performance of our rules empirically, we apply them to a dataset of crowdsourced judgments regarding the quality of hotels extracted from the travel website TripAdvisor. In our experiments we distinguish between the full dataset and a subset of highly polarised judgments, and we develop a new notion of polarisation for profiles of judgments for this purpose, which may also be of independent interest.

【Keywords】: judgment aggregation; computational social choice

66. Price of Pareto Optimality in Hedonic Games.

【Paper Link】【Pages】:475-481

【Authors】: Edith Elkind ; Angelo Fanelli ; Michele Flammini

【Abstract】: Price of Anarchy measures the welfare loss caused by selfish behavior: it is defined as the ratio of the social welfare in a socially optimal outcome and in a worst Nash equilibrium. A similar measure can be derived for other classes of stable outcomes. In this paper, we argue that Pareto optimality can be seen as a notion of stability, and introduce the concept of Price of Pareto Optimality: this is an analogue of the Price of Anarchy, where the maximum is computed over the class of Pareto optimal outcomes, i.e., outcomes that do not permit a deviation by the grand coalition that makes all players weakly better off and some players strictly better off. As a case study, we focus on hedonic games, and provide lower and upper bounds of the Price of Pareto Optimality in three classes of hedonic games: additively separable hedonic games, fractional hedonic games, and modified fractional hedonic games; for fractional hedonic games on trees our bounds are tight.

【Keywords】: pareto optimality; hedonic games

67. Multiwinner Analogues of the Plurality Rule: Axiomatic and Algorithmic Perspectives.

【Paper Link】【Pages】:482-488

【Authors】: Piotr Faliszewski ; Piotr Skowron ; Arkadii M. Slinko ; Nimrod Talmon

【Abstract】: We characterize the class of committee scoring rules that satisfy the fixed-majority criterion. In some sense, the committee scoring rules in this class are multiwinner analogues of the single-winner Plurality rule, which is uniquely characterized as the only single-winner scoring rule that satisfies the simple majority criterion. We find that, for most of the rules in our new class, the complexity of winner determination is high (i.e., the problem of computing the winners is NP-hard), but we also show some examples of polynomial-time winner determination procedures, exact and approximate.

【Keywords】: multiwinner voting; committee scoring rules; top-k-counting rules; fixed-majority criterion; axioms; algorithms

68. Ad Auctions and Cascade Model: GSP Inefficiency and Algorithms.

【Paper Link】【Pages】:489-495

【Authors】: Gabriele Farina ; Nicola Gatti

【Abstract】: The design of the best economic mechanism for Sponsored Search Auctions (SSAs) is a central task in computational mechanism design/game theory. Two open questions concern (i) the adoption of user models more accurate than the currently used one and (ii) the choice between Generalized Second Price auction (GSP) and Vickrey–Clark–Groves mechanism (VCG). In this paper, we provide some contributions to answer these questions. We study Price of Anarchy (PoA) and Price of Stability (PoS) over social welfare and auctioneer’s revenue of GSP w.r.t. the VCG when the users follow the famous cascade model. Furthermore, we provide exact, randomized, and approximate algorithms, showing that in real–world settings (Yahoo! Webscope A3 dataset, 10 available slots) optimal allocations can be found in less than 1s with up to 1,000 ads, and can be approximated in less than 20ms even with more than 1,000 ads with an average accuracy greater than 99%.

【Keywords】: Sponsored search auctions; Winner determination problem; Mechanism design

69. Variations on the Hotelling-Downs Model.

【Paper Link】【Pages】:496-501

【Authors】: Michal Feldman ; Amos Fiat ; Svetlana Obraztsova

【Abstract】: In this paper we expand the standard Hotelling-Downs model of spatial competition to a setting where clients do not necessarily choose their closest candidate (retail product or political). Specifically, we consider a setting where clients may disavow all candidates if there is no candidate that is sufficiently close to the client preferences. Moreover, if there are multiple candidates that are sufficiently close, the client may choose amongst them at random. We show the existence of Nash Equilibria for some such models, and study the price of anarchy and stability in such scenarios.

【Keywords】:

70. A Geometric Method to Construct Minimal Peer Prediction Mechanisms.

【Paper Link】【Pages】:502-508

【Authors】: Rafael M. Frongillo ; Jens Witkowski

【Abstract】: Minimal peer prediction mechanisms truthfully elicit private information (e.g., opinions or experiences) from rational agents without the requirement that ground truth is eventually revealed. In this paper, we use a geometric perspective to prove that minimal peer prediction mechanisms are equivalent to power diagrams, a type of weighted Voronoi diagram. Using this characterization and results from computational geometry, we show that many of the mechanisms in the literature are unique up to affine transformations, and introduce a general method to construct new truthful mechanisms.

【Keywords】: Mechanism Design; Peer Prediction; Incentive Schemes; Computational Geometry; Power Diagram; Robustness

71. Sequence-Form and Evolutionary Dynamics: Realization Equivalence to Agent Form and Logit Dynamics.

【Paper Link】【Pages】:509-515

【Authors】: Nicola Gatti ; Marcello Restelli

【Abstract】: Evolutionary game theory provides the principal tools to model the dynamics of multi-agent learning algorithms. While there is a long-standing literature on evolutionary game theory in strategic-form games, in the case of extensive-form games few results are known and the exponential size of the representations currently adopted makes the evolutionary analysis of such games unaffordable. In this paper, we focus on dynamics for the sequence form of extensive-form games, providing three dynamics: one realization equivalent to the normal-form logit dynamic, one realization equivalent to the agent-form replicator dynamic, and one realization equivalent to the agent-form logit dynamic. All the considered dynamics require polynomial time and space, providing an exponential compression w.r.t. the dynamics currently known and providing thus tools that can be effectively employed in practice. Moreover, we use our tools to compare the agent-form and normal-form dynamics and to provide new "hybrid" dynamics.

【Keywords】: Evolutionary game theory, extensive-form games

72. Who Can Win a Single-Elimination Tournament?

【Paper Link】【Pages】:516-522

【Authors】: Michael P. Kim ; Warut Suksompong ; Virginia Vassilevska Williams

【Abstract】: A single-elimination (SE) tournament is a popular way to select a winner in both sports competitions and in elections. A natural and well-studied question is the tournament fixing problem (TFP): given the set of all pairwise match outcomes, can a tournament organizer rig an SE tournament by adjusting the initial seeding so that their favorite player wins? We prove new sufficient conditions on the pairwise match outcome information and the favorite player, under which there is guaranteed to be a seeding where the player wins the tournament. Our results greatly generalize previous results. We also investigate the relationship between the set of players that can win an SE tournament under some seeding (so called SE winners) and other traditional tournament solutions. In addition, we generalize and strengthen prior work on probabilistic models for generating tournaments. For instance, we show that every player in an n player tournament generated by the Condorcet Random Model will be an SE winner even when the noise is as small as possible, p = Θ(ln n/n); prior work only had such results for p ≥ Ω( ln n/n). We also establish new results for significantly more general generative models.

【Keywords】: knockout tournament; tournament fixing; game theory; social choice

【Paper Link】【Pages】:523-529

【Authors】: David Kurokawa ; Ariel D. Procaccia ; Junxing Wang

【Abstract】: The fairness notion of maximin share (MMS) guarantee underlies a deployed algorithm for allocating indivisible goods under additive valuations. Our goal is to understand when we can expect to be able to give each player his MMS guarantee. Previous work has shown that such an MMS allocation may not exist, but the counterexample requires a number of goods that is exponential in the number of players; we give a new construction that uses only a linear number of goods. On the positive side, we formalize the intuition that these counterexamples are very delicate by designing an algorithm that provably finds an MMS allocation with high probability when valuations are drawn at random.

【Keywords】: fair division; maximin

74. Multi-Attribute Proportional Representation.

【Paper Link】【Pages】:530-536

【Authors】: Jérôme Lang ; Piotr Krzysztof Skowron

【Abstract】: We consider the following problem in which a given number of items has to be chosen from a predefined set. Each item is described by a vector of attributes and for each attribute there is a desired distribution that the selected set should fit. We look for a set that fits as much as possible the desired distributions on all attributes. Examples of applications include choosing members of a representative committee, where candidates are described by attributes such as sex, age and profession, and where we look for a committee that for each attribute offers a certain representation, i.e., a single committee that contains a certain number of young and old people, certain number of men and women, certain number of people with different professions, etc. With a single attribute the problem boils down to the apportionment problem for party-list proportional representation systems (in such case the value of the single attribute is the political affiliation of a candidate). We study some properties of the associated subset selection rules, and address their computation.

【Keywords】: proportional representation, apportionment, approximation

75. Multi-Defender Strategic Filtering Against Spear-Phishing Attacks.

【Paper Link】【Pages】:537-543

【Authors】: Aron Laszka ; Jian Lou ; Yevgeniy Vorobeychik

【Abstract】: Spear-phishing attacks pose a serious threat to sensitive computer systems, since they sidestep technical security mechanisms by exploiting the carelessness of authorized users. A common way to mitigate such attacks is to use e-mail filters which block e-mails with a maliciousness score above a chosen threshold. Optimal choice of such a threshold involves a tradeoff between the risk from delivered malicious emails and the cost of blocking benign traffic. A further complicating factor is the strategic nature of an attacker, who may selectively target users offering the best value in terms of likelihood of success and resulting access privileges. Previous work on strategic threshold-selection considered a single organization choosing thresholds for all users. In reality, many organizations are potential targets of such attacks, and their incentives need not be well aligned. We therefore consider the problem of strategic threshold-selection by a collection of independent self-interested users. We characterize both Stackelberg multi-defender equilibria, corresponding to short-term strategic dynamics, as well as Nash equilibria of the simultaneous game between all users and the attacker, modeling long-term dynamics, and exhibit a polynomial-time algorithm for computing short-term (Stackelberg) equilibria. We find that while Stackelberg multi-defender equilibrium need not exist, Nash equilibrium always exists, and remarkably, both equilibria are unique and socially optimal.

【Keywords】: spear-phishing; game theory; e-mail filtering; spam filtering; Stackelberg equilibrium; Nash equilibrium

76. Counterfactual Regret Minimization in Sequential Security Games.

【Paper Link】【Pages】:544-550

【Authors】: Viliam Lisý ; Trevor Davis ; Michael H. Bowling

【Abstract】: Many real world security problems can be modelled as finite zero-sum games with structured sequential strategies and limited interactions between the players. An abstract class of games unifying these models are the normal-form games with sequential strategies (NFGSS). We show that all games from this class can be modelled as well-formed imperfect-recall extensive-form games and consequently can be solved by counterfactual regret minimization. We propose an adaptation of the CFR+ algorithm for NFGSS and compare its performance to the standard methods based on linear programming and incremental game generation. We validate our approach on two security-inspired domains. We show that with a negligible loss in precision, CFR+ can compute a Nash equilibrium with five times less computation than its competitors.

【Keywords】: counterfactual regret minimization; sequential game; imperfect information; extensive form game

77. Optimizing Trading Assignments in Water Right Markets.

【Paper Link】【Pages】:551-557

【Authors】: Yicheng Liu ; Pingzhong Tang ; Tingting Xu ; Hang Zheng

【Abstract】: Over the past two decades, water markets have been successfully fielded in countries such as Australia, the United states, Chile, China, etc. Water users, mainly irrigators, have benefited immensely from water markets. However, the current water market design also faces certain serious barriers. It has been pointed out that transaction costs, which exists in most markets, induce great welfare loss. For example, for water markets in western China discussed in this paper, the influence of transaction costs is significant. Another important barrier is the locality of trades due to geographical constraints. Based on the water market at Xiying Irrigation, one of the most successful water market in western China, we model the water market as a graph with minimum transaction thresholds on edges. Our goal is to maximize the transaction volume or welfare. We prove that the existence of transaction costs results in no polynomial time approximation scheme (PTAS) to maximize social welfare (MAX SNP-hard). The complexities on special graphs are also presented. From a practical point of view, however, optimal social welfare can be obtained via a well-designed mixed integer linear program and can be approximated near optimally at a large scale via a heuristic algorithm. Both algorithms are tested on data sets generated from real historical trading data. Our study also suggests the importance of reducing transaction costs, for example, institutional costs in water market design. Our work opens a potentially important avenue of market design within the agenda of computational sustainability.

【Keywords】:

78. On the Complexity of mCP-nets.

【Paper Link】【Pages】:558-564

【Authors】: Thomas Lukasiewicz ; Enrico Malizia

【Abstract】: m CP-nets are an expressive and intuitive formalism based on CP-nets to reason about preferences of groups of agents. The dominance semantics of mCP-nets is based on the concept of voting, and different voting schemes give rise to different dominance semantics for the group. Unlike CP-nets, which received an extensive complexity analysis, m CP-nets, as reported multiple times in the literature, lack a precise study of the voting tasks' complexity. Prior to this work, only a complexity analysis of brute-force algorithms for these tasks was available, and this analysis only gave EXPTIME upper bounds for most of those problems. In this paper, we start to fill this gap by carrying out a precise computational complexity analysis of voting tasks on acyclic binary polynomially connected m CP-nets whose constituents are standard CP-nets. Interestingly, all these problems actually belong to various levels of the polynomial hierarchy, and some of them even belong to PTIME or LOGSPACE. Furthermore, for most of these problems, we provide completeness results, which show tight lower bounds for problems that (up to date) did not have any explicit non-obvious lower bound.

【Keywords】: User preferences; Group preferences; CP-nets; mCP-nets; Voting; Pareto voting; Majority voting; Condorcet winner; Computational complexity; Polynomial hierarchy

79. Reinstating Combinatorial Protections for Manipulation and Bribery in Single-Peaked and Nearly Single-Peaked Electorates.

【Paper Link】【Pages】:565-571

【Authors】: Vijay Menon ; Kate Larson

【Abstract】: Understanding when and how computational complexity can be used to protect elections against different manipulative actions has been a highly active research area over the past two decades. A recent body of work, however, has shown that many of the NP-hardness shields, previously obtained, vanish when the electorate has single-peaked or nearly single-peaked preferences. In light of these results, we investigate whether it is possible to reimpose NP-hardness shields for such electorates by allowing the voters to specify partial preferences instead of insisting they cast complete ballots. In particular, we show that in single-peaked and nearly single-peaked electorates, if voters are allowed to submit top-truncated ballots, then the complexity of manipulation and bribery for many voting rules increases from being in P to being NP-complete.

【Keywords】:

80. Refining Subgames in Large Imperfect Information Games.

【Paper Link】【Pages】:572-578

【Authors】: Matej Moravcik ; Martin Schmid ; Karel Ha ; Milan Hladík ; Stephen J. Gaukrodger

【Abstract】: The leading approach to solving large imperfect information games is to pre-calculate an approximate solution using a simplified abstraction of the full game; that solution is then used to play the original, full-scale game. The abstraction step is necessitated by the size of the game tree. However, as the original game progresses, the remaining portion of the tree (the subgame) becomes smaller. An appealing idea is to use the simplified abstraction to play the early parts of the game and then, once the subgame becomes tractable, to calculate a solution using a finer-grained abstraction in real time, creating a combined final strategy. While this approach is straightforward for perfect information games, it is a much more complex problem for imperfect information games. If the subgame is solved locally, the opponent can alter his play in prior to this subgame to exploit our combined strategy. To prevent this, we introduce the notion of subgame margin, a simple value with appealing properties. If any best response reaches the subgame, the improvement of exploitability of the combined strategy is (at least) proportional to the subgame margin. This motivates subgame refinements resulting in large positive margins. Unfortunately, current techniques either neglect subgame margin (potentially leading to a large negative subgame margin and drastically more exploitable strategies), or guarantee only non-negative subgame margin (possibly producing the original, unrefined strategy, even if much stronger strategies are possible). Our technique remedies this problem by maximizing the subgame margin and is guaranteed to find the optimal solution. We evaluate our technique using one of the top participants of the AAAI-14 Computer Poker Competition, the leading playground for agents in imperfect information setting

【Keywords】: game theory; subgame; extensive form game; nash equilibrium; abstraction; imperfect information; poker

81. Complexity of Hedonic Games with Dichotomous Preferences.

【Paper Link】【Pages】:579-585

【Authors】: Dominik Peters

【Abstract】: Hedonic games provide a model of coalition formation in which a set of agents is partitioned into coalitions and the agents have preferences over which set they belong to. Recently, Aziz et. al. (2014) have initiated the study of hedonic games with dichotomous preferences, where each agent either approves or disapproves of a given coalition. In this work, we study the computational complexity of questions related to finding optimal and stable partitions in dichotomous hedonic games under various ways of restricting and representing the collection of approved coalitions. Encouragingly, many of these problems turn out to be polynomial-time solvable. In particular, we show that an individually stable outcome always exists and can be found in polynomial time. We also provide efficient algorithms for cases in which agents approve only few coalitions, in which they only approve intervals, and in which they only approve sets of size 2 (the roommates case). These algorithms are complemented by NP-hardness results, especially for representations that are very expressive, such as in the case when agents' goals are given by propositional formulas.

【Keywords】: hedonic games; dichotomous preferences; stable matching; preference representation

82. Graphical Hedonic Games of Bounded Treewidth.

【Paper Link】【Pages】:586-593

【Authors】: Dominik Peters

【Abstract】: Hedonic games are a well-studied model of coalition formation, in which selfish agents are partitioned into disjoint sets and agents care about the make-up of the coalition they end up in. The computational problems of finding stable, optimal, or fair outcomes tend to be computationally intractable in even severely restricted instances of hedonic games. We introduce the notion of a graphical hedonic game and show that, in contrast, on classes of graphical hedonic games whose underlying graphs are of bounded treewidth and degree, such problems become easy. In particular, problems that can be specified through quantification over agents, coalitions, and (connected) partitions can be decided in linear time. The proof is by reduction to monadic second order logic. We also provide faster algorithms in special cases, and show that the extra condition of the degree bound cannot be dropped. Finally, we note that the problem of allocating indivisible goods can be modelled as a hedonic game, so that our results imply tractability of finding fair and efficient allocations on appropriately restricted instances.

【Keywords】: hedonic games; bounded treewidth; structural tractability; preference restrictions

83. Preferences Single-Peaked on Nice Trees.

【Paper Link】【Pages】:594-600

【Authors】: Dominik Peters ; Edith Elkind

【Abstract】: Preference profiles that are single-peaked on trees enjoy desirable properties: they admit a Condorcet winner (Demange 1982), and there are hard voting problems that become tractable on this domain (Yu et al., 2013). Trick (1989) proposed a polynomial-time algorithm that finds some tree with respect to which a given preference profile is single-peaked. However, some voting problems are only known to be easy for profiles that are single-peaked on "nice" trees, and Trick's algorithm provides no guarantees on the properties of the tree that it outputs. To overcome this issue, we build on the work of Trick and Yu et al. to develop a structural approach that enables us to compactly represent all trees with respect to which a given profile is single-peaked. We show how to use this representation to efficiently find the "best" tree for a given profile, according to a number of criteria; for other criteria, we obtain NP-hardness results. In particular, we show that it is NP-hard to decide whether an input profile is single-peaked with respect to a given tree. To demonstrate the applicability of our framework, we use it to identify a new class of profiles that admit an efficient algorithm for a popular variant of the Chamberlin-Courant rule.

【Keywords】:

84. Fast Optimal Clearing of Capped-Chain Barter Exchanges.

【Paper Link】【Pages】:601-607

【Authors】: Benjamin Plaut ; John P. Dickerson ; Tuomas Sandholm

【Abstract】: Kidney exchange is a type of barter market where patients exchange willing but incompatible donors. These exchanges are conducted via cycles---where each incompatible patient-donor pair in the cycle both gives and receives a kidney---and chains, which are started by an altruist donor who does not need a kidney in return. Finding the best combination of cycles and chains is hard. The leading algorithms for this optimization problem use either branch and price — a combination of branch and bound and column generation — or constraint generation. We show a correctness error in the leading prior branch-and-price-based approach [Glorie et al. 2014]. We develop a provably correct fix to it, which also necessarily changes the algorithm's complexity, as well as other improvements to the search algorithm. Next, we compare our solver to the leading constraint-generation-based solver and to the best prior correct branch-and-price-based solver. We focus on the setting where chains have a length cap. A cap is desirable in practice since if even one edge in the chain fails, the rest of the chain fails: the cap precludes very long chains that are extremely unlikely to execute and instead causes the solution to have more parallel chains and cycles that are more likely to succeed. We work with the UNOS nationwide kidney exchange, which uses a chain cap. Algorithms from our group autonomously make the transplant plans for that exchange. On that real data and demographically-accurate generated data, our new solver scales significantly better than the prior leading approaches.

【Keywords】: Kidney exchange; barter exchange; integer programming; branch and price; constraint generation; mechanism design; market clearing

85. Optimal Aggregation of Uncertain Preferences.

【Paper Link】【Pages】:608-614

【Authors】: Ariel D. Procaccia ; Nisarg Shah

【Abstract】: A paradigmatic problem in social choice theory deals with the aggregation of subjective preferences of individuals --- represented as rankings of alternatives --- into a social ranking. We are interested in settings where individuals are uncertain about their own preferences, and represent their uncertainty as distributions over rankings. Under the classic objective of minimizing the (expected) sum of Kendall tau distances between the input rankings and the output ranking, we establish that preference elicitation is surprisingly straightforward and near-optimal solutions can be obtained in polynomial time. We show, both in theory and using real data, that ignoring uncertainty altogether can lead to suboptimal outcomes.

【Keywords】: Uncertainty; Voting; Preference elicitation

86. False-Name-Proof Locations of Two Facilities: Economic and Algorithmic Approaches.

【Paper Link】【Pages】:615-621

【Authors】: Akihisa Sonoda ; Taiki Todo ; Makoto Yokoo

【Abstract】: This paper considers a mechanism design problem for locating two identical facilities on an interval, in which an agent can pretend to be multiple agents. A mechanism selects a pair of locations on the interval according to the declared single-peaked preferences of agents. An agent's utility is determined by the location of the better one (typically the closer to her ideal point). This model can represent various application domains. For example, assume a company is going to release two models of its product line and performs a questionnaire survey in an online forum to determine their detailed specs. Typically, a customer will buy only one model, but she can answer multiple times by logging onto the forum under several email accounts. We first characterize possible outcomes of mechanisms that satisfy false-name-proofness, as well as some mild conditions. By extending the result, we completely characterize the class of false-name-proof mechanisms when locating two facilities on a circle. We then clarify the approximation ratios of the false-name-proof mechanisms on a line metric for the social and maximum costs.

【Keywords】: facility location problem; mechanism design; false-name-proofness; strategy-proofness; approximation analysis

87. Closeness Centrality for Networks with Overlapping Community Structure.

【Paper Link】【Pages】:622-629

【Authors】: Mateusz Krzysztof Tarkowski ; Piotr L. Szczepanski ; Talal Rahwan ; Tomasz P. Michalak ; Michael Wooldridge

【Abstract】: Certain real-life networks have a community structure in which communities overlap. For example, a typical bus network includes bus stops (nodes), which belong to one or more bus lines (communities) that often overlap. Clearly, it is important to take this information into account when measuring the centrality of a bus stop - how important it is to the functioning of the network. For example, if a certain stop becomes inaccessible, the impact will depend in part on the bus lines that visit it. However, existing centrality measures do not take such information into account. Our aim is to bridge this gap. We begin by developing a new game-theoretic solution concept, which we call the Configuration semivalue, in order to have greater flexibility in modelling the community structure compared to previous solution concepts from cooperative game theory. We then use the new concept as a building block to construct the first extension of Closeness centrality to networks with community structure (overlapping or otherwise). Despite the computational complexity inherited from the Configuration semivalue, we show that the corresponding extension of Closeness centrality can be computed in polynomial time. We empirically evaluate this measure and our algorithm that computes it by analysing the Warsaw public transportation network.

【Keywords】: Network Centrality; Game Theory

88. Computing Rational Decisions In Extensive Games With Limited Foresight.

【Paper Link】【Pages】:630-636

【Authors】: Paolo Turrini

【Abstract】: We introduce a class of extensive form games whereplayers might not be able to foresee the possible consequences of their decisions and form a model of theiropponents which they exploit to achieve a more profitable outcome. We improve upon existing models ofgames with limited foresight, endowing players with theability of higher order reasoning and proposing a novelsolution concept to address intuitions coming from realgame play. We analyse the resulting equilibria, devisingan effective procedure to compute them.

【Keywords】: Limited Foresight; Epistemic Game Theory; Equilibrium Analysis; Algorithms

89. Computing Optimal Monitoring Strategy for Detecting Terrorist Plots.

【Paper Link】【Pages】:637-643

【Authors】: Zhen Wang ; Yue Yin ; Bo An

【Abstract】: In recent years, terrorist organizations (e.g., ISIS or al-Qaeda) are increasingly directing terrorists to launch coordinated attacks in their home countries. One example is the Paris shootings on January 7, 2015.By monitoring potential terrorists, security agencies are able to detect and stop terrorist plots at their planning stage.Although security agencies may have knowledge about potential terrorists (e.g., who they are, how they interact), they usually have limited resources and cannot monitor all terrorists.Moreover, a terrorist planner may strategically choose to arouse terrorists considering the security agency's monitoring strategy. This paper makes five key contributions toward the challenging problem of computing optimal monitoring strategies: 1) A new Stackelberg game model for terrorist plot detection;2) A modified double oracle framework for computing the optimal strategy effectively;3) Complexity results for both defender and attacker oracle problems;4) Novel mixed-integer linear programming (MILP) formulations for best response problems of both players;and 5) Effective approximation algorithms for generating suboptimal responses for both players.Experimental evaluation shows that our approach can obtain a robust enough solution outperforming widely-used centrality based heuristics significantly and scale up to realistic-sized problems.

【Keywords】:

90. Quantitative Extensions of the Condorcet Jury Theorem with Strategic Agents.

【Paper Link】【Pages】:644-650

【Authors】: Lirong Xia

【Abstract】: The Condorcet Jury Theorem justifies the wisdom of crowds and lays the foundations of the ideology of the democratic regime. However, the Jury Theorem and most of its extensions focus on two alternatives and none of them quantitatively evaluate the effect of agents’ strategic behavior on the mechanism’s truth-revealing power. We initiate a research agenda of quantitatively extend- ing the Jury Theorem with strategic agents by characterizing the price of anarchy (PoA) and the price of stability (PoS) of the common interest Bayesian voting games for three classes of mechanisms: plurality, MAPs, and the mechanisms that satisfy anonymity, neutrality, and strategy-proofness (w.r.t. a set of natural probabil- ity models). We show that while plurality and MAPs have better best-case truth-revealing power (lower PoS), the third class of mechanisms are more robust against agents’ strategic behavior (lower PoA).

【Keywords】:

91. Lift-Based Bidding in Ad Selection.

【Paper Link】【Pages】:651-657

【Authors】: Jian Xu ; Xuhui Shao ; Jianjie Ma ; Kuang-chih Lee ; Hang Qi ; Quan Lu

【Abstract】: Real-time bidding has become one of the largest online advertising markets in the world. Today the bid price per ad impression is typically decided by the expected value of how it can lead to a desired action event to the advertiser. However, this industry standard approach to decide the bid price does not consider the actual effect of the ad shown to the user, which should be measured based on the performance lift among users who have been or have not been exposed to a certain treatment of ads. In this paper, we propose a new bidding strategy and prove that if the bid price is decided based on the performance lift rather than absolute performance value, advertisers can actually gain more action events. We describe the modeling methodology to predict the performance lift and demonstrate the actual performance gain through blind A/B test with real ad campaigns. We also show that to move the demand-side platforms to bid based on performance lift, they should be rewarded based on the relative performance lift they contribute.

【Keywords】: Lift-Based Bidding; Real-Time Bidding; Demand-Side Platform; Attribution; Online Advertising

92. Optimizing Personalized Email Filtering Thresholds to Mitigate Sequential Spear Phishing Attacks.

【Paper Link】【Pages】:658-665

【Authors】: Mengchen Zhao ; Bo An ; Christopher Kiekintveld

【Abstract】: Highly targeted spear phishing attacks are increasingly common, and have been implicated in many major security breeches. Email filtering systems are the first line of defense against such attacks. These filters are typically configured with uniform thresholds for deciding whether or not to allow a message to be delivered to a user. However, users have very significant differences in both their susceptibility to phishing attacks as well as their access to critical information and credentials that can cause damage. Recent work has considered setting personalized thresholds for individual users based on a Stackelberg game model. We consider two important extensions of the previous model. First, in our model user values can be substitutable, modeling cases where multiple users provide access to the same information or credential. Second, we consider attackers who make sequential attack plans based on the outcome of previous attacks. Our analysis starts from scenarios where there is only one credential and then extends to more general scenarios with multiple credentials. For single-credential scenarios, we demonstrate that the optimal defense strategy can be found by solving a binary combinatorial optimization problem called PEDS. For multiple-credential scenarios, we formulate it as a bilevel optimization problem for finding the optimal defense strategy and then reduce it to a single level optimization problem called PEMS using complementary slackness conditions. Experimental results show that both PEDS and PEMS lead to significant higher defender utilities than two existing benchmarks in different parameter settings. Also, both PEDS and PEMS are more robust than the existing benchmarks considering uncertainties.

【Keywords】: Spear Phishing; Stackelberg Game; Sequential Attack

Technical Papers: Heuristic Search and Optimization 22

93. Unsupervised Feature Selection by Heuristic Search with Provable Bounds on Suboptimality.

【Paper Link】【Pages】:666-672

【Authors】: Hiromasa Arai ; Crystal Maung ; Ke Xu ; Haim Schweitzer

【Abstract】: Identifying a small number of features that can represent the data is a known problem that comes up in areas such as machine learning, knowledge representation, data mining, and numerical linear algebra. Computing an optimal solution is believed to be NP-hard, and there is extensive work on approximation algorithms. Classic approaches exploit the algebraic structure of the underlying matrix, while more recent approaches use randomization. An entirely different approach that uses the A heuristic search algorithm to find an optimal solution was recently proposed. Not surprisingly it is limited to effectively selecting only a small number of features. We propose a similar approach related to the Weighted A algorithm. This gives algorithms that are not guaranteed to find an optimal solution but run much faster than the A* approach, enabling effective selection of many features from large datasets. We demonstrate experimentally that these new algorithms are more accurate than the current state-of-the-art while still being practical. Furthermore, they come with an adjustable guarantee on how different their error may be from the smallest possible (optimal) error. Their accuracy can always be increased at the expense of a longer running time.

【Keywords】:

94. Tiebreaking Strategies for A* Search: How to Explore the Final Frontier.

【Paper Link】【Pages】:673-679

【Authors】: Masataro Asai ; Alex S. Fukunaga

【Abstract】: Despite recent improvements in search techniques for cost-optimal classical planning, the exponential growth of the size of the search frontier in A is unavoidable. We investigate tiebreaking strategies for A, experimentally analyzing the performance of standard tiebreaking strategies that break ties according to the heuristic value of the nodes. We find that tiebreaking has a significant impact on search algorithm performance when there are zero-cost operators that induce large plateau regions in the search space. We develop a new framework for tiebreaking based on a depth metric which measures distance from the entrance to the plateau, and propose a new, randomized strategy which significantly outperforms standard strategies on domains with zero-cost actions.

【Keywords】: Heuristic Search; Tiebreaking; Classical Planning

95. CAPReS: Context Aware Persona Based Recommendation for Shoppers.

【Paper Link】【Pages】:680-686

【Authors】: Joydeep Banerjee ; Gurulingesh Raravi ; Manoj Gupta ; Sindhu K. Ernala ; Shruti Kunde ; Koustuv Dasgupta

【Abstract】: Nowadays, brick-and-mortar stores are finding it extremely difficult to retain their customers due to the ever increasing competition from the online stores. One of the key reasons for this is the lack of personalized shopping experience offered by the brick-and-mortar stores. This work considers the problem of persona based shopping recommendation for such stores to maximize the value for money of the shoppers. For this problem, it proposes a non-polynomial time-complexity optimal dynamic program and a polynomial time-complexity non-optimal heuristic, for making top-k recommendations by taking into account shopper persona and her time and budget constraints. In our empirical evaluations with a mix of real-world data and simulated data, the performance of the heuristic in terms of the persona based recommendations (quantified by similarity scores and items recommended) closely matched (differed by only 8% each with) that of the dynamic program and at the same time heuristic ran at least twice faster compared to the dynamic program.

【Keywords】: Shopping recommendation; Shopper persona; Greedy heuristic; Dynamic programming; Graph optimization

96. Nested Monte Carlo Search for Two-Player Games.

【Paper Link】【Pages】:687-693

【Authors】: Tristan Cazenave ; Abdallah Saffidine ; Michael John Schofield ; Michael Thielscher

【Abstract】: The use of the Monte Carlo playouts as an evaluation function has proved to be a viable, general technique for searching intractable game spaces. This facilitate the use of statistical techniques like Monte Carlo Tree Search (MCTS), but is also known to require significant processing overhead. We seek to improve the quality of information extracted from the Monte Carlo playout in three ways. Firstly, by nesting the evaluation function inside another evaluation function; secondly, by measuring and utilising the depth of the playout; and thirdly, by incorporating pruning strategies that eliminate unnecessary searches and avoid traps. Our experimental data, obtained on a variety of two-player games from past General Game Playing (GGP) competitions and others, demonstrate the usefulness of these techniques in a Nested Player when pitted against a standard, optimised UCT player.

【Keywords】:

97. Look-Ahead with Mini-Bucket Heuristics for MPE.

【Paper Link】【Pages】:694-701

【Authors】: Rina Dechter ; Kalev Kask ; William Lam ; Javier Larrosa

【Abstract】: The paper investigates the potential of look-ahead in the con-text of AND/OR search in graphical models using the Mini-Bucket heuristic for combinatorial optimization tasks (e.g., MAP/MPE or weighted CSPs). We present and analyze the complexity of computing the residual (a.k.a Bellman update) of the Mini-Bucket heuristic and show how this can be used to identify which parts of the search space are more likely to benefit from look-ahead and how to bound its overhead. We also rephrase the look-ahead computation as a graphical model, to facilitate structure exploiting inference schemes. We demonstrate empirically that augmenting Mini-Bucket heuristics by look-ahead is a cost-effective way of increasing the power of Branch-And-Bound search.

【Keywords】:

98. Solving the Station Repacking Problem.

【Paper Link】【Pages】:702-709

【Authors】: Alexandre Fréchette ; Neil Newman ; Kevin Leyton-Brown

【Abstract】: We investigate the problem of repacking stations in the FCC's upcoming, multi-billion-dollar "incentive auction". Early efforts to solve this problem considered mixed-integer programming formulations, which we show are unable to reliably solve realistic, national-scale problem instances. We describe the result of a multi-year investigation of alternatives: a solver, SATFC, that has been adopted by the FCC for use in the incentive auction. SATFC is based on a SAT encoding paired with a wide range of techniques: constraint graph decomposition; novel caching mechanisms that allow for reuse of partial solutions from related, solved problems; algorithm configuration; algorithm portfolios; and the marriage of local-search and complete solver strategies. We show that our approach solves virtually all of a set of problems derived from auction simulations within the short time budget required in practice.

【Keywords】: spectrum auction; incentive auction; algorithm configuration; algorithm portfolios

99. The Complexity Landscape of Decompositional Parameters for ILP.

【Paper Link】【Pages】:710-716

【Authors】: Robert Ganian ; Sebastian Ordyniak

【Abstract】: Integer Linear Programming (ILP) can be seen as the archetypical problem for NP-complete optimization problems, and a wide range of problems in artificial intelligence are solved in practice via a translation to ILP. Despite its huge range of applications, only few tractable fragments of ILP are known, probably the most prominent of which is based on the notion of total unimodularity. Using entirely different techniques, we identify new tractable fragments of ILP by studying structural parameterizations of the constraint matrix within the framework of parameterized complexity. In particular, we show that ILP is fixed-parameter tractable when parameterized by the treedepth of the constraint matrix and the maximum absolute value of any coefficient occurring in the ILP instance. Together with matching hardness results for the more general parameter treewidth, we draw a detailed complexity landscape of ILP w.r.t. decompositional parameters defined on the constraint matrix.

【Keywords】: integer linear programming; treedepth and treewidth; parameterized complexity; complexity landscape

100. Abstract Zobrist Hashing: An Efficient Work Distribution Method for Parallel Best-First Search.

【Paper Link】【Pages】:717-723

【Authors】: Yuu Jinnai ; Alex Fukunaga

【Abstract】: Hash Distributed A (HDA) is an efficient parallel best first algorithm that asynchronously distributes work among the processes using a global hash function. Although Zobrist hashing, the standard hash function used by HDA, achieves good load balance for many domains, it incurs significant communication overhead since it requires many node transfers among threads. We propose Abstract Zobrist hashing, a new work distribution method for parallel search which reduces node transfers and mitigates communication overhead by using feature projection functions. We evaluate Abstract Zobrist hashing for multicore HDA, and show that it significantly outperforms previous work distribution methods.

【Keywords】: Heuristic Search; Parallel Search; Work Distribution

101. Learning to Branch in Mixed Integer Programming.

【Paper Link】【Pages】:724-731

【Authors】: Elias Boutros Khalil ; Pierre Le Bodic ; Le Song ; George L. Nemhauser ; Bistra N. Dilkina

【Abstract】: The design of strategies for branching in Mixed Integer Programming (MIP) is guided by cycles of parameter tuning and offline experimentation on an extremely heterogeneous testbed, using the average performance. Once devised, these strategies (and their parameter settings) are essentially input-agnostic. To address these issues, we propose a machine learning (ML) framework for variable branching in MIP.Our method observes the decisions made by Strong Branching (SB), a time-consuming strategy that produces small search trees, collecting features that characterize the candidate branching variables at each node of the tree. Based on the collected data, we learn an easy-to-evaluate surrogate function that mimics the SB strategy, by means of solving a learning-to-rank problem, common in ML. The learned ranking function is then used for branching. The learning is instance-specific, and is performed on-the-fly while executing a branch-and-bound search to solve the MIP instance. Experiments on benchmark instances indicate that our method produces significantly smaller search trees than existing heuristics, and is competitive with a state-of-the-art commercial solver.

【Keywords】: Discrete Optimization; Search; Branch and Bound; Branching; Mixed Integer Programming; Heuristic Search; Learning to Rank

102. Local Search for Hard SAT Formulas: The Strength of the Polynomial Law.

【Paper Link】【Pages】:732-738

【Authors】: Sixue Liu ; Periklis A. Papakonstantinou

【Abstract】: Random k -CNF formulas at the anticipated k -SAT phase-transition point are prototypical hard k-SAT instances. We develop a stochastic local search algorithm and study it both theoretically and through a large-scale experimental study. The algorithm comes as a result of a systematic study that contrasts rates at which a certain measure concentration phenomenon occurs. This study yields a new stochastic rule for local search. A strong point of our contribution is the conceptual simplicity of our algorithm. More importantly, the empirical results overwhelmingly indicate that our algorithm outperforms the state-of-the-art. This includes a number of winners and medalist solvers from the recent SAT Competitions.

【Keywords】: SAT-solver, local search, theory, experiments

103. Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting.

【Paper Link】【Pages】:739-745

【Authors】: Canyi Lu ; Huan Li ; Zhouchen Lin ; Shuicheng Yan

【Abstract】: The Augmented Lagragian Method (ALM) and Alternating Direction Method of Multiplier (ADMM) have been powerful optimization methods for general convex programming subject to linear constraint. We consider the convex problem whose objective consists of a smooth part and a nonsmooth but simple part. We propose the Fast Proximal Augmented Lagragian Method (Fast PALM) which achieves the convergence rate O(1/K2), compared with O(1/K) by the traditional PALM. In order to further reduce the per-iteration complexity and handle the multi-blocks problem, we propose the Fast Proximal ADMM with Parallel Splitting (Fast PL-ADMM-PS) method. It also partially improves the rate related to the smooth part of the objective function. Experimental results on both synthesized and real world data demonstrate that our fast methods significantly improve the previous PALM and ADMM

【Keywords】: fast alternating direction method of multiplier

104. Combining Bounding Boxes and JPS to Prune Grid Pathfinding.

【Paper Link】【Pages】:746-752

【Authors】: Steve Rabin ; Nathan R. Sturtevant

【Abstract】: Pathfinding is a common task across many domains and platforms, whether in games, robotics, or road maps. Given the breadth of domains, there are also a wide variety of representations used for pathfinding, and there are many techniques which have been shown to improve performance. In the last few years, the state-of-the-art in grid-based pathfinding has been significantly improved with domain-specific techniques such as Jump Point Search (JPS), Subgoal Graphs, and Compressed Path Databases. In this paper we look at a specific implementation of the general idea of Geometric Containers, showing that, while it is effective on grid maps, when combined with JPS+ it provides state-of-the-art performance.

【Keywords】: heuristic search, jump point search, grids, pathfinding

105. Fast ADMM Algorithm for Distributed Optimization with Adaptive Penalty.

【Paper Link】【Pages】:753-759

【Authors】: Changkyu Song ; Sejong Yoon ; Vladimir Pavlovic

【Abstract】: We propose new methods to speed up convergence of the Alternating Direction Method of Multipliers (ADMM), a common optimization tool in the context of large scale and distributed learning. The proposed method accelerates the speed of convergence by automatically deciding the constraint penalty needed for parameter consensus in each iteration. In addition, we also propose an extension of the method that adaptively determines the maximum number of iterations to update the penalty. We show that this approach effectively leads to an adaptive, dynamic network topology underlying the distributed optimization. The utility of the new penalty update schemes is demonstrated on both synthetic and real data, including an instance of the probabilistic matrix factorization task known as the structure from motion problem.

【Keywords】:

106. Towards Clause-Learning State Space Search: Learning to Recognize Dead-Ends.

【Paper Link】【Pages】:760-768

【Authors】: Marcel Steinmetz ; Jörg Hoffmann

【Abstract】: We introduce a state space search method that identifies dead-end states, analyzes the reasons for failure, and learns to avoid similar mistakes in the future. Our work is placed in classical planning. The key technique are critical-path heuristics h C , relative to a set C of conjunctions. These recognize a dead-end state s, returning h C (s) = infty, if s has no solution even when allowing to break up conjunctive subgoals into the elements of C. Our key idea is to learn C during search. Starting from a simple initial C, we augment search to identify unrecognized dead-ends s, where h C (s) < infinity. We design methods analyzing the situation at such s, adding new conjunctions into C to obtain h C (s) = infty, thus learning to recognize s as well as similar dead-ends search may encounter in the future. We furthermore learn clauses phi where s' not satisfying phi implies hC(s') = infty, to avoid the prohibitive overhead of computing h C on every search state. Arranging these techniques in a depth-first search, we obtain an algorithm approaching the elegance of clause learning in SAT, learning to refute search subtrees. Our experiments show that this can be quite powerful. On problems where dead-ends abound, the learning reliably reduces the search space by several orders of magnitude.

【Keywords】: state space search; conflict analysis and learning; planning

107. Implementing Troubleshooting with Batch Repair.

【Paper Link】【Pages】:769-775

【Authors】: Roni Stern ; Meir Kalech ; Hilla Shinitzky

【Abstract】: Recent work has raised the challenge of efficient automated troubleshooting in domains where repairing a set of components in a single repair action is cheaper than repairing each of them separately. This corresponds to cases where there is a non-negligible overhead to initiating a repair action and to testing the system after a repair action. In this work we propose several algorithms for choosing which batch of components to repair, so as to minimize the overall repair costs. Experimentally, we show the benefit of these algorithms over repairing components one at a time.

【Keywords】: Automated troubleshooting, Automated diagnosis

108. A Combinatorial Search Perspective on Diverse Solution Generation.

【Paper Link】【Pages】:776-783

【Authors】: Satya Gautam Vadlamudi ; Subbarao Kambhampati

【Abstract】: Finding diverse solutions has become important in many combinatorial search domains, including Automated Planning, Path Planning and Constraint Programming. Much of the work in these directions has however focussed on coming up with appropriate diversity metrics and compiling those metrics in to the solvers/planners. Most approaches use linear-time greedy algorithms for exploring the state space of solution combinations for generating a diverse set of solutions, limiting not only their completeness but also their effectiveness within a time bound. In this paper, we take a combinatorial search perspective on generating diverse solutions. We present a generic bi-level optimization framework for finding cost-sensitive diverse solutions. We propose complete methods under this framework, which guarantee finding a set of cost sensitive diverse solutions satisficing the given criteria whenever there exists such a set. We identify various aspects that affect the performance of these exhaustive algorithms and propose techniques to improve them. Experimental results show the efficacy of the proposed framework compared to an existing greedy approach.

【Keywords】: Diverse Solutions, Planning, Constraint Programming

109. On the Completeness of Best-First Search Variants That Use Random Exploration.

【Paper Link】【Pages】:784-790

【Authors】: Richard Anthony Valenzano ; Fan Xie

【Abstract】: While suboptimal best-first search algorithms like Greedy Best-First Search are frequently used when building automated planning systems, their greedy nature can make them susceptible to being easily misled by flawed heuristics. This weakness has motivated the development of best-first search variants like epsilon-greedy node selection, type-based exploration, and diverse best-first search, which all use random exploration to mitigate the impact of heuristic error. In this paper, we provide a theoretical justification for this increased robustness by formally analyzing how these algorithms behave on infinite graphs. In particular, we show that when using these approaches on any infinite graph, the probability of not finding a solution can be made arbitrarily small given enough time. This result is shown to hold for a class of algorithms that includes the three mentioned above, regardless of how misleading the heuristic is.

【Keywords】: greedy best-first search; random exploration; heuristic error; probabilistic completeness; infinite graph; graph-search problems; epsilon-greedy node selection; type-based exploration; diverse best-first search

【Paper Link】【Pages】:791-797

【Authors】: Biao Wang ; Ge Chen ; Luoyi Fu ; Li Song ; Xinbing Wang ; Xue Liu

【Abstract】: Rumor blocking is a serious problem in large-scale social networks. Malicious rumors could cause chaos in society and hence need to be blocked as soon as possible after being detected. In this paper, we propose a model of dynamic rumor influence minimization with user experience (DRIMUX). Our goal is to minimize the influence of the rumor (i.e., the number of users that have accepted and sent the rumor) by blocking a certain subset of nodes. A dynamic Ising propagation model considering both the global popularity and individual attraction of the rumor is presented based on realistic scenario. In addition, different from existing problems of influence minimization, we take into account the constraint of user experience utility. Specifically, each node is assigned a tolerance time threshold. If the blocking time of each user exceeds that threshold, the utility of the network will decrease. Under this constraint, we then formulate the problem as a network inference problem with survival theory, and propose solutions based on maximum likelihood principle. Experiments are implemented based on large-scale real world networks and validate the effectiveness of our method.

【Keywords】: Social network; Rumor blocking; User experience

111. Linearized Alternating Direction Method with Penalization for Nonconvex and Nonsmooth Optimization.

【Paper Link】【Pages】:798-804

【Authors】: Yiyang Wang ; Risheng Liu ; Xiaoliang Song ; Zhixun Su

【Abstract】: Being one of the most effective methods, Alternating Direction Method (ADM) has been extensively studied in numerical analysis for solving linearly constrained convex program. However, there are few studies focusing on the convergence property of ADM under nonconvex framework though it has already achieved well-performance on applying to various nonconvex tasks. In this paper, a linearized algorithm with penalization is proposed on the basis of ADM for solving nonconvex and nonsmooth optimization. We start from analyzing the convergence property for the classical constrained problem with two variables and then establish a similar result for multi-block case. To demonstrate the effectiveness of our proposed algorithm, experiments with synthetic and real-world data have been conducted on specific applications in signal and image processing.

【Keywords】:

112. Two Efficient Local Search Algorithms for Maximum Weight Clique Problem.

【Paper Link】【Pages】:805-811

【Authors】: Yiyuan Wang ; Shaowei Cai ; Minghao Yin

【Abstract】: The Maximum Weight Clique problem (MWCP) is an important generalization of the Maximum Clique problem with wide applications. This paper introduces two heuristics and develops two local search algorithms for MWCP. Firstly, we propose a heuristic called strong configuration checking (SCC), which is a new variant of a recent powerful strategy called configuration checking (CC) for reducing cycling in local search. Based on the SCC strategy, we develop a local search algorithm named LSCC. Moreover, to improve the performance on massive graphs, we apply a low-complexity heuristic called Best from Multiple Selection (BMS) to select the swapping vertex pair quickly and effectively. The BMS heuristic is used to improve LSCC, resulting in the LSCC+BMS algorithm. Experiments show that the proposed algorithms outperform the state-of-the-art local search algorithm MN/TS and its improved version MN/TS+BMS on the standard benchmarks namely DIMACS and BHOSLIB, as well as a wide range of real world massive graphs.

【Keywords】: local search, strong configuration checking, MWCP, Best from Multiple Selection, massive graph

113. Relaxed Majorization-Minimization for Non-Smooth and Non-Convex Optimization.

【Paper Link】【Pages】:812-818

【Authors】: Chen Xu ; Zhouchen Lin ; Zhenyu Zhao ; Hongbin Zha

【Abstract】: We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.

【Keywords】: Majorization-Minimization; Non-smooth and Non-convex Optimization; Robust Matrix Factorization

114. Submodular Optimization with Routing Constraints.

【Paper Link】【Pages】:819-826

【Authors】: Haifeng Zhang ; Yevgeniy Vorobeychik

【Abstract】: Submodular optimization, particularly under cardinality or cost constraints, has received considerable attention, stemming from its breadth of application, ranging from sensor placement to targeted marketing. However, the constraints faced in many real domains are more complex. We investigate an important and very general class of problems of maximizing a submodular function subject to general cost constraints, especially focusing on costs coming from route planning. Canoni- cal problems that motivate our framework include mobile robotic sensing, and door-to-door marketing. We propose a generalized cost-benefit (GCB) greedy al- gorithm for our problem, and prove bi-criterion approximation guarantees under significantly weaker assumptions than those in related literature. Experimental evaluation on realistic mobile sensing and door-to-door marketing problems, as well as using simulated networks, show that our algorithm achieves significantly higher utility than state-of-the-art alternatives, and has either lower or competitive running time.

【Keywords】: Submodular Optimization; Traveling Salesman Problem; Cost-benefit Greedy Algorithm; Approximation Guarantees

Technical Papers: Human-Computation and Crowd Sourcing 1

115. Behavioral Experiments in Email Filter Evasion.

【Paper Link】【Pages】:827-834

【Authors】: Liyiming Ke ; Bo Li ; Yevgeniy Vorobeychik

【Abstract】: Despite decades of effort to combat spam, unwanted and even malicious emails, such as phish which aim to deceive recipients into disclosing sensitive information, still routinely find their way into one's mailbox.To be sure, email filters manage to stop a large fraction of spam emails from ever reaching users, but spammers and phishers have mastered the art of filter evasion, or manipulating the content of email messages to avoid being filtered.We present a unique behavioral experiment designed to study email filter evasion.Our experiment is framed in somewhat broader terms: given the widespread use of machine learning methods for distinguishing spam and non-spam, we investigate how human subjects manipulate a spam template to evade a classification-based filter.We find that adding a small amount of noise to a filter significantly reduces the ability of subjects to evade it, observing that noise does not merely have a short-term impact, but also degrades evasion performance in the longer term.Moreover, we find that greater coverage of an email template by the classifier (filter) features significantly increases the difficulty of evading it.This observation suggests that aggressive feature reduction — a common practice in applied machine learning — can actually facilitate evasion.In addition to the descriptive analysis of behavior, we develop a synthetic model of human evasion behavior which closely matches observed behavior and effectively replicates experimental findings in simulation.

【Keywords】: Adversarial classification; game theory; evasion classification

Technical Papers: Humans and AI 4

116. An Oral Exam for Measuring a Dialog System's Capabilities.

【Paper Link】【Pages】:835-841

【Authors】: David Cohen ; Ian Lane

【Abstract】: This paper suggests a model and methodology for measuring the breadth and flexibility of a dialog system's capabilities. The approach relies on having human evaluators administer a targeted oral exam to a system and provide their subjective views of that system's performance on each test problem. We present results from one instantiation of this test being performed on two publicly-accessible dialog systems and a human, and show that the suggested metrics do provide useful insights into the relative strengths and weaknesses of these systems. Results suggest that this approach can be performed with reasonable reliability and with reasonable amounts of effort. We hope that authors will augment their reporting with this approach to improve clarity and make more direct progress toward broadly-capable dialog systems.

【Keywords】: dialog systems; evaluation

117. Intelligent Advice Provisioning for Repeated Interaction.

【Paper Link】【Pages】:842-849

【Authors】: Priel Levy ; David Sarne

【Abstract】: This paper studies two suboptimal advice provisioning methods ("advisors") as an alternative to providing optimal advice in repeated advising settings. Providing users with suboptimal advice has been reported to be highly advantageous whenever the optimal advice is non-intuitive, hence might not be accepted by the user. Alas, prior methods that rely on suboptimal advice generation were designed primarily for a single-shot advice provisioning setting, hence their performance in repeated settings is questionable. Our methods, on the other hand, are tailored to the repeated interaction case. Comprehensive evaluation of the proposed methods, involving hundreds of human participants, reveals that both methods meet their primary design goal (either an increased user profit or an increased user satisfaction from the advisor), while performing at least as good with the alternative goal, compared to having people perform with: (a) no advisor at all; (b) an advisor providing the theoretic-optimal advice; and (c) an effective suboptimal-advice-based advisor designed for the non-repeated variant of our experimental framework.

【Keywords】: Human-agent interaction;Advice provisioning;Decision-making

118. A Deep Choice Model.

【Paper Link】【Pages】:850-856

【Authors】: Makoto Otsuka ; Takayuki Osogami

【Abstract】: Human choice is complex in two ways. First, human choice often shows complex dependency on available alternatives. Second, human choice is often made after examining complex items such as images. The recently proposed choice model based on the restricted Boltzmann machine (RBM choice model) has been proved to represent three typical phenomena of human choice, which addresses the first complexity. We extend the RBM choice model to a deep choice model (DCM) to deal with the features of items, which are ignored in the RBM choice model. We then use deep learning to extract latent features from images and plug those latent features as input to the DCM. Our experiments show that the DCM adequately learns the choice that involves both of the two complexities in human choice.

【Keywords】:

119. Personalized Alert Agent for Optimal User Performance.

【Paper Link】【Pages】:857-864

【Authors】: Avraham Shvartzon ; Amos Azaria ; Sarit Kraus ; Claudia V. Goldman ; Joachim Meyer ; Omer Tsimhoni

【Abstract】: Preventive maintenance is essential for the smooth operation of any equipment. Still, people occasionally do not maintain their equipment adequately. Maintenance alert systems attempt to remind people to perform maintenance. However, most of these systems do not provide alerts at the optimal timing, and nor do they take into account the time required for maintenance or compute the optimal timing for a specific user. We model the problem of maintenance performance, assuming maintenance is time consuming. We solve the optimal policy for the user, i.e., the optimal timing for a user to perform maintenance. This optimal strategy depends on the value of user's time, and thus it may vary from user to user and may change over time. %We present a game Based on the solved optimal strategy we present a personalized maintenance agent, which, depending on the value of user's time, provides alerts to the user when she should perform maintenance. In an experiment using a spaceship computer game, we show that receiving alerts from the personalized alert agent significantly improves user performance.

【Keywords】: Maintenance Optimization; Alert System; Personalization

Technical Papers: Knowledge Representation and Reasoning 36

120. Minimizing User Involvement for Learning Human Mobility Patterns from Location Traces.

【Paper Link】【Pages】:865-871

【Authors】: Basma Alharbi ; Abdulhakim Ali Qahtan ; Xiangliang Zhang

【Abstract】: Utilizing trajectories for modeling human mobility often involves extracting descriptive features for each individual, a procedure heavily based on experts' knowledge. In this work, our objective is to minimize human involvement and exploit the power of community in learning `features' for individuals from their location traces. We propose a probabilistic graphical model that learns distribution of latent concepts, named motifs, from anonymized sequences of user locations. To handle variation in user activity level, our model learns motif distributions from sequence-level location co-occurrence of all users. To handle the big variation in location popularity, our model uses an asymmetric prior, conditioned on per-sequence features. We evaluate the new representation in a link prediction task and compare our results to those of baseline approaches.

【Keywords】: Mobility Pattern Modeling; Location Traces; Location Based Social Network; Call Detailed Records

121. Generating CP-Nets Uniformly at Random.

【Paper Link】【Pages】:872-878

【Authors】: Thomas E. Allen ; Judy Goldsmith ; Hayden Elizabeth Justice ; Nicholas Mattei ; Kayla Raines

【Abstract】: Conditional preference networks (CP-nets) are a commonly studied compact formalism for modeling preferences. To study the properties of CP-nets or the performance of CP-net algorithms on average, one needs to generate CP-nets in an equiprobable manner. We discuss common problems with naive generation, including sampling bias, which invalidates the base assumptions of many statistical tests and can undermine the results of an experimental study. We provide a novel algorithm for provably generating acyclic CP-nets uniformly at random. Our method is computationally efficient and allows for multi-valued domains and arbitrary bounds on the indegree in the dependency graph.

【Keywords】: preferences; CP-nets; uniform generation; directed acyclic graphs; dags

122. Boolean Functions with Ordered Domains in Answer Set Programming.

【Paper Link】【Pages】:879-885

【Authors】: Mario Alviano ; Wolfgang Faber ; Hannes Strass

【Abstract】: Boolean functions in Answer Set Programming have proven a useful modelling tool. They are usually specified by means of aggregates or external atoms. A crucial step in computing answer sets for logic programs containing Boolean functions is verifying whether partial interpretations satisfy a Boolean function for all possible values of its undefined atoms. In this paper, we develop a new methodology for showing when such checks can be done in deterministic polynomial time. This provides a unifying view on all currently known polynomial-time decidability results, and furthermore identifies promising new classes that go well beyond the state of the art. Our main technique consists of using an ordering on the atoms to significantly reduce the necessary number of model checks. For many standard aggregates, we show how this ordering can be automatically obtained.

【Keywords】: Boolean functions; aggregates; bipolar

【Paper Link】【Pages】:886-892

【Authors】: Francesco Belardinelli ; Wiebe van der Hoek

【Abstract】: This paper is aimed as a contribution to the use of formal modal languages in Artificial Intelligence. We introduce a multi-modal version of Second-order Propositional Modal Logic (SOPML), an extension of modal logic with propositional quantification, and illustrate its usefulness as a specification language for knowledge representation as well as temporal and spatial reasoning. Then, we define novel notions of (bi)simulation and prove that these preserve the interpretation of SOPML formulas. Finally, we apply these results to assess the expressive power of SOPML.

【Keywords】: Second-Order Propositional Modal Logic, Bisimulation, Temporal and Spatial Reasoning

124. A First-Order Logic of Probability and Only Knowing in Unbounded Domains.

【Paper Link】【Pages】:893-899

【Authors】: Vaishak Belle ; Gerhard Lakemeyer ; Hector J. Levesque

【Abstract】: Only knowing captures the intuitive notion that the beliefs of an agent are precisely those that follow from its knowledge base. It has previously been shown to be useful in characterizing knowledge-based reasoners, especially in a quantified setting. While this allows us to reason about incomplete knowledge in the sense of not knowing whether a formula is true or not, there are many applications where one would like to reason about the degree of belief in a formula. In this work, we propose a new general first-order account of probability and only knowing that admits knowledge bases with incomplete and probabilistic specifications. Beliefs and non-beliefs are then shown to emerge as a direct logical consequence of the sentences of the knowledge base at a corresponding level of specificity.

【Keywords】: reasoning about knowledge and belief; unbounded domains; probability and logic

125. Explaining Inconsistency-Tolerant Query Answering over Description Logic Knowledge Bases.

【Paper Link】【Pages】:900-906

【Authors】: Meghyn Bienvenu ; Camille Bourgaux ; François Goasdoué

【Abstract】: Several inconsistency-tolerant semantics have been introduced for querying inconsistent description logic knowledge bases. This paper addresses the problem of explaining why a tuple is a (non-)answer to a query under such semantics. We define explanations for positive and negative answers under the brave, AR and IAR semantics. We then study the computational properties of explanations in the lightweight description logic DL-Lite_R. For each type of explanation, we analyze the data complexity of recognizing (preferred) explanations and deciding if a given assertion is relevant or necessary. We establish tight connections between intractable explanation problems and variants of propositional satisfiability (SAT), enabling us to generate explanations by exploiting solvers for Boolean satisfaction and optimization problems. Finally, we empirically study the efficiency of our explanation framework using the well-established LUBM benchmark.

【Keywords】: inconsistency-tolerant query answering; computational complexity; DL-Lite

126. Automated Verification and Tightening of Failure Propagation Models.

【Paper Link】【Pages】:907-913

【Authors】: Benjamin Bittner ; Marco Bozzano ; Alessandro Cimatti ; Gianni Zampedri

【Abstract】: Timed Failure Propagation Graphs (TFPGs) are used in the design of safety-critical systems as a way of modeling failure propagation, and to evaluate and implement diagnostic systems. TFPGs are a very rich formalism: they allow to model Boolean combinations of faults and events, also dependent on the operational modes of the system and quantitative delays between them. TFPGs are often produced manually, from a given dynamic system of greater complexity, as abstract representations of the system behavior under specific faulty conditions. In this paper we tackle two key difficulties in this process: first, how to make sure that no important behavior of the system is overlooked in the TFPG, and that no spurious, non-existent behavior is introduced; second, how to devise the correct values for the delays between events. We propose a model checking approach to automatically validate the completeness and tightness of a TFPG for a given infinite-state dynamic system, and a procedure for the automated synthesis of the delay parameters. The proposed approach is evaluated on a number of synthetic and industrial benchmarks.

【Keywords】: timed failure propagation graphs; validation; parameter synthesis; symbolic model checking

127. A Comparative Study of Ranking-Based Semantics for Abstract Argumentation.

【Paper Link】【Pages】:914-920

【Authors】: Elise Bonzon ; Jérôme Delobelle ; Sébastien Konieczny ; Nicolas Maudet

【Abstract】: Argumentation is a process of evaluating and comparing a set of arguments. A way to compare them consists in using a ranking-based semantics which rank-order arguments from the most to the least acceptable ones. Recently, a number of such semantics have been pro- posed independently, often associated with some desirable properties. However, there is no comparative study which takes a broader perspective. This is what we propose in this work. We provide a general comparison of all these semantics with respect to the proposed proper- ties. That allows to underline the differences of behavior between the existing semantics.

【Keywords】: Abstract Argumentation; Ranking-based semantics;

128. Beyond OWL 2 QL in OBDA: Rewritings and Approximations.

【Paper Link】【Pages】:921-928

【Authors】: Elena Botoeva ; Diego Calvanese ; Valerio Santarelli ; Domenico Fabio Savo ; Alessandro Solimando ; Guohui Xiao

【Abstract】: Ontology-based data access (OBDA) is a novel paradigm facilitating access to relational data, realized by linking data sources to an ontology by means of declarative mappings. DL-Lite_R, which is the logic underpinning the W3C ontology language OWL 2 QL and the current language of choice for OBDA, has been designed with the goal of delegating query answering to the underlying database engine, and thus is restricted in expressive power. E.g., it does not allow one to express disjunctive information, and any form of recursion on the data. The aim of this paper is to overcome these limitations of DL-Lite_R, and extend OBDA to more expressive ontology languages, while still leveraging the underlying relational technology for query answering. We achieve this by relying on two well-known mechanisms, namely conservative rewriting and approximation, but significantly extend their practical impact by bringing into the picture the mapping, an essential component of OBDA. Specifically, we develop techniques to rewrite OBDA specifications with an expressive ontology to "equivalent" ones with a DL-Lite_R ontology, if possible, and to approximate them otherwise. We do so by exploiting the high expressive power of the mapping layer to capture part of the domain semantics of rich ontology languages. We have implemented our techniques in the prototype system OntoProx, making use of the state-of-the-art OBDA system Ontop and the query answering system Clipper, and we have shown their feasibility and effectiveness with experiments on synthetic and real-world data.

【Keywords】: ontology-based data access; query answering; rewriting; approximation

129. SDDs Are Exponentially More Succinct than OBDDs.

【Paper Link】【Pages】:929-935

【Authors】: Simone Bova

【Abstract】: Introduced by Darwiche (2011), sentential decision diagrams (SDDs) are essentially as tractable as ordered binary decision diagrams (OBDDs), but tend to be more succinct in practice. This makes SDDs a prominent representation language, with many applications in artificial intelligence and knowledge compilation. We prove that SDDs are more succinct than OBDDs also in theory, by constructing a family of boolean functions where each member has polynomial SDD size but exponential OBDD size. This exponential separation improves a quasipolynomial separation recently established by Razgon (2014), and settles an open problem in knowledge compilation (Darwiche, 2011).

【Keywords】: Ordered Binary Decision Diagrams; Sentential Decision Diagrams; Exponential Separation

130. On the Containment of SPARQL Queries under Entailment Regimes.

【Paper Link】【Pages】:936-942

【Authors】: Melisachew Wudage Chekol

【Abstract】: Most description logics (DL) query languages allow instance retrieval from an ABox. However, SPARQL is a schema query language allowing access to the TBox (in addition to the ABox). Moreover, its entailment regimes enable to take into account knowledge inferred from knowledge bases in the query answering process. This provides a new perspective for the containment problem. In this paper, we study the containment of SPARQL queries over OWL EL axioms under entailment. OWL EL is the language used by many large scale ontologies and is based on EL ++ . The main contribution is a novel approach to rewriting queries using SPARQL property paths and the μ-calculus in order to reduce containment test under entailment into validity check in the μ-calculus.

【Keywords】:

131. Logical Foundations of Privacy-Preserving Publishing of Linked Data.

【Paper Link】【Pages】:943-949

【Authors】: Bernardo Cuenca Grau ; Egor V. Kostylev

【Abstract】: The widespread adoption of Linked Data has been driven by the increasing demand for information exchange between organisations, as well as by data publishing regulations in domains such as health care and governance. In this setting, sensitive information is at risk of disclosure since published data can be linked with arbitrary external data sources. In this paper we lay the foundations of privacy-preserving data publishing (PPDP) in the context of Linked Data. We consider anonymisations of RDF graphs (and, more generally, relational datasets with labelled nulls) and define notions of safe and optimal anonymisations. Safety ensures that the anonymised data can be published with provable protection guarantees against linking attacks, whereas optimality ensures that it preserves as much information from the original data as possible, while satisfying the safety requirement. We establish the complexity of the underpinning decision problems both under open-world semantics inherent to RDF and a closed-world semantics, where we assume that an attacker has complete knowledge over some part of the original data.

【Keywords】: Linked Data; RDF Data; privacy; Semantic Web; Logic; Complexity of Reasoning

132. Verifying ConGolog Programs on Bounded Situation Calculus Theories.

【Paper Link】【Pages】:950-956

【Authors】: Giuseppe De Giacomo ; Yves Lespérance ; Fabio Patrizi ; Sebastian Sardiña

【Abstract】: We address verification of high-level programs over situation calculus action theories that have an infinite object domain, but bounded fluent extensions in each situation. We show that verification of mu-calculus temporal properties against ConGolog programs over such bounded theories is decidable in general. To do this, we reformulate the transition semantics of ConGolog to keep the bindings of “pick variables” into a separate variable environment whose size is naturally bounded by the number of variables. We also show that for situation-determined ConGolog programs, we can compile away the program into the action theory itself without loss of generality. This can also be done for arbitrary programs, but only to check certain properties, such as if a situation is the result of a program execution, not for mu-calculus verification.

【Keywords】: Situation Calculus; Congolog Programs; Verification; Bounded Action Theories

133. Qualitative Spatio-Temporal Stream Reasoning with Unobservable Intertemporal Spatial Relations Using Landmarks.

【Paper Link】【Pages】:957-963

【Authors】: Daniel de Leng ; Fredrik Heintz

【Abstract】: Qualitative spatio-temporal reasoning is an active research area in Artificial Intelligence. In many situations there is a need to reason about intertemporal qualitative spatial relations, i.e. qualitative relations between spatial regions at different time-points. However, these relations can never be explicitly observed since they are between regions at different time-points. In applications where the qualitative spatial relations are partly acquired by for example a robotic system it is therefore necessary to infer these relations. This problem has, to the best of our knowledge, not been explicitly studied before. The contribution presented in this paper is two-fold. First, we present a spatio-temporal logic MSTL, which allows for spatio-temporal stream reasoning. Second, we define the concept of a landmark as a region that does not change between time-points and use these landmarks to infer qualitative spatio-temporal relations between non-landmark regions at different time-points. The qualitative spatial reasoning is done in RCC-8, but the approach is general and can be applied to any similar qualitative spatial formalism.

【Keywords】: Spatio-Temporal Reasoning; Knowledge Representation; Intelligent Agents; Stream Reasoning

134. Using Decomposition-Parameters for QBF: Mind the Prefix!

【Paper Link】【Pages】:964-970

【Authors】: Eduard Eiben ; Robert Ganian ; Sebastian Ordyniak

【Abstract】: Similar to the satisfiability (SAT) problem, which can be seen to be the archetypical problem for NP, the quantified Boolean formula problem (QBF) is the archetypical problem for PSPACE. Recently, Atserias and Oliva (2014) showed that, unlike for SAT, many of the well-known decompositional parameters (such as treewidth and pathwidth) do not allow efficient algorithms for QBF. The main reason for this seems to be the lack of awareness of these parameters towards the dependencies between variables of a QBF formula. In this paper we extend the ordinary pathwidth to the QBF-setting by introducing prefix pathwidth, which takes into account the dependencies between variables in a QBF, and show that it leads to an efficient algorithm for QBF. We hope that our approach will help to initiate the study of novel tailor-made decompositional parameters for QBF and thereby help to lift the success of these decompositional parameters from SAT to QBF.

【Keywords】: quantified boolean formula; dependency schemes; satisfiability; treewidth and pathwidth; parameterized complexity

135. The Complexity of LTL on Finite Traces: Hard and Easy Fragments.

【Paper Link】【Pages】:971-977

【Authors】: Valeria Fionda ; Gianluigi Greco

【Abstract】: This paper focuses on LTL on finite traces (LTLf) for which satisfiability is known to be PSPACE-complete. However, little is known about the computational properties of fragments of LTLf. In this paper we fill this gap and make the following contributions. First, we identify several LTLf fragments for which the complexity of satisfiability drops to NP-complete or even P, by considering restrictions on the temporal operators and Boolean connectives being allowed. Second, we study a semantic variant of LTLf, which is of interest in the domain of business processes, where models have the property that precisely one propositional variable evaluates true at each time instant. Third, we introduce a reasoner for LTLf and compare its performance with the state of the art.

【Keywords】:

136. SAT-to-SAT: Declarative Extension of SAT Solvers with New Propagators.

【Paper Link】【Pages】:978-984

【Authors】: Tomi Janhunen ; Shahab Tasharrofi ; Eugenia Ternovska

【Abstract】: Special-purpose propagators speed up solving logic programs by inferring facts that are hard to deduce otherwise. However, implementing special-purpose propagators is a non-trivial task and requires expert knowledge of solvers. This paper proposes a novel approach in logic programming that allows (1) logical specification of both the problem itself and its propagators and (2) automatic incorporation of such propagators into the solving process. We call our proposed language P [ R ] and our solver SAT-to-SAT because it facilitates communication between several SAT solvers. Using our proposal, non-specialists can specify new reasoning methods (propagators) in a declarative fashion and obtain a solver that benefits from both state-of-the-art techniques implemented in SAT solvers as well as problem-specific reasoning methods that depend on the problem's structure. We implement our proposal and show that it outperforms the existing approach that only allows modeling a problem but does not allow modeling the reasoning methods for that problem.

【Keywords】: Knowledge Representation and Reasoning; Logic Programming; Special-purpose Propagators; SAT Solvers

137. Knowledge Graph Completion with Adaptive Sparse Transfer Matrix.

【Paper Link】【Pages】:985-991

【Authors】: Guoliang Ji ; Kang Liu ; Shizhu He ; Jun Zhao

【Abstract】: We model knowledge graphs for their completion by encoding each entity and relation into a numerical space. All previous work including Trans(E, H, R, and D) ignore the heterogeneity (some relations link many entity pairs and others do not) and the imbalance (the number of head entities and that of tail entities in a relation could be different) of knowledge graphs. In this paper, we propose a novel approach TranSparse to deal with the two issues. In TranSparse, transfer matrices are replaced by adaptive sparse matrices, whose sparse degrees are determined by the number of entities (or entity pairs) linked by relations. In experiments, we design structured and unstructured sparse patterns for transfer matrices and analyze their advantages and disadvantages. We evaluate our approach on triplet classification and link prediction tasks. Experimental results show that TranSparse outperforms Trans(E, H, R, and D) significantly, and achieves state-of-the-art performance.

【Keywords】:

138. Locally Adaptive Translation for Knowledge Graph Embedding.

【Paper Link】【Pages】:992-998

【Authors】: Yantao Jia ; Yuanzhuo Wang ; Hailun Lin ; Xiaolong Jin ; Xueqi Cheng

【Abstract】: Knowledge graph embedding aims to represent entities and relations in a large-scale knowledge graph as elements in a continuous vector space. Existing methods, e.g., TransE and TransH, learn embedding representation by defining a global margin-based loss function over the data. However, the optimal loss function is determined during experiments whose parameters are examined among a closed set of candidates. Moreover, embeddings over two knowledge graphs with different entities and relations share the same set of candidate loss functions, ignoring the locality of both graphs. This leads to the limited performance of embedding related applications. In this paper, we propose a locally adaptive translation method for knowledge graph embedding, called TransA, to find the optimal loss function by adaptively determining its margin over different knowledge graphs. Experiments on two benchmark data sets demonstrate the superiority of the proposed method, as compared to the-state-of-the-art ones.

【Keywords】: locally adaptive translation; knowledge graph embedding; optimal margin

139. Learning Abductive Reasoning Using Random Examples.

【Paper Link】【Pages】:999-1007

【Authors】: Brendan Juba

【Abstract】: We consider a new formulation of abduction in which degrees of "plausibility" of explanations, along with the rules of the domain, are learned from concrete examples (settings of attributes). Our version of abduction thus falls in the " learning to reason " framework of Khardon and Roth. Such approaches enable us to capture a natural notion of "plausibility" in a domain while avoiding the extremely difficult problem of specifying an explicit representation of what is "plausible." We specifically consider the question of which syntactic classes of formulas have efficient algorithms for abduction. We find that the class of k -DNF explanations can be found in polynomial time for any fixed k ; but, we also find evidence that even weak versions of our abduction task are intractable for the usual class of conjunctions . This evidence is provided by a connection to the usual, inductive PAC-learning model proposed by Valiant. We also consider an exception-tolerant variant of abduction. We observe that it is possible for polynomial-time algorithms to tolerate a few adversarially chosen exceptions, again for the class of k -DNF explanations. All of the algorithms we study are particularly simple, and indeed are variants of a rule proposed by Mill.

【Keywords】:

140. A Model for Learning Description Logic Ontologies Based on Exact Learning.

【Paper Link】【Pages】:1008-1015

【Authors】: Boris Konev ; Ana Ozaki ; Frank Wolter

【Abstract】: We investigate the problem of learning description logic (DL) ontologies in Angluin et al.’s framework of exact learning via queries posed to an oracle. We consider membership queries of the form “is a tuple a of individuals a certain answer to a data retrieval query q in a given ABox and the unknown target ontology?” and completeness queries of the form “does a hypothesis ontology entail the unknown target ontology?” Given a DL L and a data retrieval query language Q, we study polynomial learnability of ontologies in L using data retrieval queries in Q and provide an almost complete classification for DLs that are fragments of EL with role inclusions and of DL-Lite and for data retrieval queries that range from atomic queries and EL/ELI-instance queries to conjunctive queries. Some results are proved by non-trivial reductions to learning from subsumption examples.

【Keywords】: Description Logic, Exact Learning, Complexity

141. Agenda Separability in Judgment Aggregation.

【Paper Link】【Pages】:1016-1022

【Authors】: Jérôme Lang ; Marija Slavkovik ; Srdjan Vesic

【Abstract】: One of the better studied properties for operators in judgment aggregation is independence, which essentially dictates that the collective judgment on one issue should not depend on the individual judgments given on some other issue(s) in the same agenda. Independence, although considered a desirable property, is too strong, because together with mild additional conditions it implies dictatorship. We propose here a weakening of independence, named agenda separability: a judgment aggregation rule satisfies it if, whenever the agenda is composed of several independent sub-agendas, the resulting collective judgment sets can be computed separately for each sub-agenda and then put together. We show that this property is discriminant, in the sense that among judgment aggregation rules so far studied in the literature, some satisfy it and some do not. We briefly discuss the implications of agenda separability on the computation of judgment aggregation rules.

【Keywords】:

142. Basic Probabilistic Ontological Data Exchange with Existential Rules.

【Paper Link】【Pages】:1023-1029

【Authors】: Thomas Lukasiewicz ; Maria Vanina Martinez ; Livia Predoiu ; Gerardo I. Simari

【Abstract】: We study the complexity of exchanging probabilistic data between ontology-based probabilistic databases. We consider the Datalog+/- family of languages as ontology and ontology mapping languages, and we assume different compact encodings of the probabilities of the probabilistic source databases via Boolean events. We provide an extensive complexity analysis of the problem of deciding the existence of a probabilistic (universal) solution for a given probabilistic source database relative to a (probabilistic) data exchange problem for the different languages considered.

【Keywords】:

143. Resistance to Corruption of Strategic Argumentation.

【Paper Link】【Pages】:1030-1036

【Authors】: Michael J. Maher

【Abstract】: Strategic argumentation provides a simple model of disputation. We investigate it in the context of Dung's abstract argumentation. We show that strategic argumentation under the grounded semantics is resistant tocorruption -- specifically, collusion and espionage — in a sense similar to Bartholdi et al's notion of a voting scheme resistant to manipulation. Under the stable semantics, strategic argumentation is resistant to espionage, but its resistance to collusion varies according to the aims of the disputants. These results are extended to a variety of concrete languages for argumentation.

【Keywords】: strategic argumentation; abstract argumentation; defeasible reasoning; resistance to manipulation

144. Causal Explanation Under Indeterminism: A Sampling Approach.

【Paper Link】【Pages】:1037-1043

【Authors】: Christopher A. Merck ; Samantha Kleinberg

【Abstract】: One of the key uses of causes is to explain why things happen. Explanations of specific events, like an individual's heart attack on Monday afternoon or a particular car accident, help assign responsibility and inform our future decisions. Computational methods for causal inference make use of the vast amounts of data collected by individuals to better understand their behavior and improve their health. However, most methods for explanation of specific events have provided theoretical approaches with limited applicability. In contrast we make two main contributions: an algorithm for explanation that calculates the strength of token causes, and an evaluation based on simulated data that enables objective comparison against prior methods and ground truth. We show that the approach finds the correct relationships in classic test cases (causal chains, common cause, and backup causation) and in a realistic scenario (explaining hyperglycemic episodes in a simulation of type 1 diabetes).

【Keywords】: automated explanation; token causality; time-series data; stochastic processes

145. 'Knowing Whether' in Proper Epistemic Knowledge Bases.

【Paper Link】【Pages】:1044-1050

【Authors】: Tim Miller ; Paolo Felli ; Christian J. Muise ; Adrian R. Pearce ; Liz Sonenberg

【Abstract】: Proper epistemic knowledge bases (PEKBs) are syntactic knowledge bases that use multi-agent epistemic logic to represent nested multi-agent knowledge and belief. PEKBs have certain syntactic restrictions that lead to desirable computational properties; primarily, a PEKB is a conjunction of modal literals, and therefore contains no disjunction. Sound entailment can be checked in polynomial time, and is complete for a large set of arbitrary formulae in logics K n and KD n . In this paper, we extend PEKBs to deal with a restricted form of disjunction: 'knowing whether.' An agent i knows whether Q iff agent i knows Q or knows not Q; that is, []Q or []not(Q). In our experience, the ability to represent that an agent knows whether something holds is useful in many multi-agent domains. We represent knowing whether with a modal operator, and present sound polynomial-time entailment algorithms on PEKBs with the knowing whether operator in K n and KD n , but which are complete for a smaller class of queries than standard PEKBs.

【Keywords】: epistemic reasoning; multi-agent systems; nested belief

146. Ontology-Mediated Queries for NOSQL Databases.

【Paper Link】【Pages】:1051-1057

【Authors】: Marie-Laure Mugnier ; Marie-Christine Rousset ; Federico Ulliana

【Abstract】: Ontology-Based Data Access has been studied so far for relational structures and deployed on top of relational databases. This paradigm enables a uniform access to heterogeneous data sources, also coping with incomplete information. Whether OBDA is suitable also for non-relational structures, like those shared by increasingly popular NOSQL languages, is still an open question. In this paper, we study the problem of answering ontology-mediated queries on top of key-value stores. We formalize the data model and core queries of these systems, and introduce a rule language to express lightweight ontologies on top of data. We study the decidability and data complexity of query answering in this setting.

【Keywords】:

147. Zero-Suppressed Sentential Decision Diagrams.

【Paper Link】【Pages】:1058-1066

【Authors】: Masaaki Nishino ; Norihito Yasuda ; Shin-ichi Minato ; Masaaki Nagata

【Abstract】: The Sentential Decision Diagram (SDD) is a prominent knowledge representation language that subsumes the Ordered Binary Decision Diagram (OBDD) as a strict subset. Like OBDDs, SDDs have canonical forms and support bottom-up operations for combining SDDs, but they are more succinct than OBDDs. In this paper we introduce an SDD variant, called the Zero-suppressed Sentential Decision Diagram (ZSDD). The key idea of ZSDD is to employ new trimming rules for obtaining a canonical form. As a result, ZSDD subsumes the Zero-suppressed Binary Decision Diagram (ZDD) as a strict subset. ZDDs are known for their effectiveness on representing sparse Boolean functions. Likewise, ZSDDs can be more succinct than SDDs when representing sparse Boolean functions. We propose several polytime bottom-up operations over ZSDDs, and a technique for reducing ZSDD size, while maintaining applicability to important queries. We also specify two distinct upper bounds on ZSDD sizes; one is derived from the treewidth of a CNF and the other from the size of a family of sets. Experiments show that ZSDDs are smaller than SDDs or ZDDs for a standard benchmark dataset.

【Keywords】: Knowledge Compilation, Binary Decision Diagrams, Propositional Knowledge Base, Sentential Decision Diagrams

148. Scalable Training of Markov Logic Networks Using Approximate Counting.

【Paper Link】【Pages】:1067-1073

【Authors】: Somdeb Sarkhel ; Deepak Venugopal ; Tuan Anh Pham ; Parag Singla ; Vibhav Gogate

【Abstract】: In this paper, we propose principled weight learning algorithms for Markov logic networks that can easily scale to much larger datasets and application domains than existing algorithms. The main idea in our approach is to use approximate counting techniques to substantially reduce the complexity of the most computation intensive sub-step in weight learning: computing the number of groundings of a first-order formula that evaluate to true given a truth assignment to all the random variables. We derive theoretical bounds on the performance of our new algorithms and demonstrate experimentally that they are orders of magnitude faster and achieve the same accuracy or better than existing approaches.

【Keywords】: Markov Logic Network; Scalable Learning, Weight Learning

149. Metaphysics of Planning Domain Descriptions.

【Paper Link】【Pages】:1074-1080

【Authors】: Siddharth Srivastava ; Stuart J. Russell ; Alessandro Pinto

【Abstract】: STRIPS-like languages (SLLs) have fostered immense advances in automated planning. In practice, SLLs are used to express highly abstract versions of real-world planning problems, leading to more concise models and faster solution times. Unfortunately, as we show in the paper, simple ways of abstracting solvable real-world problems may lead to SLL models that are unsolvable, SLL models whose solutions are incorrect with respect to the real-world problem, or models that are inexpressible in SLLs. There is some evidence that such limitations have restricted the applicability of AI planning technology in the real world, as is apparent in the case of task and motion planning in robotics. We show that the situation can be ameliorated by a combination of increased expressive power — for example, allowing angelic nondeterminism in action effects — and new kinds of algorithmic approaches designed to produce correct solutions from initially incorrect or non-Markovian abstract models.

【Keywords】: Automated Planning, Abstraction, Planning Domain Descriptions

150. Expressive Recommender Systems through Normalized Nonnegative Models.

【Paper Link】【Pages】:1081-1087

【Authors】: Cyril J. Stark

【Abstract】: We introduce normalized nonnegative models (NNM) for explorative data analysis. NNMs are partial convexifications of models from probability theory. We demonstrate their value at the example of item recommendation. We show that NNM-based recommender systems satisfy three criteria that all recommender systems should ideally satisfy: high predictive power, computational tractability, and expressive representations of users and items. Expressive user and item representations are important in practice to succinctly summarize the pool of customers and the pool of items. In NNMs, user representations are expressive because each user's preference can be regarded as normalized mixture of preferences of stereotypical users. The interpretability of item and user representations allow us to arrange properties of items (e.g., genres of movies or topics of documents) or users (e.g., personality traits) hierarchically.

【Keywords】: ML: Recommender Systems; ML: Classification; ML: Data Mining and Knowledge Discovery; ML: Big Data / Scalability; MLA: Applications of Supervised Learning; KRR: Knowledge Acquisition; KRR: Knowledge Representation (General/Other); CS: Structural Learning

151. Complexity Results and Algorithms for Extension Enforcement in Abstract Argumentation.

【Paper Link】【Pages】:1088-1094

【Authors】: Johannes Peter Wallner ; Andreas Niskanen ; Matti Järvisalo

【Abstract】: Understanding the dynamics of argumentation frameworks (AFs) is important in the study of argumentation in AI. In this work, we focus on the so-called extension enforcement problem in abstract argumentation. We provide a nearly complete computational complexity map of fixed-argument extension enforcement under various major AF semantics, with results ranging from polynomial-time algorithms to completeness for the second-level of the polynomial hierarchy. Complementing the complexity results, we propose algorithms for NP-hard extension enforcement based on constrained optimization. Going beyond NP, we propose novel counterexample-guided abstraction refinement procedures for the second-level complete problems and present empirical results on a prototype system constituting the first approach to extension enforcement in its generality.

【Keywords】: abstract argumentation; computational complexity; dynamics of argumentation; extension enforcement; encodings; algorithms

152. Query Answering with Inconsistent Existential Rules under Stable Model Semantics.

【Paper Link】【Pages】:1095-1101

【Authors】: Hai Wan ; Heng Zhang ; Peng Xiao ; Haoran Huang ; Yan Zhang

【Abstract】: Classical inconsistency-tolerant query answering relies on selecting maximal components of an ABox/database which are consistent with the ontology. However, some rules in ontologies might be unreliable if they are extracted from ontology learning or written by unskillful knowledge engineers. In this paper we present a framework of handling inconsistent existential rules under stable model semantics, which is defined by a notion called rule repairs to select maximal components of the existential rules. Surprisingly, for R-acyclic existential rules with R-stratified or guarded existential rules with stratified negations, both the data complexity and combined complexity of query answering under the rule repair semantics remain the same as that under the conventional query answering semantics. This leads us to propose several approaches to handle the rule repair semantics by calling answer set programming solvers. An experimental evaluation shows that these approaches have good scalability of query answering under rule repairs on realistic cases.

【Keywords】: Stable Model Semantics; Inconsistent Existential Rules; Query Answering

153. Affinity Preserving Quantization for Hashing: A Vector Quantization Approach to Compact Learn Binary Codes.

【Paper Link】【Pages】:1102-1108

【Authors】: Zhe Wang ; Ling-Yu Duan ; Tiejun Huang ; Wen Gao

【Abstract】: Hashing techniques are powerful for approximate nearest neighbour (ANN) search.Existing quantization methods in hashing are all focused on scalar quantization (SQ) which is inferior in utilizing the inherent data distribution.In this paper, we propose a novel vector quantization (VQ) method named affinity preserving quantization (APQ) to improve the quantization quality of projection values, which has significantly boosted the performance of state-of-the-art hashing techniques.In particular, our method incorporates the neighbourhood structure in the pre- and post-projection data space into vector quantization.APQ minimizes the quantization errors of projection values as well as the loss of affinity property of original space.An effective algorithm has been proposed to solve the joint optimization problem in APQ, and the extension to larger binary codes has been resolved by applying product quantization to APQ.Extensive experiments have shown that APQ consistently outperforms the state-of-the-art quantization methods, and has significantly improved the performance of various hashing techniques.

【Keywords】:

154. Decidable Verification of Golog Programs over Non-Local Effect Actions.

【Paper Link】【Pages】:1109-1115

【Authors】: Benjamin Zarrieß ; Jens Claßen

【Abstract】: The Golog action programming language is a powerful means to express high-level behaviours in terms of programs over actions defined in a Situation Calculus theory. In particular for physical systems, verifying that the program satisfies certain desired temporal properties is often crucial, but undecidable in general, the latter being due to the language's high expressiveness in terms of first-order quantification, range of action effects, and program constructs. So far, approaches to achieve decidability involved restrictions where action effects either had to be context-free (i.e. not depend on the current state), local (i.e. only affect objects mentioned in the action's parameters), or at least bounded (i.e. only affect a finite number of objects). In this paper, we introduce two new, more general classes of action theories that allow for context-sensitive, non-local, unbounded effects, i.e. actions that may affect an unbounded number of possibly unnamed objects in a state-dependent fashion. We contribute to the further exploration of the boundary between decidability and undecidability for Golog, showing that for our new classes of action theories in the two-variable fragment of first-order logic, verification of CTL* properties of programs over ground actions is decidable.

【Keywords】: Golog Verification; Situation Calculus

155. Mapping Action Language BC to Logic Programs: A Characterization by Postulates.

【Paper Link】【Pages】:1116-1123

【Authors】: Haodi Zhang ; Fangzhen Lin

【Abstract】: We have earlier shown that the standard mappings from action languages B and C to logic programs under answer set semantics can be captured by sets of properties on transition systems. In this paper, we consider action language BC and show that a standard mapping from BC action descriptions to logic programs can be similarly captured when the action rules in the descriptions do not have consistency conditions.

【Keywords】: Causal action theories;Action languages;Logic programming

Technical Papers: Machine Learning Applications 46

156. On the Performance of GoogLeNet and AlexNet Applied to Sketches.

【Paper Link】【Pages】:1124-1128

【Authors】: Pedro Ballester ; Ricardo Matsumura Araujo

【Abstract】: This work provides a study on how Convolutional Neural Networks, trained to identify objects primarily in photos, perform when applied to more abstract representations of the same objects. Our main goal is to better understand the generalization abilities of these networks and their learned inner representations. We show that both GoogLeNet and AlexNet networks are largely unable to recognize abstract sketches that are easily recognizable by humans. Moreover, we show that the measured efficacy vary considerably across different classes and we discuss possible reasons for this.

【Keywords】: Image Classification; Deep Neural Network; Sketch Classification

157. Bayesian Inference of Recursive Sequences of Group Activities from Tracks.

【Paper Link】【Pages】:1129-1137

【Authors】: Ernesto Brau ; Colin Reimer Dawson ; Alfredo Carrillo ; David Sidi ; Clayton T. Morrison

【Abstract】: We present a probabilistic generative model for inferring a description of coordinated, recursively structured group activities at multiple levels of temporal granularity based on observations of individuals’ trajectories. The model accommodates: (1) hierarchically structured groups, (2) activities that are temporally and compositionally recursive, (3) component roles assigning different subactivity dynamics to subgroups of participants, and (4) a nonparametric Gaussian Process model of trajectories. We present an MCMC sampling framework for performing joint inference over recursive activity descriptions and assignment of trajectories to groups, integrating out continuous parameters. We demonstrate the model’s expressive power in several simulated and complex real-world scenarios from the VIRAT and UCLA Aerial Event video data sets.

【Keywords】: MCMC; Bayesian inference; group activity recognition; Bayesian tree models

158. Towards Domain Adaptive Vehicle Detection in Satellite Image by Supervised Super-Resolution Transfer.

【Paper Link】【Pages】:1138-1144

【Authors】: Liujuan Cao ; Rongrong Ji ; Cheng Wang ; Jonathan Li

【Abstract】: Vehicle detection in satellite image has attracted extensive research attentions with various emerging applications.However, the detector performance has been significantly degenerated due to the low resolutions of satellite images, as well as the limited training data.In this paper, a robust domain-adaptive vehicle detection framework is proposed to bypass both problems.Our innovation is to transfer the detector learning to the high-resolution aerial image domain,where rich supervision exists and robust detectors can be trained.To this end, we first propose a super-resolution algorithm using coupled dictionary learning to ``augment'' the satellite image region being tested into the aerial domain.Notably, linear detection loss is embedded into the dictionary learning, which enforces the augmented region to be sensitive to the subsequent detector training.Second, to cope with the domain changes, we propose an instance-wised detection using Exemplar Support Vector Machines (E-SVMs), which well handles the intra-class and imaging variations like scales, rotations, and occlusions.With comprehensive experiments on large-scale satellite image collections, we demonstrate that the proposed framework can significantly boost the detection accuracy over several state-of-the-arts.

【Keywords】: Vehicle Detection;Super-Resolution;Satellite Image

159. Deep Neural Networks for Learning Graph Representations.

【Paper Link】【Pages】:1145-1152

【Authors】: Shaosheng Cao ; Wei Lu ; Qiongkai Xu

【Abstract】: In this paper, we propose a novel model for learning graph representations, which generates a low-dimensional vector representation for each vertex by capturing the graph structural information. Different from other previous research efforts, we adopt a random surfing model to capture graph structural information directly, instead of using the sampling-based method for generating linear sequences proposed by Perozzi et al. (2014). The advantages of our approach will be illustrated from both theorical and empirical perspectives. We also give a new perspective for the matrix factorization method proposed by Levy and Goldberg (2014), in which the pointwise mutual information (PMI) matrix is considered as an analytical solution to the objective function of the skip-gram model with negative sampling proposed by Mikolov et al. (2013). Unlike their approach which involves the use of the SVD for finding the low-dimensitonal projections from the PMI matrix, however, the stacked denoising autoencoder is introduced in our model to extract complex features and model non-linearities. To demonstrate the effectiveness of our model, we conduct experiments on clustering and visualization tasks, employing the learned vertex representations as features. Empirical results on datasets of varying sizes show that our model outperforms other stat-of-the-art models in such tasks.

【Keywords】:

160. Discriminative Nonparametric Latent Feature Relational Models with Data Augmentation.

【Paper Link】【Pages】:1153-1159

【Authors】: Bei Chen ; Ning Chen ; Jun Zhu ; Jiaming Song ; Bo Zhang

【Abstract】: We present a discriminative nonparametric latent feature relational model (LFRM) for link prediction to automatically infer the dimensionality of latent features. Under the generic RegBayes (regularized Bayesian inference) framework, we handily incorporate the prediction loss with probabilistic inference of a Bayesian model; set distinct regularization parameters for different types of links to handle the imbalance issue in real networks; and unify the analysis of both the smooth logistic log-loss and the piecewise linear hinge loss. For the nonconjugate posterior inference, we present a simple Gibbs sampler via data augmentation, without making restricting assumptions as done in variational methods. We further develop an approximate sampler using stochastic gradient Langevin dynamics to handle large networks with hundreds of thousands of entities and millions of links, orders of magnitude larger than what existing LFRM models can process. Extensive studies on various real networks show promising performance.

【Keywords】: Link Prediction; Bayesian Nonparametrics; Latent Feature Model; Data Augmentation

161. Mitosis Detection in Breast Cancer Histology Images via Deep Cascaded Networks.

【Paper Link】【Pages】:1160-1166

【Authors】: Hao Chen ; Qi Dou ; Xi Wang ; Jing Qin ; Pheng Ann Heng

【Abstract】: The number of mitoses per tissue area gives an important aggressiveness indication of the invasive breast carcinoma.However, automatic mitosis detection in histology images remains a challenging problem. Traditional methods either employ hand-crafted features to discriminate mitoses from other cells or construct a pixel-wise classifier to label every pixel in a sliding window way. While the former suffers from the large shape variation of mitoses and the existence of many mimics with similar appearance, the slow speed of the later prohibits its use in clinical practice.In order to overcome these shortcomings, we propose a fast and accurate method to detect mitosis by designing a novel deep cascaded convolutional neural network, which is composed of two components. First, by leveraging the fully convolutional neural network, we propose a coarse retrieval model to identify and locate the candidates of mitosis while preserving a high sensitivity.Based on these candidates, a fine discrimination model utilizing knowledge transferred from cross-domain is developed to further single out mitoses from hard mimics.Our approach outperformed other methods by a large margin in 2014 ICPR MITOS-ATYPIA challenge in terms of detection accuracy. When compared with the state-of-the-art methods on the 2012 ICPR MITOSIS data (a smaller and less challenging dataset), our method achieved comparable or better results with a roughly 60 times faster speed.

【Keywords】: biomedical imaging; deep learning; mitosis detection; cascaded network

162. Deep Contextual Networks for Neuronal Structure Segmentation.

【Paper Link】【Pages】:1167-1173

【Authors】: Hao Chen ; Xiaojuan Qi ; Jie-Zhi Cheng ; Pheng Ann Heng

【Abstract】: The goal of connectomics is to manifest the interconnections of neural system with the Electron Microscopy (EM) images. However, the formidable size of EM image data renders human annotation impractical, as it may take decades to fulfill the whole job. An alternative way to reconstruct the connectome can be attained with the computerized scheme that can automatically segment the neuronal structures. The segmentation of EM images is very challenging as the depicted structures can be very diverse.To address this difficult problem, a deep contextual network is proposed here by leveraging multi-level contextual information from the deep hierarchical structure to achieve better segmentation performance.To further improve the robustness against the vanishing gradients and strengthen the capability of the back-propagation of gradient flow, auxiliary classifiers are incorporated in the architecture of our deep neural network. It will be shown that our method can effectively parse the semantic meaning from the images with the underlying neural network and accurately delineate the structural boundaries with the reference of low-level contextual cues. Experimental results on the benchmark dataset of 2012 ISBI segmentation challenge of neuronal structures suggest that the proposed method can outperform the state-of-the-art methods by a large margin with respect to different evaluation measurements. Our method can potentially facilitate the automatic connectome analysis from EM images with less human intervention effort.

【Keywords】: biomedical imaging; deep learning; neuronal structure segmentation; context

【Paper Link】【Pages】:1174-1180

【Authors】: Lin Chen ; Forrest W. Crawford ; Amin Karbasi

【Abstract】: Learning about the social structure of hidden and hard-to-reach populations — such as drug users and sex workers — is a major goal of epidemiological and public health research on risk behaviors and disease prevention. Respondent-driven sampling (RDS) is a peer-referral process widely used by many health organizations, where research subjects recruit other subjects from their social network. In such surveys, researchers observe who recruited whom, along with the time of recruitment and the total number of acquaintances (network degree) of respondents. However, due to privacy concerns, the identities of acquaintances are not disclosed. In this work, we show how to reconstruct the underlying network structure through which the subjects are recruited. We formulate the dynamics of RDS as a continuous-time diffusion process over the underlying graph and derive the likelihood of the recruitment time series under an arbitrary inter-recruitment time distribution. We develop an efficient stochastic optimization algorithm called RENDER (REspoNdent-Driven nEtwork Reconstruction) that finds the network that best explains the collected data. We support our analytical results through an exhaustive set of experiments on both synthetic and real data.

【Keywords】: respondent-driven sampling; network reconstruction; machine learning

164. Robust Multi-View Subspace Learning through Dual Low-Rank Decompositions.

【Paper Link】【Pages】:1181-1187

【Authors】: Zhengming Ding ; Yun Fu

【Abstract】: Multi-view data is highly common nowadays, since various view-points and different sensors tend to facilitate better data representation. However, data from different views show a large divergence. Specifically, one sample lies in two kinds of structures, one is class structure and the other is view structure, which are intertwined with one another in the original feature space. To address this, we develop a Robust Multi-view Subspace Learning algorithm (RMSL) through dual low-rank decompositions, which desires to seek a low-dimensional view-invariant subspace for multi-view data. Through dual low-rank decompositions, RMSL aims to disassemble two intertwined structures from each other in the low-dimensional subspace. Furthermore, we develop two novel graph regularizers to guide dual low-rank decompositions in a supervised fashion. In this way, the semantic gap across different views would be mitigated so that RMSL can preserve more within-class information and reduce the influence of view variance to seek a more robust low-dimensional subspace. Extensive experiments on two multi-view benchmarks, e.g., face and object images, have witnessed the superiority of our proposed algorithm, by comparing it with the state-of-the-art algorithms.

【Keywords】: multi-view; low-rank

165. Graph-without-cut: An Ideal Graph Learning for Image Segmentation.

【Paper Link】【Pages】:1188-1194

【Authors】: Lianli Gao ; Jingkuan Song ; Feiping Nie ; Fuhao Zou ; Nicu Sebe ; Heng Tao Shen

【Abstract】: Graph-based image segmentation organizes the image elements into graphs and partitions an image based on the graph. It has been widely used and many promising results are obtained. Since the segmentation performance highly depends on the graph, most of existing methods focus on obtaining a precise similarity graph or on designing efficient cutting/merging strategies. However, these two components are often conducted in two separated steps, and thus the obtained graph similarity may not be the optimal one for segmentation and this may lead to suboptimal results. In this paper, we propose a novel framework, Graph-Without-Cut (GWC), for learning the similarity graph and image segmentations simultaneously. GWC learns the similarity graph by assigning adaptive and optimal neighbors to each vertex based on the spatial and visual information. Meanwhile, the new rank constraint is imposed to the Laplacian matrix of the similarity graph, such that the connected components in the resulted similarity graph are exactly equal to the region number. Extensive empirical results on three public data sets (i.e, BSDS300, BSDS500 and MSRC) show that our unsupervised GWC achieves state-of-the-art performance compared with supervised and unsupervised image segmentation approaches.

【Keywords】: image segmentation; graph learning

166. MOOCs Meet Measurement Theory: A Topic-Modelling Approach.

【Paper Link】【Pages】:1195-1201

【Authors】: Jiazhen He ; Benjamin I. P. Rubinstein ; James Bailey ; Rui Zhang ; Sandra Milligan ; Jeffrey Chan

【Abstract】: This paper adapts topic models to the psychometric testing of MOOC students based on their online forum postings. Measurement theory from education and psychology provides statistical models for quantifying a person's attainment of intangible attributes such as attitudes, abilities or intelligence. Such models infer latent skill levels by relating them to individuals' observed responses on a series of items such as quiz questions. The set of items can be used to measure a latent skill if individuals' responses on them conform to a Guttman scale. Such well-scaled items differentiate between individuals and inferred levels span the entire range from most basic to the advanced. In practice, education researchers manually devise items (quiz questions) while optimising well-scaled conformance. Due to the costly nature and expert requirements of this process, psychometric testing has found limited use in everyday teaching. We aim to develop usable measurement models for highly-instrumented MOOC delivery platforms, by using participation in automatically-extracted online forum topics as items. The challenge is to formalise the Guttman scale educational constraint and incorporate it into topic models. To favour topics that automatically conform to a Guttman scale, we introduce a novel regularisation into non-negative matrix factorisation-based topic modelling. We demonstrate the suitability of our approach with both quantitative experiments on three Coursera MOOCs, and with a qualitative survey of topic interpretability on two MOOCs by domain expert interviews.

【Keywords】: MOOCs; topic modelling; measurement theory

167. Creating Images by Learning Image Semantics Using Vector Space Models.

【Paper Link】【Pages】:1202-1208

【Authors】: Derrall Heath ; Dan Ventura

【Abstract】: When dealing with images and semantics, most computational systems attempt to automatically extract meaning from images. Here we attempt to go the other direction and autonomously create images that communicate concepts. We present an enhanced semantic model that is used to generate novel images that convey meaning. We employ a vector space model and a large corpus to learn vector representations of words and then train the semantic model to predict word vectors that could describe a given image. Once trained, the model autonomously guides the process of rendering images that convey particular concepts. A significant contribution is that, because of the semantic associations encoded in these word vectors, we can also render images that convey concepts on which the model was not explicitly trained. We evaluate the semantic model with an image clustering technique and demonstrate that the model is successful in creating images that communicate semantic relationships.

【Keywords】: Image Generation; Vector Space Models; Semantic Models

168. Efficient Learning of Timeseries Shapelets.

【Paper Link】【Pages】:1209-1215

【Authors】: Lu Hou ; James T. Kwok ; Jacek M. Zurada

【Abstract】: In timeseries classification, shapelets are subsequences of timeseries with high discriminative power. Existing methods perform a combinatorial search for shapelet discovery. Even with speedup heuristics such as pruning, clustering, and dimensionality reduction, the search remains computationally expensive. In this paper, we take an entirely different approach and reformulate the shapelet discovery task as a numerical optimization problem. In particular, the shapelet positions are learned by combining the generalized eigenvector method and fused lasso regularizer to encourage a sparse and blocky solution. Extensive experimental results show that the proposed method is orders of magnitudes faster than the state-of-the-art shapelet-based methods, while achieving comparable or even better classification accuracy.

【Keywords】: timeseries; shapelets; fused lasso; generalized eigenvector method.

169. Learning to Appreciate the Aesthetic Effects of Clothing.

【Paper Link】【Pages】:1216-1222

【Authors】: Jia Jia ; Jie Huang ; Guangyao Shen ; Tao He ; Zhiyuan Liu ; Huan-Bo Luan ; Chao Yan

【Abstract】: How do people describe clothing? The words like “formal”or "casual" are usually used. However, recent works often focus on recognizing or extracting visual features (e.g., sleeve length, color distribution and clothing pattern) from clothing images accurately. How can we bridge the gap between the visual features and the aesthetic words? In this paper, we formulate this task to a novel three-level framework: visual features(VF) - image-scale space (ISS) - aesthetic words space(AWS). Leveraging the art-field image-scale space served as an intermediate layer, we first propose a Stacked Denoising Autoencoder Guided by CorrelativeLabels (SDAE-GCL) to map the visual features to the image-scale space; and then according to the semantic distances computed byWordNet::Similarity, we map the most often used aesthetic words in online clothing shops to the image-scale space too. Employing upper body menswear images downloaded from several global online clothing shops as experimental data, the results indicate that the proposed three-level framework can help to capture the subtle relationship between visual features and aesthetic words better compared to several baselines. To demonstrate that our three-level framework and its implementation methods are universally applicable, we finally present some interesting analyses on the fashion trend of menswear in the last 10 years.

【Keywords】: Clothing; Aesthetic Effects;The Image Scale Space

170. Consensus Style Centralizing Auto-Encoder for Weak Style Classification.

【Paper Link】【Pages】:1223-1229

【Authors】: Shuhui Jiang ; Ming Shao ; Chengcheng Jia ; Yun Fu

【Abstract】: Style classification (e.g., architectural, music, fashion) attracts an increasing attention in both research and industrial fields. Most existing works focused on low-level visual features composition for style representation. However, little effort has been devoted to automatic mid-level or high-level style features learning by reorganizing low-level descriptors. Moreover, styles are usually spread out and not easy to differentiate from one to another. In this paper, we call these less representative images as weak style images. To address these issues, we propose a consensus style centralizing auto-encoder (CSCAE) to extract robust style features to facilitate weak style classification. CSCAE is the ensemble of several style centralizing auto-encoders (SCAEs) with consensus constraint. Each SCAE centralizes each feature of certain category in a progressive way. We apply our method in fashion style classification and manga style classification as two example applications. In addition, we collect a new dataset, Online Shopping, for fashion style classification evaluation, which will be publicly available for vision based fashion style research. Experiments demonstrate the effectiveness of SCAE and CSCAE on both public and newly collected datasets when compared with the most recent state-of-the-art works.

【Keywords】: auto-encoder; deep learning; style

171. Column Sampling Based Discrete Supervised Hashing.

【Paper Link】【Pages】:1230-1236

【Authors】: Wang-Cheng Kang ; Wu-Jun Li ; Zhi-Hua Zhou

【Abstract】: By leveraging semantic (label) information, supervised hashing has demonstrated better accuracy than unsupervised hashing in many real applications. Because the hashing-code learning problem is essentially a discrete optimization problem which is hard to solve, most existing supervised hashing methods try to solve a relaxed continuous optimization problem by dropping the discrete constraints. However, these methods typically suffer from poor performance due to the errors caused by the relaxation. Some other methods try to directly solve the discrete optimization problem. However, they are typically time-consuming and unscalable. In this paper, we propose a novel method, called column sampling based discrete supervised hashing (COSDISH), to directly learn the discrete hashing code from semantic information. COSDISH is an iterative method, in each iteration of which several columns are sampled from the semantic similarity matrix and then the hashing code is decomposed into two parts which can be alternately optimized in a discrete way. Theoretical analysis shows that the learning (optimization) algorithm of COSDISH has a constant-approximation bound in each step of the alternating optimization procedure. Empirical results on datasets with semantic labels illustrate that COSDISH can outperform the state-of-the-art methods in real applications like image retrieval.

【Keywords】: learning to hash; supervised learning; nearest neighbor search

172. A Framework for Outlier Description Using Constraint Programming.

【Paper Link】【Pages】:1237-1243

【Authors】: Chia-Tung Kuo ; Ian Davidson

【Abstract】: Outlier detection has been studied extensively and employed in diverse applications in the past decades. In this paper we formulate a related yet understudied problem which we call outlier description. This problem often arises in practice when we have a small number of data instances that had been identified to be outliers and we wish to explain why they are outliers. We propose a framework based on constraint programming to find an optimal subset of features that most differentiates the outliers and normal instances. We further demonstrate the framework offers great flexibility in incorporating diverse scenarios arising in practice such as multiple explanations and human in the loop extensions. We empirically evaluate our proposed framework on real datasets, including medical imaging and text corpus, and demonstrate how the results are useful and interpretable in these domains.

【Keywords】:

173. Random Mixed Field Model for Mixed-Attribute Data Restoration.

【Paper Link】【Pages】:1244-1250

【Authors】: Qiang Li ; Wei Bian ; Richard Yi Da Xu ; Jane You ; Dacheng Tao

【Abstract】: Noisy and incomplete data restoration is a critical preprocessing step in developing effective learning algorithms, which targets to reduce the effect of noise and missing values in data. By utilizing attribute correlations and/or instance similarities, various techniques have been developed for data denoising and imputation tasks. However, current existing data restoration methods are either specifically designed for a particular task, or incapable of dealing with mixed-attribute data. In this paper, we develop a new probabilistic model to provide a general and principled method for restoring mixed-attribute data. The main contributions of this study are twofold: a) a unified generative model, utilizing a generic random mixed field (RMF) prior, is designed to exploit mixed-attribute correlations; and b) a structured mean-field variational approach is proposed to solve the challenging inference problem of simultaneous denoising and imputation. We evaluate our method by classification experiments on both synthetic data and real benchmark datasets. Experiments demonstrate, our approach can effectively improve the classification accuracy of noisy and incomplete data by comparing with other data restoration methods.

【Keywords】:

174. Learning with Marginalized Corrupted Features and Labels Together.

【Paper Link】【Pages】:1251-1257

【Authors】: Yingming Li ; Ming Yang ; Zenglin Xu ; Zhongfei (Mark) Zhang

【Abstract】: Tagging has become increasingly important in many real-world applications noticeably including web applications, such as web blogs and resource sharing systems. Despite this importance, tagging methods often face difficult challenges such as limited training samples and incomplete labels, which usually lead to degenerated performance on tag prediction. To improve the generalization performance, in this paper, we propose Regularized Marginalized Cross-View learning (RMCV) by jointly modeling on attribute noise and label noise. In more details, the proposed model constructs infinite training examples with attribute noises from known exponential-family distributions and exploits label noise via marginalized denoising autoencoder. Therefore, the model benefits from its robustness and alleviates the problem of tag sparsity. While RMCV is a general method for learning tagging, in the evaluations we focus on the specific application of multi-label text tagging. Extensive evaluations on three benchmark data sets demonstrate that RMCV outstands with a superior performance in comparison with state-of-the-art methods.

【Keywords】:

175. Towards Optimal Binary Code Learning via Ordinal Embedding.

【Paper Link】【Pages】:1258-1265

【Authors】: Hong Liu ; Rongrong Ji ; Yongjian Wu ; Wei Liu

【Abstract】: Binary code learning, a.k.a., hashing, has been recently popular due to its high efficiency in large-scale similarity search and recognition. It typically maps high-dimensional data points to binary codes, where data similarity can be efficiently computed via rapid Hamming distance. Most existing unsupervised hashing schemes pursue binary codes by reducing the quantization error from an original real-valued data space to a resulting Hamming space. On the other hand, most existing supervised hashing schemes constrain binary code learning to correlate with pairwise similarity labels. However, few methods consider ordinal relations in the binary code learning process, which serve as a very significant cue to learn the optimal binary codes for similarity search. In this paper, we propose a novel hashing scheme, dubbed Ordinal Embedding Hashing (OEH), which embeds given ordinal relations among data points to learn the ranking-preserving binary codes. The core idea is to construct a directed unweighted graph to capture the ordinal relations, and then train the hash functions using this ordinal graph to preserve the permutation relations in the Hamming space. To learn such hash functions effectively, we further relax the discrete constraints and design a stochastic gradient decent algorithm to obtain the optimal solution. Experimental results on two large-scale benchmark datasets demonstrate that the proposed OEH method can achieve superior performance over the state-of-the-arts approaches.At last, the evaluation on query by humming dataset demonstrates the OEH also has good performance for music retrieval by using user's humming or singing.

【Keywords】: Binary Code Learning; Hashing; Ordinal Embedding

176. Recognizing Complex Activities by a Probabilistic Interval-Based Model.

【Paper Link】【Pages】:1266-1272

【Authors】: Li Liu ; Li Cheng ; Ye Liu ; Yongpo Jia ; David S. Rosenblum

【Abstract】: A key challenge in complex activity recognition is the fact that a complex activity can often be performed in several different ways, with each consisting of its own configuration of atomic actions and their temporal dependencies. This leads us to define an atomic activity-based probabilistic framework that employs Allen's interval relations to represent local temporal dependencies. The framework introduces a latent variable from the Chinese Restaurant Process to explicitly characterize these unique internal configurations of a particular complex activity as a variable number of tables.It can be analytically shown that the resulting interval network satisfies the transitivity property, and as a result, all local temporal dependencies can be retained and are globally consistent.Empirical evaluations on benchmark datasets suggest our approach significantly outperforms the state-of-the-art methods.

【Keywords】: Complex activity recognition; Probabilistic graphical model; Allen’s interval relation; Interval network; Temporal dependency; Chinese restaurant process

177. Learning Adaptive Forecasting Models from Irregularly Sampled Multivariate Clinical Data.

【Paper Link】【Pages】:1273-1279

【Authors】: Zitao Liu ; Milos Hauskrecht

【Abstract】: Building accurate predictive models of clinical multivariate time series is crucial for understanding of the patient condition, the dynamics of a disease, and clinical decision making. A challenging aspect of this process is that the model should be flexible and adaptive to reflect well patient-specific temporal behaviors and this also in the case when the available patient-specific data are sparse and short span. To address this problem we propose and develop an adaptive two-stage forecasting approach for modeling multivariate, irregularly sampled clinical time series of varying lengths. The proposed model (1) learns the population trend from a collection of time series for past patients; (2) captures individual-specific short-term multivariate variability; and (3) adapts by automatically adjusting its predictions based on new observations. The proposed forecasting model is evaluated on a real-world clinical time series dataset. The results demonstrate that our approach is superior on the prediction tasks for multivariate, irregularly sampled clinical time series, and it outperforms both the population based and patient-specific time series prediction models in terms of prediction accuracy.

【Keywords】:

178. Deep Learning for Algorithm Portfolios.

【Paper Link】【Pages】:1280-1286

【Authors】: Andrea Loreggia ; Yuri Malitsky ; Horst Samulowitz ; Vijay A. Saraswat

【Abstract】: It is well established that in many scenarios there is no single solver that will provide optimal performance across a wide range of problem instances. Taking advantage of this observation, research into algorithm selection is designed to help identify the best approach for each problem at hand. This segregation is usually based on carefully constructed features, designed to quickly present the overall structure of the instance as a constant size numeric vector. Based on these features, a plethora of machine learning techniques can be utilized to predict the appropriate solver to execute, leading to significant improvements over relying solely on any one solver. However, being manually constructed, the creation of good features is an arduous task requiring a great deal of knowledge of the problem domain of interest. To alleviate this costly yet crucial step, this paper presents an automated methodology for producing an informative set of features utilizing a deep neural network. We show that the presented approach completely automates the algorithm selection pipeline and is able to achieve significantly better performance than a single best solver across multiple problem domains.

【Keywords】: deep learning; convolutional neural network; algorithm portfolios; knowledge representation

179. Convolutional Neural Networks over Tree Structures for Programming Language Processing.

【Paper Link】【Pages】:1287-1293

【Authors】: Lili Mou ; Ge Li ; Lu Zhang ; Tao Wang ; Zhi Jin

【Abstract】: Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

【Keywords】: deep learning; neural network; program analysis

180. Learning Tractable Probabilistic Models for Fault Localization.

【Paper Link】【Pages】:1294-1301

【Authors】: Aniruddh Nath ; Pedro M. Domingos

【Abstract】: In recent years, several probabilistic techniques have been applied to various debugging problems. However, most existing probabilistic debugging systems use relatively simple statistical models, and fail to generalize across multiple programs. In this work, we propose Tractable Fault Localization Models (TFLMs) that can be learned from data, and probabilistically infer the location of the bug. While most previous statistical debugging methods generalize over many executions of a single program, TFLMs are trained on a corpus of previously seen buggy programs, and learn to identify recurring patterns of bugs. Widely-used fault localization techniques such as TARANTULA evaluate the suspiciousness of each line in isolation; in contrast, a TFLM defines a joint probability distribution over buggy indicator variables for each line. Joint distributions with rich dependency structure are often computationally intractable; TFLMs avoid this by exploiting recent developments in tractable probabilistic models (specifically, Relational SPNs). Further, TFLMs can incorporate additional sources of information, including coverage-based features such as TARANTULA. We evaluate the fault localization performance of TFLMs that include TARANTULA scores as features in the probabilistic model. Our study shows that the learned TFLMs isolate bugs more effectively than previous statistical methods or using TARANTULA directly.

【Keywords】: Statistical relational learning; Tractable models; Automated debugging

181. Unsupervised Feature Selection with Structured Graph Optimization.

【Paper Link】【Pages】:1302-1308

【Authors】: Feiping Nie ; Wei Zhu ; Xuelong Li

【Abstract】: Since amounts of unlabelled and high-dimensional data needed to be processed, unsupervised feature selection has become an important and challenging problem in machine learning. Conventional embedded unsupervised methods always need to construct the similarity matrix, which makes the selected features highly depend on the learned structure. However real world data always contain lots of noise samples and features that make the similarity matrix obtained by original data can't be fully relied. We propose an unsupervised feature selection approach which performs feature selection and local structure learning simultaneously, the similarity matrix thus can be determined adaptively. Moreover, we constrain the similarity matrix to make it contain more accurate information of data structure, thus the proposed approach can select more valuable features. An efficient and simple algorithm is derived to optimize the problem. Experiments on various benchmark data sets, including handwritten digit data, face image data and biomedical data, validate the effectiveness of the proposed approach.

【Keywords】: Unsupervised Feature Selection; Embedded Method; Spectral Analysis

182. Differential Privacy Preservation for Deep Auto-Encoders: an Application of Human Behavior Prediction.

【Paper Link】【Pages】:1309-1316

【Authors】: NhatHai Phan ; Yue Wang ; Xintao Wu ; Dejing Dou

【Abstract】: In recent years, deep learning has spread beyond both academia and industry with many exciting real-world applications. The development of deep learning has presented obvious privacy issues. However, there has been lack of scientific study about privacy preservation in deep learning. In this paper, we concentrate on the auto-encoder, a fundamental component in deep learning, and propose the deep private auto-encoder (dPA). Our main idea is to enforce ε-differential privacy by perturbing the objective functions of the traditional deep auto-encoder, rather than its results. We apply the dPA to human behavior prediction in a health social network. Theoretical analysis and thorough experimental evaluations show that the dPA is highly effective and efficient, and it significantly outperforms existing solutions.

【Keywords】: differential privacy; deep learning; health social network; human behavior prediction

183. Privacy-CNH: A Framework to Detect Photo Privacy with Convolutional Neural Network using Hierarchical Features.

【Paper Link】【Pages】:1317-1323

【Authors】: Lam Tran ; Deguang Kong ; Hongxia Jin ; Ji Liu

【Abstract】: Photo privacy is a very important problem in the digital age where photos are commonly shared on social networking sites and mobile devices. The main challenge in photo privacy detection is how to generate discriminant features to accurately detect privacy at risk photos. Existing photo privacy detection works, which rely on low-level vision features, are non-informative to the users regarding what privacy information is leaked from their photos. In this paper, we propose a new framework called Privacy-CNH that utilizes hierarchical features which include both object and convolutional features in a deep learning model to detect privacy at risk photos. The generation of object features enables our model to better inform the users about the reason why a photo has privacy risk. The combination of convolutional and object features provide a richer model to understand photo privacy from different aspects, thus improving photo privacy detection accuracy. Experimental results demonstrate that the proposed model outperforms the state-of-the-art work and the standard convolutional neural network (CNN) with low-level features on photo privacy detection tasks.

【Keywords】: Image Classification; Privacy; Deep Learning

184. Drosophila Gene Expression Pattern Annotations via Multi-Instance Biological Relevance Learning.

【Paper Link】【Pages】:1324-1330

【Authors】: Hua Wang ; Cheng Deng ; Hao Zhang ; Xinbo Gao ; Heng Huang

【Abstract】: Recent developments in biologyhave produced a large number of gene expression patterns, many of which have been annotated textually with anatomical and developmental terms. These terms spatially correspond to local regions of the images, which are attached collectively to groups of images. Because one does not know which term is assigned to which region of which image in the group, the developmental stage classification and anatomical term annotation turn out to be a multi-instance learning (MIL) problem, which considers input as bags of instances and labels are assigned to the bags. Most existing MIL methods routinely use the Bag-to-Bag (B2B) distances, which, however, are often computationally expensive and may not truly reflect the similarities between the anatomical and developmental terms. In this paper, we approach the MIL problem from a new perspective using the Class-to-Bag (C2B) distances, which directly assesses the relations between annotation terms and image panels. Taking into account the two challenging properties of multi-instance gene expression data, high heterogeneity and weak label association, we computes the C2B distance by introducing class specific distance metrics and locally adaptive significance coefficients.We apply our new approach to automatic gene expression pattern classification and annotation on the Drosophila melanogaster species. Extensive experiments have demonstrated the effectiveness of our new method.

【Keywords】: Multi-Instance Learning; Drosophila Gene Expression Pattern Annotations

185. Recommending Groups to Users Using User-Group Engagement and Time-Dependent Matrix Factorization.

【Paper Link】【Pages】:1331-1337

【Authors】: Xin Wang ; Roger Donaldson ; Christopher Nell ; Peter Gorniak ; Martin Ester ; Jiajun Bu

【Abstract】: Social networks often provide group features to help users with similar interests associate and consume content together. Recommending groups to users poses challenges due to their complex relationship: user-group affinity is typically measured implicitly and varies with time; similarly, group characteristics change as users join and leave. To tackle these challenges, we adapt existing matrix factorization techniques to learn user-group affinity based on two different implicit engagement metrics: (i) which group-provided content users consume; and (ii) which content users provide to groups. To capture the temporally extended nature of group engagement we implement a time-varying factorization. We test the assertion that latent preferences for groups and users are sparse in investigating elastic-net regularization. Experiments using data from DeviantArt indicate that the time-varying implicit engagement-based model provides the best top-K group recommendations, illustrating the benefit of the added model complexity.

【Keywords】: User Behavioural Modelling;Personalization;Recommendation;Temporal Dynamics

【Paper Link】【Pages】:1338-1344

【Authors】: Ying Wei ; Yin Zhu ; Cane Wing-ki Leung ; Yangqiu Song ; Qiang Yang

【Abstract】: Ubiquitous computing tasks, such as human activity recognition (HAR), are enabling a wide spectrum of applications, ranging from healthcare to environment monitoring. The success of a ubiquitous computing task relies on sufﬁcient physical sensor data with groundtruth labels, which are always scarce due to the expensive annotating process. Meanwhile, social media platforms provide a lot of social or semantic context information. People share what they are doing and where they are frequently in the messages they post. This rich set of socially shared activities motivates us to transfer knowledge from social media to address the sparsity issue of labelled physical sensor data. In order to transfer the knowledge of social and semantic context, we propose a Co-Regularized Heterogeneous Transfer Learning (CoHTL) model, which builds a common semantic space derived from two heterogeneous domains. Our proposed method outperforms state-of-the-art methods on two ubiquitous computing tasks, namely human activity recognition and region function discovery.

【Keywords】: Heterogeneous Transfer Learning; Ubiquitous Computing; Social Media Mining

187. Exploiting an Oracle That Reports AUC Scores in Machine Learning Contests.

【Paper Link】【Pages】:1345-1351

【Authors】: Jacob Whitehill

【Abstract】: In machine learning contests such as the ImageNet Large Scale Visual Recognition Challenge and the KDD Cup, contestants can submit candidate solutions and receive from an oracle (typically the organizers of the competition) the accuracy of their guesses compared to the ground-truth labels. One of the most commonly used accuracy metrics for binary classification tasks is the Area Under the Receiver Operating Characteristics Curve (AUC). In this paper we provide proofs-of-concept of how knowledge of the AUC of a set of guesses can be used, in two different kinds of attacks, to improve the accuracy of those guesses. On the other hand, we also demonstrate the intractability of one kind of AUC exploit by proving that the number of possible binary labelings of n examples for which a candidate solution obtains a AUC score of c grows exponentially in n, for every c in (0,1).

【Keywords】: area under the ROC curve; data-mining contests

188. Efficient Nonparametric Subgraph Detection Using Tree Shaped Priors.

【Paper Link】【Pages】:1352-1358

【Authors】: Nannan Wu ; Feng Chen ; Jianxin Li ; Baojian Zhou ; Naren Ramakrishnan

【Abstract】: Non-parametric graph scan (NPGS) statistics are used to detect anomalous connected subgraphs on graphs, and have a wide variety of applications, such as disease outbreak detection, road traffic congestion detection, and event detection in social media. In contrast to traditional parametric scan statistics (e.g., the Kulldorff statistic), NPGS statistics are free of distributional assumptions and can be applied to heterogeneous graph data. In this paper, we make a number of contributions to the computational study of NPGS statistics. First, we present a novel reformulation of the problem as a sequence of Budget Price-Collecting Steiner Tree (B-PCST) sub-problems. Second, we show that this reformulated problem is NP-hard for a large class of nonparametric statistic functions. Third, we further develop efficient exact and approximate algorithms for a special category of graphs in which the anomalous subgraphs can be reformulated in a fixed tree topology. Finally, using extensive experiments we demonstrate the performance of our proposed algorithms in two real-world application domains (water pollution detection in water sensor networks and spatial event detection in social media networks) and contrast against state-of-the-art connected subgraph detection methods.

【Keywords】: Scan Statistics; Connected Subgraph Detection; Anomalous Pattern Detection; Event Detection

189. Factorization Ranking Model for Move Prediction in the Game of Go.

【Paper Link】【Pages】:1359-1365

【Authors】: Chenjun Xiao ; Martin Müller

【Abstract】: In this paper, we investigate the move prediction problem in the game of Go by proposing a new ranking model named Factorization Bradley Terry (FBT) model. This new model considers the move prediction problem as group competitions while also taking the interaction between features into account. A FBT model is able to provide a probability distribution that expresses a preference over moves. Therefore it can be easily compiled into an evaluation function and applied in a modern Go program. We propose a Stochastic Gradient Decent (SGD) algorithm to train a FBT model using expert game records, and provide two methods for fast computation of the gradient in order to speed up the training process. Experimental results show that our FBT model outperforms the state-of-the-art move prediction system of Latent Factor Ranking (LFR).

【Keywords】: Factorization ranking model; FBT; Move prediction in Computer Go

190. Joint Multi-View Representation Learning and Image Tagging.

【Paper Link】【Pages】:1366-1372

【Authors】: Zhe Xue ; Guorong Li ; Qingming Huang

【Abstract】: Automatic image annotation is an important problem in several machine learning applications such as image search. Since there exists a semantic gap between low-level image features and high-level semantics, the description ability of image representation can largely affect annotation results. In fact, image representation learning and image tagging are two closely related tasks. A proper image representation can achieve better image annotation results, and image tags can be treated as guidance to learn more effective image representation. In this paper, we present an optimal predictive subspace learning method which jointly conducts multi-view representation learning and image tagging. The two tasks can promote each other and the annotation performance can be further improved. To make the subspace to be more compact and discriminative, both visual structure and semantic information are exploited during learning. Moreover, we introduce powerful predictors (SVM) for image tagging to achieve better annotation performance. Experiments on standard image annotation datasets demonstrate the advantages of our method over the existing image annotation methods.

【Keywords】: Image Tagging; Image Representation; Multi-View Learning

191. Learning Deep Convolutional Neural Networks for X-Ray Protein Crystallization Image Analysis.

【Paper Link】【Pages】:1373-1379

【Authors】: Margot Lisa-Jing Yann ; Yichuan Tang

【Abstract】: Obtaining a protein's 3D structure is crucial to the understanding of its functions and interactions with other proteins. It is critical to accelerate the protein crystallization process with improved accuracy for understanding cancer and designing drugs. Systematic high-throughput approaches in protein crystallization have been widely applied, generating a large number of protein crystallization-trial images. Therefore, an efficient and effective automatic analysis for these images is a top priority. In this paper, we present a novel system, CrystalNet, for automatically labeling outcomes of protein crystallization-trial images. CrystalNet is a deep convolutional neural network that automatically extracts features from X-ray protein crystallization images for classification. We show that (1) CrystalNet can provide real-time labels for crystallization images effectively, requiring approximately 2 seconds to provide labels for all 1536 images of crystallization microassay on each plate; (2) compared with the state-of-the-art classification systems in crystallization image analysis, our technique demonstrates an improvement of 8% in accuracy, and achieve 90.8% accuracy in classification. As a part of the high-throughput pipeline which generates millions of images a year, CrystalNet can lead to a substantial reduction of labor-intensive screening.

【Keywords】: Deep Learning; Neural Networks; Protein Crystallography;

192. Linear Submodular Bandits with a Knapsack Constraint.

【Paper Link】【Pages】:1380-1386

【Authors】: Baosheng Yu ; Meng Fang ; Dacheng Tao

【Abstract】: Linear submodular bandits has been proven to be effective in solving the diversification and feature-based exploration problems in retrieval systems. Concurrently, many web-based applications, such as news article recommendation and online ad placement, can be modeled as budget-limited problems. However, the diversification problem under a budget constraint has not been considered. In this paper, we first introduce the budget constraint to linear submodular bandits as a new problem called the linear submodular bandits with a knapsack constraint. We then define an alpha-approximation unit-cost regret considering that submodular function maximization is NP-hard. To solve this problem, we propose two greedy algorithms based on a modified UCB rule. We then prove these two algorithms with different regret bounds and computational costs. We also conduct a number of experiments and the experimental results confirm our theoretical analyses.

【Keywords】:

193. Submodular Asymmetric Feature Selection in Cascade Object Detection.

【Paper Link】【Pages】:1387-1393

【Authors】: Baosheng Yu ; Meng Fang ; Dacheng Tao ; Jie Yin

【Abstract】: A cascade classifier has turned out to be effective insliding-window based real-time object detection. In acascade classifier, node learning is the key process,which includes feature selection and classifier design. Previous algorithms fail to effectively tackle the asymmetry and intersection problems existing in cascade classification, thereby limiting the performance of object detection. In this paper, we improve current feature selection algorithm by addressing both asymmetry and intersection problems. We formulate asymmetric feature selection as a submodular function maximization problem. We then propose a new algorithm SAFS with formal performance guarantee to solve this problem.We use face detection as a case study and perform experiments on two real-world face detection datasets. The experimental results demonstrate that our algorithm SAFS outperforms the state-of-art feature selection algorithms in cascade object detection, such as FFS and LACBoost.

【Keywords】:

194. Semisupervised Autoencoder for Sentiment Analysis.

【Paper Link】【Pages】:1394-1400

【Authors】: Shuangfei Zhai ; Zhongfei (Mark) Zhang

【Abstract】: In this paper, we investigate the usage of autoencoders in modeling textual data. Traditional autoencoders suffer from at least two aspects: scalability with the high dimensionality of vocabulary size and dealing with task-irrelevant words. We address this problem by introducing supervision via the loss function of autoencoders. In particular, we first train a linear classifier on the labeled data, then define a loss for the autoencoder with the weights learned from the linear classifier. To reduce the bias brought by one single classifier, we define a posterior probability distribution on the weights of the classifier, and derive the marginalized loss of the autoencoder with Laplace approximation. We show that our choice of loss function can be rationalized from the perspective of Bregman Divergence, which justifies the soundness of our model. We evaluate the effectiveness of our model on six sentiment analysis datasets, and show that our model significantly outperforms all the competing methods with respect to classification accuracy. We also show that our model is able to take advantage of unlabeled dataset and get improved performance. We further show that our model successfully learns highly discriminative feature maps, which explains its superior performance.

【Keywords】: autoencoder; semisupervised learning; sentiment analysis

195. Simultaneous Feature and Sample Reduction for Image-Set Classification.

【Paper Link】【Pages】:1401-1407

【Authors】: Man Zhang ; Ran He ; Dong Cao ; Zhenan Sun ; Tieniu Tan

【Abstract】: Image-set classification is the assignment of a label to a given image set. In real-life scenarios such as surveillance videos, each image set often contains much redundancy in terms of features and samples. This paper introduces a joint learning method for image-set classification that simultaneously learns compact binary codes and removes redundant samples. The joint objective function of our model mainly includes two parts. The first part seeks a hashing function to generate binary codes that have larger inter-class and smaller intra-class distances. The second one reduces redundant samples with discrete constraints in a low-rank way. A kernel method based on anchor points is further used to reduce sample variations. The proposed discrete objective function is simplified to a series of sub-problems that admit an analytical solution, resulting in a high-quality discrete solution with a low computational cost. Experiments on three commonly used image-set datasets show that the proposed method for the tasks of face recognition from image sets is efficient and effective.

【Keywords】:

196. Collective Noise Contrastive Estimation for Policy Transfer Learning.

【Paper Link】【Pages】:1408-1414

【Authors】: Weinan Zhang ; Ulrich Paquet ; Katja Hofmann

【Abstract】: We address the problem of learning behaviour policies to optimise online metrics from heterogeneous usage data. While online metrics, e.g., click-through rate, can be optimised effectively using exploration data, such data is costly to collect in practice, as it temporarily degrades the user experience. Leveraging related data sources to improve online performance would be extremely valuable, but is not possible using current approaches. We formulate this task as a policy transfer learning problem, and propose a first solution, called collective noise contrastive estimation (collective NCE). NCE is an efficient solution to approximating the gradient of a log-softmax objective. Our approach jointly optimises embeddings of heterogeneous data to transfer knowledge from the source domain to the target domain. We demonstrate the effectiveness of our approach by learning an effective policy for an online radio station jointly from user-generated playlists, and usage data collected in an exploration bucket.

【Keywords】: Transfer Learning; Policy Learning; Noise Contrastive Estimation; Recommender Systems

197. Learning a Hybrid Architecture for Sequence Regression and Annotation.

【Paper Link】【Pages】:1415-1421

【Authors】: Yizhe Zhang ; Ricardo Henao ; Lawrence Carin ; Jianling Zhong ; Alexander J. Hartemink

【Abstract】: When learning a hidden Markov model (HMM), sequential observations can often be complemented by real-valued summary response variables generated from the path of hidden states. Such settings arise in numerous domains, including many applications in biology, like motif discovery and genome annotation. In this paper, we present a flexible framework for jointly modeling both latent sequence features and the functional mapping that relates the summary response variables to the hidden state sequence. The algorithm is compatible with a rich set of mapping functions. Results show that the availability of additional continuous response variables can simultaneously improve the annotation of the sequential observations and yield good prediction performance in both synthetic data and real-world datasets.

【Keywords】:

198. Pose-Dependent Low-Rank Embedding for Head Pose Estimation.

【Paper Link】【Pages】:1422-1428

【Authors】: Handong Zhao ; Zhengming Ding ; Yun Fu

【Abstract】: Head pose estimation via embedding model has beendemonstrated its effectiveness from the recent works.However, most of the previous methods only focuson manifold relationship among poses, while overlookthe underlying global structure among subjects and poses.To build a robust and effective head pose estimator,we propose a novel Pose-dependent Low-Rank Embedding(PLRE) method, which is designed to exploita discriminative subspace to keep within-pose samplesclose while between-pose samples far away. Specifically,low-rank embedding is employed under the multitaskframework, where each subject can be naturallyconsidered as one task. Then, two novel terms are incorporatedto align multiple tasks to pursue a better posedependentembedding. One is the cross-task alignmentterm, aiming to constrain each low-rank coefficient toshare the similar structure. The other is pose-dependentgraph regularizer, which is developed to capture manifoldstructure of same pose cross different subjects. Experimentson databases CMU-PIE, MIT-CBCL, and extendedYaleB with different levels of random noise areconducted and six embedding model based baselinesare compared. The consistent superior results demonstratethe effectiveness of our proposed method.

【Keywords】:

199. Cold-Start Heterogeneous-Device Wireless Localization.

【Paper Link】【Pages】:1429-1435

【Authors】: Vincent W. Zheng ; Hong Cao ; Shenghua Gao ; Aditi Adhikari ; Miao Lin ; Kevin Chen-Chuan Chang

【Abstract】: In this paper, we study a cold-start heterogeneous-devicelocalization problem. This problem is challenging, becauseit results in an extreme inductive transfer learning setting,where there is only source domain data but no target do-main data. This problem is also underexplored. As there is notarget domain data for calibration, we aim to learn a robustfeature representation only from the source domain. There islittle previous work on such a robust feature learning task; besides, the existing robust feature representation propos-als are both heuristic and inexpressive. As our contribution,we for the first time provide a principled and expressive robust feature representation to solve the challenging cold-startheterogeneous-device localization problem. We evaluate ourmodel on two public real-world data sets, and show that itsignificantly outperforms the best baseline by 23.1%–91.3%across four pairs of heterogeneous devices.

【Keywords】:

【Paper Link】【Pages】:1436-1443

【Authors】: Yangxin Zhong ; Shixia Liu ; Xiting Wang ; Jiannan Xiao ; Yangqiu Song

【Abstract】: In many applications, ideas that are described by a set of words often flow between different groups. To facilitate users in analyzing the flow, we present a method to model the flow behaviors that aims at identifying the lead-lag relationships between word clusters of different user groups. In particular, an improved Bayesian conditional cointegration based on dynamic time warping is employed to learn links between words in different groups. A tensor-based technique is developed to cluster these linked words into different clusters (ideas) and track the flow of ideas. The main feature of the tensor representation is that we introduce two additional dimensions to represent both time and lead-lag relationships. Experiments on both synthetic and real datasets show that our method is more effective than methods based on traditional clustering techniques and achieves better accuracy. A case study was conducted to demonstrate the usefulness of our method in helping users understand the flow of ideas between different user groups on social media.

【Keywords】: Idea flow; Information diffusion; Text mining; Temporal data; Social media

201. Fast Hybrid Algorithm for Big Matrix Recovery.

【Paper Link】【Pages】:1444-1451

【Authors】: Tengfei Zhou ; Hui Qian ; Zebang Shen ; Congfu Xu

【Abstract】: Large-scale Nuclear Norm penalized Least Square problem (NNLS) is frequently encountered in estimation of low rank structures. In this paper we accelerate the solution procedure by combining non-smooth convex optimization with smooth Riemannian method. Our methods comprise of two phases. In the first phase, we use Alternating Direction Method of Multipliers (ADMM) both to identify the fix rank manifold where an optimum resides and to provide an initializer for the subsequent refinement. In the second phase, two superlinearly convergent Riemannian methods: Riemannian NewTon (NT) and Riemannian Conjugate Gradient descent (CG) are adopted to improve the approximation over a fix rank manifold. We prove that our Hybrid method of ADMM and NT (HADMNT) converges to an optimum of NNLS at least quadratically. The experiments on large-scale collaborative filtering datasets demonstrate very competitive performance of these fast hybrid methods compared to the state-of-the-arts.

【Keywords】:

Technical Papers: Machine Learning Methods 137

202. Data Poisoning Attacks against Autoregressive Models.

【Paper Link】【Pages】:1452-1458

【Authors】: Scott Alfeld ; Xiaojin Zhu ; Paul Barford

【Abstract】: Forecasting models play a key role in money-making ventures in many different markets. Such models are often trained on data from various sources, some of which may be untrustworthy.An actor in a given market may be incentivised to drive predictions in a certain direction to their own benefit.Prior analyses of intelligent adversaries in a machine-learning context have focused on regression and classification.In this paper we address the non-iid setting of time series forecasting.We consider a forecaster, Bob, using a fixed, known model and a recursive forecasting method.An adversary, Alice, aims to pull Bob's forecasts toward her desired target series, and may exercise limited influence on the initial values fed into Bob's model.We consider the class of linear autoregressive models, and a flexible framework of encoding Alice's desires and constraints.We describe a method of calculating Alice's optimal attack that is computationally tractable, and empirically demonstrate its effectiveness compared to random and greedy baselines on synthetic and real-world time series data.We conclude by discussing defensive strategies in the face of Alice-like adversaries.

【Keywords】: Adversarial Learning; Time Series Forecasting; Data Poisoning Attacks

203. Approximate K-Means++ in Sublinear Time.

【Paper Link】【Pages】:1459-1467

【Authors】: Olivier Bachem ; Mario Lucic ; S. Hamed Hassani ; Andreas Krause

【Abstract】: The quality of K-Means clustering is extremely sensitive to proper initialization. The classic remedy is to apply k-means++ to obtain an initial set of centers that is provably competitive with the optimal solution. Unfortunately, k-means++ requires k full passes over the data which limits its applicability to massive datasets. We address this problem by proposing a simple and efficient seeding algorithm for K-Means clustering. The main idea is to replace the exact D2-sampling step in k-means++ with a substantially faster approximation based on Markov Chain Monte Carlo sampling. We prove that, under natural assumptions on the data, the proposed algorithm retains the full theoretical guarantees of k-means++ while its computational complexity is only sublinear in the number of data points. For such datasets, one can thus obtain a provably good clustering in sublinear time. Extensive experiments confirm that the proposed method is competitive with k-means++ on a variety of real-world, large-scale datasets while offering a reduction in runtime of several orders of magnitude.

【Keywords】: Clustering; K-Means; Large-scale machine learning; Markov Chain Monte Carlo; approximate sampling

204. Incremental Stochastic Factorization for Online Reinforcement Learning.

【Paper Link】【Pages】:1468-1475

【Authors】: André da Motta Salles Barreto ; Rafael L. Beirigo ; Joelle Pineau ; Doina Precup

【Abstract】: A construct that has been receiving attention recently in reinforcement learning is stochastic factorization (SF), a particular case of non-negative factorization (NMF) in which the matrices involved are stochastic. The idea is to use SF to approximate the transition matrices of a Markov decision process (MDP). This is useful for two reasons. First, learning the factors of the SF instead of the transition matrices can reduce significantly the number of parameters to be estimated. Second, it has been shown that SF can be used to reduce the number of operations needed to compute an MDP's value function. Recently, an algorithm called expectation-maximization SF (EMSF) has been proposed to compute a SF directly from transitions sampled from an MDP. In this paper we take a closer look at EMSF. First, by exploiting the assumptions underlying the algorithm, we show that it is possible to reduce it to simple multiplicative update rules similar to the ones that helped popularize NMF. Second, we analyze the optimization process underlying EMSF and find that it minimizes a modified version of the Kullback-Leibler divergence that is particularly well-suited for learning a SF from data sampled from an arbitrary distribution. Third, we build on this improved understanding of EMSF to draw an interesting connection with NMF and probabilistic latent semantic analysis. We also exploit the simplified update rules to introduce a new version of EMSF that generalizes and significantly improves its precursor. This new algorithm provides a practical mechanism to control the trade-off between memory usage and computing time, essentially freeing the space complexity of EMSF from its dependency on the number of sample transitions. The algorithm can also compute its approximation incrementally, which makes it possible to use it concomitantly with the collection of data. This feature makes the new version of EMSF particularly suitable for online reinforcement learning. Empirical results support the utility of the proposed algorithm.

【Keywords】: Reinforcement Learning; Markov Decision Processes; Stochastic Factorization

205. Increasing the Action Gap: New Operators for Reinforcement Learning.

【Paper Link】【Pages】:1476-1483

【Authors】: Marc G. Bellemare ; Georg Ostrovski ; Arthur Guez ; Philip S. Thomas ; Rémi Munos

【Abstract】: This paper introduces new optimality-preserving operators on Q-functions. We first describe an operator for tabular representations, the consistent Bellman operator, which incorporates a notion of local policy consistency. We show that this local consistency leads to an increase in the action gap at each state; increasing this gap, we argue, mitigates the undesirable effects of approximation and estimation errors on the induced greedy policies. This operator can also be applied to discretized continuous space and time problems, and we provide empirical results evidencing superior performance in this context. Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator. As corollaries we provide a proof of optimality for Baird's advantage learning algorithm and derive other gap-increasing operators with interesting properties. We conclude with an empirical study on 60 Atari 2600 games illustrating the strong potential of these new operators.

【Keywords】: Reinforcement learning, Bellman operator, dynamic programming

206. Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication.

【Paper Link】【Pages】:1484-1490

【Authors】: Massimo Cairo ; Gabriele Farina ; Romeo Rizzi

【Abstract】: In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor. In our approach, we interpret the Viterbi algorithm as a repeated computation of matrix-vector (max, +)-multiplications. On time-homogeneous HMMs, this computation is online: a matrix, known in advance, has to be multiplied with several vectors revealed one at a time. Our main contribution is an algorithm solving this version of matrix-vector (max,+)-multiplication in subquadratic time, by performing a polynomial preprocessing of the matrix. Employing this fast multiplication algorithm, we solve the MAPD problem in O(mn 2 /log n) time for any time-homogeneous HMM of size n and observation sequence of length m, with an extra polynomial preprocessing cost negligible for m > n . To the best of our knowledge, this is the first algorithm for the MAPD problem requiring subquadratic time per observation, under the assumption — usually verified in practice — that the transition probability matrix does not change with time.

【Keywords】: Viterbi algorithm; Hidden Markov Models

207. Maximum Margin Dirichlet Process Mixtures for Clustering.

【Paper Link】【Pages】:1491-1497

【Authors】: Gang Chen ; Haiying Zhang ; Caiming Xiong

【Abstract】: The Dirichlet process mixtures (DPM) can automatically infer the model complexity from data. Hence it has attracted significant attention recently, and is widely used for model selection and clustering. As a generative model, it generally requires prior base distribution to learn component parameters by maximizing posterior probability. In contrast, discriminative classifiers model the conditional probability directly, and have yielded better results than generative classifiers.In this paper, we propose a maximum margin Dirichlet process mixture for clustering, which is different from the traditional DPM for parameter modeling. Our model takes a discriminative clustering approach, by maximizing a conditional likelihood to estimate parameters. In particular, we take a EM-like algorithm by leveraging Gibbs sampling algorithm for inference, which in turn can be perfectly embedded in the online maximum margin learning procedure to update model parameters. We test our model and show comparative results over the traditional DPM and other nonparametric clustering approaches.

【Keywords】: Nonparametric clustering; maximum margin learning; online learning

208. Progressive EM for Latent Tree Models and Hierarchical Topic Detection.

【Paper Link】【Pages】:1498-1504

【Authors】: Peixian Chen ; Nevin L. Zhang ; Leonard K. M. Poon ; Zhourong Chen

【Abstract】: Hierarchical latent tree analysis (HLTA) is recently proposed as a new method for topic detection. It differs fundamentally from the LDA-based methods in terms of topic definition, topic-document relationship, and learning method. It has been shown to discover significantly more coherent topics and better topic hierarchies. However, HLTA relies on the Expectation-Maximization (EM) algorithm for parameter estimation and hence is not efficient enough to deal with large datasets. In this paper, we propose a method to drastically speed up HLTA using a technique inspired by the advances in the method of moments. Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies.

【Keywords】:

209. Knowledge Transfer with Interactive Learning of Semantic Relationships.

【Paper Link】【Pages】:1505-1511

【Authors】: Jonghyun Choi ; Sung Ju Hwang ; Leonid Sigal ; Larry S. Davis

【Abstract】: We propose a novel learning framework for object categorization with interactive semantic feedback. In this framework, a discriminative categorization model improves through human-guided iterative semantic feedbacks. Specifically, the model identifies the most helpful relational semantic queries to discriminatively refine the model. The user feedback on whether the relationship is semantically valid or not is incorporated back into the model, in the form of regularization, and the process iterates. We validate the proposed model in a few-shot multi-class classification scenario, where we measure classification performance on a set of ‘target’ classes, with few training instances, by leveraging and transferring knowledge from ‘anchor’ classes, that contain larger set of labeled instances.

【Keywords】: active learning; interactive learning; transfer learning; knowledge transfer; human-in-the-loop classification

210. Robustness of Bayesian Pool-Based Active Learning Against Prior Misspecification.

【Paper Link】【Pages】:1512-1518

【Authors】: Nguyen Viet Cuong ; Nan Ye ; Wee Sun Lee

【Abstract】: We study the robustness of active learning (AL) algorithms against prior misspecification: whether an algorithm achieves similar performance using a perturbed prior as compared to using the true prior. In both the average and worst cases of the maximum coverage setting, we prove that all alpha-approximate algorithms are robust (i.e., near alpha-approximate) if the utility is Lipschitz continuous in the prior. We further show that robustness may not be achieved if the utility is non-Lipschitz. This suggests we should use a Lipschitz utility for AL if robustness is required. For the minimum cost setting, we can also obtain a robustness result for approximate AL algorithms. Our results imply that many commonly used AL algorithms are robust against perturbed priors. We then propose the use of a mixture prior to alleviate the problem of prior misspecification. We analyze the robustness of the uniform mixture prior and show experimentally that it performs reasonably well in practice.

【Keywords】: Active Learning; Robustness; Prior Misspecification; Pool-based; Bayesian

211. Learning Step Size Controllers for Robust Neural Network Training.

【Paper Link】【Pages】:1519-1525

【Authors】: Christian Daniel ; Jonathan Taylor ; Sebastian Nowozin

【Abstract】: This paper investigates algorithms to automatically adapt the learning rate of neural networks (NNs). Starting with stochastic gradient descent, a large variety of learning methods has been proposed for the NN setting. However, these methods are usually sensitive to the initial learning rate which has to be chosen by the experimenter. We investigate several features and show how an adaptive controller can adjust the learning rate without prior knowledge of the learning problem at hand.

【Keywords】: Neural Networks; Deep Learning; Reinforcement Learning

212. Reconstructing Hidden Permutations Using the Average-Precision (AP) Correlation Statistic.

【Paper Link】【Pages】:1526-1532

【Authors】: Lorenzo De Stefani ; Alessandro Epasto ; Eli Upfal ; Fabio Vandin

【Abstract】: We study the problem of learning probabilistic models for permutations, where the order between highly ranked items in the observed permutations is more reliable (i.e., consistent in different rankings) than the order between lower ranked items, a typical phenomena observed in many applications such as web search results and product ranking. We introduce and study a variant of the Mallows model where the distribution is a function of the widely used Average-Precision (AP) Correlation statistic, instead of the standard Kendall’s tau distance. We present a generative model for constructing samples from this distribution and prove useful properties of that distribution. Using these properties we develop an efficient algorithm that provably computes an asymptotically unbiased estimate of the center permutation, and a faster algorithm that learns with high probability the hidden central permutation for a wide range of the parameters of the model. We complement our theoretical analysis with extensive experiments showing that unsupervised methods based on our model can precisely identify ground-truth clusters of rankings in real-world data. In particular, when compared to the Kendall’s tau based methods, our methods are less affected by noise in low-rank items.

【Keywords】: Mallows Model; Permutations; Rankings; Algorithms; Classification; Clustering; Preferences

213. Generalised Brown Clustering and Roll-Up Feature Generation.

【Paper Link】【Pages】:1533-1539

【Authors】: Leon Derczynski ; Sean Chester

【Abstract】: Brown clustering is an established technique, used in hundreds of computational linguistics papers each year, to group word types that have similar distributional information. It is unsupervised and can be used to create powerful word representations for machine learning. Despite its improbable success relative to more complex methods, few have investigated whether Brown clustering has really been applied optimally. In this paper, we present a subtle but profound generalisation of Brown clustering to improve the overall quality by decoupling the number of output classes from the computational active set size. Moreover, the generalisation permits a novel approach to feature selection from Brown clusters: We show that the standard approach of shearing the Brown clustering output tree at arbitrary bitlengths is lossy and that features should be chosen insead by rolling up Generalised Brown hierarchies. The generalisation and corresponding feature generation is more principled, challenging the way Brown clustering is currently understood and applied.

【Keywords】: hierarchical clustering; unsupervised learning; word representations; natural language processing

214. Random Composite Forests.

【Paper Link】【Pages】:1540-1546

【Authors】: Giulia DeSalvo ; Mehryar Mohri

【Abstract】: We introduce a broad family of decision trees, Composite Trees, whose leaf classifiers are selected out of a hypothesis set composed of p subfamilies with different complexities. We prove new data-dependent learning guarantees for this family in the multi-class setting. These learning bounds provide a quantitative guidance for the choice of the hypotheses at each leaf. Remarkably, they depend on the Rademacher complexities of the sub-families of the predictors and the fraction of sample points correctly classified at each leaf. We further introduce random composite trees and derive learning guarantees for random composite trees which also apply to Random Forests. Using our theoretical analysis, we devise a new algorithm, RANDOMCOMPOSITEFORESTS (RCF), that is based on forming an ensemble of random composite trees. We report the results of experiments demonstrating that RCF yields significant performance improvements over both Random Forests and a variant of RCF in several tasks.

【Keywords】:

215. The Ostomachion Process.

【Paper Link】【Pages】:1547-1553

【Authors】: Xuhui Fan ; Bin Li ; Yi Wang ; Yang Wang ; Fang Chen

【Abstract】: Stochastic partition processes for exchangeable graphs produce axis-aligned blocks on a product space. In relational modeling, the resulting blocks uncover the underlying interactions between two sets of entities of the relational data. Although some flexible axis-aligned partition processes, such as the Mondrian process, have been able to capture complex interacting patterns in a hierarchical fashion, they are still in short of capturing dependence between dimensions. To overcome this limitation, we propose the Ostomachion process (OP), which relaxes the cutting direction by allowing for oblique cuts. The partitions generated by an OP are convex polygons that can capture inter-dimensional dependence. The OP also exhibits interesting properties: 1) Along the time line the cutting times can be characterized by a homogeneous Poisson process, and 2) on the partition space the areas of the resulting components comply with a Dirichlet distribution. We can thus control the expected number of cuts and the expected areas of components through hyper-parameters. We adapt the reversible-jump MCMC algorithm for inferring OP partition structures. The experimental results on relational modeling and decision tree classification have validated the merit of the OP.

【Keywords】: Stochastic partition processes, Bayesian nonparametric, relational modeling, decision tree

216. Indexable Probabilistic Matrix Factorization for Maximum Inner Product Search.

【Paper Link】【Pages】:1554-1560

【Authors】: Marco Fraccaro ; Ulrich Paquet ; Ole Winther

【Abstract】: The Maximum Inner Product Search (MIPS) problem, prevalent in matrix factorization-based recommender systems, scales linearly with the number of objects to score. Recent work has shown that clever post-processing steps can turn the MIPS problem into a nearest neighbour one, allowing sublinear retrieval time either through Locality Sensitive Hashing or various tree structures that partition the Euclidian space. This work shows that instead of employing post-processing steps, substantially faster retrieval times can be achieved for the same accuracy when inference is not decoupled from the indexing process. By framing matrix factorization to be natively indexable, so that any solution is immediately sublinearly searchable, we use the machinery of Machine Learning to best learn such a solution. We introduce Indexable Probabilistic Matrix Factorization (IPMF) to shift the traditional post-processing complexity into the training phase of the model. Its inference procedure is based on Geodesic Monte Carlo, and adds minimal additional computational cost to standard Monte Carlo methods for matrix factorization. By coupling inference and indexing in this way, we achieve more than a 50% improvement in retrieval time against two state of the art methods, for a given level of accuracy in the recommendations of two large-scale recommender systems.

【Keywords】: Fast Retrieval; Maximum Inner Product Search; Recommender Systems; Machine Learning System Architectures

217. Fast Lasso Algorithm via Selective Coordinate Descent.

【Paper Link】【Pages】:1561-1567

【Authors】: Yasuhiro Fujiwara ; Yasutoshi Ida ; Hiroaki Shiokawa ; Sotetsu Iwamura

【Abstract】: For the AI community, the lasso proposed by Tibshirani is an important regression approach in finding explanatory predictors in high dimensional data. The coordinate descent algorithm is a standard approach to solve the lasso which iteratively updates weights of predictors in a round-robin style until convergence. However, it has high computation cost. This paper proposes Sling, a fast approach to the lasso. It achieves high efficiency by skipping unnecessary updates for the predictors whose weight is zero in the iterations. Sling can obtain high prediction accuracy with fewer predictors than the standard approach. Experiments show that Sling can enhance the efficiency and the effectiveness of the lasso.

【Keywords】: Lasso; Efficient

218. Group and Graph Joint Sparsity for Linked Data Classification.

【Paper Link】【Pages】:1568-1574

【Authors】: Longwen Gao ; Shuigeng Zhou

【Abstract】: Various sparse regularizers have been applied to machine learning problems, among which structured sparsity has been proposed for a better adaption to structured data. In this paper, motivated by effectively classifying linked data (e.g. Web pages, tweets, articles with references, and biological network data) where a group structure exists over the whole dataset and links exist between specific samples, we propose a joint sparse representation model that combines group sparsity and graph sparsity, to select a small number of connected components from the graph of linked samples, meanwhile promoting the sparsity of edges that link samples from different groups in each connected component. Consequently, linked samples are selected from a few sparsely-connected groups. Both theoretical analysis and experimental results on four benchmark datasets show that the joint sparsity model outperforms traditional group sparsity model and graph sparsity model, as well as the latest group-graph sparsity model.

【Keywords】: sparse representation; group sparsity; graph sparsity; linked data; classification

219. Risk Minimization in the Presence of Label Noise.

【Paper Link】【Pages】:1575-1581

【Authors】: Wei Gao ; Lu Wang ; Yu-Feng Li ; Zhi-Hua Zhou

【Abstract】: Matrix concentration inequalities have attracted much attention in diverse applications such as linear algebra, statistical estimation, combinatorial optimization, etc. In this paper, we present new Bernstein concentration inequalities depending only on the first moments of random matrices, whereas previous Bernstein inequalities are heavily relevant to the first and second moments. Based on those results, we analyze the empirical risk minimization in the presence of label noise. We find that many popular losses used in risk minimization can be decomposed into two parts, where the first part won't be affected and only the second part will be affected by noisy labels. We show that the influence of noisy labels on the second part can be reduced by our proposed LICS (Labeled Instance Centroid Smoothing) approach. The effectiveness of the LICS algorithm is justified both theoretically and empirically.

【Keywords】:

220. Decentralized Approximate Bayesian Inference for Distributed Sensor Network.

【Paper Link】【Pages】:1582-1588

【Authors】: Behnam Gholami ; Sejong Yoon ; Vladimir Pavlovic

【Abstract】: Bayesian models provide a framework for probabilistic modelling of complex datasets. Many such models are computationally demanding, especially in the presence of large datasets. In sensor network applications, statistical (Bayesian) parameter estimation usually relies on decentralized algorithms, in which both data and computation are distributed across the nodes of the network. In this paper we propose a framework for decentralized Bayesian learning using Bregman Alternating Direction Method of Multipliers (B-ADMM). We demonstrate the utility of our framework, with Mean Field Variational Bayes (MFVB) as the primitive for distributed affine structure from motion (SfM).

【Keywords】: Distributed Learning; Variational Inference; ADMM; Bregman Divergence

221. Assumed Density Filtering Methods for Learning Bayesian Neural Networks.

【Paper Link】【Pages】:1589-1595

【Authors】: Soumya Ghosh ; Francesco Maria Delle Fave ; Jonathan S. Yedidia

【Abstract】: Buoyed by the success of deep multilayer neural networks, there is renewed interest in scalable learning of Bayesian neural networks. Here, we study algorithms that utilize recent advances in Bayesian inference to efficiently learn distributions over network weights. In particular, we focus on recently proposed assumed density filtering based methods for learning Bayesian neural networks -- Expectation and Probabilistic backpropagation. Apart from scaling to large datasets, these techniques seamlessly deal with non-differentiable activation functions and provide parameter (learning rate, momentum) free learning. In this paper, we first rigorously compare the two algorithms and in the process develop several extensions, including a version of EBP for continuous regression problems and a PBP variant for binary classification. Next, we extend both algorithms to deal with multiclass classification and count regression problems. On a variety of diverse real world benchmarks, we find our extensions to be effective, achieving results competitive with the state-of-the-art.

【Keywords】:

222. Extending the Modelling Capacity of Gaussian Conditional Random Fields while Learning Faster.

【Paper Link】【Pages】:1596-1602

【Authors】: Jesse Glass ; Mohamed F. Ghalwash ; Milan Vukicevic ; Zoran Obradovic

【Abstract】: Gaussian Conditional Random Fields (GCRF) are atype of structured regression model that incorporatesmultiple predictors and multiple graphs. This isachieved by defining quadratic term feature functions inGaussian canonical form which makes the conditionallog-likelihood function convex and hence allows findingthe optimal parameters by learning from data. In thiswork, the parameter space for the GCRF model is extendedto facilitate joint modelling of positive and negativeinfluences. This is achieved by restricting the modelto a single graph and formulating linear bounds on convexitywith respect to the models parameters. In addition,our formulation for the model using one networkallows calculating gradients much faster than alternativeimplementations. Lastly, we extend the model onestep farther and incorporate a bias term into our linkweight. This bias is solved as part of the convex optimization.Benefits of the proposed model in terms ofimproved accuracy and speed are characterized on severalsynthetic graphs with 2 million links as well as on ahospital admissions prediction task represented as a humandisease-symptom similarity network correspondingto more than 35 million hospitalization records inCalifornia over 9 years.

【Keywords】: Structured Regression, Non-Linear Regression, Convex Ensemble Methods

223. Uncertainty Propagation in Long-Term Structured Regression on Evolving Networks.

【Paper Link】【Pages】:1603-1609

【Authors】: Djordje Gligorijevic ; Jelena Stojanovic ; Zoran Obradovic

【Abstract】: In long-term forecasting it is important to estimate the confidence of predictions, as they are often affected by errors that are accumulated over the prediction horizon. To address this problem, an effective novel iterative method is developed for Gaussian structured learning models in this study for propagating uncertainty in temporal graphs by modeling noisy inputs. The proposed method is applied for three long-term (up to 8 years ahead) structured regression problems on real-world evolving networks from the health and climate domains. The obtained empirical results and use case analysis provide evidence that the new approach allows better uncertainty propagation as compared to published alternatives.

【Keywords】: uncertainty propagation; conditional random fields; gaussian conditional random fields; climate network; disease network

224. Teaching-to-Learn and Learning-to-Teach for Multi-label Propagation.

【Paper Link】【Pages】:1610-1616

【Authors】: Chen Gong ; Dacheng Tao ; Jie Yang ; Wei Liu

【Abstract】: Multi-label propagation aims to transmit the multi-label information from labeled examples to unlabeled examples based on a weighted graph. Existing methods ignore the specific propagation difficulty of different unlabeled examples and conduct the propagationin an imperfect sequence, leading to the error-prone classification of some difficult examples with uncertain labels. To address this problem, this paper associates each possible label with a "teacher", and proposesa "Multi-Label Teaching-to-Learn and Learning-to-Teach" (ML-TLLT) algorithm, so that the entire propagationprocess is guided by the teachers and manipulated from simple examples to more difficult ones. In the teaching-to-learn step, the teachers select the simplest examples for the current propagation by investigating both the definitiveness of each possible label of the unlabeled examples, and the dependencies between labels revealed by the labeled examples. In the learning-to-teach step, the teachers reversely learn from the learner’s feedback to properly select the simplest examples for the next propagation. Thorough empirical studies show that due to the optimized propagation sequence designed by the teachers, ML-TLLT yields generally better performance than seven state-of-the-art methods on the typical multi-label benchmark datasets.

【Keywords】:

225. Discriminative Analysis Dictionary Learning.

【Paper Link】【Pages】:1617-1623

【Authors】: Jun Guo ; Yanqing Guo ; Xiangwei Kong ; Man Zhang ; Ran He

【Abstract】: Dictionary learning (DL) has been successfully applied to various pattern classification tasks in recent years. However, analysis dictionary learning (ADL), as a major branch of DL, has not yet been fully exploited in classification due to its poor discriminability. This paper presents a novel DL method, namely Discriminative Analysis Dictionary Learning (DADL), to improve the classification performance of ADL. First, a code consistent term is integrated into the basic analysis model to improve discriminability. Second, a triplet constraint-based local topology preserving loss function is introduced to capture the discriminative geometrical structures embedded in data. Third, correntropy induced metric is employed as a robust measure to better control outliers for classification. Then, half-quadratic minimization and alternate search strategy are used to speed up the optimization process so that there exist closed-form solutions in each alternating minimization stage. Experiments on several commonly used databases show that our proposed method not only significantly improves the discriminative ability of ADL, but also outperforms state-of-the-art synthesis DL methods.

【Keywords】: Dictionary Learning; correntropy; triplet constraints

226. Active Learning with Cross-Class Knowledge Transfer.

【Paper Link】【Pages】:1624-1630

【Authors】: Yuchen Guo ; Guiguang Ding ; Yuqi Wang ; Xiaoming Jin

【Abstract】: When there are insufficient labeled samples for training a supervised model, we can adopt active learning to select the most informative samples for human labeling, or transfer learning to transfer knowledge from related labeled data source. Combining transfer learning with active learning has attracted much research interest in recent years. Most existing works follow the setting where the class labels in source domain are the same as the ones in target domain. In this paper, we focus on a more challenging cross-class setting where the class labels are totally different in two domains but related to each other in an intermediary attribute space, which is barely investigated before. We propose a novel and effective method that utilizes the attribute representation as the seed parameters to generate the classification models for classes. And we propose a joint learning framework that takes into account the knowledge from the related classes in source domain, and the information in the target domain. Besides, it is simple to perform uncertainty sampling, a fundamental technique for active learning, based on the framework. We conduct experiments on three benchmark datasets and the results demonstrate the efficacy of the proposed method.

【Keywords】: Active Learning; Transfer Learning; Zero-shot Learning

227. Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis.

【Paper Link】【Pages】:1631-1637

【Authors】: Assaf Hallak ; Aviv Tamar ; Rémi Munos ; Shie Mannor

【Abstract】: We consider the off-policy evaluation problem in Markov decision processes with function approximation. We propose a generalization of the recently introduced emphatic temporal differences (ETD) algorithm, which encompasses the original ETD(λ), as well as several other off-policy evaluation algorithms as special cases. We call this framework ETD(λ, β), where our introduced parameter β controls the decay rate of an importance-sampling term. We study conditions under which the projected fixed-point equation underlying ETD(λ, β) involves a contraction operator, allowing us to present the first asymptotic error bounds (bias) for ETD(λ, β). Our results show that the original ETD algorithm always involves a contraction operator, and its bias is bounded. Moreover, by controlling β, our proposed generalization allows trading-off bias for variance reduction, thereby achieving a lower total error.

【Keywords】: Off-policy Evaluation; Emphatic Temporal Differences

228. Multi-Stage Multi-Task Learning with Reduced Rank.

【Paper Link】【Pages】:1638-1644

【Authors】: Lei Han ; Yu Zhang

【Abstract】: Multi-task learning (MTL) seeks to improve the generalization performance by sharing information among multiple tasks. Many existing MTL approaches aim to learn the low-rank structure on the weight matrix, which stores the model parameters of all tasks, to achieve task sharing, and as a consequence the trace norm regularization is widely used in the MTL literature. A major limitation of these approaches based on trace norm regularization is that all the singular values of the weight matrix are penalized simultaneously, leading to impaired estimation on recovering the larger singular values in the weight matrix. To address the issue, we propose a Reduced rAnk MUlti-Stage multi-tAsk learning (RAMUSA) method based on the recently proposed capped norms. Different from existing trace-norm-based MTL approaches which minimize the sum of all the singular values, the RAMUSA method uses a capped trace norm regularizer to minimize only the singular values smaller than some threshold. Due to the non-convexity of the capped trace norm, we develop a simple but well guaranteed multi-stage algorithm to learn the weight matrix iteratively. We theoretically prove that the estimation error at each stage in the proposed algorithm shrinks and finally achieves a lower upper-bound as the number of stages becomes large enough. Empirical studies on synthetic and real-world datasets demonstrate the effectiveness of the RAMUSA method in comparison with the state-of-the-art methods.

【Keywords】:

229. Reduction Techniques for Graph-Based Convex Clustering.

【Paper Link】【Pages】:1645-1651

【Authors】: Lei Han ; Yu Zhang

【Abstract】: The Graph-based Convex Clustering (GCC) method has gained increasing attention recently. The GCC method adopts a fused regularizer to learn the cluster centers and obtains a geometric clusterpath by varying the regularization parameter. One major limitation is that solving the GCC model is computationally expensive. In this paper, we develop efficient graph reduction techniques for the GCC model to eliminate edges, each of which corresponds to two data points from the same cluster, without solving the optimization problem in the GCC method, leading to improved computational efficiency. Specifically, two reduction techniques are proposed according to tree-based and cyclic-graph-based convex clustering methods separately. The proposed reduction techniques are appealing since they only need to scan the data once with negligibly additional cost and they are independent of solvers for the GCC method, making them capable of improving the efficiency of any existing solver. Experiments on both synthetic and real-world datasets show that our methods can largely improve the efficiency of the GCC model.

【Keywords】:

230. SAND: Semi-Supervised Adaptive Novel Class Detection and Classification over Data Stream.

【Paper Link】【Pages】:1652-1658

【Authors】: Ahsanul Haque ; Latifur Khan ; Michael Baron

【Abstract】: Most approaches to classifying data streams either divide the stream into fixed-size chunks or use gradual forgetting. Due to evolving nature of data streams, finding a proper size or choosing a forgetting rate without prior knowledge about time-scale of change is not a trivial task. These approaches hence suffer from a trade-off between performance and sensitivity. Existing dynamic sliding window based approaches address this problem by tracking changes in classifier error rate, but are supervised in nature. We propose an efficient semi-supervised framework in this paper which uses change detection on classifier confidence to detect concept drifts, and to determine chunk boundaries dynamically. It also addresses concept evolution problem by detecting outliers having strong cohesion among themselves. Experiment results on benchmark and synthetic data sets show effectiveness of the proposed approach.

【Keywords】: Dynamic Chunk Size; Classifier Confidence; Limited Labeled Data; Concept Drift; Concept Evolution

231. Flattening the Density Gradient for Eliminating Spatial Centrality to Reduce Hubness.

【Paper Link】【Pages】:1659-1665

【Authors】: Kazuo Hara ; Ikumi Suzuki ; Kei Kobayashi ; Kenji Fukumizu ; Milos Radovanovic

【Abstract】: Spatial centrality, whereby samples closer to the center of a dataset tend to be closer to all other samples, is regarded as one source of hubness. Hubness is well known to degrade k-nearest-neighbor (k-NN) classification. Spatial centrality can be removed by centering, i.e., shifting the origin to the global center of the dataset, in cases where inner product similarity is used. However, when Euclidean distance is used, centering has no effect on spatial centrality because the distance between the samples is the same before and after centering. As described in this paper, we propose a solution for the hubness problem when Euclidean distance is considered. We provide a theoretical explanation to demonstrate how the solution eliminates spatial centrality and reduces hubness. We then present some discussion of the reason the proposed solution works, from a viewpoint of density gradient, which is regarded as the origin of spatial centrality and hubness. We demonstrate that the solution corresponds to flattening the density gradient. Using real-world datasets, we demonstrate that the proposed method improves k-NN classification performance and outperforms an existing hub-reduction method.

【Keywords】: Hubness; Density gradient; Spatial centrality; k nearest neighbor method

232. Discriminative Vanishing Component Analysis.

【Paper Link】【Pages】:1666-1672

【Authors】: Chenping Hou ; Feiping Nie ; Dacheng Tao

【Abstract】: Vanishing Component Analysis (VCA) is a recently proposed prominent work in machine learning. It narrows the gap between tools and computational algebra: the vanishing ideal and its applications to classification problem. In this paper, we will analyze VCA in the kernel view, which is also another important research direction in machine learning. Under a very weak assumption, we provide a different point of view to VCA and make the kernel trick on VCA become possible. We demonstrate that the projection matrix derived by VCA is located in the same space as that of Kernel Principal Component Analysis (KPCA) with a polynomial kernel. Two groups of projections can express each other by linear transformation. Furthermore, we prove that KPCA and VCA have identical discriminative power, provided that the ratio trace criteria is employed as the measurement. We also show that the kernel formulated by the inner products of VCA's projections can be expressed by the KPCA's kernel linearly. Based on the analysis above, we proposed a novel Discriminative Vanishing Component Analysis (DVCA) approach. Experimental results are provided for demonstration.

【Keywords】:

233. Common and Discriminative Subspace Kernel-Based Multiblock Tensor Partial Least Squares Regression.

【Paper Link】【Pages】:1673-1679

【Authors】: Ming Hou ; Qibin Zhao ; Brahim Chaib-draa ; Andrzej Cichocki

【Abstract】: In this work, we introduce a new generalized nonlinear tensor regression framework called kernel-based multiblock tensor partial least squares (KMTPLS) for predicting a set of dependent tensor blocks from a set of independent tensor blocks through the extraction of a small number of common and discriminative latent components. By considering both common and discriminative features, KMTPLS effectively fuses the information from multiple tensorial data sources and unifies the single and multiblock tensor regression scenarios into one general model. Moreover, in contrast to multilinear model, KMTPLS successfully addresses the nonlinear dependencies between multiple response and predictor tensor blocks by combining kernel machines with joint Tucker decomposition, resulting in a significant performance gain in terms of predictability. An efficient learning algorithm for KMTPLS based on sequentially extracting common and discriminative latent vectors is also presented. Finally, to show the effectiveness and advantages of our approach, we test it on the real-life regression task in computer vision, i.e., reconstruction of human pose from multiview video sequences.

【Keywords】: tensor regression models; kernel methods; multiblock regression models; multiblock partial least squares; human pose estimation; motion trajectory

234. Multi-Label Manifold Learning.

【Paper Link】【Pages】:1680-1686

【Authors】: Peng Hou ; Xin Geng ; Min-Ling Zhang

【Abstract】: This paper gives an attempt to explore the manifold in the label space for multi-label learning. Traditional label space is logical, where no manifold exists. In order to study the label manifold, the label space should be extended to a Euclidean space. However, the label manifold is not explicitly available from the training examples. Fortunately, according to the smoothness assumption that the points close to each other are more likely to share a label, the local topological structure can be shared between the feature manifold and the label manifold. Based on this, we propose a novel method called ML2, i.e., Multi-Label Manifold Learning, to reconstruct and exploit the label manifold. To our best knowledge, it is one of the first attempts to explore the manifold in the label space in multi-label learning. Extensive experiments show that the performance of multi-label learning can be improved significantly with the label manifold.

【Keywords】: Multi-Label Learning; Label Manifold

235. Optimal Discrete Matrix Completion.

【Paper Link】【Pages】:1687-1693

【Authors】: Zhouyuan Huo ; Ji Liu ; Heng Huang

【Abstract】: In recent years, matrix completion methods have been successfully applied to solve recommender system applications. Most of them focus on the matrix completion problem in real number domain, and produce continuous prediction values. However, these methods are not appropriate in some occasions where the entries of matrix are discrete values, such as movie ratings prediction, social network relation and interaction prediction, because their continuous outputs are not probabilities and uninterpretable. In this case, an additional step to process the continuous results with either heuristic threshold parameters or complicated mapping is necessary, while it is inefficient and may diverge from the optimal solution. There are a few matrix completion methods working on discrete number domain, however, they are not applicable to sparse and large-scale data set. In this paper, we propose a novel optimal discrete matrix completion model, which is able to learn optimal thresholds automatically and also guarantees an exact low-rank structure of the target matrix. We use stochastic gradient descent algorithm with momentum method to optimize the new objective function and speed up optimization. In the experiments, it is proved that our method can predict discrete values with high accuracy, very close to or even better than these values obtained by carefully tuned thresholds on Movielens and YouTube data sets. Meanwhile, our model is able to handle online data and easy to parallelize.

【Keywords】: Matrix Completion; Recommder System; Big Data

236. Conservativeness of Untied Auto-Encoders.

【Paper Link】【Pages】:1694-1700

【Authors】: Daniel Jiwoong Im ; Mohamed Ishmael Diwan Belghazi ; Roland Memisevic

【Abstract】: We discuss necessary and sufficient conditions for an auto-encoder to define a conservative vector field, in which case it is associated with anenergy function akin to the unnormalized log-probability of the data.We show that the conditions for conservativeness are more general than for encoder and decoder weights to be the same ("tied weights''), and thatthey also depend on the form of the hidden unit activation functions.Moreover, we show that contractive training criteria, such as denoising, enforces these conditions locally.Based on these observations, we show how we can use auto-encoders to extract the conservative component of a vector field.

【Keywords】: Auto-encoders; Neural Networks;

237. Infinite Plaid Models for Infinite Bi-Clustering.

【Paper Link】【Pages】:1701-1708

【Authors】: Katsuhiko Ishiguro ; Issei Sato ; Masahiro Nakano ; Akisato Kimura ; Naonori Ueda

【Abstract】: We propose a probabilistic model for non-exhaustive and overlapping (NEO) bi-clustering. Our goal is to extract a few sub-matrices from the given data matrix, where entries of a sub-matrix are characterized by a specific distribution or parameters. Existing NEO biclustering methods typically require the number of sub-matrices to be extracted, which is essentially difficult to fix a priori. In this paper, we extend the plaid model, known as one of the best NEO bi-clustering algorithms, to allow infinite bi-clustering; NEO bi-clustering without specifying the number of sub-matrices. Our model can represent infinite sub-matrices formally. We develop a MCMC inference without the finite truncation, which potentially addresses all possible numbers of sub-matrices. Experiments quantitatively and qualitatively verify the usefulness of the proposed model. The results reveal that our model can offer more precise and in-depth analysis of sub-matrices.

【Keywords】: clustering; bi-clustering; NEO bi-clusteirng, infinite bi-clustering; Bayesian Nonparametrics

238. Improving Predictive State Representations via Gradient Descent.

【Paper Link】【Pages】:1709-1715

【Authors】: Nan Jiang ; Alex Kulesza ; Satinder P. Singh

【Abstract】: Predictive state representations (PSRs) model dynamical systems using appropriately chosen predictions about future observations as a representation of the current state. In contrast to the hidden states posited by HMMs or RNNs, PSR states are directly observable in the training data; this gives rise to a moment-matching spectral algorithm for learning PSRs that is computationally efficient and statistically consistent when the model complexity matches that of the true system generating the data. In practice, however, model mismatch is inevitable and while spectral learning remains appealingly fast and simple it may fail to find optimal models. To address this problem, we investigate the use of gradient methods for improving spectrally-learned PSRs. We show that only a small amount of additional gradient optimization can lead to significant performance gains, and moreover that initializing gradient methods with the spectral learning solution yields better models in significantly less time than starting from scratch.

【Keywords】: spectral learning; predictive state representation; gradient descent

239. A Probabilistic Approach to Knowledge Translation.

【Paper Link】【Pages】:1716-1722

【Authors】: Shangpu Jiang ; Daniel Lowd ; Dejing Dou

【Abstract】: In this paper, we focus on a novel knowledge reuse scenario where the knowledge in the source schema needs to be translated to a semantically heterogeneous target schema. We refer to this task as “knowledge translation” (KT). Unlike data translation and transfer learning, KT does not require any data from the source or target schema. We adopt a probabilistic approach to KT by representing the knowledge in the source schema, the mapping between the source and target schemas, and the resulting knowledge in the target schema all as probability distributions, specially using Markov random fields and Markov logic networks. Given the source knowledge and mappings, we use standard learning and inference algorithms for probabilistic graphical models to find an explicit probability distribution in the target schema that minimizes the Kullback-Leibler divergence from the implicit distribution. This gives us a compact probabilistic model that represents knowledge from the source schema as well as possible, respecting the uncertainty in both the source knowledge and the mapping. In experiments on both propositional and relational domains, we find that the knowledge obtained by KT is comparable to other approaches that require data, demonstrating that knowledge can be reused without data.

【Keywords】: Statistical Relational Learning; Knowledge Translation; Transfer Learning

240. The l2, 1-Norm Stacked Robust Autoencoders for Domain Adaptation.

【Paper Link】【Pages】:1723-1729

【Authors】: Wenhao Jiang ; Hongchang Gao ; Fu-Lai Chung ; Heng Huang

【Abstract】: Recently, deep learning methods that employ stacked denoising autoencoders (SDAs) have been successfully applied in domain adaptation. Remarkable performance in multi-domain sentiment analysis datasets has been reported, making deep learning a promising approach to domain adaptation problems. SDAs are distinguished by learning robust data representations for recovering the original features that have been artificially corrupted with noise. The idea has been further exploited to marginalize out the random corruptions by a state-of-the-art method called mSDA. In this paper, a deep learning method for domain adaptation called l 2,1 -norm stacked robust autoencoders ( l 2,1 -SRA) is proposed to learn useful representations for domain adaptation tasks. Each layer of l 2,1 -SRA contains two steps: a robust linear reconstruction step which is based on l 2,1 robust regression and a non-linear squashing transformation step. The experimental results demonstrate that the proposed method is very effective in multiple cross domain classification datasets which include Amazon review dataset, spam dataset from ECML/PKDD discovery challenge 2006 and 20 newsgroups dataset.

【Keywords】: Autoencoder; Deep Learning; Robust Model

241. Wishart Mechanism for Differentially Private Principal Components Analysis.

【Paper Link】【Pages】:1730-1736

【Authors】: Wuxuan Jiang ; Cong Xie ; Zhihua Zhang

【Abstract】: We propose a new input perturbation mechanism for publishing a covariance matrix to achieve (epsilon,0)-differential privacy. Our mechanism uses a Wishart distribution to generate matrix noise. In particular, we apply this mechanism to principal component analysis (PCA). Our mechanism is able to keep the positive semi-definiteness of the published covariance matrix. Thus, our approach gives rise to a general publishing framework for input perturbation of a symmetric positive semidefinite matrix. Moreover, compared with the classic Laplace mechanism, our method has better utility guarantee. To the best of our knowledge, the Wishart mechanism is the best input perturbation approach for (epsilon,0)-differentially private PCA. We also compare our work with previous exponential mechanism algorithms in the literature and provide near optimal bound while having more flexibility and less computational intractability.

【Keywords】: Differential Privacy; Principal Component Analysis

242. Deep Learning with S-Shaped Rectified Linear Activation Units.

【Paper Link】【Pages】:1737-1743

【Authors】: Xiaojie Jin ; Chunyan Xu ; Jiashi Feng ; Yunchao Wei ; Junjun Xiong ; Shuicheng Yan

【Abstract】: Rectified linear activation units are important components for state-of-the-art deep convolutional networks. In this paper, we propose a novel S-shaped rectifiedlinear activation unit (SReLU) to learn both convexand non-convex functions, imitating the multiple function forms given by the two fundamental laws, namely the Webner-Fechner law and the Stevens law, in psychophysics and neural sciences. Specifically, SReLU consists of three piecewise linear functions, which are formulated by four learnable parameters. The SReLU is learned jointly with the training of the whole deep network through back propagation. During the training phase, to initialize SReLU in different layers, we propose a “freezing” method to degenerate SReLU into a predefined leaky rectified linear unit in the initial several training epochs and then adaptively learn the good initial values. SReLU can be universally used in the existing deep networks with negligible additional parameters and computation cost. Experiments with two popular CNN architectures, Network in Network and GoogLeNet on scale-various benchmarks including CIFAR10, CIFAR100, MNIST and ImageNet demonstrate that SReLU achieves remarkable improvement compared to other activation functions.

【Keywords】:

243. Delay-Tolerant Online Convex Optimization: Unified Analysis and Adaptive-Gradient Algorithms.

【Paper Link】【Pages】:1744-1750

【Authors】: Pooria Joulani ; András György ; Csaba Szepesvári

【Abstract】: We present a unified, black-box-style method for developing and analyzing online convex optimization (OCO) algorithms for full-information online learning in delayed-feedback environments. Our new, simplified analysis enables us to substantially improve upon previous work and to solve a number of open problems from the literature. Specifically, we develop and analyze asynchronous AdaGrad-style algorithms from the Follow-the-Regularized-Leader (FTRL) and Mirror-Descent family that, unlike previous works, can handle projections and adapt both to the gradients and the delays, without relying on either strong convexity or smoothness of the objective function, or data sparsity. Our unified framework builds on a natural reduction from delayed-feedback to standard (non-delayed) online learning. This reduction, together with recent unification results for OCO algorithms, allows us to analyze the regret of generic FTRL and Mirror-Descent algorithms in the delayed-feedback setting in a unified manner using standard proof techniques. In addition, the reduction is exact and can be used to obtain both upper and lower bounds on the regret in the delayed-feedback setting.

【Keywords】: Machine Learning; Online Learning; Delayed Feedback; AdaGrad; Online Convex Optimization; FTRL; Mirror Descent;

244. Shakeout: A New Regularized Deep Neural Network Training Scheme.

【Paper Link】【Pages】:1751-1757

【Authors】: Guoliang Kang ; Jun Li ; Dacheng Tao

【Abstract】: Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. The invention of effective training techniques largely contributes to this success. The so-called "Dropout" training scheme is one of the most powerful tool to reduce over-fitting. From the statistic point of view, Dropout works by implicitly imposing an L2 regularizer on the weights. In this paper, we present a new training scheme: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, our method randomly chooses to enhance or inverse the contributions of each unit to the next layer. We show that our scheme leads to a combination of L1 regularization and L2 regularization imposed on the weights, which has been proved effective by the Elastic Net models in practice.We have empirically evaluated the Shakeout scheme and demonstrated that sparse network weights are obtained via Shakeout training. Our classification experiments on real-life image datasets MNIST and CIFAR-10 show that Shakeout deals with over-fitting effectively.

【Keywords】:

245. Bounded Optimal Exploration in MDP.

【Paper Link】【Pages】:1758-1764

【Authors】: Kenji Kawaguchi

【Abstract】: Within the framework of probably approximately correct Markov decision processes (PAC-MDP), much theoretical work has focused on methods to attain near optimality after a relatively long period of learning and exploration. However, practical concerns require the attainment of satisfactory behavior within a short period of time. In this paper, we relax the PAC-MDP conditions to reconcile theoretically driven exploration methods and practical needs. We propose simple algorithms for discrete and continuous state spaces, and illustrate the benefits of our proposed relaxation via theoretical analyses and numerical examples. Our algorithms also maintain anytime error bounds and average loss bounds. Our approach accommodates both Bayesian and non-Bayesian methods.

【Keywords】: Learning; Exploration; Markov Decision Process

246. Uncorrelated Group LASSO.

【Paper Link】【Pages】:1765-1771

【Authors】: Deguang Kong ; Ji Liu ; Bo Liu ; Xuan Bao

【Abstract】: l 2,1 -norm is an effective regularization to enforce a simple group sparsity for feature learning. To capture some subtle structures among feature groups, we propose a new regularization called exclusive group l 2,1 -norm. It enforces the sparsity at the intra-group level by using l 2,1 -norm, while encourages the selected features to distribute in different groups by using l 2 norm at the inter-group level. The proposed exclusivegroup l 2,1 -norm is capable of eliminating the feature correlationsin the context of feature selection, if highly correlated features are collected in the same groups. To solve the generic exclusive group l 2,1 -norm regularized problems, we propose an efficient iterative re-weighting algorithm and provide a rigorous convergence analysis. Experiment results on real world datasets demonstrate the effectiveness of the proposed new regularization and algorithm.

【Keywords】: exclusive; lasso; group; feature selection

247. Learning Future Classifiers without Additional Data.

【Paper Link】【Pages】:1772-1778

【Authors】: Atsutoshi Kumagai ; Tomoharu Iwata

【Abstract】: We propose probabilistic models for predicting future classifiers given labeled data with timestamps collected until the current time. In some applications, the decision boundary changes over time. For example, in spam mail classification, spammers continuously create new spam mails to overcome spam filters, and therefore, the decision boundary that classifies spam or non-spam can vary. Existing methods require additional labeled and/or unlabeled data to learn a time-evolving decision boundary. However, collecting these data can be expensive or impossible. By incorporating time-series models to capture the dynamics of a decision boundary, the proposed model can predict future classifiers without additional data. We developed two learning algorithms for the proposed model on the basis of variational Bayesian inference. The effectiveness of the proposed method is demonstrated with experiments using synthetic and real-world data sets.

【Keywords】: classification;time-series;concept drift

248. Compressed Conditional Mean Embeddings for Model-Based Reinforcement Learning.

【Paper Link】【Pages】:1779-1787

【Authors】: Guy Lever ; John Shawe-Taylor ; Ronnie Stafford ; Csaba Szepesvári

【Abstract】: We present a model-based approach to solving Markov decision processes (MDPs) in which the system dynamics are learned using conditional mean embeddings (CMEs). This class of methods comes with strong performance guarantees, and enables planning to be performed in an induced finite (pseudo-)MDP, which approximates the MDP, but can be solved exactly using dynamic programming. Two drawbacks of existing methods exist: firstly, the size of the induced finite (pseudo-)MDP scales quadratically with the amount of data used to learn the model, costing much memory and time when planning with the learned model; secondly, learning the CME itself using powerful kernel least-squares is costly – a second computational bottleneck. We present an algorithm which maintains a rich kernelized CME model class, but solves both problems: firstly we demonstrate that the loss function for the CME model suggests a principled approach to compressing the induced (pseudo-)MDP, leading to faster planning, while maintaining guarantees; secondly we propose to learn the CME model itself using fast sparse-greedy kernel regression well-suited to the RL context. We demonstrate superior performance to existing methods in this class of modelbased approaches on a range of MDPs.

【Keywords】: Reinforcement Learning; Kernel Methods

249. Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks.

【Paper Link】【Pages】:1788-1794

【Authors】: Chunyuan Li ; Changyou Chen ; David E. Carlson ; Lawrence Carin

【Abstract】: Effective training of deep neural networks suffers from two main issues. The first is that the parameter space of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter space. A second issue is overfitting, which is typically addressed by early stopping. However, recent work has demonstrated that Bayesian model averaging mitigates this problem. The posterior can be sampled by using Stochastic Gradient Langevin Dynamics (SGLD). However, the rapidly changing curvature renders default SGLD methods inefficient. Here, we propose combining adaptive preconditioners with SGLD. In support of this idea, we give theoretical properties on asymptotic convergence and predictive risk. We also provide empirical results for Logistic Regression, Feedforward Neural Nets, and Convolutional Neural Nets, demonstrating that our preconditioned SGLD method gives state-of-the-art performance on these models.

【Keywords】: precondition; stochastic gradient MCMC; stochastic gradient Langevin dynamics; deep neural networks

250. High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models.

【Paper Link】【Pages】:1795-1801

【Authors】: Chunyuan Li ; Changyou Chen ; Kai Fan ; Lawrence Carin

【Abstract】: Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility of modern Bayesian methods to yield scalable learning and inference, while maintaining a measure of uncertainty in the model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning. In SG-MCMC, multivariate stochastic gradient thermostats (mSGNHT) augment each parameter of interest, with a momentum and a thermostat variable to maintain stationary distributions as target posterior distributions. As the number of variables in a continuous-time diffusion increases, its numerical approximation error becomes a practical bottleneck, so better use of a numerical integrator is desirable. To this end, we propose use of an efficient symmetric splitting integrator in mSGNHT, instead of the traditional Euler integrator. We demonstrate that the proposed scheme is more accurate, robust, and converges faster. These properties are demonstrated to be desirable in Bayesian deep learning. Extensive experiments on two canonical models and their deep extensions demonstrate that the proposed scheme improves general Bayesian posterior sampling, particularly for deep models.

【Keywords】: SG-MCMC;Stochastic Gradient Thermostats; Numerical Integrator; Bayesian Learning; Deep Models

251. Multi-Objective Self-Paced Learning.

【Paper Link】【Pages】:1802-1808

【Authors】: Hao Li ; Maoguo Gong ; Deyu Meng ; Qiguang Miao

【Abstract】: Current self-paced learning (SPL) regimes adopt the greedy strategy to obtain the solution with a gradually increasing pace parameter while where to optimally terminate this increasing process is difficult to determine.Besides, most SPL implementations are very sensitive to initialization and short of a theoretical result to clarify where SPL converges to with pace parameter increasing.In this paper, we propose a novel multi-objective self-paced learning (MOSPL) method to address these issues.Specifically, we decompose the objective functions as two terms, including the loss and the self-paced regularizer, respectively, and treat the problem as the compromise between these two objectives.This naturally reformulates the SPL problem as a standard multi-objective issue.A multi-objective evolutionary algorithm is used to optimize the two objectives simultaneously to facilitate the rational selection of a proper pace parameter.The proposed technique is capable of ameliorating a set of solutions with respect to a range of pace parameters through finely compromising these solutions inbetween, and making them perform robustly even under bad initialization.A good solution can then be naturally achieved from these solutions by making use of some off-the-shelf tools in multi-objective optimization.Experimental results on matrix factorization and action recognition demonstrate the superiority of the proposed method against the existing issues in current SPL research.

【Keywords】: Self-paced Learning; Multi-objective Optimization

252. Scalable Sequential Spectral Clustering.

【Paper Link】【Pages】:1809-1815

【Authors】: Yeqing Li ; Junzhou Huang ; Wei Liu

【Abstract】: In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one significant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot fit in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, \textit{e.g.}, memory. We begin with investigating an effective way of approximating the graph affinity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efficient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-the-arts.

【Keywords】: spectral clustering, large scale, sequential k-means

253. Towards Safe Semi-Supervised Learning for Multivariate Performance Measures.

【Paper Link】【Pages】:1816-1822

【Authors】: Yu-Feng Li ; James T. Kwok ; Zhi-Hua Zhou

【Abstract】: Semi-supervised learning (SSL) is an important research problem in machine learning. While it is usually expected that the use of unlabeled data can improve performance, in many cases SSL is outperformed by supervised learning using only labeled data. To this end, the construction of a performance-safe SSL method has become a key issue of SSL study. To alleviate this problem, we propose in this paper the UMVP (safe semi-sUpervised learning for MultiVariate Performance measure) method, because of the need of various performance measures in practical tasks. The proposed method integrates multiple semi-supervised learners, and maximizes the worst-case performance gain to derive the final prediction. The overall problem is formulated as a maximin optimization. In oder to solve the resultant difficult maximin optimization, this paper shows that when the performance measure is the Top- k Precision, F β score or AUC, a minimax convex relaxation of the maximin optimization can be solved efficiently. Experimental results show that the proposed method can effectively improve the safeness of SSL under multiple multivariate performance measures.

【Keywords】: safe semi-supervised learning

254. Accelerating Random Kaczmarz Algorithm Based on Clustering Information.

【Paper Link】【Pages】:1823-1829

【Authors】: Yujun Li ; Kaichun Mo ; Haishan Ye

【Abstract】: Kaczmarz algorithm is an efficient iterative algorithm to solve overdetermined consistent system of linear equations. During each updating step, Kaczmarz chooses a hyperplane based on an individual equation and projects the current estimate for the exact solution onto that space to get a new estimate.Many vairants of Kaczmarz algorithms are proposed on how to choose better hyperplanes.Using the property of randomly sampled data in high-dimensional space,we propose an accelerated algorithm based on clustering information to improve block Kaczmarz and Kaczmarz via Johnson-Lindenstrauss lemma. Additionally, we theoretically demonstrate convergence improvement on block Kaczmarz algorithm.

【Keywords】: Kaczmarz algorithm; consistent linear equation system; matrix block

255. Fast and Accurate Refined Nyström-Based Kernel SVM.

【Paper Link】【Pages】:1830-1836

【Authors】: Zhe Li ; Tianbao Yang ; Lijun Zhang ; Rong Jin

【Abstract】: In this paper, we focus on improving the performance of the Nyström based kernel SVM. Although the Nyström approximation has been studied extensively and its application to kernel classification has been exhibited in several studies, there still exists a potentially large gap between the performance of classifier learned with the Nyström approximation and that learned with the original kernel. In this work, we make novel contributions to bridge the gap without increasing the training costs too much by proposing a refined Nyström based kernel classifier. We adopt a two-step approach that in the first step we learn a sufficiently good dual solution and in the second step we use the obtained dual solution to construct a new set of bases for the Nyström approximation to re-train a refined classifier. Our approach towards learning a good dual solution is based on a sparse-regularized dual formulation with the Nyström approximation, which can be solved with the same time complexity as solving the standard formulation. We justify our approach by establishing a theoretical guarantee on the error of the learned dual solution in the first step with respect to the optimal dual solution under appropriate conditions. The experimental results demonstrate that (i) the obtained dual solution by our approach in the first step is closer to the optimal solution and yields improved prediction performance; and (ii) the second step using the obtained dual solution to re-train the model further improves the performance.

【Keywords】: Kernel SVM;Nystrom Method;Classification

256. How Important Is Weight Symmetry in Backpropagation?

【Paper Link】【Pages】:1837-1844

【Authors】: Qianli Liao ; Joel Z. Leibo ; Tomaso A. Poggio

【Abstract】: Gradient backpropagation (BP) requires symmetric feedforward and feedback connections — the same weights must be used for forward and backward passes. This "weight transport problem'' (Grossberg 1987) is thought to be one of the main reasons to doubt BP's biologically plausibility. Using 15 different classification datasets, we systematically investigate to what extent BP really depends on weight symmetry. In a study that turned out to be surprisingly similar in spirit to Lillicrap et al.'s demonstration (Lillicrap et al. 2014) but orthogonal in its results, our experiments indicate that: (1) the magnitudes of feedback weights do not matter to performance (2) the signs of feedback weights do matter — the more concordant signs between feedforward and their corresponding feedback connections, the better (3) with feedback weights having random magnitudes and 100% concordant signs, we were able to achieve the same or even better performance than SGD. (4) some normalizations/stabilizations are indispensable for such asymmetric BP to work, namely Batch Normalization (BN) (Ioffe and Szegedy 2015) and/or a "Batch Manhattan'' (BM) update rule.

【Keywords】: neural networks,deep learning,backpropagation,biological plausibility,batch normalization,batch Manhattan,weight transport,weight symmetry,neuroscience

257. Re-Active Learning: Active Learning with Relabeling.

【Paper Link】【Pages】:1845-1852

【Authors】: Christopher H. Lin ; Mausam ; Daniel S. Weld

【Abstract】: Active learning seeks to train the best classifier at the lowest annotation cost by intelligently picking the best examples to label. Traditional algorithms assume there is a single annotator and disregard the possibility of requesting additional independent annotations for a previously labeled example. However, relabeling examples is important, because all annotators make mistakes — especially crowdsourced workers, who have become a common source of training data. This paper seeks to understand the difference in marginal value between decreasing the noise of the training set via relabeling and increasing the size and diversity of the (noisier) training set by labeling new examples. We use the term re-active learning to denote this generalization of active learning. We show how traditional active learning methods perform poorly at re-active learning, present new algorithms designed for this important problem, formally characterize their behavior, and empirically show that our methods effectively make this tradeoff.

【Keywords】: Active Learning, Crowdsourcing, Human Computation

258. Interaction Point Processes via Infinite Branching Model.

【Paper Link】【Pages】:1853-1859

【Authors】: Peng Lin ; Bang Zhang ; Ting Guo ; Yang Wang ; Fang Chen

【Abstract】: Many natural and social phenomena can be modeled by interaction point processes (IPPs) (Diggle et al. 1994), stochastic point processes considering the interaction between points. In this paper, we propose the infinite branching model (IBM), a Bayesian statistical model that can generalize and extend some popular IPPs, e.g., Hawkes process (Hawkes 1971; Hawkes and Oakes 1974). It treats IPP as a mixture of basis point processes with the aid of a distance dependent prior over branching structure that describes the relationship between points. The IBM can estimate point event intensity, interaction mechanism and branching structure simultaneously. A generic Metropolis-within-Gibbs sampling method is also developed for model parameter inference. The experiments on synthetic and real-world data demonstrate the superiority of the IBM.

【Keywords】: Interaction point process; Bayesian nonparametric approach; Branching structure

259. Gaussian Process Planning with Lipschitz Continuous Reward Functions: Towards Unifying Bayesian Optimization, Active Learning, and Beyond.

【Paper Link】【Pages】:1860-1866

【Authors】: Chun Kai Ling ; Kian Hsiang Low ; Patrick Jaillet

【Abstract】: This paper presents a novel nonmyopic adaptive Gaussian process planning (GPP) framework endowed with a general class of Lipschitz continuous reward functions that can unify some active learning/sensing and Bayesian optimization criteria and offer practitioners some flexibility to specify their desired choices for defining new tasks/problems. In particular, it utilizes a principled Bayesian sequential decision problem framework for jointly and naturally optimizing the exploration-exploitation trade-off. In general, the resulting induced GPP policy cannot be derived exactly due to an uncountable set of candidate observations. A key contribution of our work here thus lies in exploiting the Lipschitz continuity of the reward functions to solve for a nonmyopic adaptive epsilon-optimal GPP (epsilon-GPP) policy. To plan in real time, we further propose an asymptotically optimal, branch-and-bound anytime variant of epsilon-GPP with performance guarantee. We empirically demonstrate the effectiveness of our epsilon-GPP policy and its anytime variant in Bayesian optimization and an energy harvesting task.

【Keywords】: Non-myopic planning, Gaussian process, Bayesian optimization, active learning

260. Online ARIMA Algorithms for Time Series Prediction.

【Paper Link】【Pages】:1867-1873

【Authors】: Chenghao Liu ; Steven C. H. Hoi ; Peilin Zhao ; Jianling Sun

【Abstract】: Autoregressive integrated moving average (ARIMA) is one of the most popular linear models for time series forecasting due to its nice statistical properties and great flexibility. However, its parameters are estimated in a batch manner and its noise terms are often assumed to be strictly bounded, which restricts its applications and makes it inefficient for handling large-scale real data. In this paper, we propose online learning algorithms for estimating ARIMA models under relaxed assumptions on the noise terms, which is suitable to a wider range of applications and enjoys high computational efficiency. The idea of our ARIMA method is to reformulate the ARIMA model into a task of full information online optimization (without random noise terms). As a consequence, we can online estimation of the parameters in an efficient and scalable way. Furthermore, we analyze regret bounds of the proposed algorithms, which guarantee that our online ARIMA model is provably as good as the best ARIMA model in hindsight. Finally, our encouraging experimental results further validate the effectiveness and robustness of our method.

【Keywords】:

261. Consensus Guided Unsupervised Feature Selection.

【Paper Link】【Pages】:1874-1880

【Authors】: Hongfu Liu ; Ming Shao ; Yun Fu

【Abstract】: Feature selection has been widely recognized as one of the key problems in data mining and machine learning community, especially for high-dimensional data with redundant information, partial noises and outliers. Recently, unsupervised feature selection attracts substantial research attentions since data acquisition is rather cheap today but labeling work is still expensive and time consuming. This is specifically useful for effective feature selection of clustering tasks. Recent works using sparse projection with pre-learned pseudo labels achieve appealing results; however, they generate pseudo labels with all features so that noisy and ineffective features degrade the cluster structure and further harm the performance of feature selection; besides, these methods suffer from complex composition of multiple constraints and computational inefficiency, e.g., eigen-decomposition. Differently, in this work we introduce consensus clustering for pseudo labeling, which gets rid of expensive eigen-decomposition and provides better clustering accuracy with high robustness. In addition, complex constraints such as non-negative are removed due to the crisp indicators of consensus clustering. Specifically, we propose one efficient formulation for our unsupervised feature selection by using the utility function and provide theoretical analysis on optimization rules and model convergence. Extensive experiments on several popular data sets demonstrate that our methods are superior to the most recent state-of-the-art works in terms of NMI.

【Keywords】: Feature Selection; Consensus Clustering

262. Sparse Perceptron Decision Tree for Millions of Dimensions.

【Paper Link】【Pages】:1881-1887

【Authors】: Weiwei Liu ; Ivor W. Tsang

【Abstract】: Due to the nonlinear but highly interpretable representations,decision tree (DT) models have significantly attracted a lot of attention of researchers. However, DT models usually suffer from the curse of dimensionality and achieve degenerated performance when there are many noisy features. To address these issues, this paper first presents a novel data-dependent generalization error bound for the perceptron decision tree(PDT), which provides the theoretical justification to learn a sparse linear hyperplane in each decision node and to prune the tree. Following our analysis, we introduce the notion of sparse perceptron decision node (SPDN) with a budget constraint on the weight coefficients, and propose a sparse perceptron decision tree (SPDT) algorithm to achieve nonlinear prediction performance. To avoid generating an unstable and complicated decision tree and improve the generalization of the SPDT, we present a pruning strategy by learning classifiers to minimize cross-validation errors on each SPDN. Extensive empirical studies verify that our SPDT is more resilient to noisy features and effectively generates a small,yet accurate decision tree. Compared with state-of-the-art DT methods and SVM, our SPDT achieves better generalization performance on ultrahigh dimensional problems with more than 1 million features.

【Keywords】:

263. Multiple Kernel k-Means Clustering with Matrix-Induced Regularization.

【Paper Link】【Pages】:1888-1894

【Authors】: Xinwang Liu ; Yong Dou ; Jianping Yin ; Lei Wang ; En Zhu

【Abstract】: Multiple kernel k-means (MKKM) clustering aims to optimally combine a group of pre-specified kernels to improve clustering performance. However, we observe that existing MKKM algorithms do not sufficiently consider the correlation among these kernels. This could result in selecting mutually redundant kernels and affect the diversity of information sources utilized for clustering, which finally hurts the clustering performance. To address this issue, this paper proposes an MKKM clustering with a novel, effective matrix-induced regularization to reduce such redundancy and enhance the diversity of the selected kernels. We theoretically justify this matrix-induced regularization by revealing its connection with the commonly used kernel alignment criterion. Furthermore, this justification shows that maximizing the kernel alignment for clustering can be viewed as a special case of our approach and indicates the extendability of the proposed matrix-induced regularization for designing better clustering algorithms. As experimentally demonstrated on five challenging MKL benchmark data sets, our algorithm significantly improves existing MKKM and consistently outperforms the state-of-the-art ones in the literature, verifying the effectiveness and advantages of incorporating the proposed matrix-induced regularization.

【Keywords】: Multiple kernel learning; Clustering; Matrix-induced regularization

264. Finding One's Best Crowd: Online Learning By Exploiting Source Similarity.

【Paper Link】【Pages】:1895-1901

【Authors】: Yang Liu ; Mingyan Liu

【Abstract】: We consider an online learning problem (classification or prediction) involving disparate sources of sequentially arriving data, whereby a user over time learns the best set of data sources to use in constructing the classifier by exploiting their similarity. We first show that, when (1) the similarity information among data sources is known, and (2) data from different sources can be acquired without cost, then a judicious selection of data from different sources can effectively enlarge the training sample size compared to using a single data source, thereby improving the rate and performance of learning; this is achieved by bounding the classification error of the resulting classifier. We then relax assumption (1) and characterize the loss in learning performance when the similarity information must also be acquired through repeated sampling. We further relax both (1) and (2) and present a cost-efficient algorithm that identifies a best crowd from a potentially large set of data sources in terms of both classifier performance and data acquisition cost. This problem has various applications, including online prediction systems with time series data of various forms, such as financial markets, advertisement and network measurement.

【Keywords】: online learning;disparate data source;similarity;prediction;crowdsourcing

265. Learning FRAME Models Using CNN Filters.

【Paper Link】【Pages】:1902-1910

【Authors】: Yang Lu ; Song-Chun Zhu ; Ying Nian Wu

【Abstract】: The convolutional neural network (ConvNet or CNN) has proven to be very successful in many tasks such as those in computer vision. In this conceptual paper, we study the generative perspective of the discriminative CNN. In particular, we propose to learn the generative FRAME (Filters, Random field, And Maximum Entropy) model using the highly expressive filters pre-learned by the CNN at the convolutional layers. We show that the learning algorithm can generate realistic and rich object and texture patterns in natural scenes. We explain that each learned model corresponds to a new CNN unit at a layer above the layer of filters employed by the model. We further show that it is possible to learn a new layer of CNN units using a generative CNN model, which is a product of experts model, and the learning algorithm admits an EM interpretation with binary latent variables.

【Keywords】: CNN; Generative Learning; Random Fields; Maximum Entropy

266. Sparse Latent Space Policy Search.

【Paper Link】【Pages】:1911-1918

【Authors】: Kevin Sebastian Luck ; Joni Pajarinen ; Erik Berger ; Ville Kyrki ; Heni Ben Amor

【Abstract】: Computational agents often need to learn policies that involve many control variables, e.g., a robot needs to control several joints simultaneously. Learning a policy with a high number of parameters, however, usually requires a large number of training samples. We introduce a reinforcement learning method for sample-efficient policy search that exploits correlations between control variables. Such correlations are particularly frequent in motor skill learning tasks. The introduced method uses Variational Inference to estimate policy parameters, while at the same time uncovering a low-dimensional latent space of controls. Prior knowledge about the task and the structure of the learning agent can be provided by specifying groups of potentially correlated parameters. This information is then used to impose sparsity constraints on the mapping between the high-dimensional space of controls and a lower-dimensional latent space. In experiments with a simulated bi-manual manipulator, the new approach effectively identifies synergies between joints, performs efficient low-dimensional policy search, and outperforms state-of-the-art policy search methods.

【Keywords】: Policy Search; Reinforcement Learning; Robotics; Dimensionality Reduction

267. Expected Tensor Decomposition with Stochastic Gradient Descent.

【Paper Link】【Pages】:1919-1925

【Authors】: Takanori Maehara ; Kohei Hayashi ; Ken-ichi Kawarabayashi

【Abstract】: In this study, we investigate expected CP decomposition — a special case of CP decomposition in which a tensor to be decomposed is given as the sum or average of tensor samples X ( t ) for t = 1,..., T . To determine this decomposition, we develope stochastic-gradient-descent-type algorithms with four appealing features: efficient memory use, ability to work in an online setting, robustness of parameter tuning, and simplicity. Our theoretical analysis show that the solutions do not diverge to infinity for any initial value or step size. Experimental results confirm that our algorithms significantly outperform all existing methods in terms of accuracy. We also show that they can successfully decompose a large tensor, containing billion-scale nonzero elements.

【Keywords】: tensor; CP-decomposition; stochastic gradient descent

268. Offline Evaluation of Online Reinforcement Learning Algorithms.

【Paper Link】【Pages】:1926-1933

【Authors】: Travis Mandel ; Yun-En Liu ; Emma Brunskill ; Zoran Popovic

【Abstract】: In many real-world reinforcement learning problems, we have access to an existing dataset and would like to use it to evaluate various learning approaches. Typically, one would prefer not to deploy a fixed policy, but rather an algorithm that learns to improve its behavior as it gains more experience. Therefore, we seek to evaluate how a proposed algorithm learns in our environment, meaning we need to evaluate how an algorithm would have gathered experience if it were run online. In this work, we develop three new evaluation approaches which guarantee that, given some history, algorithms are fed samples from the distribution that they would have encountered if they were run online. Additionally, we are the first to propose an approach that is provably unbiased given finite data, eliminating bias due to the length of the evaluation. Finally, we compare the sample-efficiency of these approaches on multiple datasets, including one from a real-world deployment of an educational game.

【Keywords】: Offline Evaluation; Nonstationary Policy Evaluation; Unbiased estimator; Replayer; Exploration and Exploitation

269. Reinforcement Learning with Parameterized Actions.

【Paper Link】【Pages】:1934-1940

【Authors】: Warwick Masson ; Pravesh Ranchod ; George Konidaris

【Abstract】: We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy search in the goal-scoring and Platform domains.

【Keywords】: Reinforcement Learning; Parameterized Actions; Parameterized Policies

270. Fixed-Rank Supervised Metric Learning on Riemannian Manifold.

【Paper Link】【Pages】:1941-1947

【Authors】: Yadong Mu

【Abstract】: Metric learning has become a critical tool in many machine learning tasks. This paper focuses on learning an optimal Mahalanobis distance matrix (parameterized by a positive semi-definite matrix W ) in the setting of supervised learning. Recently, particular research attention has been attracted by low-rank metric learning, which requires that matrix W is dominated by a few large singular values. In the era of high feature dimensions, low-rank metric learning effectively reduces the storage and computation overheads. However, existing low-rank metric learning algorithms usually adopt sophisticated regularization (such as LogDet divergence) for encouraging matrix low-rankness, which unfortunately incur iterative computations of matrix SVD. In this paper, we tackle low-rank metric learning by enforcing fixed-rank constraint on the matrix W. We harness the Riemannian manifold geometry of the collection of fixed-rank matrices and devise a novel second-order Riemannian retraction operator. The proposed operator is efficient and ensures that W always resides on the manifold. Comprehensive numerical experiments conducted on benchmarks clearly suggest that the proposed algorithm is substantially superior or on par with the state-of-the-art in terms of k -NN classification accuracy. Moreover, the proposed manifold retraction operator can be also naturally applied in generic rank-constrained machine learning algorithms.

【Keywords】:

271. All-in Text: Learning Document, Label, and Word Representations Jointly.

【Paper Link】【Pages】:1948-1954

【Authors】: Jinseok Nam ; Eneldo Loza Mencía ; Johannes Fürnkranz

【Abstract】: Conventional multi-label classification algorithms treat the target labels of the classification task as mere symbols that are void of an inherent semantics. However, in many cases textual descriptions of these labels are available or can be easily constructed from public document sources such as Wikipedia. In this paper, we investigate an approach for embedding documents and labels into a joint space while sharing word representations between documents and labels. For finding such embeddings, we rely on the text of documents as well as descriptions for the labels. The use of such label descriptions not only lets us expect an increased performance on conventional multi-label text classification tasks, but can also be used to make predictions for labels that have not been seen during the training phase. The potential of our method is demonstrated on the multi-label classification task of assigning keywords from the Medical Subject Headings (MeSH) to publications in biomedical research, both in a conventional and in a zero-shot learning setting.

【Keywords】:

272. Holographic Embeddings of Knowledge Graphs.

【Paper Link】【Pages】:1955-1961

【Authors】: Maximilian Nickel ; Lorenzo Rosasco ; Tomaso A. Poggio

【Abstract】: Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs. In this work, we propose holographic embeddings (HolE) to learn compositional vector space representations of entire knowledge graphs. The proposed method is related to holographic models of associative memory in that it employs circular correlation to create compositional representations. By using correlation as the compositional operator, HolE can capture rich interactions but simultaneously remains efficient to compute, easy to train, and scalable to very large datasets. Experimentally, we show that holographic embeddings are able to outperform state-of-the-art methods for link prediction on knowledge graphs and relational learning benchmark datasets.

【Keywords】: Knowledge Graph; Compositional Embeddings; Holographic Embeddings

273. New l1-Norm Relaxations and Optimizations for Graph Clustering.

【Paper Link】【Pages】:1962-1968

【Authors】: Feiping Nie ; Hua Wang ; Cheng Deng ; Xinbo Gao ; Xuelong Li ; Heng Huang

【Abstract】: In recent data mining research, the graph clustering methods, such as normalized cut and ratio cut, have been well studied and applied to solve many unsupervised learning applications. The original graph clustering methods are NP-hard problems. Traditional approaches used spectral relaxation to solve the graph clustering problems. The main disadvantage of these approaches is that the obtained spectral solutions could severely deviate from the true solution. To solve this problem, in this paper, we propose a new relaxation mechanism for graph clustering methods. Instead of minimizing the squared distances of clustering results, we use the l1-norm distance. More important, considering the normalized consistency, we also use the l1-norm for the normalized terms in the new graph clustering relaxations. Due to the sparse result from the l1-norm minimization, the solutions of our new relaxed graph clustering methods get discrete values with many zeros, which are close to the ideal solutions. Our new objectives are difficult to be optimized, because the minimization problem involves the ratio of nonsmooth terms. The existing sparse learning optimization algorithms cannot be applied to solve this problem. In this paper, we propose a new optimization algorithm to solve this difficult non-smooth ratio minimization problem. The extensive experiments have been performed on three two-way clustering and eight multi-way clustering benchmark data sets. All empirical results show that our new relaxation methods consistently enhance the normalized cut and ratio cut clustering results.

【Keywords】: Graph Clustering; l1-norm Normalized Cut

274. The Constrained Laplacian Rank Algorithm for Graph-Based Clustering.

【Paper Link】【Pages】:1969-1976

【Authors】: Feiping Nie ; Xiaoqian Wang ; Michael I. Jordan ; Heng Huang

【Abstract】: Graph-based clustering methods perform clustering on a fixed input data graph. If this initial construction is of low quality then the resulting clustering may also be of low quality. Moreover, existing graph-based clustering methods require post-processing on the data graph to extract the clustering indicators. We address both of these drawbacks by allowing the data graph itself to be adjusted as part of the clustering procedure. In particular, our Constrained Laplacian Rank (CLR) method learns a graph with exactly k connected components (where k is the number of clusters). We develop two versions of this method, based upon the L1-norm and the L2-norm, which yield two new graph-based clustering objectives. We derive optimization algorithms to solve these objectives. Experimental results on synthetic datasets and real-world benchmark datasets exhibit the effectiveness of this new graph-based clustering method.

【Keywords】: Clustering; Graph Construction; Constrained Laplacian Rank

275. Efficient PAC-Optimal Exploration in Concurrent, Continuous State MDPs with Delayed Updates.

【Paper Link】【Pages】:1977-1985

【Authors】: Jason Pazis ; Ronald Parr

【Abstract】: We present a new, efficient PAC optimal exploration algorithm that is able to explore in multiple, continuous or discrete state MDPs simultaneously. Our algorithm does not assume that value function updates can be completed instantaneously, and maintains PAC guarantees in realtime environments. Not only do we extend the applicability of PAC optimal exploration algorithms to new, realistic settings, but even when instant value function updates are possible, our bounds present a significant improvement over previous single MDP exploration bounds, and a drastic improvement over previous concurrent PAC bounds. We also present TCE, a new, fine grained metric for the cost of exploration.

【Keywords】: MDP; exploration; PAC; optimal; concurrent; efficient; delayed

276. Viral Clustering: A Robust Method to Extract Structures in Heterogeneous Datasets.

【Paper Link】【Pages】:1986-1992

【Authors】: Vahan Petrosyan ; Alexandre Proutière

【Abstract】: Cluster validation constitutes one of the most challenging problems in unsupervised cluster analysis. For example, identifying the true number of clusters present in a dataset has been investigated for decades, and is still puzzling researchers today. The difficulty stems from the high variety of the dataset characteristics. Some datasets exhibit a strong structure with a few well-separated and normally distributed clusters, but most often real-world datasets contain possibly many overlapping non-gaussian clusters with heterogeneous variances and shapes. This calls for the design of robust clustering algorithms that could adapt to the structure of the data and in particular accurately guess the true number of clusters. They have recently been interesting attempts to design such algorithms, e.g. based on involved non-parametric statistical inference techniques. In this paper, we develop Viral Clustering (VC), a simple algorithm that jointly estimates the number of clusters and outputs clusters. The VC algorithm relies on two antagonist and interacting components. The first component tends to regroup neighbouring samples together, while the second component tends to spread samples in various clusters. This spreading component is performed using an analogy with the way virus spread over networks. We present extensive numerical experiments illustrating the robustness of the VC algorithm, and its superiority compared to existing algorithms.

【Keywords】: Clustering; number of clusters, cluster validation, k-means

277. Inverse Reinforcement Learning through Policy Gradient Minimization.

【Paper Link】【Pages】:1993-1999

【Authors】: Matteo Pirotta ; Marcello Restelli

【Abstract】: Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert's policy.Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions.This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any "direct" RL problem.The idea is to find the reward function that minimizes the gradient of a parameterized representation of the expert's policy.In particular, when the reward function can be represented as a linear combination of some basis functions, we will show that the aforementioned optimization problem can be efficiently solved.We present an empirical evaluation of the proposed approach on a multidimensional version of the Linear-Quadratic Regulator (LQR) both in the case where the parameters of the expert's policy are known and in the (more realistic) case where the parameters of the expert's policy need to be inferred from the expert's demonstrations.Finally, the algorithm is compared against the state-of-the-art on the mountain car domain, where the expert's policy is unknown.

【Keywords】: Reinforcement Learning; Inverse Reinforcement Learning

278. Scaling Simultaneous Optimistic Optimization for High-Dimensional Non-Convex Functions with Low Effective Dimensions.

【Paper Link】【Pages】:2000-2006

【Authors】: Hong Qian ; Yang Yu

【Abstract】: Simultaneous optimistic optimization (SOO) is a recently proposed global optimization method with a strong theoretical foundation. Previous studies have shown that SOO has a good performance in low-dimensional optimization problems, however, its performance is unsatisfactory when the dimensionality is high. This paper adapts random embedding to scaling SOO, resulting in the RESOO algorithm. We prove that the simple regret of RESOO depends only on the effective dimension of the problem, while that of SOO depends on the dimension of the solution space. Empirically, on some high-dimensional non-convex testing functions as well as hyper-parameter tuning tasks for multi-class support vector machines, RESOO shows significantly improved performance from SOO.

【Keywords】: random embedding; stochastic optimistic optimization; high-dimensional optimization

279. Selecting Near-Optimal Learners via Incremental Data Allocation.

【Paper Link】【Pages】:2007-2015

【Authors】: Ashish Sabharwal ; Horst Samulowitz ; Gerald Tesauro

【Abstract】: We study a novel machine learning (ML) problem setting of sequentially allocating small subsets of training data amongst a large set of classifiers. The goal is to select a classifier that will give near-optimal accuracy when trained on all data, while also minimizing the cost of misallocated samples. This is motivated by large modern datasets and ML toolkits with many combinations of learning algorithms and hyper-parameters. Inspired by the principle of "optimism under uncertainty," we propose an innovative strategy, Data Allocation using Upper Bounds (DAUB), which robustly achieves these objectives across a variety of real-world datasets. We further develop substantial theoretical support for DAUB in an idealized setting where the expected accuracy of a classifier trained on $n$ samples can be known exactly. Under these conditions we establish a rigorous sub-linear bound on the regret of the approach (in terms of misallocated data), as well as a rigorous bound on suboptimality of the selected classifier. Our accuracy estimates using real-world datasets only entail mild violations of the theoretical scenario, suggesting that the practical behavior of DAUB is likely to approach the idealized behavior.

【Keywords】: classifier selection; bandit algorithms; big data

280. Scalable Algorithms for Tractable Schatten Quasi-Norm Minimization.

【Paper Link】【Pages】:2016-2022

【Authors】: Fanhua Shang ; Yuanyuan Liu ; James Cheng

【Abstract】: The Schatten- p quasi-norm (0< p <1) is usually used to replace the standard nuclear norm in order to approximate the rank function more accurately. However, existing Schatten- p quasi-norm minimization algorithms involve singular value decomposition (SVD) or eigenvalue decomposition (EVD) in each iteration, and thus may become very slow and impractical for large-scale problems. In this paper, we first define two tractable Schatten quasi-norms, i.e., Frobenius/nuclear hybrid and bi-nuclear quasi-norms, and then prove that they are in essence the Schatten-2/3 and 1/2 quasi-norms, respectively, which lead to the design of very efficient algorithms that only need to update two much smaller factor matrices. We also design two efficient proximal alternating linearzied minimization algorithms for solving representative matrix completion problems. Finally, we provide the global convergence and performance guarantees for our algorithms, which have better convergence properties than existing algorithms. Experimental results on synthetic and real-world data show that our algorithms are more accurate than the state-of-the-art methods, and are orders of magnitude faster.

【Keywords】:

281. Spectral Bisection Tree Guided Deep Adaptive Exemplar Autoencoder for Unsupervised Domain Adaptation.

【Paper Link】【Pages】:2023-2029

【Authors】: Ming Shao ; Zhengming Ding ; Handong Zhao ; Yun Fu

【Abstract】: Learning with limited labeled data is always a challenge in AI problems, and one of promising ways is transferring well-established source domain knowledge to the target domain, i.e., domain adaptation. In this paper, we extend the deep representation learning to domain adaptation scenario, and propose a novel deep model called ``Deep Adaptive Exemplar AutoEncoder (DAE$^2$)''. Different from conventional denoising autoencoders using corrupted inputs, we assign semantics to the input-output pairs of the autoencoders, which allow us to gradually extract discriminant features layer by layer. To this end, first, we build a spectral bisection tree to generate source-target data compositions as the training pairs fed to autoencoders. Second, a low-rank coding regularizer is imposed to ensure the transferability of the learned hidden layer. Finally, a supervised layer is added on top to transform learned representations into discriminant features. The problem above can be solved iteratively in an EM fashion of learning. Extensive experiments on domain adaptation tasks including object, handwritten digits, and text data classifications demonstrate the effectiveness of the proposed method.

【Keywords】: Unsupervised domain adaptation; Spectral bisection tree; Marginalized autoencoder; Low-rank coding; Deep learning

282. Metric Learning for Ordinal Data.

【Paper Link】【Pages】:2030-2036

【Authors】: Yuan Shi ; Wenzhe Li ; Fei Sha

【Abstract】: A large amount of ordinal-valued data exist in many domains, including medical and health science, social science, economics, political science, etc. Unlike image and speech datasets of real-valued data, learning with ordinal variables (i.e., features) presents unique challenges. In particular, the nominal differences between those feature values, which are just ranks, do not necessarily correspond to the real distances between the corresponding categories. Given their wide existence, it is imperative to develop machine learning algorithms that specifically address the need to model and infer with such data. In this paper, we present a novel metric learning algorithm that takes into consideration the nature of ordinal data. Our approach treats ordinal values as latent variables in intervals. Our algorithm then learns what those intervals are as well as distance metrics to measure distances between latent variables in those intervals. We derive the corresponding optimization algorithm and demonstrate how that can be solved effectively. Experimental results show that the proposed approach significantly improves baselines that do not explicitly model ordinal features.

【Keywords】: distance metric learning; ordinal data; latent variables

283. Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization.

【Paper Link】【Pages】:2037-2043

【Authors】: Adish Singla ; Sebastian Tschiatschek ; Andreas Krause

【Abstract】: We address the problem of maximizing an unknown submodular function that can only be accessed via noisy evaluations. Our work is motivated by the task of summarizing content, e.g., image collections, by leveraging users' feedback in form of clicks or ratings. For summarization tasks with the goal of maximizing coverage and diversity, submodular set functions are a natural choice. When the underlying submodular function is unknown, users' feedback can provide noisy evaluations of the function that we seek to maximize. We provide a generic algorithm — ExpGreedy — for maximizing an unknown submodular function under cardinality constraints. This algorithm makes use of a novel exploration module— TopX — that proposes good elements based on adaptively sampling noisy function evaluations. TopX is able to accommodate different kinds of observation models such as value queries and pairwise comparisons. We provide PAC-style guarantees on the quality and sampling cost of the solution obtained by ExpGreedy. We demonstrate the effectiveness of our approach in an interactive, crowdsourced image collection summarization application.

【Keywords】: submodular maximization; noisy oracles; preference elicitation; image collection summarization; crowdsourcing

284. Bayesian Matrix Completion via Adaptive Relaxed Spectral Regularization.

【Paper Link】【Pages】:2044-2050

【Authors】: Yang Song ; Jun Zhu

【Abstract】: Bayesian matrix completion has been studied based on a low-rank matrix factorization formulation with promising results. However, little work has been done on Bayesian matrix completion based on the more direct spectral regularization formulation. We fill this gap by presenting a novel Bayesian matrix completion method based on spectral regularization. In order to circumvent the difficulties of dealing with the orthonormality constraints of singular vectors, we derive a new equivalent form with relaxed constraints, which then leads us to design an adaptive version of spectral regularization feasible for Bayesian inference. Our Bayesian method requires no parameter tuning and can infer the number of latent factors automatically. Experiments on synthetic and real datasets demonstrate encouraging results on rank recovery and collaborative filtering, with notably good results for very sparse matrices.

【Keywords】: matrix completion; Bayesian methods; nuclear norm; spectral regularization; Stiefel manifold

285. Marginalized Continuous Time Bayesian Networks for Network Reconstruction from Incomplete Observations.

【Paper Link】【Pages】:2051-2057

【Authors】: Lukas Studer ; Loïc Paulevé ; Christoph Zechner ; Matthias Reumann ; María Rodríguez Martínez ; Heinz Koeppl

【Abstract】: Continuous Time Bayesian Networks (CTBNs) provide a powerful means to model complex network dynamics. How- ever, their inference is computationally demanding — especially if one considers incomplete and noisy time-series data. The latter gives rise to a joint state- and parameter estimation problem, which can only be solved numerically. Yet, finding the exact parameterization of the CTBN has often only secondary importance in practical scenarios. We therefore focus on the structure learning problem and present a way to analytically marginalize the Markov chain underlying the CTBN model with respect its parameters. Since the resulting stochastic process is parameter-free, its inference reduces to an optimal filtering problem. We solve the latter using an efficient parallel implementation of a sequential Monte Carlo scheme. Our framework enables CTBN inference to be applied to incomplete noisy time-series data frequently found in molecular biology and other disciplines.

【Keywords】:

286. Return of Frustratingly Easy Domain Adaptation.

【Paper Link】【Pages】:2058-2065

【Authors】: Baochen Sun ; Jiashi Feng ; Kate Saenko

【Abstract】: Unlike human learning, machine learning often fails to handle changes between training (source) and test (target) input distributions. Such domain shifts, common in practical scenarios, severely damage the performance of conventional machine learning methods. Supervised domain adaptation methods have been proposed for the case when the target data have labels, including some that perform very well despite being ``frustratingly easy'' to implement. However, in practice, the target domain is often unlabeled, requiring unsupervised adaptation. We propose a simple, effective, and efficient method for unsupervised domain adaptation called CORrelation ALignment (CORAL). CORAL minimizes domain shift by aligning the second-order statistics of source and target distributions, without requiring any target labels. Even though it is extraordinarily simple--it can be implemented in four lines of Matlab code--CORAL performs remarkably well in extensive evaluations on standard benchmark datasets.

【Keywords】: Domain Adaptation; Unsupervised Learning; Deep Learning

287. On the Depth of Deep Neural Networks: A Theoretical View.

【Paper Link】【Pages】:2066-2072

【Authors】: Shizhao Sun ; Wei Chen ; Liwei Wang ; Xiaoguang Liu ; Tie-Yan Liu

【Abstract】: People believe that depth plays an important role in success of deep neural networks (DNN). However, this belief lacks solid theoretical justifications as far as we know. We investigate role of depth from perspective of margin bound. In margin bound, expected error is upper bounded by empirical margin error plus Rademacher Average (RA) based capacity term. First, we derive an upper bound for RA of DNN, and show that it increases with increasing depth. This indicates negative impact of depth on test performance. Second, we show that deeper networks tend to have larger representation power (measured by Betti numbers based complexity) than shallower networks in multi-class setting, and thus can lead to smaller empirical margin error. This implies positive impact of depth. The combination of these two results shows that for DNN with restricted number of hidden units, increasing depth is not always good since there is a tradeoff between positive and negative impacts. These results inspire us to seek alternative ways to achieve positive impact of depth, e.g., imposing margin-based penalty terms to cross entropy loss so as to reduce empirical margin error without increasing depth. Our experiments show that in this way, we achieve significantly better test performance.

【Keywords】:

288. Linear-Time Learning on Distributions with Approximate Kernel Embeddings.

【Paper Link】【Pages】:2073-2079

【Authors】: Dougal J. Sutherland ; Junier B. Oliva ; Barnabás Póczos ; Jeff G. Schneider

【Abstract】: Many interesting machine learning problems are best posed by considering instances that are distributions, or sample sets drawn from distributions. Most previous work devoted to machine learning tasks with distributional inputs has done so through pairwise kernel evaluations between pdfs (or sample sets). While such an approach is fine for smaller datasets, the computation of an N × N Gram matrix is prohibitive in large datasets. Recent scalable estimators that work over pdfs have done so only with kernels that use Euclidean metrics, like the L 2 distance. However, there are a myriad of other useful metrics available, such as total variation, Hellinger distance, and the Jensen-Shannon divergence. This work develops the first random features for pdfs whose dot product approximates kernels using these non-Euclidean metrics. These random features allow estimators to scale to large datasets by working in a primal space, without computing large Gram matrices. We provide an analysis of the approximation error in using our proposed random features, and show empirically the quality of our approximation both in estimating a Gram matrix and in solving learning tasks in real-world and synthetic data.

【Keywords】: nonparametric estimation; approximate kernel embedding; learning on distributions

289. Learning Sparse Confidence-Weighted Classifier on Very High Dimensional Data.

【Paper Link】【Pages】:2080-2086

【Authors】: Mingkui Tan ; Yan Yan ; Li Wang ; Anton van den Hengel ; Ivor W. Tsang ; Qinfeng (Javen) Shi

【Abstract】: Confidence-weighted (CW) learning is a successful online learning paradigm which maintains a Gaussian distribution over classifier weights and adopts a covariancematrix to represent the uncertainties of the weight vectors. However, there are two deficiencies in existing full CW learning paradigms, these being the sensitivity to irrelevant features, and the poor scalability to high dimensional data due to the maintenance of the covariance structure. In this paper, we begin by presenting an online-batch CW learning scheme, and then present a novel paradigm to learn sparse CW classifiers. The proposed paradigm essentially identifies feature groups and naturally builds a block diagonal covariance structure, making it very suitable for CW learning over very high-dimensional data.Extensive experimental results demonstrate the superior performance of the proposed methods over state-of-the-art counterparts on classification and feature selection tasks.

【Keywords】: Online learning; Confidence-weighted learning; High Dimensional Data; block diagonal covariance

290. Algorithms for Differentially Private Multi-Armed Bandits.

【Paper Link】【Pages】:2087-2093

【Authors】: Aristide C. Y. Tossou ; Christos Dimitrakakis

【Abstract】: We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist (ε,δ) differentially private variants of Upper Confidence Bound algorithms which have optimal regret, O(ε−1 + log T ). This is a significant improvement over previous results, which only achieve poly-log regret O(ε−2 log3 T), because of our use of a novel interval based mechanism. We also substantially improve the bounds of previous family of algorithms which use a continual release mechanism. Experiments clearly validate our theoretical bounds.

【Keywords】: differential privacy; stochastic bandit; upper confidence bound; UCB

291. Deep Reinforcement Learning with Double Q-Learning.

【Paper Link】【Pages】:2094-2100

【Authors】: Hado van Hasselt ; Arthur Guez ; David Silver

【Abstract】: The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-learning with a deep neural network, suffers from substantial overestimations in some games in the Atari 2600 domain. We then show that the idea behind the Double Q-learning algorithm, which was introduced in a tabular setting, can be generalized to work with large-scale function approximation. We propose a specific adaptation to the DQN algorithm and show that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

【Keywords】:

292. Online Instrumental Variable Regression with Applications to Online Linear System Identification.

【Paper Link】【Pages】:2101-2107

【Authors】: Arun Venkatraman ; Wen Sun ; Martial Hebert ; J. Andrew Bagnell ; Byron Boots

【Abstract】: Instrumental variable regression (IVR) is a statistical technique utilized to recover unbiased estimators when there are errors in the independent variables. Estimator bias in learned time series models can yield poor performance in applications such as long-term prediction and filtering where the recursive use of the model results in the accumulation of propagated error. However, prior work addressed the IVR objective in the batch setting, where it is necessary to store the entire dataset in memory — an infeasible requirementin large dataset scenarios. In this work, we develop Online Instrumental Variable Regression (OIVR), an algorithm that is capable of updating the learned estimator with streaming data. We show that the online adaptation of IVR enjoys a no-regret performance guarantee with respect the original batchsetting by taking advantage of any no-regret online learning algorithm inside OIVR for the underlying update steps. We experimentally demonstrate the efficacy of our algorithm in combination with popular no-regret onlinealgorithms for the task of learning predictive dynamical system models and on a prototypical econometrics instrumental variable regression problem.

【Keywords】: Online learning; System Identification; Regret analysis; Instrumental Variable Regression

293. The Hidden Convexity of Spectral Clustering.

【Paper Link】【Pages】:2108-2114

【Authors】: James R. Voss ; Mikhail Belkin ; Luis Rademacher

【Abstract】: In recent years, spectral clustering has become a standard method for data analysis used in a broad range of applications. In this paper we propose a new class of algorithms for multiway spectral clustering based on optimization of a certain "contrast function" over the unit sphere. These algorithms, partly inspired by certain Indepenent Component Analysis techniques, are simple, easy to implement and efficient. Geometrically, the proposed algorithms can be interpreted as hidden basis recovery by means of function optimization. We give a complete characterization of the contrast functions admissible for provable basis recovery. We show how these conditions can be interpreted as a "hidden convexity" of our optimization problem on the sphere; interestingly, we use efficient convex maximization rather than the more common convex minimization. We also show encouraging experimental results on real and simulated data.

【Keywords】: spectral clustering; convex maximization; basis recovery

294. Multitask Generalized Eigenvalue Program.

【Paper Link】【Pages】:2115-2121

【Authors】: Boyu Wang ; Joelle Pineau ; Borja Balle

【Abstract】: We present a novel multitask learning framework called multitask generalized eigenvalue program (MTGEP), which jointly solves multiple related generalized eigenvalue problems (GEPs). This framework is quite general and can be applied to many eigenvalue problems in machine learning and pattern recognition, ranging from supervised learning to unsupervised learning, such as principal component analysis (PCA), Fisher discriminant analysis (FDA), common spatial pattern (CSP), and so on. The core assumption of our approach is that the leading eigenvectors of related GEPs lie in some subspace that can be approximated by a sparse linear combination of basis vectors. As a result, these GEPs can be jointly solved by a sparse coding approach. Empirical evaluation with both synthetic and benchmark real world datasets validates the efficacy and efficiency of the proposed techniques, especially for grouped multitask GEPs.

【Keywords】: multitask learning; generalized eigenvalue program

295. Product Grassmann Manifold Representation and Its LRR Models.

【Paper Link】【Pages】:2122-2129

【Authors】: Boyue Wang ; Yongli Hu ; Junbin Gao ; Yanfeng Sun ; Baocai Yin

【Abstract】: It is a challenging problem to cluster multi- and high-dimensional data with complex intrinsic properties and non-linear manifold structure. The recently proposed subspace clustering method, Low Rank Representation (LRR), shows attractive performance on data clustering, but it generally does with data in Euclidean spaces. In this paper, we intend to cluster complex high dimensional data with multiple varying factors. We propose a novel representation, namely Product Grassmann Manifold (PGM), to represent these data. Additionally, we discuss the geometry metric of the manifold and expand the conventional LRR model in Euclidean space onto PGM and thus construct a new LRR model. Several clustering experimental results show that the proposed method obtains superior accuracy compared with the clustering methods on manifolds or conventional Euclidean spaces.

【Keywords】: Low Rank Representation, Subspace Clustering, Grassmann Manifold, Kernelized Method

296. Text Classification with Heterogeneous Information Network Kernels.

【Paper Link】【Pages】:2130-2136

【Authors】: Chenguang Wang ; Yangqiu Song ; Haoran Li ; Ming Zhang ; Jiawei Han

【Abstract】: Text classification is an important problem with many applications. Traditional approaches represent text as a bag-of-words and build classifiers based on this representation. Rather than words, entity phrases, the relations between the entities, as well as the types of the entities and relations carry much more information to represent the texts. This paper presents a novel text as network classification framework, which introduces 1) a structured and typed heterogeneous information networks (HINs) representation of texts, and 2) a meta-path based approach to link texts. We show that with the new representation and links of texts, the structured and typed information of entities and relations can be incorporated into kernels. Particularly, we develop both simple linear kernel and indefinite kernel based on meta-paths in the HIN representation of texts, where we call them HIN-kernels. Using Freebase, a well-known world knowledge base, to construct HIN for texts, our experiments on two benchmark datasets show that the indefinite HIN kernel based on weighted meta-paths outperforms the state-of-the-art methods and other HIN-kernels.

【Keywords】: Text classification; Document modeling; Heterogeneous information networks

297. Semi-Supervised Dictionary Learning via Structural Sparse Preserving.

【Paper Link】【Pages】:2137-2144

【Authors】: Di Wang ; Xiaoqin Zhang ; Mingyu Fan ; Xiuzi Ye

【Abstract】: While recent techniques for discriminative dictionary learning have attained promising results on the classification tasks, their performance is highly dependent on the number of labeled samples available for training. However, labeling samples is expensive and time consuming due to the significant human effort involved. In this paper, we present a novel semi- supervised dictionary learning method which utilizes the structural sparse relationships between the labeled and unlabeled samples. Specifically, by connecting the sparse reconstruction coefficients on both the original samples and dictionary, the unlabeled samples can be automatically grouped to the different labeled samples, and the grouped samples share a small number of atoms in the dictionary via mixed l2p- norm regularization. This makes the learned dictionary more representative and discriminative since the shared atoms are learned by using the labeled and unlabeled samples potentially from the same class. Minimizing the derived objective function is a challenging task because it is non-convex and highly non-smooth. We propose an efficient optimization algorithm to solve the problem based on the block coordinate descent method. Moreover, we have a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of our method in classification applications.

【Keywords】:

298. Relational Knowledge Transfer for Zero-Shot Learning.

【Paper Link】【Pages】:2145-2151

【Authors】: Donghui Wang ; Yanan Li ; Yuetan Lin ; Yueting Zhuang

【Abstract】: General zero-shot learning (ZSL) approaches exploit transfer learning via semantic knowledge space. In this paper, we reveal a novel relational knowledge transfer (RKT) mechanism for ZSL, which is simple, generic and effective. RKT resolves the inherent semantic shift problem existing in ZSL through restoring the missing manifold structure of unseen categories via optimizing semantic mapping. It extracts the relational knowledge from data manifold structure in semantic knowledge space based on sparse coding theory. The extracted knowledge is then transferred backwards to generate virtual data for unseen categories in the feature space. On the one hand, the generalizing ability of the semantic mapping function can be enhanced with the added data. On the other hand, the mapping function for unseen categories can be learned directly from only these generated data, achieving inspiring performance. Incorporated with RKT, even simple baseline methods can achieve good results. Extensive experiments on three challenging datasets show prominent performance obtained by RKT, and we obtain 82.43% accuracy on the Animals with Attributes dataset.

【Keywords】:

299. Optimizing Multivariate Performance Measures from Multi-View Data.

【Paper Link】【Pages】:2152-2158

【Authors】: Jim Jing-Yan Wang ; Ivor Wai-Hung Tsang ; Xin Gao

【Abstract】: To date, many machine learning applications have multiple views of features, and different applications require specific multivariate performance measures, such as the F-score for retrieval. However, existing multivariate performance measure optimization methods are limited to single-view data, while traditional multi-view learning methods cannot optimize multivariate performance measures directly. To fill this gap, in this paper, we propose the problem of optimizing multivariate performance measures from multi-view data, and an effective method to solve it. We propose to learn linear discriminant functions for different views, and combine them to construct an overall multivariate mapping function for multi-view data. To learn the parameters of the linear discriminant functions of different views to optimize a given multivariate performance measure, we formulate an optimization problem. In this problem, we propose to minimize the complexity of the linear discriminant function of each view, promote the consistency of the responses of different views over the same data points, and minimize the upper boundary of the corresponding loss of a given multivariate performance measure. To optimize this problem, we develop an iterative cutting-plane algorithm. Experiments on four benchmark data sets show that it not only outperforms traditional single-view based multivariate performance optimization methods, but also achieves better results than ordinary multi-view learning methods.

【Keywords】: Multivariate Performance Measures; Multi-View Learning; Cutting-Plane Algorithm; Multi-View Consistency

300. An Efficient Time Series Subsequence Pattern Mining and Prediction Framework with an Application to Respiratory Motion Prediction.

【Paper Link】【Pages】:2159-2165

【Authors】: Shouyi Wang ; KinMing Kam ; Cao Xiao ; Stephen R. Bowen ; Wanpracha Art Chaovalitwongse

【Abstract】: Traditional time series analysis methods are limited on some complex real-world time series data. Respiratory motion prediction is one of such challenging problems. The memory-based nearest neighbor approaches haveshown potentials in predicting complex nonlinear time series compared to many traditional parametric prediction models. However, the massive time series subsequences representation, the similarity distance measures, the number of nearest neighbors, and the ensemble functions create challenges as well as limit the performance of nearest neighbor approaches in complex time series prediction. To address these problems, we propose a flexible time series pattern representation and selection framework, called the orthogonalpolynomial-based variant-nearest-neighbor (OPVNN) approach. For the respiratory motion prediction problem, the proposed approach achieved the highest and most robust prediction performance compared to the state-of-the-art time series prediction methods. With a solid mathematical and theoretical foundation in orthogonal polynomials, the proposed time series representation, subsequence pattern mining and prediction framework has a great potential to benefit those industry and medical applications that need to handle highly nonlinear and complex time series data streams, such as quasi-periodic ones.

【Keywords】: Time Series, Subsequence Pattern Mining, Memory-Based Prediction

【Paper Link】【Pages】:2166-2172

【Authors】: Xin Wang ; Ming-Ching Chang ; Yiming Ying ; Siwei Lyu

【Abstract】: Many learning problems in real world applications involve rich datasets comprising multiple information modalities. In this work, we study co-regularized PLSA(coPLSA) as an efficient solution to probabilistic topic analysis of multi-modal data. In coPLSA, similarities between topic compositions of a data entity across different data modalities are measured with divergences between discrete probabilities, which are incorporated as a co-regularizer to augment individual PLSA models over each data modality. We derive efficient iterative learning algorithms for coPLSA with symmetric KL, L2 and L1 divergences as co-regularizers, in each case the essential optimization problem affords simple numerical solutions that entail only matrix arithmetic operations and numerical solution of 1D nonlinear equations. We evaluate the performance of the coPLSA algorithms on text/image cross-modal retrieval tasks, on which they show competitive performance with state-of-the-art methods.

【Keywords】: PLSA, Topic Model, Multi-Modal Learning

302. Toward a Better Understanding of Deep Neural Network Based Acoustic Modelling: An Empirical Investigation.

【Paper Link】【Pages】:2173-2179

【Authors】: Xingfu Wang ; Lin Wang ; Jing Chen ; Litao Wu

【Abstract】: Recently, deep neural networks (DNNs) have outperformed traditional acoustic models on a variety of speech recognition benchmarks.However, due to system differences across research groups, although a tremendous breadth and depth of related work has been established, it is still not easy to assess the performance improvements of a particular architectural variant from examining the literature when building DNN acoustic models. Our work aims to uncover which variations among baseline systems are most relevant for automatic speech recognition (ASR) performance via a series of systematic tests on the limits of the major architectural choices.By holding all the other components fixed, we are able to explore the design and training decisions without being confounded by the other influencing factors. Our experiment results suggest that a relatively simple DNN architecture and optimization technique produces strong results.These findings, along with previous work, not only help build a better understanding towards why DNN acoustic models perform well or how they might be improved, but also help establish a set of best practices for new speech corpora and language understanding task variants.

【Keywords】: Deep Nueral Network, speech recognition, acoustic model, cross entropy, stochastic gradient descent, Nesterovs Accelerated Gradient, Dropout, early stopping, rectified linear unit, classical momentum

303. Noise-Adaptive Margin-Based Active Learning and Lower Bounds under Tsybakov Noise Condition.

【Paper Link】【Pages】:2180-2186

【Authors】: Yining Wang ; Aarti Singh

【Abstract】: We present a simple noise-robust margin-based active learn-ing algorithm to find homogeneous (passing the origin) linearseparators and analyze its error convergence when labels arecorrupted by noise. We show that when the imposed noisesatisfies the Tsybakov low noise condition (Mammen, Tsy-bakov, and others 1999; Tsybakov 2004) the algorithm is ableto adapt to unknown level of noise and achieves optimal sta-tistical rate up to polylogarithmic factors. We also derive lower bounds for margin based active learningalgorithms under Tsybakov noise conditions (TNC) for themembership query synthesis scenario (Angluin 1988). Ourresult implies lower bounds for the stream based selectivesampling scenario (Cohn 1990) under TNC for some fairlysimple data distributions. Quite surprisingly, we show that thesample complexity cannot be improved even if the underly-ing data distribution is as simple as the uniform distributionon the unit ball. Our proof involves the construction of a well-separated hypothesis set on the d-dimensional unit ball alongwith carefully designed label distributions for the Tsybakovnoise condition. Our analysis might provide insights for otherforms of lower bounds as well.

【Keywords】:

304. Learning by Transferring from Unsupervised Universal Sources.

【Paper Link】【Pages】:2187-2193

【Authors】: Yu-Xiong Wang ; Martial Hebert

【Abstract】: Category classifiers trained from a large corpus of annotated data are widely accepted as the sources for (hypothesis) transfer learning. Sources generated in this way are tied to a particular set of categories, limiting their transferability across a wide spectrum of target categories. In this paper, we address this largely-overlooked yet fundamental source problem by both introducing a systematic scheme for generating universal source hypotheses and proposing a principled, scalable approach to automatically tuning the transfer process. Our approach is based on the insights that expressive source hypotheses could be generated without any supervision and that a sparse combination of such hypotheses facilitates recognition of novel categories from few samples. We demonstrate improvements over the state-of-the-art on object and scene classification in the small sample size regime.

【Keywords】: Transfer Learning, Domain Adaptation, Object Recognition, Scene Classification

305. Learning Deep ℓ0 Encoders.

【Paper Link】【Pages】:2194-2200

【Authors】: Zhangyang Wang ; Qing Ling ; Thomas S. Huang

【Abstract】: Despite its nonconvex nature, ℓ 0 sparse approximation is desirable in many theoretical and application cases. We study the ℓ 0 sparse approximation problem with the tool of deep learning, by proposing Deep ℓ 0 Encoders. Two typical forms, the ℓ 0 regularized problem and the M-sparse problem, are investigated. Based on solid iterative algorithms, we model them as feed-forward neural networks, through introducing novel neurons and pooling functions. Enforcing such structural priors acts as an effective network regularization. The deep encoders also enjoy faster inference, larger learning capacity, and better scalability compared to conventional sparse coding solutions. Furthermore, under task-driven losses, the models can be conveniently optimized from end to end. Numerical results demonstrate the impressive performances of the proposed encoders.

【Keywords】: deep learning; sparse approximation

306. Adaptive Normalized Risk-Averting Training for Deep Neural Networks.

【Paper Link】【Pages】:2201-2207

【Authors】: Zhiguang Wang ; Tim Oates ; James Lo

【Abstract】: This paper proposes a set of new error criteria and a learning approach, called Adaptive Normalized Risk-Averting Training (ANRAT) to attack the non-convex optimization problem in training deep neural networks without pretraining. Theoretically, we demonstrate its effectiveness based on the expansion of the convexity region. By analyzing the gradient on the convexity index $\lambda$, we explain the reason why our learning method using gradient descent works. In practice, we show how this training method is successfully applied for improved training of deep neural networks to solve visual recognition tasks on the MNIST and CIFAR-10 datasets. Using simple experimental settings without pretraining and other tricks, we obtain results comparable or superior to those reported in recent literature on the same tasks using standard ConvNets + MSE/cross entropy. Performance on deep/shallow multilayer perceptron and Denoised Auto-encoder is also explored. ANRAT can be combined with other quasi-Newton training methods, innovative network variants, regularization techniques and other common tricks in DNNs. Other than unsupervised pretraining, it provides a new perspective to address the non-convex optimization strategy in training DNNs.

【Keywords】: risk averting error;convex optimization;deep neural networks;

307. Nonlinear Feature Extraction with Max-Margin Data Shifting.

【Paper Link】【Pages】:2208-2214

【Authors】: Jianqiao Wangni ; Ning Chen

【Abstract】: Feature extraction is an important task in machine learning. In this paper, we present a simple and efficient method, named max-margin data shifting (MMDS), to process the data before feature extraction. By relying on a large-margin classifier, MMDS is helpful to enhance the discriminative ability of subsequent feature extractors. The kernel trick can be applied to extract nonlinear features from input data. We further analyze in detail the example of principal component analysis (PCA). The empirical results on multiple linear and nonlinear models demonstrate that MMDS can efficiently improve the performance of unsupervised extractors.

【Keywords】: Large Margin Learning; Principal Component Analysis; Kernel Methods

308. Unsupervised Feature Selection on Networks: A Generative View.

【Paper Link】【Pages】:2215-2221

【Authors】: Xiaokai Wei ; Bokai Cao ; Philip S. Yu

【Abstract】: In the past decade, social and information networks have become prevalent, and research on the network data has attracted much attention. Besides the link structure, network data are often equipped with the content information (i.e, node attributes) that is usually noisy and characterized by high dimensionality. As the curse of dimensionality could hamper the performance of many machine learning tasks on networks (e.g., community detection and link prediction), feature selection can be a useful technique for alleviating such issue. In this paper, we investigate the problem of unsupervised feature selection on networks. Most existing feature selection methods fail to incorporate the linkage information, and the state-of-the-art approaches usually rely on pseudo labels generated from clustering. Such cluster labels may be far from accurate and can mislead the feature selection process. To address these issues, we propose a generative point of view for unsupervised features selection on networks that can seamlessly exploit the linkage and content information in a more effective manner. We assume that the link structures and node content are generated from a succinct set of high-quality features, and we find these features through maximizing the likelihood of the generation process. Experimental results on three real-world datasets show that our approach can select more discriminative features than state-of-the-art methods.

【Keywords】: Machine Learning; Feature Selection; Social Network; Community Detection

309. Model-Free Preference-Based Reinforcement Learning.

【Paper Link】【Pages】:2222-2228

【Authors】: Christian Wirth ; Johannes Fürnkranz ; Gerhard Neumann

【Abstract】: Specifying a numeric reward function for reinforcement learning typically requires a lot of hand-tuning from a human expert. In contrast, preference-based reinforcement learning (PBRL) utilizes only pairwise comparisons between trajectories as a feedback signal, which are often more intuitive to specify. Currently available approaches to PBRL for control problems with continuous state/action spaces require a known or estimated model, which is often not available and hard to learn. In this paper, we integrate preference-based estimation of the reward function into a model-free reinforcement learning (RL) algorithm, resulting in a model-free PBRL algorithm. Our new algorithm is based on Relative Entropy Policy Search (REPS), enabling us to utilize stochastic policies and to directly control the greediness of the policy update. REPS decreases exploration of the policy slowly by limiting the relative entropy of the policy update, which ensures that the algorithm is provided with a versatile set of trajectories, and consequently with informative preferences. The preference-based estimation is computed using a sample-based Bayesian method, which can also estimate the uncertainty of the utility. Additionally, we also compare to a linear solvable approximation, based on inverse RL. We show that both approaches perform favourably to the current state-of-the-art. The overall result is an algorithm that can learn non-parametric continuous action policies from a small number of preferences.

【Keywords】: Reinforcement Learning, Preferences;Model-Free;Relative Entropy;Bayesian

310. Constrained Submodular Minimization for Missing Labels and Class Imbalance in Multi-label Learning.

【Paper Link】【Pages】:2229-2236

【Authors】: Baoyuan Wu ; Siwei Lyu ; Bernard Ghanem

【Abstract】: In multi-label learning, there are two main challenges: missing labels and class imbalance (CIB). The former assumes that only a partial set of labels are provided for each training instance while other labels are missing. CIB is observed from two perspectives: first, the number of negative labels of each instance is much larger than its positive labels; second, the rate of positive instances (i.e. the number of positive instances divided by the total number of instances) of different classes are significantly different. Both missing labels and CIB lead to significant performance degradation. In this work, we propose a new method to handle these two challenges simultaneously. We formulate the problem as a constrained submodular minimization that is composed of a submodular objective function that encourages label consistency and smoothness, as well as, class cardinality bound constraints to handle class imbalance. We further present a convex approximation based on the Lovasz extension of submodular functions, leading to a linear program, which can be efficiently solved by the alternative direction method of multipliers (ADMM). Experimental results on several benchmark datasets demonstrate the improved performance of our method over several state-of-the-art methods.

【Keywords】: multi-label; missing labels; class-imbalance; constrained submodular minimization

311. Representing Sets of Instances for Visual Recognition.

【Paper Link】【Pages】:2237-2243

【Authors】: Jianxin Wu ; Bin-Bin Gao ; Guoqing Liu

【Abstract】: In computer vision, a complex entity such as an image or video is often represented as a set of instance vectors, which are extracted from different parts of that entity. Thus, it is essential to design a representation to encode information in a set of instances robustly. Existing methods such as FV and VLAD are designed based on a generative perspective, and their performances fluctuate when difference types of instance vectors are used (i.e., they are not robust). The proposed D3 method effectively compares two sets as two distributions, and proposes a directional total variation distance (DTVD) to measure their dissimilarity. Furthermore, a robust classifier-based method is proposed to estimate DTVD robustly, and to efficiently represent these sets. D3 is evaluated in action and image recognition tasks. It achieves excellent robustness, accuracy and speed.

【Keywords】:

312. Robust Semi-Supervised Learning through Label Aggregation.

【Paper Link】【Pages】:2244-2250

【Authors】: Yan Yan ; Zhongwen Xu ; Ivor W. Tsang ; Guodong Long ; Yi Yang

【Abstract】: Semi-supervised learning is proposed to exploit both labeled and unlabeled data. However, as the scale of data in real world applications increases significantly, conventional semi-supervised algorithms usually lead to massive computational cost and cannot be applied to large scale datasets. In addition, label noise is usually present in the practical applications due to human annotation, which very likely results in remarkable degeneration of performance in semi-supervised methods. To address these two challenges, in this paper, we propose an efficient RObust Semi-Supervised Ensemble Learning (ROSSEL) method, which generates pseudo-labels for unlabeled data using a set of weak annotators, and combines them to approximate the ground-truth labels to assist semi-supervised learning. We formulate the weighted combination process as a multiple label kernel learning (MLKL) problem which can be solved efficiently. Compared with other semi-supervised learning algorithms, the proposed method has linear time complexity. Extensive experiments on five benchmark datasets demonstrate the superior effectiveness, efficiency and robustness of the proposed algorithm.

【Keywords】:

313. Analysis-Synthesis Dictionary Learning for Universality-Particularity Representation Based Classification.

【Paper Link】【Pages】:2251-2257

【Authors】: Meng Yang ; Weiyang Liu ; Weixin Luo ; Linlin Shen

【Abstract】: Dictionary learning has played an important role in the success of sparse representation. Although synthesis dictionary learning for sparse representation has been well studied for universality representation (i.e., the dictionary is universal to all classes) and particularity representation (i.e., the dictionary is class-particular), jointly learning an analysis dictionary and a synthesis dictionary is still in its infant stage. Universality-particularity representation can well match the intrinsic characteristics of data (i.e., different classes share commonality and distinctness), while analysis-synthesis dictionary can give a more complete view of data representation (i.e., analysis dictionary is a dual-viewpoint of synthesis dictionary). In this paper, we proposed a novel model of analysis-synthesis dictionary learning for universality-particularity (ASDL-UP) representation based classification. The discrimination of universality and particularity representation is jointly exploited by simultaneously learning a pair of analysis dictionary and synthesis dictionary. More specifically, we impose a label preserving term to analysis coding coefficients for universality representation. Fisher-like regularizations for analysis coding coefficients and the subsequent synthesis representation are introduced to particularity representation. Compared with other state-of-the-art dictionary learning methods, ASDL-UP has shown better or competitive performance in various classification tasks.

【Keywords】: Analysis-Synthesis Dictionary Learning; Universality-Particularity Representation; Classification

314. Efficient Average Reward Reinforcement Learning Using Constant Shifting Values.

【Paper Link】【Pages】:2258-2264

【Authors】: Shangdong Yang ; Yang Gao ; Bo An ; Hao Wang ; Xingguo Chen

【Abstract】: There are two classes of average reward reinforcement learning (RL) algorithms: model-based ones that explicitly maintain MDP models and model-free ones that do not learn such models. Though model-free algorithms are known to be more efficient, they often cannot converge to optimal policies due to the perturbation of parameters. In this paper, a novel model-free algorithm is proposed, which makes use of constant shifting values (CSVs) estimated from prior knowledge. To encourage exploration during the learning process, the algorithm constantly subtracts the CSV from the rewards. A terminating condition is proposed to handle the unboundedness of Q-values caused by such substraction. The convergence of the proposed algorithm is proved under very mild assumptions. Furthermore, linear function approximation is investigated to generalize our method to handle large-scale tasks. Extensive experiments on representative MDPs and the popular game Tetris show that the proposed algorithms significantly outperform the state-of-the-art ones.

【Keywords】: Reinforcement Learning; Average Reward; Constant Shifting Value

315. Learning Continuous-Time Bayesian Networks in Relational Domains: A Non-Parametric Approach.

【Paper Link】【Pages】:2265-2271

【Authors】: Shuo Yang ; Tushar Khot ; Kristian Kersting ; Sriraam Natarajan

【Abstract】: Many real world applications in medicine, biology, communication networks, web mining, and economics, among others, involve modeling and learning structured stochastic processes that evolve over continuous time. Existing approaches, however, have focused on propositional domains only. Without extensive feature engineering, it is difficult-if not impossible-to apply them within relational domains where we may have varying number of objects and relations among them. We therefore develop the first relational representation called Relational Continuous-Time Bayesian Networks (RCTBNs) that can address this challenge. It features a nonparametric learning method that allows for efficiently learning the complex dependencies and their strengths simultaneously from sequence data. Our experimental results demonstrate that RCTBNs can learn as effectively as state-of-the-art approaches for propositional tasks while modeling relational tasks faithfully.

【Keywords】: relational continuous time Bayesian networks; structured sequence data; sequential event prediction; relational functional gradient boosting

316. Instance Specific Metric Subspace Learning: A Bayesian Approach.

【Paper Link】【Pages】:2272-2278

【Authors】: Han-Jia Ye ; De-Chuan Zhan ; Yuan Jiang

【Abstract】: Instead of using a uniform metric, instance specific distance learning methods assign multiple metrics for different localities, which take data heterogeneity into consideration. Therefore, they may improve the performance of distance based classifiers, e.g., kNN. Existing methods obtain multiple metrics of test data by either transductively assigning metrics for unlabeled instances or designing distance functions manually, which are with limited generalization ability. In this paper, we propose isMets (Instance Specific METric Subspace) framework which can automatically span the whole metric space in a generative manner and is able to inductively learn a specific metric subspace for each instance via inferring the expectation over the metric bases in a Bayesian manner. The whole framework can be solved with Variational Bayes (VB). Experiment on synthetic data shows that the learned results are with good interpretability. Moreover, comprehensive results on real world datasets validate the effectiveness and robustness of isMets.

【Keywords】: Instance Specific Metric Learning; Bayesian Approach

317. Scalable Completion of Nonnegative Matrix with Separable Structure.

【Paper Link】【Pages】:2279-2285

【Authors】: Xiyu Yu ; Wei Bian ; Dacheng Tao

【Abstract】: Matrix completion is to recover missing/unobserved values of a data matrix from very limited observations. Due to widely potential applications, it has received growing interests in fields from machine learning, data mining, to collaborative filtering and computer vision. To ensure the successful recovery of missing values, most existing matrix completion algorithms utilise the low-rank assumption, i.e., the fully observed data matrix has a low rank, or equivalently the columns of the matrix can be linearly represented by a few numbers of basis vectors. Although such low-rank assumption applies generally in practice, real-world data can process much richer structural information. In this paper, we present a new model for matrix completion, motivated by the separability assumption of nonnegative matrices from the recent literature of matrix factorisations: there exists a set of columns of the matrix such that the resting columns can be represented by their convex combinations. Given the separability property, which holds reasonably for many applications, our model provides a more accurate matrix completion than the low-rank based algorithms. Further, we derives a scalable algorithm to solve our matrix completion model, which utilises a randomised method to select the basis columns under the separability assumption and a coordinate gradient based method to automatically deal with the structural constraints in optimisation. Compared to the state-of-the-art algorithms, the proposed matrix completion model achieves competitive results on both synthetic and real datasets.

【Keywords】: Matrix completion; Nonnegative matrix separability assumption; Scalable algorithm; Coordinate gradient based method

318. Derivative-Free Optimization via Classification.

【Paper Link】【Pages】:2286-2292

【Authors】: Yang Yu ; Hong Qian ; Yi-Qi Hu

【Abstract】: Many randomized heuristic derivative-free optimization methods share a framework that iteratively learns a model for promising search areas and samples solutions from the model. This paper studies a particular setting of such framework, where the model is implemented by a classification model discriminating good solutions from bad ones. This setting allows a general theoretical characterization, where critical factors to the optimization are discovered. We also prove that optimization problems with Local Lipschitz continuity can be solved in polynomial time by proper configurations of this framework. Following the critical factors, we propose the randomized coordinate shrinking classification algorithm to learn the model, forming the RACOS algorithm, for optimization in continuous and discrete domains. Experiments on the testing functions as well as on the machine learning tasks including spectral clustering and classification with Ramp loss demonstrate the effectiveness of RACOS.

【Keywords】: randomized optimization; non-convex optimization; time complexity; evolutionary algorithm

319. On Order-Constrained Transitive Distance Clustering.

【Paper Link】【Pages】:2293-2299

【Authors】: Zhiding Yu ; Weiyang Liu ; Wenbo Liu ; Yingzhen Yang ; Ming Li ; B. V. K. Vijaya Kumar

【Abstract】: We consider the problem of approximating order-constrained transitive distance (OCTD) and its clustering applications. Given any pairwise data, transitive distance (TD) is defined as the smallest possible "gap" on the set of paths connecting them. While such metric definition renders significant capability of addressing elongated clusters, it is sometimes also an over-simplified representation which loses necessary regularization on cluster structure and overfits to short links easily. As a result, conventional TD often suffers from degraded performance given clusters with "thick" structures. Our key intuition is that the maximum (path) order, which is the maximum number of nodes on a path, controls the level of flexibility. Reducing this order benefits the clustering performance by finding a trade-off between flexibility and regularization on cluster structure. Unlike TD, finding OCTD becomes an intractable problem even though the number of connecting paths is reduced. We therefore propose a fast approximation framework, using random samplings to generate multiple diversified TD matrices and a pooling to output the final approximated OCTD matrix. Comprehensive experiments on toy, image and speech datasets show the excellent performance of OCTD, surpassing TD with significant gains and giving state-of-the-art performance on several datasets.

【Keywords】: Transitive Distance; Clustering; Order-constrained;

320. A Proximal Alternating Direction Method for Semi-Definite Rank Minimization.

【Paper Link】【Pages】:2300-2308

【Authors】: Ganzhao Yuan ; Bernard Ghanem

【Abstract】: Semi-definite rank minimization problems model a wide range of applications in both signal processing and machine learning fields. This class of problem is NP-hard in general. In this paper, we propose a proximal Alternating Direction Method (ADM) for the well-known semi-definite rank regularized minimization problem. Specifically, we first reformulate this NP-hard problem as an equivalent biconvex MPEC (Mathematical Program with Equilibrium Constraints), and then solve it using proximal ADM, which involves solving a sequence of structured convex semi-definite subproblems to find a desirable solution to the original rank regularized optimization problem. Moreover, based on the Kurdyka-Lojasiewicz inequality, we prove that the proposed method always converges to a KKT stationary point under mild conditions. We apply the proposed method to the widely studied and popular sensor network localization problem. Our extensive experiments demonstrate that the proposed algorithm outperforms state-of-the-art low-rank semi-definite minimization algorithms in terms of solution quality.

【Keywords】:

321. Learning Expected Hitting Time Distance.

【Paper Link】【Pages】:2309-2314

【Authors】: De-Chuan Zhan ; Peng Hu ; Zui Chu ; Zhi-Hua Zhou

【Abstract】: Most distance metric learning (DML) approaches focus on learning a Mahalanobis metric for measuring distances between examples. However, for particular feature representations, e.g., histogram features like BOW and SPM, Mahalanobis metric could not model the correlations between these features well. In this work, we define a non-Mahalanobis distance for histogram features, via Expected Hitting Time (EHT) of Markov Chain, which implicitly considers the high-order feature relationships between different histogram features. The EHT based distance is parameterized by transition probabilities of Markov Chain, we consequently propose a novel type of distance learning approach (LED, Learning Expected hitting time Distance) to learn appropriate transition probabilities for EHT based distance. We validate the effectiveness of LED on a series of real-world datasets. Moreover, experiments show that the learned transition probabilities are with good comprehensibility.

【Keywords】: Expected Hitting Time; Metric Learning

322. Stochastic Optimization for Kernel PCA.

【Paper Link】【Pages】:2315-2322

【Authors】: Lijun Zhang ; Tianbao Yang ; Jinfeng Yi ; Rong Jin ; Zhi-Hua Zhou

【Abstract】: Kernel Principal Component Analysis (PCA) is a popular extension of PCA which is able to find nonlinear patterns from data. However, the application of kernel PCA to large-scale problems remains a big challenge, due to its quadratic space complexity and cubic time complexity in the number of examples. To address this limitation, we utilize techniques from stochastic optimization to solve kernel PCA with linear space and time complexities per iteration. Specifically, we formulate it as a stochastic composite optimization problem, where a nuclear norm regularizer is introduced to promote low-rankness, and then develop a simple algorithm based on stochastic proximal gradient descent. During the optimization process, the proposed algorithm always maintains a low-rank factorization of iterates that can be conveniently held in memory. Compared to previous iterative approaches, a remarkable property of our algorithm is that it is equipped with an explicit rate of convergence. Theoretical analysis shows that the solution of our algorithm converges to the optimal one at an O(1/T) rate, where T is the number of iterations.

【Keywords】: Stochastic Optimization; Kernel PCA; Stochastic Proximal Gradient Descent; Low-rank

323. Asynchronous Distributed Semi-Stochastic Gradient Optimization.

【Paper Link】【Pages】:2323-2329

【Authors】: Ruiliang Zhang ; Shuai Zheng ; James T. Kwok

【Abstract】: With the recent proliferation of large-scale learning problems, there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants. However, existing algorithms either suffer from slow convergence due to the inherent variance of stochastic gradients, or have a fast linear convergence rate but at the expense of poorer solution quality. In this paper, we combine their merits by proposing a fast distributed asynchronous SGD-based algorithm with variance reduction. A constant learning rate can be used, and it is also guaranteed to converge linearly to the optimal solution. Experiments on the Google Cloud Computing Platform demonstrate that the proposed algorithm outperforms state-of-the-art distributed asynchronous algorithms in terms of both wall clock time and solution quality.

【Keywords】: Asynchronous; Distributed Computing; Stochastic Gradient Decent; SGD; Optimization; Large Scale Learning; Big Data

324. An Alternating Proximal Splitting Method with Global Convergence for Nonconvex Structured Sparsity Optimization.

【Paper Link】【Pages】:2330-2336

【Authors】: Shubao Zhang ; Hui Qian ; Xiaojin Gong

【Abstract】: In many learning tasks with structural properties, structured sparse modeling usually leads to better interpretability and higher generalization performance. While great efforts have focused on the convex regularization, recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However, the resulting nonconvex optimization problems are still challenging, especially for the structured sparsity-inducing regularizers. In this paper, we propose a splitting method for solving nonconvex structured sparsity optimization problems. The proposed method alternates between a gradient step and an easily solvable proximal step, and thus enjoys low per-iteration computational complexity. We prove that the whole sequence generated by the proposed method converges to a critical point with at least sublinear convergence rate, relying on the Kurdyka-Łojasiewicz inequality. Experiments on both simulated and real-world data sets demonstrate the efficiency and efficacy of the proposed method.

【Keywords】:

325. Accelerated Sparse Linear Regression via Random Projection.

【Paper Link】【Pages】:2337-2343

【Authors】: Weizhong Zhang ; Lijun Zhang ; Rong Jin ; Deng Cai ; Xiaofei He

【Abstract】: In this paper, we present an accelerated numerical method based on random projection for sparse linear regression. Previous studies have shown that under appropriate conditions, gradient-based methods enjoy a geometric convergence rate when applied to this problem. However, the time complexity of evaluating the gradient is as large as $\mathcal{O}(nd)$, where $n$ is the number of data points and $d$ is the dimensionality, making those methods inefficient for large-scale and high-dimensional dataset. To address this limitation, we first utilize random projection to find a rank-$k$ approximator for the data matrix, and reduce the cost of gradient evaluation to $\mathcal{O}(nk+dk)$, a significant improvement when $k$ is much smaller than $d$ and $n$. Then, we solve the sparse linear regression problem via a proximal gradient method with a homotopy strategy to generate sparse intermediate solutions. Theoretical analysis shows that our method also achieves a global geometric convergence rate, and moreover the sparsity of all the intermediate solutions are well-bounded over the iterations. Finally, we conduct experiments to demonstrate the efficiency of the proposed method.

【Keywords】: Lasso; Convex Optimization; Machine Learning

326. Large-Scale Graph-Based Semi-Supervised Learning via Tree Laplacian Solver.

【Paper Link】【Pages】:2344-2350

【Authors】: Yan-Ming Zhang ; Xu-Yao Zhang ; Xiao-Tong Yuan ; Cheng-Lin Liu

【Abstract】: Graph-based Semi-Supervised learning is one of the most popular and successful semi-supervised learning methods. Typically, it predicts the labels of unlabeled data by minimizing a quadratic objective induced by the graph, which is unfortunately a procedure of polynomial complexity in the sample size $n$. In this paper, we address this scalability issue by proposing a method that approximately solves the quadratic objective in nearly linear time. The method consists of two steps: it first approximates a graph by a minimum spanning tree, and then solves the tree-induced quadratic objective function in O(n) time which is the main contribution of this work. Extensive experiments show the significant scalability improvement over existing scalable semi-supervised learning methods.

【Keywords】: semi-supervised learning; graph-based learning methods

327. Near-Optimal Active Learning of Multi-Output Gaussian Processes.

【Paper Link】【Pages】:2351-2357

【Authors】: Yehong Zhang ; Trong Nghia Hoang ; Kian Hsiang Low ; Mohan S. Kankanhalli

【Abstract】: This paper addresses the problem of active learning of a multi-output Gaussian process (MOGP) model representing multiple types of coexisting correlated environmental phenomena. In contrast to existing works, our active learning problem involves selecting not just the most informative sampling locations to be observed but also the types of measurements at each selected location for minimizing the predictive uncertainty (i.e., posterior joint entropy) of a target phenomenon of interest given a sampling budget. Unfortunately, such an entropy criterion scales poorly in the numbers of candidate sampling locations and selected observations when optimized. To resolve this issue, we first exploit a structure common to sparse MOGP models for deriving a novel active learning criterion. Then, we exploit a relaxed form of submodularity property of our new criterion for devising a polynomial-time approximation algorithm that guarantees a constant-factor approximation of that achieved by the optimal set of selected observations. Empirical evaluation on real-world datasets shows that our proposed approach outperforms existing algorithms for active learning of MOGP and single-output GP models.

【Keywords】: Active learning, Gaussian process, multi-output Gaussian process

328. Multi-Domain Active Learning for Recommendation.

【Paper Link】【Pages】:2358-2364

【Authors】: Zihan Zhang ; Xiaoming Jin ; Lianghao Li ; Guiguang Ding ; Qiang Yang

【Abstract】: Recently, active learning has been applied to recommendation to deal with data sparsity on a single domain. In this paper, we propose an active learning strategy for recommendation to alleviate the data sparsity in a multi-domain scenario. Specifically, our proposed active learning strategy simultaneously consider both specific and independent knowledge over all domains. We use the expected entropy to measure the generalization error of the domain-specific knowledge and propose a variance-based strategy to measure the generalization error of the domain-independent knowledge. The proposed active learning strategy use a unified function to effectively combine these two measurements. We compare our strategy with five state-of-the-art baselines on five different multi-domain recommendation tasks, which are constituted by three real-world data sets. The experimental results show that our strategy performs significantly better than all the baselines and reduces human labeling efforts by at least 5.6%, 8.3%, 11.8%, 12.5% and 15.4% on the five tasks, respectively.

【Keywords】: active learning, recommendation, multi-domain

329. On the Differential Privacy of Bayesian Inference.

【Paper Link】【Pages】:2365-2371

【Authors】: Zuhe Zhang ; Benjamin I. P. Rubinstein ; Christos Dimitrakakis

【Abstract】: We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy. Our main contributions are four different algorithms for private Bayesian inference on probabilistic graphical models. These include two mechanisms for adding noise to the Bayesian updates, either directly to the posterior parameters, or to their Fourier transform so as to preserve update consistency. We also utilise a recently introduced posterior sampling mechanism, for which we prove bounds for the specific but general case of discrete Bayesian networks; and we introduce a maximum-a-posteriori private mechanism. Our analysis includes utility and privacy bounds, with a novel focus on the influence of graph structure on privacy. Worked examples and experiments with Bayesian naive Bayes and Bayesian linear regression illustrate the application of our mechanisms.

【Keywords】:

330. A Scalable and Extensible Framework for Superposition-Structured Models.

【Paper Link】【Pages】:2372-2378

【Authors】: Shenjian Zhao ; Cong Xie ; Zhihua Zhang

【Abstract】: In many learning tasks, structural models usually lead to better interpretability and higher generalization performance. In recent years, however, the simple structural models such as lasso are frequently proved to be insufficient. Accordingly, there has been a lot of work on "superposition-structured" models where multiple structural constraints are imposed. To efficiently solve these "superposition-structured" statistical models, we develop a framework based on a proximal Newton-type method. Employing the smoothed conic dual approach with the LBFGS updating formula, we propose a scalable and extensible proximal quasi-Newton (SEP-QN) framework. Empirical analysis on various datasets shows that our framework is potentially powerful, and achieves super-linear convergence rate for optimizing some popular "superposition-structured" statistical models such as the fused sparse group lasso.

【Keywords】: superposition-structured;proximal;LBFGS

331. Fast Asynchronous Parallel Stochastic Gradient Descent: A Lock-Free Approach with Convergence Guarantee.

【Paper Link】【Pages】:2379-2385

【Authors】: Shen-Yi Zhao ; Wu-Jun Li

【Abstract】: Stochastic gradient descent (SGD) and its variants have become more and more popular in machine learning due to their efficiency and effectiveness. To handle large-scale problems, researchers have recently proposed several parallel SGD methods for multicore systems. However, existing parallel SGD methods cannot achieve satisfactory performance in real applications. In this paper, we propose a fast asynchronous parallel SGD method, called AsySVRG, by designing an asynchronous strategy to parallelize the recently proposed SGD variant called stochastic variance reduced gradient (SVRG). AsySVRG adopts a lock-free strategy which is more efficient than other strategies with locks. Furthermore, we theoretically prove that AsySVRG is convergent with a linear convergence rate. Both theoretical and empirical results show that AsySVRG can outperform existing state-of-the-art parallel SGD methods like Hogwild! in terms of convergence rate and computation cost.

【Keywords】: stochastic learning; parallel; lock-free; sgd

332. DinTucker: Scaling Up Gaussian Process Models on Large Multidimensional Arrays.

【Paper Link】【Pages】:2386-2392

【Authors】: Shandian Zhe ; Yuan Qi ; Youngja Park ; Zenglin Xu ; Ian Molloy ; Suresh Chari

【Abstract】: Tensor decomposition methods are effective tools for modelling multidimensional array data (i.e., tensors). Among them, nonparametric Bayesian models, such as Infinite Tucker Decomposition (InfTucker), are more powerful than multilinear factorization approaches, including Tucker and PARAFAC, and usually achieve better predictive performance. However, they are difficult to handle massive data due to a prohibitively high training cost. To address this limitation, we propose Distributed infinite Tucker (DinTucker), a new hierarchical Bayesian model that enables local learning of InfTucker on subarrays and global information integration from local results. We further develop a distributed stochastic gradient descent algorithm, coupled with variational inference for model estimation. In addition, the connection between DinTucker and InfTucker is revealed in terms of model evidence. Experiments demonstrate that DinTucker maintains the predictive accuracy of InfTucker and is scalable on massive data: On multidimensional arrays with billions of elements from two real-world applications, DinTucker achieves significantly higher prediction accuracy with less training time, compared with the state-of-the-art large-scale tensor decomposition method, GigaTensor.

【Keywords】: Large Scale Tensor Decomposition; Multidimensional array analysis; Distributed Tensor Decomposition; Map-Reduce; Gaussian Process; Nonlinear Tensor Decomposition; Local Gaussian Process

333. Fast Nonsmooth Regularized Risk Minimization with Continuation.

【Paper Link】【Pages】:2393-2399

【Authors】: Shuai Zheng ; Ruiliang Zhang ; James T. Kwok

【Abstract】: In regularized risk minimization, the associated optimization problem becomes particularly difficult when both the loss and regularizer are nonsmooth. Existing approaches either have slow or unclear convergence properties, are restricted to limited problem subclasses, or require careful setting of a smoothing parameter. In this paper, we propose a continuation algorithm that is applicable to a large class of nonsmooth regularized risk minimization problems, can be flexibly used with a number of existing solvers for the underlying smoothed subproblem, and with convergence results on the whole algorithm rather than just one of its subproblems. In particular, when accelerated solvers are used, the proposed algorithm achieves the fastest known rates of $O(1/T^2)$ on strongly convex problems, and $O(1/T)$ on general convex problems. Experiments on nonsmooth classification and regression tasks demonstrate that the proposed algorithm outperforms the state-of-the-art.

【Keywords】: nonsmooth; regularized risk minimization; continuation;

334. Transfer Learning for Cross-Language Text Categorization through Active Correspondences Construction.

【Paper Link】【Pages】:2400-2406

【Authors】: Joey Tianyi Zhou ; Sinno Jialin Pan ; Ivor W. Tsang ; Shen-Shyang Ho

【Abstract】: Most existing heterogeneous transfer learning (HTL) methods for cross-language text classification rely on sufficient cross-domain instance correspondences to learn a mapping across heterogeneous feature spaces, and assume that such correspondences are given in advance. However, in practice, correspondences between domains are usually unknown. In this case, extensively manual efforts are required to establish accurate correspondences across multilingual documents based on their content and meta-information. In this paper, we present a general framework to integrate active learning to construct correspondences between heterogeneous domains for HTL, namely HTL through active correspondences construction (HTLA). Based on this framework, we develop a new HTL method. On top of the new HTL method, we further propose a strategy to actively construct correspondences between domains. Extensive experiments are conducted on various multilingual text classification tasks to verify the effectiveness of HTLA.

【Keywords】: Heterogeneous Transfer Learning, Active Learning, Cross-Language Text Categorization

335. Veto-Consensus Multiple Kernel Learning.

【Paper Link】【Pages】:2407-2414

【Authors】: Yuxun Zhou ; Ninghang Hu ; Costas J. Spanos

【Abstract】: We propose Veto-Consensus Multiple Kernel Learning (VCMKL), a novel way of combining multiple kernels such that one class of samples is described by the logical intersection (consensus) of base kernelized decision rules, whereas the other classes by the union (veto) of their complements. The proposed configuration is a natural fit for domain description and learning with hidden subgroups. We first provide generalization risk bound in terms of the Rademacher complexity of the classifier, and then a large margin multi-ν learning objective with tunable training error bound is formulated. Seeing that the corresponding optimization is non-convex and existing methods severely suffer from local minima, we establish a new algorithm, namely Parametric Dual Descent Procedure (PDDP) that can approach global optimum with guarantees. The bases of PDDP are two theorems that reveal the global convexity and local explicitness of the parameterized dual optimum, for which a series of new techniques for parametric program have been developed. The proposed method is evaluated on extensive set of experiments, and the results show significant improvement over the state-of-the-art approaches.

【Keywords】: Multiple Kernel Learning;Consensus Learning;Global Optimization

336. Deep Hashing Network for Efficient Similarity Retrieval.

【Paper Link】【Pages】:2415-2421

【Authors】: Han Zhu ; Mingsheng Long ; Jianmin Wang ; Yue Cao

【Abstract】: Due to the storage and retrieval efficiency, hashing has been widely deployed to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing, which improves the quality of hash coding by exploiting the semantic similarity on data pairs, has received increasing attention recently. For most existing supervised hashing methods for image retrieval, an image is first represented as a vector of hand-crafted or machine-learned features, followed by another separate quantization step that generates binary codes. However, suboptimal hash coding may be produced, because the quantization error is not statistically minimized and the feature representation is not optimally compatible with the binary coding. In this paper, we propose a novel Deep Hashing Network (DHN) architecture for supervised hashing, in which we jointly learn good image representation tailored to hash coding and formally control the quantization error. The DHN model constitutes four key components: (1) a sub-network with multiple convolution-pooling layers to capture image representations; (2) a fully-connected hashing layer to generate compact binary hash codes; (3) a pairwise cross-entropy loss layer for similarity-preserving learning; and (4) a pairwise quantization loss for controlling hashing quality. Extensive experiments on standard image retrieval datasets show the proposed DHN model yields substantial boosts over latest state-of-the-art hashing methods.

【Keywords】: Supervised Hashing; Deep Learning; Similarity Retrieval

337. Coupled Dictionary Learning for Unsupervised Feature Selection.

【Paper Link】【Pages】:2422-2428

【Authors】: Pengfei Zhu ; Qinghua Hu ; Changqing Zhang ; Wangmeng Zuo

【Abstract】: Unsupervised feature selection (UFS) aims to reduce the time complexity and storage burden, as well as improve the generalization performance. Most existing methods convert UFS to supervised learning problem by generating labels with specific techniques (e.g., spectral analysis, matrix factorization and linear predictor). Instead, we proposed a novel coupled analysis-synthesis dictionary learning method, which is free of generating labels. The representation coefficients are used to model the cluster structure and data distribution. Specifically, the synthesis dictionary is used to reconstruct samples, while the analysis dictionary analytically codes the samples and assigns probabilities to the samples. Afterwards, the analysis dictionary is used to select features that can well preserve the data distribution. The effective L2p-norm (0 < p <1) regularization is imposed on the analysis dictionary to get much sparse solution and is more effective in feature selection.We proposed an iterative reweighted least squares algorithm to solve the L2p-norm optimization problem and proved it can converge to a fixed point. Experiments on benchmark datasets validated the effectiveness of the proposed method

【Keywords】:

338. Stochastic Parallel Block Coordinate Descent for Large-Scale Saddle Point Problems.

【Paper Link】【Pages】:2429-2437

【Authors】: Zhanxing Zhu ; Amos J. Storkey

【Abstract】: We consider convex-concave saddle point problems with a separable structure and non-strongly convex functions. We propose an efficient stochastic block coordinate descent method using adaptive primal-dual updates, which enables flexible parallel optimization for large-scale problems. Our method shares the efficiency and flexibility of block coordinate descent methods with the simplicity of primal-dual methods and utilizing the structure of the separable convex-concave saddle point problem. It is capable of solving a wide range of machine learning applications, including robust principal component analysis, Lasso, and feature selection by group Lasso, etc. Theoretically and empirically, we demonstrate significantly better performance than state-of-the-art methods in all these applications.

【Keywords】: saddle point; stochastic coordiante descent; large-scale optimization

Technical Papers: Multiagent Systems 18

339. Temporal Vaccination Games under Resource Constraints.

【Paper Link】【Pages】:2438-2444

【Authors】: Abhijin Adiga ; Anil Vullikanti

【Abstract】: The decision to take vaccinations and other protective interventions for avoiding an infection is a natural game-theoretic setting. Most of the work on vaccination games has focused on decisions at the start of an epidemic. However, a lot of people defer their vaccination decisions, in practice. For example, in the case of the seasonal flu, vaccination rates gradually increase, as the epidemic rate increases. This motivates the study of temporal vaccination games, in which vaccination decisions can be made more than once. An important issue in the context of temporal decisions is that of resource limitations, which may arise due to production and distribution constraints. While there has been some work on temporal vaccination games, resource constraints have not been considered. In this paper, we study temporal vaccination games for epidemics in the SI (susceptible-infectious) model, with resource constraints in the form of a repeated game in complex social networks, with budgets on the number of vaccines that can be taken at any time. We find that the resource constraints and the vaccination and infection costs have a significant impact on the structure of Nash equilibria (NE). In general, the budget constraints can cause NE to become very inefficient, and finding efficient NE as well as the social optimum are NP-hard problems. We develop algorithms for finding NE and approximating the social optimum. We evaluate our results using simulations on different kinds of networks.

【Keywords】: vaccination games; disease spread; social optimum; best response strategy

340. Detection of Plan Deviation in Multi-Agent Systems.

【Paper Link】【Pages】:2445-2451

【Authors】: Bikramjit Banerjee ; Steven Loscalzo ; Daniel Lucas Thompson

【Abstract】: Plan monitoring in a collaborative multi-agent system requires an agent to not only monitor the execution of its own plan, but also to detect possible deviations or failures in the plan execution of its teammates. In domains featuring partial observability and uncertainty in the agents’ sensing and actuation, especially where communication among agents is sparse (as a part of a cost-minimized plan), plan monitoring can be a significant challenge. We design an Expectation Maximization (EM) based algorithm for detection of plan deviation of teammates in such a multi-agent system. However, a direct implementation of this algorithm is intractable, so we also design an alternative approach grounded on the agents’ plans, for tractability. We establish its equivalence to the intractable version, and evaluate these techniques in some challenging tasks.

【Keywords】:

341. Complexity of Shift Bribery in Committee Elections.

【Paper Link】【Pages】:2452-2458

【Authors】: Robert Bredereck ; Piotr Faliszewski ; Rolf Niedermeier ; Nimrod Talmon

【Abstract】: We study the (parameterized) complexity of Shift Bribery for multiwinner voting rules. We focus on the SNTV, Bloc, k-Borda, and Chamberlin-Courant rules, as well as on approximate variants of the Chamberlin-Courant rule, since the original rule is NP-hard to compute. We show that Shift Bribery tends to be significantly harder in the multiwinner setting than in the single-winner one by showing settings where Shift Bribery is easy in the single-winner cases, but is hard (and hard to approximate) in the multiwinner ones. We show that the non-monotonicity of those rules which are based on approximation algorithms for the Chamberlin--Courant rule sometimes affects the complexity of Shift Bribery.

【Keywords】: multiwinner elections; shift bribery; algorithms; parameterized complexity; campaign management; Chamberlin--Courant

342. Global Model Checking on Pushdown Multi-Agent Systems.

【Paper Link】【Pages】:2459-2465

【Authors】: Taolue Chen ; Fu Song ; Zhilin Wu

【Abstract】: Pushdown multi-agent systems, modeled by pushdown game structures (PGSs), are an important paradigm of infinite-state multi-agent systems. Alternating-time temporal logics are well-known specification formalisms for multi-agent systems, where the selective path quantifier is introduced to reason about strategies of agents. In this paper, we investigate model checking algorithms for variants of alternating-time temporal logics over PGSs, initiated by Murano and Perelli at IJCAI'15. We first give a triply exponential-time model checking algorithm for ATL* over PGSs. The algorithm is based on the saturation method, and is the first global model checking algorithm with a matching lower bound. Next, we study the model checking problem for the alternating-time mu-calculus. We propose an exponential-time global model checking algorithm which extends similar algorithms for pushdown systems and modal mu-calculus. The algorithm admits a matching lower bound, which holds even for the alternation-free fragment and ATL.

【Keywords】:

343. Frugal Bribery in Voting.

【Paper Link】【Pages】:2466-2472

【Authors】: Palash Dey ; Neeldhara Misra ; Y. Narahari

【Abstract】: Bribery in elections is an important problem in computational social choice theory. We introduce and study two important special cases of the bribery problem, namely, FRUGAL-BRIBERY and FRUGAL-$BRIBERY where the briber is frugal in nature. By this, we mean that the briber is only able to influence voters who benefit from the suggestion of the briber. More formally, a voter is vulnerable if the outcome of the election improves according to her own preference when she accepts the suggestion of the briber. In the FRUGAL-BRIBERY problem, the goal is to make a certain candidate win the election by changing only the vulnerable votes. In the FRUGAL-$BRIBERY problem, the vulnerable votes have prices and the goal is to make a certain candidate win the election by changing only the vulnerable votes, subject to a budget constraint. We show that both the FRUGAL-BRIBERY and the FRUGAL-$BRIBERY problems are intractable for many commonly used voting rules for weighted as well as unweighted elections. These intractability results demonstrate that bribery is a hard computational problem, in the sense that several special cases of this problem continue to be computationally intractable. This strengthens the view that bribery, although a possible attack on an election in principle, may be infeasible in practice.

【Keywords】: social choice theory, algorithms, bribery, voting, frugal

344. Target Surveillance in Adversarial Environments Using POMDPs.

【Paper Link】【Pages】:2473-2479

【Authors】: Maxim Egorov ; Mykel J. Kochenderfer ; Jaak J. Uudmae

【Abstract】: This paper introduces an extension of the target surveillance problem in which the surveillance agent is exposed to an adversarial ballistic threat. The problem is formulated as a mixed observability Markov decision process (MOMDP), which is a factored variant of the partially observable Markov decision process, to account for state and dynamic uncertainties. The control policy resulting from solving the MOMDP aims to optimize the frequency of target observations and minimize exposure to the ballistic threat. The adversary’s behavior is modeled with a level-k policy, which is used to construct the state transition of the MOMDP. The approach is empirically evaluated against a MOMDP adversary and against a human opponent in a target surveillance computer game. The empirical results demonstrate that, on average, level 3 MOMDP policies outperform lower level reasoning policies as well as human players.

【Keywords】: POMDPs; Level-k Reasoning; Target Surveillance; Adversarial Modeling

345. Multi-Variable Agents Decomposition for DCOPs.

【Paper Link】【Pages】:2480-2486

【Authors】: Ferdinando Fioretto ; William Yeoh ; Enrico Pontelli

【Abstract】: The application of DCOP models to large problems faces two main limitations: (i) Modeling limitations, as each agent can handle only a single variable of the problem; and (ii) Resolution limitations, as current approaches do not exploit the local problem structure withineach agent. This paper proposes a novel Multi-Variable Agent (MVA) DCOP decompositiontechnique, which: (i) Exploits the co-locality of each agent's variables, allowing us to adopt efficient centralized techniques within each agent; (ii) Enables the use of hierarchical parallel models and proposes the use of GPUs; and (iii) Reduces the amount of computation and communication required in several classes of DCOP algorithms.

【Keywords】: DCOP; CP; GPU; MVA

【Paper Link】【Pages】:2487-2493

【Authors】: Julio Erasmo Godoy ; Ioannis Karamouzas ; Stephen J. Guy ; Maria L. Gini

【Abstract】: In crowded multi-agent navigation environments, the motion of the agents is significantly constrained by the motion of the nearby agents. This makes planning paths very difficult and leads to inefficient global motion. To address this problem, we propose a new distributed approach to coordinate the motions of agents in crowded environments. With our approach, agents take into account the velocities and goals of their neighbors and optimize their motion accordingly and in real-time. We experimentally validate our coordination approach in a variety of scenarios and show that its performance scales to scenarios with hundreds of agents.

【Keywords】: Multi-Agent Navigation; Multi-Agent Coordination

【Paper Link】【Pages】:2494-2500

【Authors】: The Anh Han

【Abstract】: Social punishment, whereby cooperators punish defectors, has been suggested as an important mechanism that promotes the emergence of cooperation or maintenance of social norms in the context of the one-shot (i.e. non-repeated) interaction. However, whenever antisocial punishment, whereby defectors punish cooperators, is available, this antisocial behavior outperforms social punishment, leading to the destruction of cooperation. In this paper, we use evolutionary game theory to show that this antisocial behavior can be efficiently restrained by relying on prior commitments, wherein agents can arrange, prior to an interaction, agreements regarding posterior compensation by those who dishonor the agreements. We show that, although the commitment mechanism by itself can guarantee a notable level of cooperation, a significantly higher level is achieved when both mechanisms, those of proposing prior commitments and of punishment, are available in co-presence. Interestingly, social punishment prevails and dominates in this system as it can take advantage of the commitment mechanism to cope with antisocial behaviors. That is, establishment of a commitment system helps to pave the way for the evolution of social punishment and abundant cooperation, even in the presence of antisocial punishment.

【Keywords】: Cooperation; Punishment; Commitment; Antisocial punishment; Evolutionary Game Theory;

348. Efficient Computation of Emergent Equilibrium in Agent-Based Simulation.

【Paper Link】【Pages】:2501-2508

【Authors】: Zehong Hu ; Meng Sha ; Moath Jarrah ; Jie Zhang ; Hui Xi

【Abstract】: In agent-based simulation, emergent equilibrium describes the macroscopic steady states of agents' interactions. While the state of individual agents might be changing, the collective behavior pattern remains the same in macroscopic equilibrium states. Traditionally, these emergent equilibriums are calculated using Monte Carlo methods. However, these methods require thousands of repeated simulation runs, which are extremely time-consuming. In this paper, we propose a novel three-layer framework to efficiently compute emergent equilibriums. The framework consists of a macro-level pseudo-arclength equilibrium solver (PAES), a micro-level simulator (MLS) and a macro-micro bridge (MMB). It can adaptively explore parameter space and recursively compute equilibrium states using the predictor-corrector scheme. We apply the framework to the popular opinion dynamics and labour market models. The experimental results show that our framework outperformed Monte Carlo experiments in terms of computation efficiency while maintaining the accuracy.

【Keywords】: Emergent Behavior; Equilibrium Computation; Agent-Based Simulation

349. Strengthening Agents Strategic Ability with Communication.

【Paper Link】【Pages】:2509-2515

【Authors】: Xiaowei Huang ; Qingliang Chen ; Kaile Su

【Abstract】: The current frameworks of reasoning about agents' collective strategy are either too conservative or too liberal in terms of the sharing of local information between agents. In this paper, we argue that in many cases, a suitable amount of information is required to be communicated between agents to both enforce goals and keep privacy. Several communication operators are proposed to work with an epistemic strategy logic ATLK. The complexity of model checking resulting logics is studied, and surprisingly, we found that the additional expressiveness from the communication operators comes for free.

【Keywords】:

350. Model Checking Probabilistic Knowledge: A PSPACE Case.

【Paper Link】【Pages】:2516-2522

【Authors】: Xiaowei Huang ; Marta Kwiatkowska

【Abstract】: Model checking probabilistic knowledge of memoryful semantics is undecidable, even for a simple formula concerning the reachability of probabilistic knowledge of a single agent. This result suggests that the usual approach of tackling undecidable model checking problems, by finding syntactic restrictions over the logic language, may not suffice. In this paper, we propose to work with an additional restriction that agent's knowledge concerns a special class of atomic propositions. A PSPACE-complete case is identified with this additional restriction, for a logic language combining LTL with limit-sure knowledge of a single agent.

【Keywords】:

351. Learning for Decentralized Control of Multiagent Systems in Large, Partially-Observable Stochastic Environments.

【Paper Link】【Pages】:2523-2529

【Authors】: Miao Liu ; Christopher Amato ; Emily P. Anesta ; John Daniel Griffith ; Jonathan P. How

【Abstract】: Decentralized partially observable Markov decision processes (Dec-POMDPs) provide a general framework for multiagent sequential decision-making under uncertainty. Although Dec-POMDPs are typically intractable to solve for real-world problems, recent research on macro-actions (i.e., temporally-extended actions) has significantly increased the size of problems that can be solved. However, current methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. To accommodate more realistic scenarios, when such information is not available, this paper presents a policy-based reinforcement learning approach, which learns the agent policies based solely on trajectories generated by previous interaction with the environment (e.g., demonstrations). We show that our approach is able to generate valid macro-action controllers and develop an expectationmaximization (EM) algorithm (called Policy-based EM or PoEM), which has convergence guarantees for batch learning. Our experiments show PoEM is a scalable learning method that can learn optimal policies and improve upon hand-coded “expert” solutions.

【Keywords】: Dec-POMDPs, Multiagent Planning, Reinforcement Learning, Mealy Machine, Macro Action

352. Bayesian Learning of Other Agents' Finite Controllers for Interactive POMDPs.

【Paper Link】【Pages】:2530-2536

【Authors】: Alessandro Panella ; Piotr J. Gmytrasiewicz

【Abstract】: We consider an autonomous agent operating in a stochastic, partially-observable, multiagent environment, that explicitly models the other agents as probabilistic deterministic finite-state controllers (PDFCs) in order to predict their actions. We assume that such models are not given to the agent, but instead must be learned from (possibly imperfect) observations of the other agents' behavior. The agent maintains a belief over the other agents' models, that is updated via Bayesian inference. To represent this belief we place a flexible stick-breaking distribution over PDFCs, that allows the posterior to concentrate around controllers whose size is not bounded and scales with the complexity of the observed data. Since this Bayesian inference task is not analytically tractable, we devise a Markov chain Monte Carlo algorithm to approximate the posterior distribution. The agent then embeds the result of this inference into its own decision making process using the interactive POMDP framework. We show that our learning algorithm can learn agent models that are behaviorally accurate for problems of varying complexity, and that the agent's performance increases as a result.

【Keywords】: Multiagent Systems, Opponent Modeling, Bayesian Learning, Dirichlet Process

353. Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs.

【Paper Link】【Pages】:2537-2543

【Authors】: Philipp Robbel ; Frans A. Oliehoek ; Mykel J. Kochenderfer

【Abstract】: Many solution methods for Markov Decision Processes (MDPs) exploit structure in the problem and are based on value function factorization. Especially multiagent settings, however, are known to suffer from an exponential increase in value component sizes as interactions become denser, restricting problem sizes and types that can be handled. We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP. We show how representational benefits from anonymity translate into computational efficiencies, both for variable elimination in a factor graph and for the approximate linear programming solution to factored MDPs. Our methods scale to factored MDPs that were previously unsolvable, such as the control of a stochastic disease process over densely connected graphs with 50 nodes and 25 agents.

【Keywords】: Multiagent MDP; Factored MDP; Approximate Linear Programming; Variable Elimination; planning

354. ConTaCT: Deciding to Communicate during Time-Critical Collaborative Tasks in Unknown, Deterministic Domains.

【Paper Link】【Pages】:2544-2550

【Authors】: Vaibhav V. Unhelkar ; Julie A. Shah

【Abstract】: Communication between agents has the potential to improve team performance of collaborative tasks. However, communication is not free in most domains, requiring agents to reason about the costs and benefits of sharing information. In this work, we develop an online, decentralized communication policy, ConTaCT, that enables agents to decide whether or not to communicate during time-critical collaborative tasks in unknown, deterministic environments. Our approach is motivated by real-world applications, including the coordination of disaster response and search and rescue teams. These settings motivate a model structure that explicitly represents the world model as initially unknown but deterministic in nature, and that de-emphasizes uncertainty about action outcomes. Simulated experiments are conducted in which ConTaCT is compared to other multi-agent communication policies, and results indicate that ConTaCT achieves comparable task performance while substantially reducing communication overhead.

【Keywords】: Communication in multi-agent systems; Decentralized execution; Planning under uncertainty; Cooperation and collaboration

355. Is It Harmful When Advisors Only Pretend to Be Honest?

【Paper Link】【Pages】:2551-2557

【Authors】: Dongxia Wang ; Tim Muller ; Jie Zhang ; Yang Liu

【Abstract】: In trust systems, unfair rating attacks — where advisors provide ratings dishonestly — influence the accuracy of trust evaluation. A secure trust system should function properly under all possible unfair rating attacks; including dynamic attacks. In the literature, camouflage attacks are the most studied dynamic attacks. But an open question is whether more harmful dynamic attacks exist. We propose random processes to model and measure dynamic attacks. The harm of an attack is influenced by a user's ability to learn from the past. We consider three types of users: blind users, aware users, and general users. We found for all the three types, camouflage attacks are far from the most harmful. We identified the most harmful attacks, under which we found the ratings may still be useful to users.

【Keywords】: Trust Systems; Unfair Rating; Information Theory; Camouflage Attack

356. Robust Execution of BDI Agent Programs by Exploiting Synergies Between Intentions.

【Paper Link】【Pages】:2558-2565

【Authors】: Yuan Yao ; Brian Logan ; John Thangarajah

【Abstract】: A key advantage the reactive planning approach adopted by BDI-based agents is the ability to recover from plan execution failures, and almost all BDI agent programming languages and platforms provide some form of failure handling mechanism. In general, these consist of simply choosing an alternative plan for the failed subgoal (e.g., JACK, Jadex). In this paper, we propose an alternative approach to recovering from execution failures that relies on exploiting positive interactions between an agent's intentions. A positive interaction occurs when the execution of an action in one intention assists the execution of actions in other intentions (e.g., by (re)establishing their preconditions). We have implemented our approach in a scheduling algorithm for BDI agents which we call SP. The results of a preliminary empirical evaluation of SP suggest our approach out-performs existing failure handling mechanisms used by state-of-the-art BDI languages. Moreover, the computational overhead of SP is modest.

【Keywords】: BDI agents; Intention selection; Failure recovery

Technical Papers: NLP and Knowledge Representation 16

357. Short Text Representation for Detecting Churn in Microblogs.

【Paper Link】【Pages】:2566-2572

【Authors】: Hadi Amiri ; Hal Daumé III

【Abstract】: Churn happens when a customer leaves a brand or stop using its services. Brands reduce their churn rates by identifying and retaining potential churners through customer retention campaigns. In this paper, we consider the problem of classifying micro-posts as churny or non-churny with respect to a given brand. Motivated by the recent success of recurrent neural networks (RNNs) in word representation, we propose to utilize RNNs to learn micro-post and churn indicator representations. We show that such representations improve the performance of churn detection in microblogs and lead to more accurate ranking of churny contents. Furthermore, in this researchwe show that state-of-the-art sentiment analysis approaches fail to identify churny contents. Experiments on Twitter data about three telco brands show the utility of our approach for this task.

【Keywords】: short text representation; tweet representation; churn prediction; churn classification

358. Topic Concentration in Query Focused Summarization Datasets.

【Paper Link】【Pages】:2573-2579

【Authors】: Tal Baumel ; Raphael Cohen ; Michael Elhadad

【Abstract】: Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization methods, which ignore query relevance, when evaluated on traditional QFS datasets. We hypothesize this lack of success stems from the nature of the dataset. We define a task-based method to quantify topic concentration in datasets, i.e., the ratio of sentences within the dataset that are relevant to the query, and observe that the DUC 2005, 2006 and 2007 datasets suffer from very high topic concentration. We introduce TD-QFS, a new QFS dataset with controlled levels of topic concentration. We compare competitive baseline algorithms on TD-QFS and report strong improvement in ROUGE performance for algorithms that properly model query relevance as opposed to generic summarizers. We further present three new and simple QFS algorithms, RelSum, ThresholdSum, and TFIDF-KLSum that outperform state of the art QFS algorithms on the TD-QFS dataset by a large margin.

【Keywords】: Automatic suumarization; datasets; evaluation; QFS; IR

359. Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions.

【Paper Link】【Pages】:2580-2586

【Authors】: Peter Clark ; Oren Etzioni ; Tushar Khot ; Ashish Sabharwal ; Oyvind Tafjord ; Peter D. Turney ; Daniel Khashabi

【Abstract】: What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results. We evaluate the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam (using only non-diagram, multiple choice questions), and show that our overall system’s score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. We conclude with a detailed analysis, illustrating the complementary strengths of each method in the ensemble. Our datasets are being released to enable further research.

【Keywords】: question answering; natural language processing; machine learning; ensemble methods

360. Verb Pattern: A Probabilistic Semantic Representation on Verbs.

【Paper Link】【Pages】:2587-2593

【Authors】: Wanyun Cui ; Xiyou Zhou ; Hangyu Lin ; Yanghua Xiao ; Haixun Wang ; Seung-won Hwang ; Wei Wang

【Abstract】: Verbs are important in semantic understanding of natural language. Traditional verb representations, such as FrameNet, PropBank, VerbNet, focus on verbs' roles. These roles are too coarse to represent verbs' semantics. In this paper, we introduce verb patterns to represent verbs' semantics, such that each pattern corresponds to a single semantic of the verb. First we analyze the principles for verb patterns: generality and specificity. Then we propose a nonparametric model based on description length. Experimental results prove the high effectiveness of verb patterns. We further apply verb patterns to context-aware conceptualization, to show that verb patterns are helpful in semantic-related tasks.

【Keywords】: Verb Pattern; Verb Semantics; Verb Representation; Conceptualization

361. ExTaSem! Extending, Taxonomizing and Semantifying Domain Terminologies.

【Paper Link】【Pages】:2594-2600

【Authors】: Luis Espinosa Anke ; Horacio Saggion ; Francesco Ronzano ; Roberto Navigli

【Abstract】: We introduce ExTaSem!, a novel approach for the automatic learning of lexical taxonomies from domain terminologies. First, we exploit a very large semantic network to collect housands of in-domain textual definitions. Second, we extract (hyponym, hypernym) pairs from each definition with a CRF-based algorithm trained on manually-validated data. Finally, we introduce a graph induction procedure which constructs a full-fledged taxonomy where each edge is weighted according to its domain pertinence. ExTaSem! achieves state-of-the-art results in the following taxonomy evaluation experiments: (1) Hypernym discovery, (2) Reconstructing gold standard taxonomies, and (3) Taxonomy quality according to structural measures. We release weighted taxonomies for six domains for the use and scrutiny of the community.

【Keywords】: BabelNet; Taxonomy Learning; Semantics; Domain Pertinence; Hypernym;

362. A Unified Bayesian Model of Scripts, Frames and Language.

【Paper Link】【Pages】:2601-2607

【Authors】: Francis Ferraro ; Benjamin Van Durme

【Abstract】: We present the first probabilistic model to capture all levels of the Minsky Frame structure, with the goal of corpus-based induction of scenario definitions. Our model unifies prior efforts in discourse-level modeling with that of Fillmore's related notion of frame, as captured in sentence-level, FrameNet semantic parses; as part of this, we resurrect the coupling among Minsky's frames, Schank's scripts and Fillmore's frames, as originally laid out by those authors. Empirically, our approach yields improved scenario representations, reflected quantitatively in lower surprisal and more coherent latent scenarios.

【Keywords】: semantic frames; natural language understanding; event semantics

363. Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet.

【Paper Link】【Pages】:2608-2614

【Authors】: Josu Goikoetxea ; Eneko Agirre ; Aitor Soroa

【Abstract】: Text and Knowledge Bases are complementary sources of information. Given the success of distributed word representations learned from text, several techniques to infuse additional information from sources like WordNet into word representations have been proposed. In this paper, we follow an alternative route. We learn word representations from text and WordNet independently, and then explore simple and sophisticated methods to combine them. The combined representations are applied to an extensive set of datasets on word similarity and relatedness. Simple combination methods happen to perform better that more complex methods like CCA or retrofitting, showing that, in the case of WordNet, learning word representations separately is preferable to learning one single representation space or adding WordNet information directly. A key factor, which we illustrate with examples, is that the WordNet-based representations captures similarity relations encoded in WordNet better than retrofitting. In addition, we show that the average of the similarities from six word representations yields results beyond the state-of-the-art in several datasets, reinforcing the opportunities to explore further combination techniques.

【Keywords】: Knowledge bases, random walks, distributional semantics, similarity

364. Representing Verbs as Argument Concepts.

【Paper Link】【Pages】:2615-2621

【Authors】: Yu Gong ; Kaiqi Zhao ; Kenny Qili Zhu

【Abstract】: Verbs play an important role in the understanding of natural language text. This paper studies the problem of abstracting the subject and object arguments of a verb into a set of noun concepts, known as the “argument concepts”. This set of concepts, whose size is parameterized, represents the fine-grained semantics of a verb. For example, the object of “enjoy” can be abstracted into time, hobby and event, etc. We present a novel framework to automatically infer human readable and machine computable action concepts with high accuracy.

【Keywords】: Knowledge Representation; Conceptualization; Algorithm

365. A Generative Model of Words and Relationships from Multiple Sources.

【Paper Link】【Pages】:2622-2629

【Authors】: Stephanie L. Hyland ; Theofanis Karaletsos ; Gunnar Rätsch

【Abstract】: Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this requirement may not be met due to difficulties in obtaining a large corpus, or the limited range of expression in average use. Such domains may encode prior knowledge about entities in a knowledge base or ontology. We propose a generative model which integrates evidence from diverse data sources, enabling the sharing of semantic information. We achieve this by generalising the concept of co-occurrence from distributional semantics to include other relationships between entities or words, which we model as affine transformations on the embedding space. We demonstrate the effectiveness of this approach by outperforming recent models on a link prediction task and demonstrating its ability to profit from partially or fully unobserved data training labels. We further demonstrate the usefulness of learning from different data sources with overlapping vocabularies.

【Keywords】: word embeddings; generative model; natural language processing; relational data

366. Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning.

【Paper Link】【Pages】:2630-2637

【Authors】: Lemao Liu ; Andrew M. Finch ; Masao Utiyama ; Eiichiro Sumita

【Abstract】: Recurrent neural networks, particularly the long short- term memory networks, are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental short- coming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional LSTMs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of sequence-level losses. Extensive experiments were performed on two standard sequence-to-sequence trans- duction tasks: machine transliteration and grapheme-to- phoneme transformation. The results show that the proposed approach achieves consistent and substantial im- provements, compared to six state-of-the-art systems. In particular, our approach outperforms the best reported error rates by a margin (up to 9% relative gains) on the grapheme-to-phoneme task.

【Keywords】:

367. Fine-Grained Semantic Conceptualization of FrameNet.

【Paper Link】【Pages】:2638-2644

【Authors】: Jin-Woo Park ; Seung-won Hwang ; Haixun Wang

【Abstract】: Understanding verbs is essential for many natural language tasks. Tothis end, large-scale lexical resources such as FrameNet have beenmanually constructed to annotate the semantics of verbs (frames) andtheir arguments (frame elements or FEs) in example sentences.Our goal is to "semantically conceptualize" example sentences by connectingFEs to knowledge base (KB) concepts.For example, connecting Employer FE to company concept in the KB enables the understanding thatany (unseen) company can also be FE examples.However, a naive adoption of existing KB conceptualization technique, focusingon scenarios of conceptualizing a few terms,cannot 1) scale to many FE instances (average of 29.7 instances for all FEs) and 2) leverage interdependence betweeninstances and concepts.We thus propose a scalable k-truss clusteringand a Markov Random Field (MRF) model leveraging interdependence betweenconcept-instance, concept-concept, and instance-instance pairs. Our extensive analysis with real-life data validates that our approachimproves not only the quality of the identified concepts for FrameNet, but alsothat of applications such as selectional preference.

【Keywords】: Knowledge base;conceptualization;selectional preference

368. Dependency Tree Representations of Predicate-Argument Structures.

【Paper Link】【Pages】:2645-2651

【Authors】: Likun Qiu ; Yue Zhang ; Meishan Zhang

【Abstract】: We present a novel annotation framework for representing predicate-argument structures, which uses dependency trees to encode the syntactic and semantic roles of a sentence simultaneously. The main contribution is a semantic role transmission model, which eliminates the structural gap between syntax and shallow semantics, making them compatible. A Chinese semantic treebank was built under the proposed framework, and the first release containing about 14K sentences is made freely available. The proposed framework enables semantic role labeling to be solved as a sequence labeling task, and experiments show that standard sequence labelers can give competitive performance on the new treebank compared with state-of-the-art graph structure models.

【Keywords】: semantic role labeling; semantic treebank; Chinese

369. Complementing Semantic Roles with Temporally Anchored Spatial Knowledge: Crowdsourced Annotations and Experiments.

【Paper Link】【Pages】:2652-2658

【Authors】: Alakananda Vempala ; Eduardo Blanco

【Abstract】: This paper presents a framework to infer spatial knowledge from semantic role representations. We infer whether entities are or are not located somewhere, and temporally anchor this spatial information. A large crowdsourcing effort on top of OntoNotes shows that these temporally-anchored spatial inferences are ubiquitous and intuitive to humans. Experimental results show that inferences can be performed automatically and semantic features bring significant improvement.

【Keywords】: spatial inference; temporally-anchored knowledge; semantic roles

370. Representation Learning of Knowledge Graphs with Entity Descriptions.

【Paper Link】【Pages】:2659-2665

【Authors】: Ruobing Xie ; Zhiyuan Liu ; Jia Jia ; Huanbo Luan ; Maosong Sun

【Abstract】: Representation learning (RL) of knowledge graphs aims to project both entities and relations into a continuous low-dimensional space. Most methods concentrate on learning representations with knowledge triples indicating relations between entities. In fact, in most knowledge graphs there are usually concise descriptions for entities, which cannot be well utilized by existing methods. In this paper, we propose a novel RL method for knowledge graphs taking advantages of entity descriptions. More specifically, we explore two encoders, including continuous bag-of-words and deep convolutional neural models to encode semantics of entity descriptions. We further learn knowledge representations with both triples and descriptions. We evaluate our method on two tasks, including knowledge graph completion and entity classification. Experimental results on real-world datasets show that, our method outperforms other baselines on the two tasks, especially under the zero-shot setting, which indicates that our method is capable of building representations for novel entities according to their descriptions. The source code of this paper can be obtained from https://github.com/xrb92/DKRL.

【Keywords】: knowledge graph; representation learning; entity; convolutional neural network; description

371. Hashtag-Based Sub-Event Discovery Using Mutually Generative LDA in Twitter.

【Paper Link】【Pages】:2666-2672

【Authors】: Chen Xing ; Yuan Wang ; Jie Liu ; Yalou Huang ; Wei-Ying Ma

【Abstract】: Sub-event discovery is an effective method for social event analysis in Twitter. It can discover sub-events from large amount of noisy event-related information in Twitter and semantically represent them. The task is challenging because tweets are short, informal and noisy. To solve this problem, we consider leveraging event-related hashtags that contain many locations, dates and concise sub-event related descriptions to enhance sub-event discovery. To this end, we propose a hashtag-based mutually generative Latent Dirichlet Allocation model(MGe-LDA). In MGe-LDA, hashtags and topics of a tweet are mutually generated by each other. The mutually generative process models the relationship between hashtags and topics of tweets, and highlights the role of hashtags as a semantic representation of the corresponding tweets. Experimental results show that MGe-LDA can significantly outperform state-of-the-art methods for sub-event discovery.

【Keywords】:

372. PEAK: Pyramid Evaluation via Automated Knowledge Extraction.

【Paper Link】【Pages】:2673-2680

【Authors】: Qian Yang ; Rebecca J. Passonneau ; Gerard de Melo

【Abstract】: Evaluating the selection of content in a summary is important both for human-written summaries, which can be a useful pedagogical tool for reading and writing skills, and machine-generated summaries, which are increasingly being deployed in information management. The pyramid method assesses a summary by aggregating content units from the summaries of a wise crowd (a form of crowdsourcing). It has proven highly reliable but has largely depended on manual annotation. We propose PEAK, the first method to automatically assess summary content using the pyramid method that also generates the pyramid content models. PEAK relies on open information extraction and graph algorithms. The resulting scores correlate well with manually derived pyramid scores on both human and machine summaries, opening up the possibility of wide-spread use in numerous applications.

【Keywords】: automatic summarization; summarization evaluation; education

Technical Papers: NLP and Machine Learning 28

373. Instructable Intelligent Personal Agent.

【Paper Link】【Pages】:2681-2689

【Authors】: Amos Azaria ; Jayant Krishnamurthy ; Tom M. Mitchell

【Abstract】: Unlike traditional machine learning methods, humans often learn from natural language instruction. As users become increasingly accustomed to interacting with mobile devices using speech, their interest in instructing these devices in natural language is likely to grow. We introduce our Learning by Instruction Agent (LIA), an intelligent personal agent that users can teach to perform new action sequences to achieve new commands, using solely natural language interaction. LIA uses a CCG semantic parser to ground the semantics of each command in terms of primitive executable procedures defining sensors and effectors of the agent. Given a natural language command that LIA does not understand, it prompts the user to explain how to achieve the command through a sequence of steps, also specified in natural language. A novel lexicon induction algorithm enables LIA to generalize across taught commands, e.g., having been taught how to "forward an email to Alice," LIA can correctly interpret the command "forward this email to Bob." A user study involving email tasks demonstrates that users voluntarily teach LIA new commands, and that these taught commands significantly reduce task completion time. These results demonstrate the potential of natural language instruction as a significant, under-explored paradigm for machine learning.

【Keywords】: Learning by Instructions; Instructable Agent; Learning in Natural Language

374. Joint Word Representation Learning Using a Corpus and a Semantic Lexicon.

【Paper Link】【Pages】:2690-2696

【Authors】: Danushka Bollegala ; Mohammed Alsuhaibani ; Takanori Maehara ; Ken-ichi Kawarabayashi

【Abstract】: Methods for learning word representations using large text corpora have received much attention lately due to their impressive performancein numerous natural language processing (NLP) tasks such as, semantic similarity measurement, and word analogy detection.Despite their success, these data-driven word representation learning methods do not considerthe rich semantic relational structure between words in a co-occurring context. On the other hand, already much manual effort has gone into the construction of semantic lexicons such as the WordNetthat represent the meanings of words by defining the various relationships that exist among the words in a language.We consider the question, can we improve the word representations learnt using a corpora by integrating theknowledge from semantic lexicons?. For this purpose, we propose a joint word representation learning method that simultaneously predictsthe co-occurrences of two words in a sentence subject to the relational constrains given by the semantic lexicon.We use relations that exist between words in the lexicon to regularize the word representations learnt from the corpus.Our proposed method statistically significantly outperforms previously proposed methods for incorporating semantic lexicons into wordrepresentations on several benchmark datasets for semantic similarity and word analogy.

【Keywords】: Deep Learning; Word Representations

375. Ask, and Shall You Receive? Understanding Desire Fulfillment in Natural Language Text.

【Paper Link】【Pages】:2697-2703

【Authors】: Snigdha Chaturvedi ; Dan Goldwasser ; Hal Daumé III

【Abstract】: The ability to comprehend wishes or desires and their fulfillment is important to Natural Language Understanding. This paper introduces the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled. We propose various unstructured and structured models that capture fulfillment cues such as the subject's emotional state and actions. Our experiments with two different datasets demonstrate the importance of understanding the narrative and discourse structure to address this task.

【Keywords】: Natural Language Understanding; Latent Variable Models; Structured Prediction; Desires and Wishes

376. Modeling Evolving Relationships Between Characters in Literary Novels.

【Paper Link】【Pages】:2704-2710

【Authors】: Snigdha Chaturvedi ; Shashank Srivastava ; Hal Daumé III ; Chris Dyer

【Abstract】: Studying characters plays a vital role in computationally representing and interpreting narratives. Unlike previous work, which has focused on inferring character roles, we focus on the problem of modeling their relationships. Rather than assuming a fixed relationship for a character pair, we hypothesize that relationships temporally evolve with the progress of the narrative, and formulate the problem of relationship modeling as a structured prediction problem. We propose a semi-supervised framework to learn relationship sequences from fully as well as partially labeled data. We present a Markovian model capable of accumulating historical beliefs about the relationship and status changes. We use a set of rich linguistic and semantically motivated features that incorporate world knowledge to investigate the textual content of narrative. We empirically demonstrate that such a framework outperforms competitive baselines.

【Keywords】: Relationship Modeling; Dynamic Relationships; Structured Prediction; Semi-supervised Methods; Latent Variable Models

377. Jointly Modeling Topics and Intents with Global Order Structure.

【Paper Link】【Pages】:2711-2717

【Authors】: Bei Chen ; Jun Zhu ; Nan Yang ; Tian Tian ; Ming Zhou ; Bo Zhang

【Abstract】: Modeling document structure is of great importance for discourse analysis and related applications. The goal of this research is to capture the document intent structure by modeling documents as a mixture of topic words and rhetorical words. While the topics are relatively unchanged through one document, the rhetorical functions of sentences usually change following certain orders in discourse. We propose GMM-LDA, a topic modeling based Bayesian unsupervised model, to analyze the document intent structure cooperated with order information. Our model is flexible that has the ability to combine the annotations and do supervised learning. Additionally, entropic regularization can be introduced to model the significant divergence between topics and intents. We perform experiments in both unsupervised and supervised settings, results show the superiority of our model over several state-of-the-art baselines.

【Keywords】: Topic Model; Generalized Mallows Model; Discourse Analysis

378. Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text.

【Paper Link】【Pages】:2718-2726

【Authors】: Sahil Garg ; Aram Galstyan ; Ulf Hermjakob ; Daniel Marcu

【Abstract】: We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.

【Keywords】: abstract meaning representation; kernels; edge embedding

379. What Happens Next? Event Prediction Using a Compositional Neural Network Model.

【Paper Link】【Pages】:2727-2733

【Authors】: Mark Granroth-Wilding ; Stephen Clark

【Abstract】: We address the problem of automatically acquiring knowledge of event sequences from text, with the aim of providing a predictive model for use in narrative generation systems. We present a neural network model that simultaneously learns embeddings for words describing events, a function to compose the embeddings into a representation of the event, and a coherence function to predict the strength of association between two events. We introduce a new development of the narrative cloze evaluation task, better suited to a setting where rich information about events is available. We compare models that learn vector-space representations of the events denoted by verbs in chains centering on a single protagonist. We find that recent work on learning vector-space embeddings to capture word meaning can be effectively applied to this task, including simple incorporation of a verb's arguments in the representation by vector addition. These representations provide a good initialization for learning the richer, compositional model of events with a neural network, vastly outperforming a number of baselines and competitive alternatives.

【Keywords】: event inference, neural networks, embeddings, narrative

380. A Representation Learning Framework for Multi-Source Transfer Parsing.

【Paper Link】【Pages】:2734-2740

【Authors】: Jiang Guo ; Wanxiang Che ; David Yarowsky ; Haifeng Wang ; Ting Liu

【Abstract】: Cross-lingual model transfer has been a promising approach for inducing dependency parsers for low-resource languages where annotated treebanks are not available. The major obstacles for the model transfer approach are two-fold: 1. Lexical features are not directly transferable across languages; 2. Target language-specific syntactic structures are difficult to be recovered. To address these two challenges, we present a novel representation learning framework for multi-source transfer parsing. Our framework allows multi-source transfer parsing using full lexical features straightforwardly. By evaluating on the Google universal dependency treebanks (v2.0), our best models yield an absolute improvement of 6.53% in averaged labeled attachment score, as compared with delexicalized multi-source transfer models. We also significantly outperform the state-of-the-art transfer system proposed most recently.

【Keywords】: Natural Language Processing; Representation Learning; Multilingual Learning; Dependency Parsing

381. Character-Aware Neural Language Models.

【Paper Link】【Pages】:2741-2749

【Authors】: Yoon Kim ; Yacine Jernite ; David Sontag ; Alexander M. Rush

【Abstract】: We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway net work over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic, Czech, French, German, Spanish, Russian), the model outperforms word-level/morpheme-level LSTM baselines, again with fewer parameters. The results suggest that on many languages, character inputs are sufficient for language modeling. Analysis of word representations obtained from the character composition part of the model reveals that the model is able to encode, from characters only, both semantic and orthographic information.

【Keywords】:

382. Implicit Discourse Relation Classification via Multi-Task Neural Networks.

【Paper Link】【Pages】:2750-2756

【Authors】: Yang Liu ; Sujian Li ; Xiaodong Zhang ; Zhifang Sui

【Abstract】: Without discourse connectives, classifying implicit discourse relations is a challenging task and a bottleneck for building a practical discourse parser. Previous research usually makes use of one kind of discourse framework such as PDTB or RST to improve the classification performance on discourse relations. Actually, under different discourse annotation frameworks, there exist multiple corpora which have internal connections. To exploit the combination of different discourse corpora, we design related discourse classification tasks specific to a corpus, and propose a novel Convolutional Neural Network embedded multi-task learning system to synthesize these tasks by learning both unique and shared representations for each task. The experimental results on the PDTB implicit discourse relation classification task demonstrate that our model achieves significant gains over baseline systems.

【Keywords】: discourse parsing, multi-task learning, neural network

383. Convolution Kernels for Discriminative Learning from Streaming Text.

【Paper Link】【Pages】:2757-2763

【Authors】: Michal Lukasik ; Trevor Cohn

【Abstract】: Time series modeling is an important problem with many applications in different domains. Here we consider discriminative learning from time series, where we seek to predict an output response variable based on time series input. We develop a method based on convolution kernels to model discriminative learning over streams of text. Our method outperforms competitive baselines in three synthetic and two real datasets, rumour frequency modeling and popularity prediction tasks.

【Keywords】:

384. Numerical Relation Extraction with Minimal Supervision.

【Paper Link】【Pages】:2764-2771

【Authors】: Aman Madaan ; Ashish Mittal ; Mausam ; Ganesh Ramakrishnan ; Sunita Sarawagi

【Abstract】: We study a novel task of numerical relation extraction with the goal of extracting relations where one of the arguments is a number or a quantity ( e.g., atomic_number(Aluminium, 13), inflation_rate(India, 10.9%)). This task presents peculiar challenges not found in standard IE, such as the difficulty of matching numbers in distant supervision and the importance of units. We design two extraction systems that require minimal human supervision per relation: (1) NumberRule, a rule based extractor, and (2) NumberTron, a probabilistic graphical model. We find that both systems dramatically outperform MultiR, a state-of-the-art non-numerical IE model, obtaining up to 25 points F-score improvement.

【Keywords】:

385. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences.

【Paper Link】【Pages】:2772-2778

【Authors】: Hongyuan Mei ; Mohit Bansal ; Matthew R. Walter

【Abstract】: We propose a neural sequence-to-sequence model for direction following, a task that is essential to realizing effective autonomous agents. Our alignment-based encoder-decoder model with long short-term memory recurrent neural networks (LSTM-RNN) translates natural language instructions to action sequences based upon a representation of the observable world state. We introduce a multi-level aligner that empowers our model to focus on sentence "regions" salient to the current world state by using multiple abstractions of the input sentence. In contrast to existing methods, our model uses no specialized linguistic resources (e.g., parsers) or task-specific annotations (e.g., seed lexicons). It is therefore generalizable, yet still achieves the best results reported to-date on a benchmark single-sentence dataset and competitive results for the limited-training multi-sentence setting. We analyze our model through a series of ablations that elucidate the contributions of the primary components of our model.

【Keywords】: direction following; natural language processing; natural language semantics

386. Addressing a Question Answering Challenge by Combining Statistical Methods with Inductive Rule Learning and Reasoning.

【Paper Link】【Pages】:2779-2785

【Authors】: Arindam Mitra ; Chitta Baral

【Abstract】: A group of researchers from Facebook has recently proposed a set of 20 question-answering tasks (Facebook's bAbl dataset) as a challenge for the natural language understanding ability of an intelligent agent. These tasks are designed to measure various skills of an agent, such as: fact based question-answering, simple induction, the ability to find paths, co-reference resolution and many more. Their goal is to aid in the development of systems that can learn to solve such tasks and to allow a proper evaluation of such systems. They show existing systems cannot fully solve many of those toy tasks. In this work, we present a system that excels at all the tasks except one. The proposed model of the agent uses the Answer Set Programming (ASP) language as the primary knowledge representation and reasoning language along with the standard statistical Natural Language Processing (NLP) models. Given a training dataset containing a set of narrations, questions and their answers, the agent jointly uses a translation system, an Inductive Logic Programming algorithm and Statistical NLP methods to learn the knowledge needed to answer similar questions. Our results demonstrate that the introduction of a reasoning module significantly improves the performance of an intelligent agent.

【Keywords】: Question Answering; bAbl; Facebook Challenge; Natural Language Processing; Inductive Logic Programming

387. Siamese Recurrent Architectures for Learning Sentence Similarity.

【Paper Link】【Pages】:2786-2792

【Authors】: Jonas Mueller ; Aditya Thyagarajan

【Abstract】: We present a siamese adaptation of the Long Short-Term Memory (LSTM) network for labeled data comprised of pairs of variable-length sequences. Our model is applied to assess semantic similarity between sentences, where we exceed state of the art, outperforming carefully handcrafted features and recently proposed neural network systems of greater complexity. For these applications, we provide word-embedding vectors supplemented with synonymic information to the LSTMs, which use a fixed size vector to encode the underlying meaning expressed in a sentence (irrespective of the particular wording/syntax). By restricting subsequent operations to rely on a simple Manhattan metric, we compel the sentence representations learned by our model to form a highly structured space whose geometry reflects complex semantic relationships. Our results are the latest in a line of findings that showcase LSTMs as powerful language models capable of tasks requiring intricate understanding.

【Keywords】: neural network; semantic similarity; sentence representation; long short-term memory

388. Text Matching as Image Recognition.

【Paper Link】【Pages】:2793-2799

【Authors】: Liang Pang ; Yanyan Lan ; Jiafeng Guo ; Jun Xu ; Shengxian Wan ; Xueqi Cheng

【Abstract】: Matching two texts is a fundamental problem in many natural language processing tasks. An effective way is to extract meaningful matching patterns from words, phrases, and sentences to produce the matching score. Inspired by the success of convolutional neural network in image recognition, where neurons can capture many complicated patterns based on the extracted elementary visual patterns such as oriented edges and corners, we propose to model text matching as the problem of image recognition. Firstly, a matching matrix whose entries represent the similarities between words is constructed and viewed as an image. Then a convolutional neural network is utilized to capture rich matching patterns in a layer-by-layer way. We show that by resembling the compositional hierarchies of patterns in image recognition, our model can successfully identify salient signals such as n-gram and n-term matchings. Experimental results demonstrate its superiority against the baselines.

【Keywords】:

389. Learning Statistical Scripts with LSTM Recurrent Neural Networks.

【Paper Link】【Pages】:2800-2806

【Authors】: Karl Pichotta ; Raymond J. Mooney

【Abstract】: Scripts encode knowledge of prototypical sequences of events. We describe a Recurrent Neural Network model for statistical script learning using Long Short-Term Memory, an architecture which has been demonstrated to work well on a range of Artificial Intelligence tasks. We evaluate our system on two tasks, inferring held-out events from text and inferring novel events from text, substantially outperforming prior approaches on both tasks.

【Keywords】: Natural Language Processing; Deep Learning

390. Inferring Interpersonal Relations in Narrative Summaries.

【Paper Link】【Pages】:2807-2813

【Authors】: Shashank Srivastava ; Snigdha Chaturvedi ; Tom Mitchell

【Abstract】: Characterizing relationships between people is fundamental for the understanding of narratives. In this work, we address the problem of inferring the polarity of relationships between people in narrative summaries. We formulate the problem as a joint structured prediction for each narrative, and present a general model that combines evidence from linguistic and semantic features, as well as features based on the structure of the social community in the text. We additionally provide a clustering-based approach that can exploit regularities in narrative types. e.g., learn an affinity for love-triangles in romantic stories. On a dataset of movie summaries from Wikipedia, our structured models provide more than 30% error-reduction over a competitive baseline that considers pairs of characters in isolation.

【Keywords】: Structured Prediction;Relation classification;Computational Narrative Modeling;Structured Perceptron;Text mining

391. Evaluation of Semantic Dependency Labeling Across Domains.

【Paper Link】【Pages】:2814-2820

【Authors】: Svetlana Stoyanchev ; Amanda Stent ; Srinivas Bangalore

【Abstract】: One of the key concerns in computational semantics is to construct a domain independent semantic representation which captures the richness of natural language, yet can be quickly customized to a specific domain for practical applications. We propose to use generic semantic frames defined in FrameNet, a domain-independent semantic resource, as an intermediate semantic representation for language understanding in dialog systems. In this paper we: (a) outline a novel method for FrameNet-style semantic dependency labeling that builds on a syntactic dependency parse; and (b) compare the accuracy of domain-adapted and generic approaches to semantic parsing for dialog tasks, using a frame-annotated corpus of human-computer dialogs in an airline reservation domain.

【Keywords】: dialog; semantic parsing; semantic dependency

392. Inside Out: Two Jointly Predictive Models for Word Representations and Phrase Representations.

【Paper Link】【Pages】:2821-2827

【Authors】: Fei Sun ; Jiafeng Guo ; Yanyan Lan ; Jun Xu ; Xueqi Cheng

【Abstract】: Distributional hypothesis lies in the root of most existing word representation models by inferring word meaning from its external contexts. However, distributional models cannot handle rare and morphologically complex words very well and fail to identify some fine-grained linguistic regularity as they are ignoring the word forms. On the contrary, morphology points out that words are built from some basic units, i.e., morphemes. Therefore, the meaning and function of such rare words can be inferred from the words sharing the same morphemes, and many syntactic relations can be directly identified based on the word forms. However, the limitation of morphology is that it cannot infer the relationship between two words that do not share any morphemes. Considering the advantages and limitations of both approaches, we propose two novel models to build better word representations by modeling both external contexts and internal morphemes in a jointly predictive way, called BEING and SEING. These two models can also be extended to learn phrase representations according to the distributed morphology theory. We evaluate the proposed models on similarity tasks and analogy tasks. The results demonstrate that the proposed models can outperform state-of-the-art models significantly on both word and phrase representation learning.

【Keywords】: Word Representation; Distributional Hypothesis; Morphology; Phrase Representation

393. Non-Linear Similarity Learning for Compositionality.

【Paper Link】【Pages】:2828-2834

【Authors】: Masashi Tsubaki ; Kevin Duh ; Masashi Shimbo ; Yuji Matsumoto

【Abstract】: Many NLP applications rely on the existence ofsimilarity measures over text data.Although word vector space modelsprovide good similarity measures between words,phrasal and sentential similarities derived from compositionof individual words remain as a difficult problem.In this paper, we propose a new method of ofnon-linear similarity learning for semantic compositionality.In this method, word representations are learnedthrough the similarity learning of sentencesin a high-dimensional space with kernel functions.On the task of predicting the semantic similarity oftwo sentences (SemEval 2014, Task 1),our method outperforms linear baselines,feature engineering approaches,recursive neural networks,and achieve competitive results with long short-term memory models.

【Keywords】:

394. A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations.

【Paper Link】【Pages】:2835-2841

【Authors】: Shengxian Wan ; Yanyan Lan ; Jiafeng Guo ; Jun Xu ; Liang Pang ; Xueqi Cheng

【Abstract】: Matching natural language sentences is central for many applications such as information retrieval and question answering. Existing deep models rely on a single sentence representation or multiple granularity representations for matching. However, such methods cannot well capture the contextualized local information in the matching process. To tackle this problem, we present a new deep architecture to match two sentences with multiple positional sentence representations. Specifically, each positional sentence representation is a sentence representation at this position, generated by a bidirectional long short term memory (Bi-LSTM). The matching score is finally produced by aggregating interactions between these different positional sentence representations, through k-Max pooling and a multi-layer perceptron. Our model has several advantages: (1) By using Bi-LSTM, rich context of the whole sentence is leveraged to capture the contextualized local information in each positional sentence representation; (2) By matching with multiple positional sentence representations, it is flexible to aggregate different important contextualized local information in a sentence to support the matching; (3) Experiments on different tasks such as question answering and sentence completion demonstrate the superiority of our model.

【Keywords】:

395. Morphological Segmentation with Window LSTM Neural Networks.

【Paper Link】【Pages】:2842-2848

【Authors】: Linlin Wang ; Zhu Cao ; Yu Xia ; Gerard de Melo

【Abstract】: Morphological segmentation, which aims to break words into meaning-bearing morphemes, is an important task in natural language processing. Most previous work relies heavily on linguistic preprocessing. In this paper, we instead propose novel neural network architectures that learn the structure of input sequences directly from raw input words and are subsequently able to predict morphological boundaries. Our architectures rely on Long Short Term Memory (LSTM) units to accomplish this, but exploit windows of characters to capture more contextual information. Experiments on multiple languages confirm the effectiveness of our models on this task.

【Keywords】: morphology; segmentation; LSTMs; recurrent neural network

396. Minimally-Constrained Multilingual Embeddings via Artificial Code-Switching.

【Paper Link】【Pages】:2849-2855

【Authors】: Michael Wick ; Pallika Kanani ; Adam Pocock

【Abstract】: We present a method that consumes a large corpus of multilingual text and produces a single, unified word embedding in which the word vectors generalize across languages. In contrast to current approaches that require language identification, our method is agnostic about the languages with which the documents in the corpus are expressed, and does not rely on parallel corpora to constrain the spaces. Instead we utilize a small set of human provided word translations---which are often freely and readily available. We can encode such word translations as hard constraints in the model's objective functions; however, we find that we can more naturally constrain the space by allowing words in one language to borrow distributional statistics from context words in another language. We achieve this via a process we term artificial code-switching. As the name suggests, we induce code-switching so that words across multiple languages appear in contexts together. Not only do embedding models trained on code-switched data learn common cross-lingual structure, the common structure allows an NLP model trained in a source language to generalize to multiple target languages (achieving up to 80% of the accuracy of models trained with target-language data).

【Keywords】: NLP; word embeddings; multilingual; sentiment analysis; artificial code switching

397. Syntactic Skeleton-Based Translation.

【Paper Link】【Pages】:2856-2862

【Authors】: Tong Xiao ; Jingbo Zhu ; Chunliang Zhang ; Tongran Liu

【Abstract】: In this paper we propose an approach to modeling syntactically-motivated skeletal structure of source sentence for machine translation. This model allows for application of high-level syntactic transfer rules and low-level non-syntactic rules. It thus involves fully syntactic, non-syntactic, and partially syntactic derivations via a single grammar and decoding paradigm. On large-scale Chinese-English and English-Chinese translation tasks, we obtain an average improvement of +0.9 BLEU across the newswire and web genres.

【Keywords】: Statistical Machine Translation; Syntax-Based Model

398. A Morphology-Aware Network for Morphological Disambiguation.

【Paper Link】【Pages】:2863-2869

【Authors】: Eray Yildiz ; Caglar Tirkaz ; H. Bahadir Sahin ; Mustafa Tolga Eren ; Omer Ozan Sonmez

【Abstract】: Agglutinative languages such as Turkish, Finnish andHungarian require morphological disambiguation beforefurther processing due to the complex morphologyof words. A morphological disambiguator is usedto select the correct morphological analysis of a word.Morphological disambiguation is important because itgenerally is one of the first steps of natural languageprocessing and its performance affects subsequent analyses.In this paper, we propose a system that uses deeplearning techniques for morphological disambiguation.Many of the state-of-the-art results in computer vision,speech recognition and natural language processinghave been obtained through deep learning models.However, applying deep learning techniques to morphologicallyrich languages is not well studied. In this work,while we focus on Turkish morphological disambiguationwe also present results for French and German inorder to show that the proposed architecture achieveshigh accuracy with no language-specific feature engineeringor additional resource. In the experiments, weachieve 84.12 , 88.35 and 93.78 morphological disambiguationaccuracy among the ambiguous words forTurkish, German and French respectively.

【Keywords】: morphological disambiguation, word embeddings, convolutional neural network, POS tagging, morphology tagging, lemmatization

399. Building Earth Mover's Distance on Bilingual Word Embeddings for Machine Translation.

【Paper Link】【Pages】:2870-2876

【Authors】: Meng Zhang ; Yang Liu ; Huan-Bo Luan ; Maosong Sun ; Tatsuya Izuha ; Jie Hao

【Abstract】: Following their monolingual counterparts, bilingual word embeddings are also on the rise. As a major application task, word translation has been relying on the nearest neighbor to connect embeddings cross-lingually. However, the nearest neighbor strategy suffers from its inherently local nature and fails to cope with variations in realistic bilingual word embeddings. Furthermore, it lacks a mechanism to deal with many-to-many mappings that often show up across languages. We introduce Earth Mover's Distance to this task by providing a natural formulation that translates words in a holistic fashion, addressing the limitations of the nearest neighbor. We further extend the formulation to a new task of identifying parallel sentences, which is useful for statistical machine translation systems, thereby expanding the application realm of bilingual word embeddings. We show encouraging performance on both tasks.

【Keywords】: Earth Mover's Distance; word translation; bilingual corpus filtering; machine translation

400. Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint.

【Paper Link】【Pages】:2877-2884

【Authors】: Li Zhao ; Minlie Huang ; Ziyu Yao ; Rongwei Su ; Yingying Jiang ; Xiaoyan Zhu

【Abstract】: Multinomial Naive Bayes with Expectation Maximization (MNB-EM) is a standard semi-supervised learning method to augment Multinomial Naive Bayes (MNB) for text classification. Despite its success, MNB-EM is not stable, and may succeed or fail to improve MNB. We believe that this is because MNB-EM lacks the ability to preserve the class distribution on words. In this paper, we propose a novel method to augment MNB-EM by leveraging the word-level statistical constraint to preserve the class distribution on words. The word-level statistical constraints are further converted to constraints on document posteriors generated by MNB-EM. Experiments demonstrate that our method can consistently improve MNB-EM, and outperforms state-of-art baselines remarkably.

【Keywords】: text classification; semi-supervised learning;

Technical Papers: NLP and Text Mining 32

401. Labeling the Semantic Roles of Commas.

【Paper Link】【Pages】:2885-2891

【Authors】: Naveen Arivazhagan ; Christos Christodoulopoulos ; Dan Roth

【Abstract】: Commas and the surrounding sentence structure often express relations that are essential to understanding the meaning of the sentence. This paper proposes a set of relations commas participate in, expanding on previous work in this area, and develops a new dataset annotated with this set of labels. We identify features that are important to achieve a good performance on comma labeling and then develop a machine learning method that achieves high accuracy on identifying comma relations, improving over previous work. Finally, we discuss a variety of possible uses, both as syntactic and discourse-oriented features and constraints for downstream tasks.

【Keywords】: punctuation; semantics

【Paper Link】【Pages】:2892-2898

【Authors】: Adrian Benton ; Michael J. Paul ; Braden Hancock ; Mark Dredze

【Abstract】: This paper considers survey prediction from social media. We use topic models to correlate social media messages with survey outcomes and to provide an interpretable representation of the data. Rather than rely on fully unsupervised topic models, we use existing aggregated survey data to inform the inferred topics, a class of topic model supervision referred to as collective supervision. We introduce and explore a variety of topic model variants and provide an empirical analysis, with conclusions of the most effective models for this task.

【Keywords】: topic models; survey prediction; social media

403. Distant IE by Bootstrapping Using Lists and Document Structure.

【Paper Link】【Pages】:2899-2905

【Authors】: Lidong Bing ; Mingyang Ling ; Richard C. Wang ; William W. Cohen

【Abstract】: Distant labeling for information extraction (IE) suffers from noisy training data. We describe a way of reducing the noise associated with distant IE by identifying coupling constraints between potential instance labels. As one example of coupling,items in a list are likely to have the same label.A second example of coupling comes from analysis of document structure: in some corpora,sections can be identified such that items in the same section are likely to have the same label. Such sections do not exist in all corpora, but we show that augmenting a large corpus with coupling constraints from even a small, well-structured corpus can improve performance substantially, doubling F1 on one task.

【Keywords】: distant IE; label propagation; coordinate-term list; document structure; structured corpus

404. TGSum: Build Tweet Guided Multi-Document Summarization Dataset.

【Paper Link】【Pages】:2906-2912

【Authors】: Ziqiang Cao ; Chengyao Chen ; Wenjie Li ; Sujian Li ; Furu Wei ; Ming Zhou

【Abstract】: The development of summarization research has been significantly hampered by the costly acquisition of reference summaries. This paper proposes an effective way to automatically collect large scales of news-related multi-document summaries with reference to social media's reactions. We utilize two types of social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to cluster documents into different topic sets. Also, a tweet with a hyper-link often highlights certain key points of the corresponding document. We synthesize a linked document cluster to form a reference summary which can cover most key points. To this aim, we adopt the ROUGE metrics to measure the coverage ratio, and develop an Integer Linear Programming solution to discover the sentence set reaching the upper bound of ROUGE. Since we allow summary sentences to be selected from both documents and high-quality tweets, the generated reference summaries could be abstractive. Both informativeness and readability of the collected summaries are verified by manual judgment. In addition, we train a Support Vector Regression summarizer on DUC generic multi-document summarization benchmarks. With the collected data as extra training resource, the performance of the summarizer improves a lot on all the test sets. We release this dataset for further research.

【Keywords】: Tweet; Multi-document summarization; reference summary

405. Joint Inference over a Lightly Supervised Information Extraction Pipeline: Towards Event Coreference Resolution for Resource-Scarce Languages.

【Paper Link】【Pages】:2913-2920

【Authors】: Chen Chen ; Vincent Ng

【Abstract】: We address two key challenges in end-to-end event coreference resolution research: (1) the error propagation problem, where an event coreference resolver has to assume as input the noisy outputs produced by its upstream components in the standard information extraction (IE) pipeline; and (2) the data annotation bottleneck, where manually annotating data for all the components in the IE pipeline is prohibitively expensive. This is the case in the vast majority of the world's natural languages, where such annotated resources are not readily available. To address these problems, we propose to perform joint inference over a lightly supervised IE pipeline, where all the models are trained using either active learning or unsupervised learning. Using our approach, only 25% of the training sentences in the Chinese portion of the ACE 2005 corpus need to be annotated with entity and event mentions in order for our event coreference resolver to surpass its fully supervised counterpart in performance.

【Keywords】:

406. Discourse Relations Detection via a Mixed Generative-Discriminative Framework.

【Paper Link】【Pages】:2921-2927

【Authors】: Jifan Chen ; Qi Zhang ; Pengfei Liu ; Xuanjing Huang

【Abstract】: Word embeddings, which can better capture the fine-grained semantics of words, have proven to be useful for a variety of natural language processing tasks. However, because discourse structures describe the relationships between segments of discourse, word embeddings cannot be directly integrated to perform the task. In this paper, we introduce a mixed generative-discriminative framework, in which we use vector offsets between embeddings of words to represent the semantic relations between text segments and Fisher kernel framework to convert a variable number of vector offsets into a fixed length vector. In order to incorporate the weights of these offsets into the vector, we also propose the Weighted Fisher Vector. Experimental results on two different datasets show that the proposed method without using manually designed features can achieve better performance on recognizing the discourse level relations in most cases.

【Keywords】:

407. Age of Exposure: A Model of Word Learning.

【Paper Link】【Pages】:2928-2934

【Authors】: Mihai Dascalu ; Danielle S. McNamara ; Scott A. Crossley ; Stefan Trausan-Matu

【Abstract】: Textual complexity is widely used to assess the difficulty of reading materials and writing quality in student essays. At a lexical level, word complexity can represent a building block for creating a comprehensive model of lexical networks that adequately estimates learners’ understanding. In order to best capture how lexical associations are created between related concepts, we propose automated indices of word complexity based on Age of Exposure (AoE). AOE indices computationally model the lexical learning process as a function of a learner's experience with language. This study describes a proof of concept based on the on a large-scale learning corpus (i.e., TASA). The results indicate that AoE indices yield strong associations with human ratings of age of acquisition, word frequency, entropy, and human lexical response latencies providing evidence of convergent validity.

【Keywords】: word complexity; Latent Dirichlet Allocation; simulate word learning

408. Acquiring Knowledge of Affective Events from Blogs Using Label Propagation.

【Paper Link】【Pages】:2935-2942

【Authors】: Haibo Ding ; Ellen Riloff

【Abstract】: Many common events in our daily life affect us in positive and negative ways. For example, going on vacation is typically an enjoyable event, while being rushed to the hospital is an undesirable event. In narrative stories and personal conversations, recognizing that some events have a strong affective polarity is essential to understand the discourse and the emotional states of the affected people. However, current NLP systems mainly depend on sentiment analysis tools, which fail to recognize many events that are implicitly affective based on human knowledge about the event itself and cultural norms. Our goal is to automatically acquire knowledge of stereotypically positive and negative events from personal blogs. Our research creates an event context graph from a large collection of blog posts and uses a sentiment classifier and semi-supervised label propagation algorithm to discover affective events. We explore several graph configurations that propagate affective polarity across edges using local context, discourse proximity, and event-event co-occurrence. We then harvest highly affective events from the graph and evaluate the agreement of the polarities with human judgements.

【Keywords】: Natural Language Processing; Sentiment Analysis; Information Extraction

409. To Swap or Not to Swap? Exploiting Dependency Word Pairs for Reordering in Statistical Machine Translation.

【Paper Link】【Pages】:2943-2949

【Authors】: Christian Hadiwinoto ; Yang Liu ; Hwee Tou Ng

【Abstract】: Reordering poses a major challenge in machine translation (MT) between two languages with significant differences in word order. In this paper, we present a novel reordering approach utilizing sparse features based on dependency word pairs. Each instance of these features captures whether two words, which are related by a dependency link in the source sentence dependency parse tree, follow the same order or are swapped in the translation output. Experiments on Chinese-to-English translation show a statistically significant improvement of 1.21 BLEU point using our approach, compared to a state-of-the-art statistical MT system that incorporates prior reordering approaches.

【Keywords】: natural language processing; machine translation; reordering; dependency parse

410. Global Distant Supervision for Relation Extraction.

【Paper Link】【Pages】:2950-2956

【Authors】: Xianpei Han ; Le Sun

【Abstract】: Machine learning approaches to relation extraction are typically supervised and require expensive labeled data. To break the bottleneck of labeled data, a promising approach is to exploit easily obtained indirect supervision knowledge – which we usually refer to as distant supervision (DS). However, traditional DS methods mostly only exploit one specific kind of indirect supervision knowledge – the relations/facts in a given knowledge base, thus often suffer from the problem of lack of supervision. In this paper, we propose a global distant supervision model for relation extraction, which can: 1) compensate the lack of supervision with a wide variety of indirect supervision knowledge; and 2) reduce the uncertainty in DS by performing joint inference across relation instances. Experimental results show that, by exploiting the consistency between relation labels, the consistency between relations and arguments, and the consistency between neighbor instances using Markov logic, our method significantly outperforms traditional DS approaches.

【Keywords】: Relation Extraction; Distant Supervision; Markov Logic

411. Extracting Topical Phrases from Clinical Documents.

【Paper Link】【Pages】:2957-2963

【Authors】: Yulan He

【Abstract】: In clinical documents, medical terms are often expressed in multi-word phrases. Traditional topic modelling approaches relying on the "bag-of-words" assumption are not effective in extracting topic themes from clinical documents. This paper proposes to first extract medical phrases using an off-the-shelf tool for medical concept mention extraction, and then train a topic model which takes a hierarchy of Pitman-Yor processes as prior for modelling the generation of phrases of arbitrary length. Experimental results on patients' discharge summaries show that the proposed approach outperforms the state-of-the-art topical phrase extraction model on both perplexity and topic coherence measure and finds more interpretable topics.

【Keywords】: Topical phrase extraction, Latent Dirichlet Allocation, Hierarchical Pitman-Yor Process, clinical documents

【Paper Link】【Pages】:2964-2971

【Authors】: Ting Hua ; Yue Ning ; Feng Chen ; Chang-Tien Lu ; Naren Ramakrishnan

【Abstract】: The analysis of interactions between social media and traditional news streams is becoming increasingly relevant for a variety of applications, including: understanding the underlying factors that drive the evolution of data sources, tracking the triggers behind events, and discovering emerging trends.Researchers have explored such interactions by examining volume changes or information diffusions,however, most of them ignore the semantical and topical relationships between news and social media data.Our work is the first attempt to study how news influences social media, or inversely, based on topical knowledge.We propose a hierarchical Bayesian model that jointly models the news and social media topics and their interactions.We show that our proposed model can capture distinct topics for individual datasets as well as discover the topic influences among multiple datasets.By applying our model to large sets of news and tweets, we demonstrate its significant improvement over baseline methods and explore its power in the discovery of interesting patterns for real world cases.

【Keywords】: Twitter; topic model; news; data mining

【Paper Link】【Pages】:2972-2978

【Authors】: Zhiwei Jin ; Juan Cao ; Yongdong Zhang ; Jiebo Luo

【Abstract】: Fake news spreading in social media severely jeopardizes the veracity of online content. Fortunately, with the interactive and open features of microblogs, skeptical and opposing voices against fake news always arise along with it. The conflicting information, ignored by existing studies, is crucial for news verification. In this paper, we take advantage of this "wisdom of crowds" information to improve news verification by mining conflicting viewpoints in microblogs. First, we discover conflicting viewpoints in news tweets with a topic model method. Based on identified tweets' viewpoints, we then build a credibility propagation network of tweets linked with supporting or opposing relations. Finally, with iterative deduction, the credibility propagation on the network generates the final evaluation result for news. Experiments conducted on a real-world data set show that the news verification performance of our approach significantly outperforms those of the baseline approaches.

【Keywords】:

414. Argument Mining from Speech: Detecting Claims in Political Debates.

【Paper Link】【Pages】:2979-2985

【Authors】: Marco Lippi ; Paolo Torroni

【Abstract】: The automatic extraction of arguments from text, also known as argument mining, has recently become a hot topic in artificial intelligence. Current research has only focused on linguistic analysis. However, in many domains where communication may be also vocal or visual, paralinguistic features too may contribute to the transmission of the message that arguments intend to convey. For example, in political debates a crucial role is played by speech. The research question we address in this work is whether in such domains one can improve claim detection for argument mining, by employing features from text and speech in combination. To explore this hypothesis, we develop a machine learning classifier and train it on an original dataset based on the 2015 UK political elections debate.

【Keywords】: Argumentation mining, claim detection, argumentation, speech processing, political debate, natural language processing, paralinguistic feature analysis

415. Improving Opinion Aspect Extraction Using Semantic Similarity and Aspect Associations.

【Paper Link】【Pages】:2986-2992

【Authors】: Qian Liu ; Bing Liu ; Yuanlin Zhang ; Doo Soon Kim ; Zhiqiang Gao

【Abstract】: Aspect extraction is a key task of fine-grained opinion mining. Although it has been studied by many researchers, it remains to be highly challenging. This paper proposes a novel unsupervised approach to make a major improvement. The approach is based on the framework of lifelong learning and is implemented with two forms of recommendations that are based on semantic similarity and aspect associations respectively. Experimental results using eight review datasets show the effectiveness of the proposed approach.

【Keywords】: Aspect extraction; Opinion Mining; Aspect recommendation

416. A Probabilistic Soft Logic Based Approach to Exploiting Latent and Global Information in Event Classification.

【Paper Link】【Pages】:2993-2999

【Authors】: Shulin Liu ; Kang Liu ; Shizhu He ; Jun Zhao

【Abstract】: Global information such as event-event association, and latent local information such as fine-grained entity types, are crucial to event classification. However, existing methods typically focus on sophisticated local features such as part-of-speech tags, either fully or partially ignoring the aforementioned information. By contrast, this paper focuses on fully employing them for event classification. We notice that it is difficult to encode some global information such as event-event association for previous methods. To resolve this problem, we propose a feasible approach which encodes global information in the form of logic using Probabilistic Soft Logic model. Experimental results show that, our proposed approach advances state-of-the-art methods, and achieves the best F1 score to date on the ACE data set.

【Keywords】: Event Classification; Information Extraction; Event Extraction

417. Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding.

【Paper Link】【Pages】:3000-3006

【Authors】: Guangyi Lv ; Tong Xu ; Enhong Chen ; Qi Liu ; Yi Zheng

【Abstract】: Recent years have witnessed the boom of online sharing media contents, which raise significant challenges in effective management and retrieval. Though a large amount of efforts have been made, precise retrieval on video shots with certain topics has been largely ignored. At the same time, due to the popularity of novel time-sync comments, or so-called "bullet-screen comments", video semantics could be now combined with timestamps to support further research on temporal video labeling. In this paper, we propose a novel video understanding framework to assign temporal labels on highlighted video shots. To be specific, due to the informal expression of bullet-screen comments, we first propose a temporal deep structured semantic model (T-DSSM) to represent comments into semantic vectors by taking advantage of their temporal correlation. Then, video highlights are recognized and labeled via semantic vectors in a supervised way. Extensive experiments on a real-world dataset prove that our framework could effectively label video highlights with a significant margin compared with baselines, which clearly validates the potential of our framework on video understanding, as well as bullet-screen comments interpretation.

【Keywords】: bullet-screen comment, temporal labeling, semantic embedding

418. Joint Word Segmentation, POS-Tagging and Syntactic Chunking.

【Paper Link】【Pages】:3007-3014

【Authors】: Chen Lyu ; Yue Zhang ; Donghong Ji

【Abstract】: Chinese chunking has traditionally been solved by assuming gold standard word segmentation.We find that the accuracies drop drastically when automatic segmentation is used.Inspired by the fact that chunking knowledge can potentially improve segmentation, we explore a joint model that performs segmentation, POS-tagging and chunking simultaneously.In addition, to address the sparsity of full chunk features, we employ a semi-supervised method to derive chunk cluster features from large-scale automatically-chunked data.Results show the effectiveness of the joint model with semi-supervised features.

【Keywords】: joint model; semi-supervised method; Chinese syntactic chunking

419. Microsummarization of Online Reviews: An Experimental Study.

【Paper Link】【Pages】:3015-3021

【Authors】: Rebecca Mason ; Benjamin Gaska ; Benjamin Van Durme ; Pallavi Choudhury ; Ted Hart ; Bill Dolan ; Kristina Toutanova ; Margaret Mitchell

【Abstract】: Mobile and location-based social media applications provide platforms for users to share brief opinions about products, venues, and services. These quickly typed opinions, or microreviews, are a valuable source of current sentiment on a wide variety of subjects. However, there is currently little research on how to mine this information to present it back to users in easily consumable way. In this paper, we introduce the task of microsummarization, which combines sentiment analysis, summarization, and entity recognition in order to surface key content to users. We explore unsupervised and supervised methods for this task, and find we can reliably extract relevant entities and the sentiment targeted towards them using crowdsourced labels as supervision. In an end-to-end evaluation, we find our best-performing system is vastly preferred by judges over a traditional extractive summarization approach. This work motivates an entirely new approach to summarization, incorporating both sentiment analysis and item extraction for modernized, at-a-glance presentation of public opinion.

【Keywords】: summarization; entity recognition; sentiment analysis; opinion mining; recommendations

420. A Semi-Supervised Learning Approach to Why-Question Answering.

【Paper Link】【Pages】:3022-3029

【Authors】: Jong-Hoon Oh ; Kentaro Torisawa ; Chikara Hashimoto ; Ryu Iida ; Masahiro Tanaka ; Julien Kloetzer

【Abstract】: We propose a semi-supervised learning method for improving why-question answering (why-QA). The key of our method is to generate training data (question-answer pairs) from causal relations in texts such as "Tsunamis are generated because the ocean's water mass is displaced by an earthquake." A naive method for the generation would be to make a question-answer pair by simply converting the effect part of the causal relations into a why-question, like "Why are tsunamis generated?" from the above example, and using the source text of the causal relations as an answer. However, in our preliminary experiments, this naive method actually failed to improve the why-QA performance. The main reason was that the machine-generated questions were often incomprehensible like "Why does (it) happen?", and that the system suffered from overfitting to the results of our automatic causality recognizer. Hence, we developed a novel method that effectively filters out incomprehensible questions and retrieves from texts answers that are likely to be paraphrases of a given causal relation. Through a series of experiments, we showed that our approach significantly improved the precision of the top answer by 8% over the current state-of-the-art system for Japanese why-QA.

【Keywords】: Why-Question Answering; Question Answering; Semi-Supervised Learning; Causal Relation

421. Discovering User Attribute Stylistic Differences via Paraphrasing.

【Paper Link】【Pages】:3030-3037

【Authors】: Daniel Preotiuc-Pietro ; Wei Xu ; Lyle H. Ungar

【Abstract】: User attribute prediction from social media text has proven successful and useful for downstream tasks. In previous studies, differences in user trait language use have been limited primarily to the presence or absence of words that indicate topical preferences. In this study, we aim to find linguistic style distinctions across three different user attributes: gender, age and occupational class. By combining paraphrases with a simple yet effective method, we capture a wide set of stylistic differences that are exempt from topic bias. We show their predictive power in user profiling, conformity with human perception and psycholinguistic hypotheses, and potential use in generating natural language tailored to specific user traits.

【Keywords】: User traits; Paraphrases; User profiling; User attributes; Stylistic Diferences; Natural Language Processing; Pscyholinguistics; Text mining

422. Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings.

【Paper Link】【Pages】:3038-3044

【Authors】: Yafeng Ren ; Yue Zhang ; Meishan Zhang ; Donghong Ji

【Abstract】: It has been shown that learning distributed word representations is highly useful for Twitter sentiment classification.Most existing models rely on a single distributed representation for each word.This is problematic for sentiment classification because words are often polysemous and each word can contain different sentiment polarities under different topics.We address this issue by learning topic-enriched multi-prototype word embeddings (TMWE).In particular, we develop two neural networks which 1) learn word embeddings that better capture tweet context by incorporating topic information, and 2) learn topic-enriched multiple prototype embeddings for each word.Experiments on Twitter sentiment benchmark datasets in SemEval 2013 show that TMWE outperforms the top system with hand-crafted features, and the current best neural network model.

【Keywords】: word embeddings; neural network; sentiment classification; topic information

423. Temporal Topic Analysis with Endogenous and Exogenous Processes.

【Paper Link】【Pages】:3045-3051

【Authors】: Baiyang Wang ; Diego Klabjan

【Abstract】: We consider the problem of modeling temporal textual data taking endogenous and exogenous processes into account. Such text documents arise in real world applications, including job advertisements and economic news articles, which are influenced by the fluctuations of the general economy. We propose a hierarchical Bayesian topic model which imposes a "group-correlated" hierarchical structure on the evolution of topics over time incorporating both processes, and show that this model can be estimated from Markov chain Monte Carlo sampling methods. We further demonstrate that this model captures the intrinsic relationships between the topic distribution and the time-dependent factors, and compare its performance with latent Dirichlet allocation (LDA) and two other related models. The model is applied to two collections of documents to illustrate its empirical performance: online job advertisements from DirectEmployers Association and journalists' postings on BusinessInsider.com.

【Keywords】: topic modeling; hierarchical models; temporal models

【Paper Link】【Pages】:3052-3058

【Authors】: Shuai Wang ; Zhiyuan Chen ; Bing Liu ; Sherry Emery

【Abstract】: In almost any application of social media analysis, the user is interested in studying a particular topic or research question. Collecting posts or messages relevant to the topic from a social media source is a necessary step. Due to the huge size of social media sources (e.g., Twitter and Facebook), one has to use some topic keywords to search for possibly relevant posts. However, gathering a good set of keywords is a very tedious and time-consuming task. It often involves a lengthy iterative process of searching and manual reading. In this paper, we propose a novel technique to help the user identify topical search keywords. Our experiments are carried out on identifying such keywords for five (5) real-life application topics to be used for searching relevant tweets from the Twitter API. The results show that the proposed method is highly effective.

【Keywords】: Search keywords; Topic Keyword Mining; Social Media Tracking

425. Personalized Microblog Sentiment Classification via Multi-Task Learning.

【Paper Link】【Pages】:3059-3065

【Authors】: Fangzhao Wu ; Yongfeng Huang

【Abstract】: Microblog sentiment classification is an interesting and important research topic with wide applications. Traditional microblog sentiment classification methods usually use a single model to classify the messages from different users and omit individuality. However, microblogging users frequently embed their personal character, opinion bias and language habits into their messages, and the same word may convey different sentiments in messages posted by different users. In this paper, we propose a personalized approach for microblog sentiment classification. In our approach, each user has a personalized sentiment classifier, which is decomposed into two components, a global one and a user-specific one. Our approach can capture the individual personality and at the same time leverage the common sentiment knowledge shared by all users. The personalized sentiment classifiers of massive users are trained in a collaborative way based on multi-task learning to handle the data sparseness problem. In addition, we incorporate users' social relations into our model to strengthen the learning of the personalized models. Moreover, we propose a distributed optimization algorithm to solve our model in parallel. Experiments on two real-world microblog sentiment datasets validate that our approach can improve microblog sentiment classification accuracy effectively and efficiently.

【Keywords】: Sentiment Classification; Microblog; Personalization

426. Improving Recommendation of Tail Tags for Questions in Community Question Answering.

【Paper Link】【Pages】:3066-3072

【Authors】: Yu Wu ; Wei Wu ; Zhoujun Li ; Ming Zhou

【Abstract】: We study tag recommendation for questions in community question answering (CQA). Tags represent the semantic summarization of questions are useful for navigation and expert finding in CQA and can facilitate content consumption such as searching and mining in these web sites. The task is challenging, as both questions and tags are short and a large fraction of tags are tail tags which occur very infrequently. To solve these problems, we propose matching questions and tags not only by themselves, but also by similar questions and similar tags. The idea is then formalized as a model in which we calculate question-tag similarity using a linear combination of similarity with similar questions and tags weighted by tag importance.Question similarity, tag similarity, and tag importance are learned in a supervised random walk framework by fusing multiple features. Our model thus can not only accurately identify question-tag similarity for head tags, but also improve the accuracy of recommendation of tail tags. Experimental results show that the proposed method significantly outperforms state-of-the-art methods on tag recommendation for questions. Particularly, it improves tail tag recommendation accuracy by a large margin.

【Keywords】: supervised random walk; tag recommendation; tail tag

427. Exploring Multiple Feature Spaces for Novel Entity Discovery.

【Paper Link】【Pages】:3073-3079

【Authors】: Zhaohui Wu ; Yang Song ; C. Lee Giles

【Abstract】: Continuously discovering novel entities in news and Web data is important for Knowledge Base (KB) maintenance. One of the key challenges is to decide whether an entity mention refers to an in-KB or out-of-KB entity. We propose a principled approach that learns a novel entity classifier by modeling mention and entity representation into multiple feature spaces, including contextual, topical, lexical, neural embedding and query spaces. Different from most previous studies that address novel entity discovery as a submodule of entity linking systems, our model is more a generalized approach and can be applied as a pre-filtering step of novel entities for any entity linking systems. Experiments on three real-world datasets show that our method significantly outperforms existing methods on identifying novel entities.

【Keywords】: Novel Entity Discovery; Entity Linking; Entity Modeling

428. Tweet Timeline Generation with Determinantal Point Processes.

【Paper Link】【Pages】:3080-3086

【Authors】: Jin-ge Yao ; Feifan Fan ; Wayne Xin Zhao ; Xiaojun Wan ; Edward Y. Chang ; Jianguo Xiao

【Abstract】: The task of tweet timeline generation (TTG) aims at selecting a small set of representative tweets to generate a meaningful timeline and providing enough coverage for a given topical query. This paper presents an approach based on determinantal point processes (DPPs) by jointly modeling the topical relevance of each selected tweet and overall selectional diversity. Aiming at better treatment for balancing relevance and diversity, we introduce two novel strategies, namely spectral rescaling and topical prior. Extensive experiments on the public TREC 2014 dataset demonstrate that our proposed DPP model along with the two strategies can achieve fairly competitive results against the state-of-the-art TTG systems.

【Keywords】:

429. Gated Neural Networks for Targeted Sentiment Analysis.

【Paper Link】【Pages】:3087-3093

【Authors】: Meishan Zhang ; Yue Zhang ; Duy-Tin Vo

【Abstract】: Targeted sentiment analysis classifies the sentiment polarity towards each target entity mention in given text documents. Seminal methods extract manual discrete features from automatic syntactic parse trees in order to capture semantic information of the enclosing sentence with respect to a target entity mention. Recently, it has been shown that competitive accuracies can be achieved without using syntactic parsers, which can be highly inaccurate on noisy text such as tweets. This is achieved by applying distributed word representations and rich neural pooling functions over a simple and intuitive segmentation of tweets according to target entity mentions. In this paper, we extend this idea by proposing a sentence-level neural model to address the limitation of pooling functions, which do not explicitly model tweet-level semantics. First, a bi-directional gated neural network is used to connect the words in a tweet so that pooling functions can be applied over the hidden layer instead of words for better representing the target and its contexts. Second, a three-way gated neural network structure is used to model the interaction between the target mention and its surrounding contexts. Experiments show that our proposed model gives significantly higher accuracies compared to the current best method for targeted sentiment analysis.

【Keywords】: Targeted sentiment analysis; neural network

430. A Joint Model for Question Answering over Multiple Knowledge Bases.

【Paper Link】【Pages】:3094-3100

【Authors】: Yuanzhe Zhang ; Shizhu He ; Kang Liu ; Jun Zhao

【Abstract】: As the amount of knowledge bases (KBs) grows rapidly, the problem of question answering (QA) over multiple KBs has drawn more attention. The most significant distinction between multiple KB-QA and single KB-QA is that the former must consider the alignments between KBs. The pipeline strategy first constructs the alignments independently, and then uses the obtained alignments to construct queries. However, alignment construction is not a trivial task, and the introduced noises would be passed on to query construction. By contrast, we notice that alignment construction and query construction are interactive steps, and jointly considering them would be beneficial. To this end, we present a novel joint model based on integer linear programming (ILP), uniting these two procedures into a uniform framework. The experimental results demonstrate that the proposed approach outperforms state-of-the-art systems, and is able to improve the performance of both alignment construction and query construction.

【Keywords】:

431. A Joint Model for Entity Set Expansion and Attribute Extraction from Web Search Queries.

【Paper Link】【Pages】:3101-3107

【Authors】: Zhenzhong Zhang ; Le Sun ; Xianpei Han

【Abstract】: Entity Set Expansion (ESE) and Attribute Extraction (AE) are usually treated as two separate tasks in Information Extraction (IE). However, the two tasks are tightly coupled, and each task can benefit significantly from the other by leveraging the inherent relationship between entities and attributes. That is, 1) an attribute is important if it is shared by many typical entities of a class; 2) an entity is typical if it owns many important attributes of a class. Based on this observation, we propose a joint model for ESE and AE, which models the inherent relationship between entities and attributes as a graph. Then a graph reinforcement algorithm is proposed to jointly mine entities and attributes of a specific class. Experimental results demonstrate the superiority of our method for discovering both new entities and new attributes.

【Keywords】: Entity Set Expansion; Attribute Extraction

432. Aggregating Inter-Sentence Information to Enhance Relation Extraction.

【Paper Link】【Pages】:3108-3115

【Authors】: Hao Zheng ; Zhoujun Li ; Senzhang Wang ; Zhao Yan ; Jianshe Zhou

【Abstract】: Previous work for relation extraction from free text is mainly based on intra-sentence information. As relations might be mentioned across sentences, inter-sentence information can be leveraged to improve distantly supervised relation extraction. To effectively exploit inter-sentence information, we propose a ranking based approach, which first learns a scoring function based on a listwise learning-to-rank model and then uses it for multi-label relation extraction. Experimental results verify the effectiveness of our method for aggregating information across sentences. Additionally, to further improve the ranking of high-quality extractions, we propose an effective method to rank relations from different entity pairs. This method can be easily integrated into our overall relation extraction framework, and boosts the precision significantly.

【Keywords】: relation extraction; learning to rank; aggregating inter-sentence information

Technical Papers: Planning and Scheduling 14

433. Dynamic Controllability of Disjunctive Temporal Networks: Validation and Synthesis of Executable Strategies.

【Paper Link】【Pages】:3116-3122

【Authors】: Alessandro Cimatti ; Andrea Micheli ; Marco Roveri

【Abstract】: The Temporal Network with Uncertainty (TNU) modeling framework is used to represent temporal knowledge in presence of qualitative temporal uncertainty. Dynamic Controllability (DC) is the problem of deciding the existence of a strategy for scheduling the controllable time points of the network observing past happenings only. In this paper, we address the DC problem for a very general class of TNU, namely Disjunctive Temporal Network with Uncertainty. We make the following contributions. First, we define strategies in the form of an executable language; second, we propose the first decision procedure to check whether a given strategy is a solution for the DC problem; third we present an efficient algorithm for strategy synthesis based on techniques derived from Timed Games and Satisfiability Modulo Theory. The experimental evaluation shows that the approach is superior to the state-of-the-art.

【Keywords】: Dynamic Controllability; Disjunctive Temporal Networks with Uncertainty; Strategy Synthesis; Temporal Reasoning

434. Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value.

【Paper Link】【Pages】:3123-3129

【Authors】: Amir Massoud Farahmand ; Daniel Nikolaev Nikovski ; Yuji Igarashi ; Hiroki Konaka

【Abstract】: We propose a new class of computationally fast algorithms to find close to optimal policy for Markov Decision Processes (MDP) with large finite horizon T.The main idea is that instead of planning until the time horizon T, we plan only up to a truncated horizon H << T and use an estimate of the true optimal value function as the terminal value. Our approach of finding the terminal value function is to learn a mapping from an MDP to its value function by solving many similar MDPs during a training phase and fit a regression estimator. We analyze the method by providing an error propagation theorem that shows the effect of various sources of errors to the quality of the solution. We also empirically validate this approach in a real-world application of designing an energy management system for Hybrid Electric Vehicles with promising results.

【Keywords】: Reinforcement Learning; Approximate Dynamic Programming; Multi-task Learning; Hybrid Electric Vehicles

435. General Error Bounds in Heuristic Search Algorithms for Stochastic Shortest Path Problems.

【Paper Link】【Pages】:3130-3137

【Authors】: Eric A. Hansen ; Ibrahim Abdoulahi

【Abstract】: We consider recently-derived error bounds that can be used to bound the quality of solutions found by heuristic search algorithms for stochastic shortest path problems. In their original form, the bounds can only be used for problems with positive action costs. We show how to generalize the bounds so that they can be used in solving any stochastic shortest path problem, regardless of cost structure. In addition, we introduce a simple new heuristic search algorithm that performs as well or better than previous algorithms for this class of problems, while being easier to implement and analyze.

【Keywords】:

436. Solving Risk-Sensitive POMDPs With and Without Cost Observations.

【Paper Link】【Pages】:3138-3144

【Authors】: Ping Hou ; William Yeoh ; Pradeep Varakantham

【Abstract】: Partially Observable Markov Decision Processes (POMDPs) are often used to model planning problems under uncertainty. The goal in Risk-Sensitive POMDPs (RS-POMDPs) is to find a policy that maximizes the probability that the cumulative cost is within some user-defined cost threshold. In this paper, unlike existing POMDP literature, we distinguish between the two cases of whether costs can or cannot be observed and show the empirical impact of cost observations. We also introduce a new search-based algorithm to solve RS-POMDPs and show that it is faster and more scalable than existing approaches in two synthetic domains and a taxi domain generated with real-world data.

【Keywords】: POMDP; Risk-Sensitive; Utility Theory

437. Randomised Procedures for Initialising and Switching Actions in Policy Iteration.

【Paper Link】【Pages】:3145-3151

【Authors】: Shivaram Kalyanakrishnan ; Neeldhara Misra ; Aditya Gopalan

【Abstract】: Policy Iteration (PI) (Howard 1960) is a classical method for computing an optimal policy for a finite Markov Decision Problem (MDP). The method is conceptually simple: starting from some initial policy, “policy improvement” is repeatedly performed to obtain progressively dominating policies, until eventually, an optimal policy is reached. Being remarkably efficient in practice, PI is often favoured over alternative approaches such as Value Iteration and Linear Programming. Unfortunately, even after several decades of study, theoretical bounds on the complexity of PI remain unsatisfactory. For an MDP with n states and k actions, Mansour and Singh (1999) bound the number of iterations taken by Howard’s PI, the canonical variant of the method, by O ( k n / n ). This bound merely improves upon the trivial bound of kn by a linear factor. However, a randomised variant of PI introduced by Mansour and Singh (1999) does yield an exponential improvement, with its expected number of iterations bounded by O(((1 + 2/log 2 ( k )) k / 2) n ).With the objective of furnishing improved upper bounds for PI, we introduce two randomised procedures in this paper. Our first contribution is a routine to find a good initial policy for PI. After evaluating a number of randomly generated policies, this procedure applies a novel criterion to pick one to initialise PI. When PI is subsequently applied, we show that the expected number of policy evaluations—including both the initialisation and the improvement stages—remains bounded in expectation by O ( k n /2 ). The key construction employed in this routine is a total order on the set of policies. Our second contribution is a randomised action-switching rule for PI, which admits a bound of O((2 + ln( k – 1)) n ) on the expected number of iterations. To the best of our knowledge, this is the tightest complexity bound known for PI when k >= 3.

【Keywords】:

438. Goal Recognition Design with Non-Observable Actions.

【Paper Link】【Pages】:3152-3158

【Authors】: Sarah Keren ; Avigdor Gal ; Erez Karpas

【Abstract】: Goal recognition design involves the offline analysis of goal recognition models by formulating measures that assess the ability to perform goal recognition within a model and finding efficient ways to compute and optimize them. In this work we relax the full observability assumption of earlier work by offering a new generalized model for goal recognition design with non-observable actions. A model with partial observability is relevant to goal recognition applications such as assisted cognition and security, which suffer from reduced observability due to sensor malfunction or lack of sufficient budget. In particular we define a worst case distinctiveness (wcd) measure that represents the maximal number of steps an agent can take in a system before the observed portion of his trajectory reveals his objective. We present a method for calculating wcd based on a novel compilation to classical planning and propose a method to improve the design using sensor placement. Our empirical evaluation shows that the proposed solutions effectively compute and improve wcd.

【Keywords】: Goal Recognition Design; Goal Recognition; Intention Detection; Partial Observability; Compilation to classical planning

439. Computing Contingent Plans Using Online Replanning.

【Paper Link】【Pages】:3159-3165

【Authors】: Radimir Komarnitsky ; Guy Shani

【Abstract】: In contingent planning under partial observability with sensing actions, agents actively use sensing to discover meaningful facts about the world. For this class of problems the solution can be represented as a plan tree, branching on various possible observations. Recent successful approaches translate the partially observable contingent problem into a non-deterministic fully observable problem, and then use a planner for non-deterministic planning. While this approach has been successful in many domains, the translation may become very large, encumbering the task of the non-deterministic planner. In this paper we suggest a different approach - using an online contingent solver repeatedly to construct a plan tree. We execute the plan returned by the online solver until the next observation action, and then branch on the possible observed values, and replan for every branch independently. In many cases a plan tree can be exponential in the number of state variables, but still, the tree has a structure that allows us to compactly represent it using a directed graph. We suggest a mechanism for tailoring such a graph that reduces both the computational effort and the storage space. Furthermore, unlike recent state of the art offline planners, our approach is not bounded to a specific class of contingent problems, such as limited problem width, or simple contingent problems. We present a set of experiments, showing our approach to scale better than state of the art offline planners.

【Keywords】: contingent planning;offline;online;plan tree

440. Multi-Agent Path Finding with Payload Transfers and the Package-Exchange Robot-Routing Problem.

【Paper Link】【Pages】:3166-3173

【Authors】: Hang Ma ; Craig A. Tovey ; Guni Sharon ; T. K. Satish Kumar ; Sven Koenig

【Abstract】: We study transportation problems where robots have to deliver packages and can transfer the packages among each other. Specifically, we study the package-exchange robot-routing problem (PERR), where each robot carries one package, any two robots in adjacent locations can exchange their packages, and each package needs to be delivered to a given destination. We prove that exchange operations make all PERR instances solvable. Yet, we also show that PERR is NP-hard to approximate within any factor less than 4/3 for makespan minimization and is NP-hard to solve for flowtime minimization, even when there are only two types of packages. Our proof techniques also generate new insights into other transportation problems, for example, into the hardness of approximating optimal solutions to the standard multi-agent path-finding problem (MAPF). Finally, we present optimal and suboptimal PERR solvers that are inspired by MAPF solvers, namely a flow-based ILP formulation and an adaptation of conflict-based search. Our empirical results demonstrate that these solvers scale well and that PERR instances often have smaller makespans and flowtimes than the corresponding MAPF instances.

【Keywords】: path planning; multi-agent pathfinding; complexity of planning

441. Solving Transition-Independent Multi-Agent MDPs with Sparse Interactions.

【Paper Link】【Pages】:3174-3180

【Authors】: Joris Scharpff ; Diederik M. Roijers ; Frans A. Oliehoek ; Matthijs T. J. Spaan ; Mathijs Michiel de Weerdt

【Abstract】: In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value. Typical algorithms exploit additive structure in the value function, but in the fully-observable multi-agent MDP (MMDP) setting such structure is not present. We propose a new optimal solver for transition-independent MMDPs, in which agents can only affect their own state but their reward depends on joint transitions. We represent these dependencies compactly in conditional return graphs (CRGs). Using CRGs the value of a joint policy and the bounds on partially specified joint policies can be efficiently computed. We propose CoRe, a novel branch-and-bound policy search algorithm building on CRGs. CoRe typically requires less runtime than available alternatives and finds solutions to previously unsolvable problems.

【Keywords】: Markov Decision Process;Transition-independent Multi-agent MDPs; Reward interactions; Conditional Return Graphs

442. Solving Goal Recognition Design Using ASP.

【Paper Link】【Pages】:3181-3187

【Authors】: Tran Cao Son ; Orkunt Sabuncu ; Christian Schulz-Hanke ; Torsten Schaub ; William Yeoh

【Abstract】: Goal Recognition Design involves identifying the best ways to modify an underlying environment that agents operate in, typically by making asubset of feasible actions infeasible, so that agents are forced to reveal their goals as early as possible. Thus far, existing work has focused exclusively on imperative classical planning. In this paper, we address the same problem with a different paradigm, namely, declarative approaches based on Answer Set Programming (ASP). Our experimental results show that one of our ASP encodings is more scalable and is significantly faster by up to three orders of magnitude than thecurrent state of the art.

【Keywords】: Planning, Goal Recognition, Goal Recognition Design

443. Efficient Macroscopic Urban Traffic Models for Reducing Congestion: A PDDL+ Planning Approach.

【Paper Link】【Pages】:3188-3194

【Authors】: Mauro Vallati ; Daniele Magazzeni ; Bart De Schutter ; Lukás Chrpa ; Thomas Leo McCluskey

【Abstract】: The global growth in urbanisation increases the demand for services including road transport infrastructure, presenting challenges in terms of mobility. In this scenario, optimising the exploitation of urban road networks is a pivotal challenge. Existing urban traffic control approaches, based on complex mathematical models, can effectively deal with planned-ahead events, but are not able to cope with unexpected situations --such as roads blocked due to car accidents or weather-related events-- because of their huge computational requirements. Therefore, such unexpected situations are mainly dealt with manually, or by exploiting pre-computed policies. Our goal is to show the feasibility of using mixed discrete-continuous planning to deal with unexpected circumstances in urban traffic control. We present a PDDL+ formulation of urban traffic control, where continuous processes are used to model flows of cars, and show how planning can be used to efficiently reduce congestion of specified roads by controlling traffic light green phases. We present simulation results on two networks (one of them considers Manchester city centre) that demonstrate the effectiveness of the approach, compared with fixed-time and reactive techniques.

【Keywords】:

444. A Proactive Sampling Approach to Project Scheduling under Uncertainty.

【Paper Link】【Pages】:3195-3201

【Authors】: Pradeep Varakantham ; Na Fu ; Hoong Chuin Lau

【Abstract】: Uncertainty in activity durations is a key characteristic of many real world scheduling problems in manufacturing, logistics and project management. RCPSP/max with durational uncertainty is a general model that can be used to represent durational uncertainty in a wide variety of scheduling problems where there exist resource constraints. However, computing schedules or execution strategies for RCPSP/max with durational uncertainty is NP-hard and hence we focus on providing approximation methods in this paper. We provide a principled approximation approach based on Sample Average Approximation (SAA) to compute proactive schedules for RCPSP/max with durational uncertainty. We further contribute an extension to SAA for improving scalability significantly without sacrificing on solution quality. Not only is our approach able to compute schedules at comparable runtimes as existing approaches, it also provides lower α-quantile makespan (also referred to as α-robust makespan) values than the best known approach on benchmark problems from the literature.

【Keywords】: Resource Constrained Project Scheduling, Sample Average Approximation, Optimization

445. A POMDP Formulation of Proactive Learning.

【Paper Link】【Pages】:3202-3208

【Authors】: Kyle Hollins Wray ; Shlomo Zilberstein

【Abstract】: We cast the Proactive Learning (PAL) problem—Active Learning (AL) with multiple reluctant, fallible, cost-varying oracles—as a Partially Observable Markov Decision Process (POMDP). The agent selects an oracle at each time step to label a data point, while it maintains a belief over the true underlying correctness of its current dataset’s labels. The goal is to minimize labeling costs while considering the value of obtaining correct labels, thus maximizing final resultant classifier accuracy. We prove three properties that show our particular formulation leads to a structured and bounded-size set of belief points, enabling strong performance of point-based methods to solve the POMDP. Our method is compared with the original three algorithms proposed by Donmez and Carbonell and a simple baseline. We demonstrate that our approach matches or improves upon the original approach within five different oracle scenarios, each on two datasets. Finally, our algorithm provides a general, well-defined mathematical foundation to build upon.

【Keywords】: Proactive Learning; POMDP; PBVI

446. Approximation Algorithms for Route Planning with Nonlinear Objectives.

【Paper Link】【Pages】:3209-3217

【Authors】: Ger Yang ; Evdokia Nikolova

【Abstract】: We consider optimal route planning when the objective function is a general nonlinear and non-monotonic function. Such an objective models user behavior more accurately, for example, when a user is risk-averse, or the utility function needs to capture a penalty for early arrival. It is known that as non-linearity arises, the problem can become NP-hard and little is known on computing optimal solutions when in addition there is no monotonicity guarantee. We show that an approximately optimal non-simple path can be efficiently computed under some natural constraints. In particular, we provide a fully polynomial approximation scheme under hop constraints. Our approximation algorithm can extend to run in pseudo-polynomial time under an additional linear constraint that sometimes is useful. As a by-product, we show that our algorithm can be applied to the problem of finding a path that is most likely to be on time for a given deadline.

【Keywords】: route planning, nonlinear objective, stochastic shortest path, approximation algorithm, cost-to-time ratio, non-simple path

Technical Papers: Reasoning under Uncertainty 15

447. Approximate Probabilistic Inference via Word-Level Counting.

【Paper Link】【Pages】:3218-3224

【Authors】: Supratik Chakraborty ; Kuldeep S. Meel ; Rakesh Mistry ; Moshe Y. Vardi

【Abstract】: Hashing-based model counting has emerged as a promising approach for large-scale probabilistic inference on graphical models. A key component of these techniques is the use of xor-based 2-universal hash functions that operate over Boolean domains. Many counting problems arising in probabilistic inference are, however, naturally encoded over finite discrete domains. Techniques based on bit-level (or Boolean) hash functions require these problems to be propositionalized, making it impossible to leverage the remarkable progress made in SMT (Satisfiability Modulo Theory) solvers that can reason directly over words (or bit-vectors). In this work, we present the first approximate model counter that uses word-level hashing functions, and can directly leverage the power of sophisticated SMT solvers. Empirical evaluation over an extensive suite of benchmarks demonstrates the promise of the approach.

【Keywords】: Universal Hashing; SMT; Model Counting

448. A Symbolic SAT-Based Algorithm for Almost-Sure Reachability with Small Strategies in POMDPs.

【Paper Link】【Pages】:3225-3232

【Authors】: Krishnendu Chatterjee ; Martin Chmelik ; Jessica Davies

【Abstract】: POMDPs are standard models for probabilistic planning problems, where an agent interacts with an uncertain environment. We study the problem of almost-sure reachability, where given a set of target states, the question is to decide whether there is a policy to ensure that the target set is reached with probability 1 (almost-surely). While in general the problem is EXPTIME-complete, in many practical cases policies with a small amount of memory suffice. Moreover, the existing solution to the problem is explicit, which first requires to construct explicitly an exponential reduction to a belief-support MDP. In this work, we first study the existence of observation-stationary strategies, which is NP-complete, and then small-memory strategies. We present a symbolic algorithm by an efficient encoding to SAT and using a SAT solver for the problem. We report experimental results demonstrating the scalability of our symbolic (SAT-based) approach.

【Keywords】: POMDPs; SAT; Uncertainty in AI; Planning under Uncertainty

449. Structured Features in Naive Bayes Classification.

【Paper Link】【Pages】:3233-3240

【Authors】: Arthur Choi ; Nazgol Tavabi ; Adnan Darwiche

【Abstract】: We propose the structured naive Bayes (SNB) classifier, which augments the ubiquitous naive Bayes classifier with structured features. SNB classifiers facilitate the use of complex features, such as combinatorial objects (e.g., graphs, paths and orders) in a general but systematic way. Underlying the SNB classifier is the recently proposed Probabilistic Sentential Decision Diagram (PSDD), which is a tractable representation of probability distributions over structured spaces. We illustrate the utility and generality of the SNB classifier via case studies. First, we show how we can distinguish players of simple games in terms of play style and skill level based purely on observing the games they play. Second, we show how we can detect anomalous paths taken on graphs based purely on observing the paths themselves.

【Keywords】: naive Bayes; classification; structured spaces; decision diagrams; Bayesian networks

450. On Parameter Tying by Quantization.

【Paper Link】【Pages】:3241-3247

【Authors】: Li Chou ; Somdeb Sarkhel ; Nicholas Ruozzi ; Vibhav Gogate

【Abstract】: The maximum likelihood estimator (MLE) is generally asymptotically consistent but is susceptible to over-fitting. To combat this problem, regularization methods which reduce the variance at the cost of (slightly) increasing the bias are often employed in practice. In this paper, we present an alternative variance reduction (regularization) technique that quantizes the MLE estimates as a post processing step, yielding a smoother model having several tied parameters. We provide and prove error bounds for our new technique and demonstrate experimentally that it often yields models having higher test-set log-likelihood than the ones learned using the MLE. We also propose a new importance sampling algorithm for fast approximate inference in models having several tied parameters. Our experiments show that our new inference algorithm is superior to existing approaches such as Gibbs sampling and MC-SAT on models having tied parameters, learned using our quantization-based approach.

【Keywords】: Quantization; Learning Graphical Models; Parameter Tying; Importance Sampling

451. Exact Sampling with Integer Linear Programs and Random Perturbations.

【Paper Link】【Pages】:3248-3254

【Authors】: Carolyn Kim ; Ashish Sabharwal ; Stefano Ermon

【Abstract】: We consider the problem of sampling from a discrete probability distribution specified by a graphical model. Exact samples can, in principle, be obtained by computing the mode of the original model perturbed with an exponentially many i.i.d. random variables. We propose a novel algorithm that views this as a combinatorial optimization problem and searches for the extreme state using a standard integer linear programming (ILP) solver, appropriately extended to account for the random perturbation. Our technique, GumbelMIP, leverages linear programming (LP) relaxations to evaluate the qualityof samples and prune large portions of the search space, and can thus scale to large tree-width models beyond the reach of current exact inference methods. Further, when the optimization problem is not solved to optimality, our method yields a novel approximate sampling technique. We empirically demonstrate that our approach parallelizes well, our exact sampler scales better than alternative approaches, and our approximate sampler yields better quality samples than a Gibbs sampler and a low-dimensional perturbation method.

【Keywords】:

452. From Exact to Anytime Solutions for Marginal MAP.

【Paper Link】【Pages】:3255-3262

【Authors】: Junkyu Lee ; Radu Marinescu ; Rina Dechter ; Alexander T. Ihler

【Abstract】: This paper explores the anytime performance of search-based algorithms for solving the Marginal MAP task over graphical models. The current state of the art for solving this challenging task is based on best-first search exploring the AND/OR graph with the guidance of heuristics based on mini-bucket and variational cost-shifting principles. Yet, those schemes are uncompromising in that they solve the problem exactly, or not at all, and often suffer from memory problems. In this work, we explore the well known principle of weighted search for converting best-first search solvers into anytime schemes. The weighted best-first search schemes report a solution early in the process by using inadmissible heuristics, and subsequently improve the solution. While it was demonstrated recently that weighted schemes can yield effective anytime behavior for pure MAP tasks, Marginal MAP is far more challenging (e.g., a conditional sum must be evaluated for every solution). Yet, in an extensive empirical analysis we show that weighted schemes are indeed highly effective for Marginal MAP yielding the most competitive schemes to date for this task.

【Keywords】: marginal MAP;AND/OR search;anytime algorithm;

453. On Learning Causal Models from Relational Data.

【Paper Link】【Pages】:3263-3270

【Authors】: Sanghack Lee ; Vasant Honavar

【Abstract】: Many applications call for learning causal models from relational data. We investigate Relational Causal Models (RCM) under relational counterparts of adjacency-faithfulness and orientation-faithfulness, yielding a simple approach to identifying a subset of relational d-separation queries needed for determining the structure of an RCM using d-separation against an unrolled DAG representation of the RCM. We provide original theoretical analysis that offers the basis of a sound and efficient algorithm for learning the structure of an RCM from relational data. We describe RCD-Light, a sound and efficient constraint-based algorithm that is guaranteed to yield a correct partially-directed RCM structure with at least as many edges oriented as in that produced by RCD, the only other existing algorithm for learning RCM. We show that unlike RCD, which requires exponential time and space, RCD-Light requires only polynomial time and space to orient the dependencies of a sparse RCM.

【Keywords】:

454. Online Spatio-Temporal Matching in Stochastic and Dynamic Domains.

【Paper Link】【Pages】:3271-3277

【Authors】: Meghna Lowalekar ; Pradeep Varakantham ; Patrick Jaillet

【Abstract】: Spatio-temporal matching of services to customers online is a problem that arises on a large scale in many domains associated with shared transportation (ex: taxis, ride sharing, super shuttles, etc.) and delivery services (ex: food, equipment, clothing, home fuel, etc.). A key characteristic of these problems is that matching of services to customers in one round has a direct impact on the matching of services to customers in the next round. For instance, in the case of taxis, in the second round taxis can only pick up customers closer to the drop off point of the customer from the first round of matching. Traditionally, greedy myopic approaches have been adopted to address such large scale online matching problems. While they provide solutions in a scalable manner, due to their myopic nature the quality of matching obtained can be improved significantly (demonstrated in our experimental results). In this paper, we present a two stage stochastic optimization formulation to consider expected future demand. We then provide multiple enhancements to solve large scale problems more effectively and efficiently. Finally, we demonstrate the significant improvement provided by our techniques over myopic approaches on two real world taxi data sets.

【Keywords】: Online Matching, Stochastic Optimization

455. Scaling Relational Inference Using Proofs and Refutations.

【Paper Link】【Pages】:3278-3286

【Authors】: Ravi Mangal ; Xin Zhang ; Aditya Kamath ; Aditya V. Nori ; Mayur Naik

【Abstract】: Many inference problems are naturally formulated using hard and soft constraints over relational domains: the desired solution must satisfy the hard constraints, while optimizing the objectives expressed by the soft constraints. Existing techniques for solving such constraints rely on efficiently grounding a sufficient subset of constraints that is tractable to solve. We present an eager-lazy grounding algorithm that eagerly exploits proofs and lazily refutes counterexamples. We show that our algorithm achieves significant speedup over existing approaches without sacrificing soundness for real-world applications from information retrieval and program analysis.

【Keywords】: relational probabilistic models; markov logic networks; lazy grounding; relational inference

456. Closed-Form Gibbs Sampling for Graphical Models with Algebraic Constraints.

【Paper Link】【Pages】:3287-3293

【Authors】: Hadi Mohasel Afshar ; Scott Sanner ; Christfried Webers

【Abstract】: Probabilistic inference in many real-world problems requires graphical models with deterministic algebraic constraints between random variables (e.g., Newtonian mechanics, Pascal’s law, Ohm’s law) that are known to be problematic for many inference methods such as Monte Carlo sampling. Fortunately, when such constraintsare invertible, the model can be collapsed and the constraints eliminated through the well-known Jacobian-based change of variables. As our first contributionin this work, we show that a much broader classof algebraic constraints can be collapsed by leveraging the properties of a Dirac delta model of deterministic constraints. Unfortunately, the collapsing processcan lead to highly piecewise densities that pose challenges for existing probabilistic inference tools. Thus,our second contribution to address these challenges is to present a variation of Gibbs sampling that efficiently samples from these piecewise densities. The key insight to achieve this is to introduce a class of functions that (1) is sufficiently rich to approximate arbitrary models up to arbitrary precision, (2) is closed under dimension reduction (collapsing) for models with (non)linear algebraic constraints and (3) always permits one analytical integral sufficient to automatically derive closed-form conditionals for Gibbs sampling. Experiments demonstrate the proposed sampler converges at least an order of magnitude faster than existing Monte Carlo samplers.

【Keywords】: MCMC; sampling; Gibbs; closed-form; deterministic

457. Learning Bayesian Networks with Bounded Tree-width via Guided Search.

【Paper Link】【Pages】:3294-3300

【Authors】: Siqi Nie ; Cassio Polpo de Campos ; Qiang Ji

【Abstract】: Bounding the tree-width of a Bayesian network can reduce the chance of overfitting, and allows exact inference to be performed efficiently. Several existing algorithms tackle the problem of learning bounded tree-width Bayesian networks by learning from k-trees as super-structures, but they do not scale to large domains and/or large tree-width. We propose a guided search algorithm to find k-trees with maximum Informative scores, which is a measure of quality for the k-tree in yielding good Bayesian networks. The algorithm achieves close to optimal performance compared to exact solutions in small domains, and can discover better networks than existing approximate methods can in large domains. It also provides an optimal elimination order of variables that guarantees small complexity for later runs of exact inference. Comparisons with well-known approaches in terms of learning and inference accuracy illustrate its capabilities.

【Keywords】: Bayesian network; structure learning; bounded treewidth

458. Learning Ensembles of Cutset Networks.

【Paper Link】【Pages】:3301-3307

【Authors】: Tahrima Rahman ; Vibhav Gogate

【Abstract】: Cutset networks — OR (decision) trees that have Bayesian networks whose treewidth is bounded by one at each leaf — are a new class of tractable probabilistic models that admit fast, polynomial-time inference and learning algorithms. This is unlike other state-of-the-art tractable models such as thin junction trees, arithmetic circuits and sum-product networks in which inference is fast and efficient but learning can be notoriously slow. In this paper, we take advantage of this unique property to develop fast algorithms for learning ensembles of cutset networks. Specifically, we consider generalized additive mixtures of cutset networks and develop sequential boosting-based and parallel bagging-based approaches for learning them from data. We demonstrate, via a thorough experimental evaluation, that our new algorithms are superior to competing approaches in terms of test-set log-likelihood score and learning time.

【Keywords】: Ensemble methods; Tractable PGMs; Cutset Networks

459. RAO*: An Algorithm for Chance-Constrained POMDP's.

【Paper Link】【Pages】:3308-3314

【Authors】: Pedro Henrique de Rodrigues Quemel e Assis Santana ; Sylvie Thiébaux ; Brian Williams

【Abstract】: Autonomous agents operating in partially observable stochastic environments often face the problem of optimizing expected performance while bounding the risk of violating safety constraints. Such problems can be modeled as chance-constrained POMDP's (CC-POMDP's). Our first contribution is a systematic derivation of execution risk in POMDP domains, which improves upon how chance constraints are handled in the constrained POMDP literature. Second, we present RAO, a heuristic forward search algorithm producing optimal, deterministic, finite-horizon policies for CC-POMDP's. In addition to the utility heuristic, RAO leverages an admissible execution risk heuristic to quickly detect and prune overly-risky policy branches. Third, we demonstrate the usefulness of RAO* in two challenging domains of practical interest: power supply restoration and autonomous science agents.

【Keywords】: chance constraint; POMDP; heuristic search

460. Separators and Adjustment Sets in Markov Equivalent DAGs.

【Paper Link】【Pages】:3315-3321

【Authors】: Benito van der Zander ; Maciej Liskiewicz

【Abstract】: In practice the vast majority of causal effect estimations from observational data are computed using adjustment sets which avoid confounding by adjusting for appropriate covariates. Recently several graphical criteria for selecting adjustment sets have been proposed. They handle causal directed acyclic graphs (DAGs) as well as more general types of graphs that represent Markov equivalence classes of DAGs, including completed partially directed acyclic graphs (CPDAGs). Though expressed in graphical language, it is not obvious how the criteria can be used to obtain effective algorithms for finding adjustment sets. In this paper we provide a new criterion which leads to an efficient algorithmic framework to find, test and enumerate covariate adjustments for chain graphs - mixed graphs representing in a compact way a broad range of Markov equivalence classes of DAGs.

【Keywords】: causality; causal effect; Markov equivalence classes; directed acyclic graph; chain graph

461. Closing the Gap Between Short and Long XORs for Model Counting.

【Paper Link】【Pages】:3322-3329

【Authors】: Shengjia Zhao ; Sorathan Chaturapruek ; Ashish Sabharwal ; Stefano Ermon

【Abstract】: Many recent algorithms for approximate model counting are based on a reduction to combinatorial searches over random subsets of the space defined by parity or XOR constraints. Long parity constraints (involving many variables) provide strong theoretical guarantees but are computationally difficult. Short parity constraints are easier to solve but have weaker statistical properties. It is currently not known how long these parity constraints need to be. We close the gap by providing matching necessary and sufficient conditions on the required asymptotic length of the parity constraints. Further, we provide a new family of lower bounds and the first non-trivial upper bounds on the model count that are valid for arbitrarily short XORs. We empirically demonstrate the effectiveness of these bounds on model counting benchmarks and in a Satisfiability Modulo Theory (SMT) application motivated by the analysis of contingency tables in statistics.

【Keywords】: Model Counting, Randomized Hashing, SAT

Technical Papers: Robotics 5

462. Distance Minimization for Reward Learning from Scored Trajectories.

【Paper Link】【Pages】:3330-3336

【Authors】: Benjamin Burchfiel ; Carlo Tomasi ; Ronald Parr

【Abstract】: Many planning methods rely on the use of an immediate reward function as a portable and succinct representation of desired behavior. Rewards are often inferred from demonstrated behavior that is assumed to be near-optimal. We examine a framework, Distance Minimization IRL (DM-IRL), for learning reward functions from scores an expert assigns to possibly suboptimal demonstrations. By changing the expert’s role from a demonstrator to a judge, DM-IRL relaxes some of the assumptions present in IRL, enabling learning from the scoring of arbitrary demonstration trajectories with unknown transition functions. DM-IRL complements existing IRL approaches by addressing different assumptions about the expert. We show that DM-IRL is robust to expert scoring error and prove that finding a policy that produces maximally informative trajectories for an expert to score is strongly NP-hard. Experimentally, we demonstrate that the reward function DM-IRL learns from an MDP with an unknown transition model can transfer to an agent with known characteristics in a novel environment, and we achieve successful learning with limited available training data.

【Keywords】: Reinforcement Learning; Robotics; Learning from Demonstration; Inverse Reinforcement Learning; Inverse Optimal Control; IRL; IOC; RL

463. Efficient Spatio-Temporal Tactile Object Recognition with Randomized Tiling Convolutional Networks in a Hierarchical Fusion Strategy.

【Paper Link】【Pages】:3337-3345

【Authors】: Le-le Cao ; Ramamohanarao Kotagiri ; Fuchun Sun ; Hongbo Li ; Wen-bing Huang ; Zay Maung Maung Aye

【Abstract】: Robotic tactile recognition aims at identifying target objects or environments from tactile sensory readings. The advancement of unsupervised feature learning and biological tactile sensing inspire us proposing the model of 3T-RTCN that performs spatio-temporal feature representation and fusion for tactile recognition. It decomposes tactile data into spatial and temporal threads, and incorporates the strength of randomized tiling convolutional networks. Experimental evaluations show that it outperforms some state-of-the-art methods with a large margin regarding recognition accuracy, robustness, and fault-tolerance; we also achieve an order-of-magnitude speedup over equivalent networks with pretraining and finetuning. Practical suggestions and hints are summarized in the end for effectively handling the tactile data.

【Keywords】: tactile object recognition; feature representation; feature fusion; decision fusion; tiled convolutional network; random weights; tactile flow; robustness; fault-tolerance

464. Continual Planning in Golog.

【Paper Link】【Pages】:3346-3353

【Authors】: Till Hofmann ; Tim Niemueller ; Jens Claßen ; Gerhard Lakemeyer

【Abstract】: To solve ever more complex and longer tasks, mobile robots need to generate more elaborate plans and must handle dynamic environments and incomplete knowledge. We address this challenge by integrating two seemingly different approaches — PDDL-based planning for efficient plan generation and Golog for highly expressive behavior specification — in a coherent framework that supports continual planning. The latter allows to interleave plan generation and execution through assertions, which are placeholder actions that are dynamically expanded into conditional sub-plans (using classical planners) once a replanning condition is satisfied. We formalize and implement continual planning in Golog which was so far only supported in PDDL-based systems. This enables combining the execution of generated plans with regular Golog programs and execution monitoring. Experiments on autonomous mobile robots show that the approach supports expressive behavior specification combined with efficient sub-plan generation to handle dynamic environments and incomplete knowledge in a unified way.

【Keywords】: Golog; Situation Calculus; PDDL; Classical Planning; Continual Planning

465. Selectively Reactive Coordination for a Team of Robot Soccer Champions.

【Paper Link】【Pages】:3354-3360

【Authors】: Juan Pablo Mendoza ; Joydeep Biswas ; Philip Cooksey ; Richard Wang ; Steven D. Klee ; Danny Zhu ; Manuela M. Veloso

【Abstract】: CMDragons 2015 is the champion of the RoboCup Small Size League of autonomous robot soccer. The team won all of its six games, scoring a total of 48 goals and conceding 0. This unprecedented dominant performance is the result of various features, but we particularly credit our novel offense multi-robot coordination. This paper thus presents our Selectively Reactive Coordination (SRC) algorithm, consisting of two layers: A coordinated opponent-agnostic layer enables the team to create its own plans, setting the pace of the game in offense. An individual opponent-reactive action selection layer enables the robots to maintain reactivity to different opponents. We demonstrate the effectiveness of our coordination through results from RoboCup 2015, and through controlled experiments using a physics-based simulator and an automated referee.

【Keywords】: Multi-robot coordination; Robot Soccer; Planning and Execution

466. Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks.

【Paper Link】【Pages】:3361-3368

【Authors】: Peter Ondruska ; Ingmar Posner

【Abstract】: This paper presents to the best of our knowledge the first end-to-end object tracking approach which directly maps from raw sensor input to object tracks in sensor space without requiring any feature engineering or system identification in the form of plant or sensor models. Specifically, our system accepts a stream of raw sensor data at one end and, in real-time, produces an estimate of the entire environment state at the output including even occluded objects. We achieve this by framing the problem as a deep learning task and exploit sequence models in the form of recurrent neural networks to learn a mapping from sensor measurements to object tracks. In particular, we propose a learning method based on a form of input dropout which allows learning in an unsupervised manner, only based on raw, occluded sensor data without access to ground-truth annotations. We demonstrate our approach using a synthetic dataset designed to mimic the task of tracking objects in 2D laser data — as commonly encountered in robotics applications — and show that it learns to track many dynamic objects despite occlusions and the presence of sensor noise.

【Keywords】: Tracking; Deep Machine Learning; Recurrent Neural Network; Computer Vision; Partial Observability

Technical Papers: Search and Constraint Satisfaction 11

467. Component Caching in Hybrid Domains with Piecewise Polynomial Densities.

【Paper Link】【Pages】:3369-3375

【Authors】: Vaishak Belle ; Guy Van den Broeck ; Andrea Passerini

【Abstract】: Counting the models of a propositional formula is an important problem: for example, it serves as the backbone of probabilistic inference by weighted model counting. A key algorithmic insight is component caching (CC), in which disjoint components of a formula, generated dynamically during a DPLL search, are cached so that they only have to be solved once. In the recent years, driven by SMT technology and probabilistic inference in hybrid domains, there is an increasing interest in counting the models of linear arithmetic sentences. To date, however, solvers for these are block-clause implementations, which are nonviable on large problem instances. In this paper, as a first step in extending CC to hybrid domains, we show how propositional CC systems can be leveraged when limited to piecewise polynomial densities. Our experiments demonstrate a large gap in performance when compared to existing approaches based on a variety of block-clause strategies.

【Keywords】: model counting, probabilistic inference, hybrid graphical models

468. The Meta-Problem for Conservative Mal'tsev Constraints.

【Paper Link】【Pages】:3376-3382

【Authors】: Clément Carbonnel

【Abstract】: In the algebraic approach to CSP (Constraint Satisfaction Problem), the complexity of constraint languages is studied using closure operations called polymorphisms. Many of these operations are known to induce tractability of any language they preserve. We focus on the meta-problem: given a language G, decide if G has a polymorphism with nice properties. We design an algorithm that decides in polynomial-time if a constraint language has a conservative Mal'tsev polymorphism, and outputs one if one exists. As a corollary we obtain that the class of conservative Mal'tsev constraints is uniformly tractable, and we conjecture that this result remains true in the non-conservative case.

【Keywords】:

469. Steiner Tree Problems with Side Constraints Using Constraint Programming.

【Paper Link】【Pages】:3383-3389

【Authors】: Diego de Uña ; Graeme Gange ; Peter Schachte ; Peter J. Stuckey

【Abstract】: The Steiner Tree Problem is a well know NP-complete problem that is well studied and for which fast algorithms are already available. Nonetheless, in the real world the Steiner Tree Problem is almost always accompanied by side constraints which means these approaches cannot be applied. For many problems with side constraints, only approximation algorithms are known. We introduce here a propagator for the tree constraint with explanations, as well as lower bounding techniques and a novel constraint programming approach for the Steiner Tree Problem and two of its variants. We find our propagators with explanations are highly advantageous when it comes to solving variants of this problem.

【Keywords】: Steiner tree; constraint programming; propagation

470. Alternative Filtering for the Weighted Circuit Constraint: Comparing Lower Bounds for the TSP and Solving TSPTW.

【Paper Link】【Pages】:3390-3396

【Authors】: Sylvain Ducomman ; Hadrien Cambazard ; Bernard Penz

【Abstract】: Many problems, and in particular routing problems, require to find one or many circuits in a weighted graph. The weights often express the distance or the travel time between vertices. We propose in this paper various filtering algorithms for the weighted circuit constraint which maintain a circuit in a weighted graph. The filtering algorithms are typical cost based filtering algorithms relying on relaxations of the Traveling Salesman Problem. We investigate three bounds and show that they are incomparable. In particular we design a filtering algorithm based on a lower bound introduced in 1981 by Christophides et al.. This bound can provide stronger filtering than the classical Held and Karp’s approach when additional information, such as the possible positions of the clients in the tour, is available. This is particularly suited for problems with side constraints such as time windows.

【Keywords】: Weighted circuit; TSP; TSPTW

471. Using the Shapley Value to Analyze Algorithm Portfolios.

【Paper Link】【Pages】:3397-3403

【Authors】: Alexandre Fréchette ; Lars Kotthoff ; Tomasz P. Michalak ; Talal Rahwan ; Holger H. Hoos ; Kevin Leyton-Brown

【Abstract】: Algorithms for NP-complete problems often have different strengths andweaknesses, and thus algorithm portfolios often outperform individualalgorithms. It is surprisingly difficult to quantify a component algorithm's contributionto such a portfolio. Reporting a component's standalone performance wronglyrewards near-clones while penalizing algorithms that have small but distinctareas of strength. Measuring a component's marginal contribution to an existingportfolio is better, but penalizes sets of strongly correlated algorithms,thereby obscuring situations in which it is essential to have at least onealgorithm from such a set. This paper argues for analyzing component algorithmcontributions via a measure drawn from coalitional game theory---the Shapleyvalue---and yields insight into a research community's progress over time. Weconclude with an application of the analysis we advocate to SAT competitions,yielding novel insights into the behaviour of algorithm portfolios, theircomponents, and the state of SAT solving technology.

【Keywords】: algorithm portfolios; Shapley value; contribution analysis; SAT competition

472. On the Extraction of One Maximal Information Subset That Does Not Conflict with Multiple Contexts.

【Paper Link】【Pages】:3404-3410

【Authors】: Éric Grégoire ; Yacine Izza ; Jean-Marie Lagniez

【Abstract】: The efficient extraction of one maximal information subset that does not conflict with multiple contxts or additional information sources is a key basic issue in many A.I. domains, especially when these contexts or sources can be mutually conflicting. In this paper, this question is addressed from a computational point of view in clausal Boolean logic. A new approach is introduced that experimentally outperforms the currently most efficient technique.

【Keywords】: MSS; Co-MSS;MCS; SAT; satisfiability reasoning; reasoning under contexts

473. Bidirectional Search That Is Guaranteed to Meet in the Middle.

【Paper Link】【Pages】:3411-3417

【Authors】: Robert C. Holte ; Ariel Felner ; Guni Sharon ; Nathan R. Sturtevant

【Abstract】: We present MM, the first bidirectional heuristic search algorithm whose forward and backward searches are guaranteed to ''meet in the middle'', i.e. never expand a node beyond the solution midpoint. We also present a novel framework for comparing MM, A*, and brute-force search, and identify conditions favoring each algorithm. Finally, we present experimental results that support our theoretical analysis.

【Keywords】: Artificial Intelligence; Heuristic Search

474. Breaking More Composition Symmetries Using Search Heuristics.

【Paper Link】【Pages】:3418-3425

【Authors】: Jimmy H. M. Lee ; Zichen Zhu

【Abstract】: The pruning power of partial symmetry breaking depends on the given subset of symmetries to break as well as the interactions among symmetry breaking constraints. In the context of Partial Symmetry Breaking During Search (ParSBDS), the search order determines the set of symmetry breaking constraints to add and thus also makes an impact on node and solution pruning. In this paper, we give the first formal characterization of the pruning behavior of ParSBDS and its improved variants. Introducing the notion of Dominance-Completeness (DC-ness), we show that ParSBDS and variants eliminate the symmetry group of the given subset of symmetries if the resultant search tree is DC, and give an example scenario. Unfortunately, building a DC tree is not always possible. We propose two search heuristics with the aim of having more nodes dominated and thus also pruned during search. Extensive experimentation demonstrates how the proposed heuristics and their combination can drastically reduce the solution set size, search space and runtime when compared against the state-of-the-art static and dynamic symmetry breaking methods.

【Keywords】: Dynamic symmetry breaking; Partial symmetry breaking; Search heuristics

475. Increasing Nogoods in Restart-Based Search.

【Paper Link】【Pages】:3426-3433

【Authors】: Jimmy H. M. Lee ; Christian Schulte ; Zichen Zhu

【Abstract】: Restarts are an important technique to make search more robust. This paper is concerned with how to maintain and propagate nogoods recorded from restarts efficiently. It builds on reduced nld-nogoods introduced for restarts and increasing nogoods introduced for symmetry breaking. The paper shows that reduced nld-nogoods extracted from a single restart are in fact increasing, which can thus benefit from the efficient propagation algorithm of the incNGs global constraint. We present a lighter weight filtering algorithm for incNGs in the context of restart-based search using dynamic event sets (dynamic subscriptions). We show formally that the lightweight version enforces GAC on each nogood while reducing the number of subscribed decisions. The paper also introduces an efficient approximation to nogood minimization such that all shortened reduced nld-nogoods from the same restart are also increasing and can be propagated with the new filtering algorithm. Experimental results confirm that our lightweight filtering algorithm and approximated nogood minimization successfully trade a slight loss in pruning for considerably better efficiency, and hence compare favorably against existing state-of-the-art techniques.

【Keywords】: Nogoods; Restart-based Search; Filtering Algorithm

476. Exponential Recency Weighted Average Branching Heuristic for SAT Solvers.

【Paper Link】【Pages】:3434-3440

【Authors】: Jia Hui Liang ; Vijay Ganesh ; Pascal Poupart ; Krzysztof Czarnecki

【Abstract】: Modern conflict-driven clause-learning SAT solvers routinely solve large real-world instances with millions of clauses and variables in them. Their success crucially depends on effective branching heuristics. In this paper, we propose a new branching heuristic inspired by the exponential recency weighted average algorithm used to solve the bandit problem. The branching heuristic, we call CHB, learns online which variables to branch on by leveraging the feedback received from conflict analysis. We evaluated CHB on 1200 instances from the SAT Competition 2013 and 2014 instances, and showed that CHB solves significantly more instances than VSIDS, currently the most effective branching heuristic in widespread use. More precisely, we implemented CHB as part of the MiniSat and Glucose solvers, and performed an apple-to-apple comparison with their VSIDS-based variants. CHB-based MiniSat (resp. CHB-based Glucose) solved approximately 16.1% (resp. 5.6%) more instances than their VSIDS-based variants. Additionally, CHB-based solvers are much more efficient at constructing first preimage attacks on step-reduced SHA-1 and MD5 cryptographic hash functions, than their VSIDS-based counterparts. To the best of our knowledge, CHB is the first branching heuristic to solve significantly more instances than VSIDS on a large, diverse benchmark of real-world instances.

【Keywords】: Branching Heuristic; Bandit Problem

477. Counting-Based Search for Constraint Optimization Problems.

【Paper Link】【Pages】:3441-3448

【Authors】: Gilles Pesant

【Abstract】: Branching heuristics based on counting solutions in constraints have been quite good at guiding search to solve constraint satisfaction problems. But do they perform as well for constraint optimization problems? We propose an adaptation of counting-based search for optimization, show how to modify solution density computation for some of the most frequently-occurring constraints, and empirically evaluate its performance on several benchmark problems.

【Keywords】:

Technical Papers: Vision 35

478. Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects.

【Paper Link】【Pages】:3449-3456

【Authors】: Hessam Bagherinezhad ; Hannaneh Hajishirzi ; Yejin Choi ; Ali Farhadi

【Abstract】: Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However, the impact of the information about sizes of objects is yet to be determined in AI. We postulate that this is mainly attributed to the lack of a comprehensive repository of size information. In this paper, we introduce a method to automatically infer object sizes, leveraging visual and textual information from web. By maximizing the joint likelihood of textual and visual observations, our method learns reliable relative size estimates, with no explicit human supervision. We introduce the relative size dataset and show that our method outperforms competitive textual and visual baselines in reasoning about size comparisons.

【Keywords】: language-vision; knowledge extraction; size information

479. Deep Quantization Network for Efficient Image Retrieval.

【Paper Link】【Pages】:3457-3463

【Authors】: Yue Cao ; Mingsheng Long ; Jianmin Wang ; Han Zhu ; Qingfu Wen

【Abstract】: Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing improves the quality of hash coding by exploiting the semantic similarity on data pairs and has received increasing attention recently. For most existing supervised hashing methods for image retrieval, an image is first represented as a vector of hand-crafted or machine-learned features, then quantized by a separate quantization step that generates binary codes. However, suboptimal hash coding may be produced, since the quantization error is not statistically minimized and the feature representation is not optimally compatible with the hash coding. In this paper, we propose a novel Deep Quantization Network (DQN) architecture for supervised hashing, which learns image representation for hash coding and formally control the quantization error. The DQN model constitutes four key components: (1) a sub-network with multiple convolution-pooling layers to capture deep image representations; (2) a fully connected bottleneck layer to generate dimension-reduced representation optimal for hash coding; (3) a pairwise cosine loss layer for similarity-preserving learning; and (4) a product quantization loss for controlling hashing quality and the quantizability of bottleneck representation. Extensive experiments on standard image retrieval datasets show the proposed DQN model yields substantial boosts over latest state-of-the-art hashing methods.

【Keywords】: Supervised Quantization; Deep Learning; Image Retrieval

480. Dynamic Concept Composition for Zero-Example Event Detection.

【Paper Link】【Pages】:3464-3470

【Authors】: Xiaojun Chang ; Yi Yang ; Guodong Long ; Chengqi Zhang ; Alexander G. Hauptmann

【Abstract】: In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars. In principle, zero-shot learning makes it possible to train an event detection model based on the assumption that events (e.g. birthday party) can be described by multiple mid-level semantic concepts (e.g. blowing candle'',birthday cake''). Towards this goal, we first pre-train a bundle of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest and pick up the relevant concept classifiers, which are applied on all test videos to get multiple prediction score vectors. While most existing systems combine the predictions of the concept classifiers with fixed weights, we propose to learn the optimal weights of the concept classifiers for each testing video by exploring a set of online available videos with free-form text descriptions of their content. To validate the effectiveness of the proposed approach, we have conducted extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV dataset. The experimental results confirm the superiority of the proposed approach.

【Keywords】: Event Detectionl; Zero-Example Event Detection; Dynamic Concept Composition

481. Face Video Retrieval via Deep Learning of Binary Hash Representations.

【Paper Link】【Pages】:3471-3477

【Authors】: Zhen Dong ; Su Jia ; Tianfu Wu ; Mingtao Pei

【Abstract】: Retrieving faces from large mess of videos is an attractive research topic with wide range of applications. Its challenging problems are large intra-class variations, and tremendous time and space complexity. In this paper, we develop a new deep convolutional neural network (deep CNN) to learn discriminative and compact binary representations of faces for face video retrieval. The network integrates feature extraction and hash learning into a unified optimization framework for the optimal compatibility of feature extractor and hash functions. In order to better initialize the network, the low-rank discriminative binary hashing is proposed to pre-learn hash functions during the training procedure. Our method achieves excellent performances on two challenging TV-Series datasets.

【Keywords】: face video retrieval; convolutional neural network

482. Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos.

【Paper Link】【Pages】:3478-3486

【Authors】: Mohamed Elhoseiny ; Jingen Liu ; Hui Cheng ; Harpreet S. Sawhney ; Ahmed M. Elgammal

【Abstract】: We propose a new zero-shot Event-Detection method by Multi-modal Distributional Semantic embedding of videos. Our model embeds object and action concepts as well as other available modalities from videos into a distributional semantic space. To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) semantic embedding of concepts definitions, and (c) retrieve videos by free text event query (e.g., "changing a vehicle tire") based on their content. We first embed the video into the multi-modal semantic space and then measure the similarity between videos with the event query in free text form. We validated our method on the large TRECVID MED (Multimedia Event Detection) challenge. Using only the event title as a query, our method outperformed the state-the-art that uses big descriptions from 12.6\% to 13.5\% with MAP metric and from 0.73 to 0.83 with ROC-AUC metric. It is also an order of magnitude faster.

【Keywords】: Language & Vision; Event Detection Zero Shot Detection; Action Recognition

483. Concepts Not Alone: Exploring Pairwise Relationships for Zero-Shot Video Activity Recognition.

【Paper Link】【Pages】:3487-

【Authors】: Chuang Gan ; Ming Lin ; Yi Yang ; Gerard de Melo ; Alexander G. Hauptmann

【Abstract】: Vast quantities of videos are now being captured at astonishing rates, but the majority of these are not labelled. To cope with such data, we consider the task of content-based activity recognition in videos without any manually labelled examples, also known as zero-shot video recognition. To achieve this, videos are represented in terms of detected visual concepts, which are then scored as relevant or irrelevant according to their similarity with a given textual query. In this paper, we propose a more robust approach for scoring concepts in order to alleviate many of the brittleness and low precision problems of previous work. Not only do we jointly consider semantic relatedness, visual reliability, and discriminative power. To handle noise and non-linearities in the ranking scores of the selected concepts, we propose a novel pairwise order matrix approach for score aggregation. Extensive experiments on the large-scale TRECVID Multimedia Event Detection data show the superiority of our approach.

【Keywords】: zero-shot; video

484. Transductive Zero-Shot Recognition via Shared Model Space Learning.

【Paper Link】【Pages】:3434-3500

【Authors】: Yuchen Guo ; Guiguang Ding ; Xiaoming Jin ; Jianmin Wang

【Abstract】: Zero-shot Recognition (ZSR) is to learn recognition models for novel classes without labeled data. It is a challenging task and has drawn considerable attention in recent years. The basic idea is to transfer knowledge from seen classes via the shared attributes. This paper focus on the transductive ZSR, i.e., we have unlabeled data for novel classes. Instead of learning models for seen and novel classes separately as in existing works, we put forward a novel joint learning approach which learns the shared model space (SMS) for models such that the knowledge can be effectively transferred between classes using the attributes. An effective algorithm is proposed for optimization. We conduct comprehensive experiments on three benchmark datasets for ZSR. The results demonstrates that the proposed SMS can significantly outperform the state-of-the-art related approaches which validates its efficacy for the ZSR task.

【Keywords】: zero-shot learning; image classification; optimization;

485. Reading Scene Text in Deep Convolutional Sequences.

【Paper Link】【Pages】:3501-3508

【Authors】: Pan He ; Weilin Huang ; Yu Qiao ; Chen Change Loy ; Xiaoou Tang

【Abstract】: We develop a Deep-Text Recurrent Network (DTRN)that regards scene text reading as a sequence labelling problem. We leverage recent advances of deep convolutional neural networks to generate an ordered highlevel sequence from a whole word image, avoiding the difficult character segmentation problem. Then a deep recurrent model, building on long short-term memory (LSTM), is developed to robustly recognize the generated CNN sequences, departing from most existing approaches recognising each character independently. Our model has a number of appealing properties in comparison to existing scene text recognition methods: (i) It can recognise highly ambiguous words by leveraging meaningful context information, allowing it to work reliably without either pre- or post-processing; (ii) the deep CNN feature is robust to various image distortions; (iii) it retains the explicit order information in word image, which is essential to discriminate word strings; (iv) the model does not depend on pre-defined dictionary, and it can process unknown words and arbitrary strings. It achieves impressive results on several benchmarks, advancing the-state-of-the-art substantially.

【Keywords】: scene text recognition

486. Structured Output Prediction for Semantic Perception in Autonomous Vehicles.

【Paper Link】【Pages】:3509-3515

【Authors】: Rein Houthooft ; Cedric De Boom ; Stijn Verstichel ; Femke Ongenae ; Filip De Turck

【Abstract】: A key challenge in the realization of autonomous vehicles is the machine's ability to perceive its surrounding environment. This task is tackled through a model that partitions vehicle camera input into distinct semantic classes, by taking into account visual contextual cues. The use of structured machine learning models is investigated, which not only allow for complex input, but also arbitrarily structured output. Towards this goal, an outdoor road scene dataset is constructed with accompanying fine-grained image labelings. For coherent segmentation, a structured predictor is modeled to encode label distributions conditioned on the input images. After optimizing this model through max-margin learning, based on an ontological loss function, efficient classification is realized via graph cuts inference using alpha-expansion. Both quantitative and qualitative analyses demonstrate that by taking into account contextual relations between pixel segmentation regions within a second-degree neighborhood, spurious label assignments are filtered out, leading to highly accurate semantic segmentations for outdoor scenes.

【Keywords】: structured prediction; autonomous vehicles; segmentation

487. Robust Complex Behaviour Modeling at 90Hz.

【Paper Link】【Pages】:3516-3522

【Authors】: Xiangyu Kong ; Yizhou Wang ; Tao Xiang

【Abstract】: Modeling complex crowd behaviour for tasks such as rare event detection has received increasing interest. However, existing methods are limited because (1) they are sensitive to noise often resulting in a large number of false alarms; and (2) they rely on elaborate models leading to high computational cost thus unsuitable for processing a large number of video inputs in real-time. In this paper, we overcome these limitations by introducing a novel complex behaviour modeling framework, which consists of a Binarized Cumulative Directional (BCD) feature as representation, novel spatial and temporal context modeling via an iterative correlation maximization, and a set of behaviour models, each being a simple Bernoulli distribution. Despite its simplicity, our experiments on three benchmark datasets show that it significantly outperforms the state-of-the-art for both temporal video segmentation and rare event detection. Importantly, it is extremely efficient — reaches 90Hz on a normal PC platform using MATLAB.

【Keywords】: Behaviour Analysis; Visual Surveillance; Video Event Detection; Anomaly Detection; Real-time Detection

488. Exploiting View-Specific Appearance Similarities Across Classes for Zero-Shot Pose Prediction: A Metric Learning Approach.

【Paper Link】【Pages】:3523-3529

【Authors】: Alina Kuznetsova ; Sung Ju Hwang ; Bodo Rosenhahn ; Leonid Sigal

【Abstract】: Viewpoint estimation, especially in case of multiple object classes, remains an important and challenging problem. First, objects under different views undergo extreme appearance variations, often making within-class variance larger than between-class variance. Second, obtaining precise ground truth for real-world images, necessary for training supervised viewpoint estimation models, is extremely difficult and time consuming. As a result, annotated data is often available only for a limited number of classes. Hence it is desirable to share viewpoint information across classes. Additional complexity arises from unaligned pose labels between classes, i.e. a side view of a car might look more like a frontal view of a toaster, than its side view. To address these problems, we propose a metric learning approach for joint class prediction and pose estimation. Our approach allows to circumvent the problem of viewpoint alignment across multiple classes, and does not require dense viewpoint labels. Moreover, we show, that the learned metric generalizes to new classes, for which the pose labels are not available, and therefore makes it possible to use only partially annotated training sets, relying on the intrinsic similarities in the viewpoint manifolds. We evaluate our approach on two challenging multi-class datasets, 3DObjects and PASCAL3D+.

【Keywords】: object pose estimation; large margin; metric learning; object detection;

489. Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision.

【Paper Link】【Pages】:3530-3538

【Authors】: Marius Leordeanu ; Alexandra Radu ; Shumeet Baluja ; Rahul Sukthankar

【Abstract】: Feature selection is essential for effective visual recognition. We propose an efficient joint classifier learning and feature selection method that discovers sparse, compact representations of input features from a vast sea of candidates, with an almost unsupervised formulation. Our method requires only the following knowledge, which we call the feature sign - whether or not a particular feature has on average stronger values over positive samples than over negatives. We show how this can be estimated using as few as a single labeled training sample per class. Then, using these feature signs, we extend an initial supervised learning problem into an (almost) unsupervised clustering formulation that can incorporate new data without requiring ground truth labels. Our method works both as a feature selection mechanism and as a fully competitive classifier. It has important properties, low computational cost annd excellent accuracy, especially in difficult cases of very limited training data. We experiment on large-scale recognition in video and show superior speed and performance to established feature selection approaches such as AdaBoost, Lasso, greedy forward-backward selection, and powerful classifiers such as SVM.

【Keywords】: feature selection; video classification; semi-supervised learning

490. Decentralized Robust Subspace Clustering.

【Paper Link】【Pages】:3539-3545

【Authors】: Bo Liu ; Xiao-Tong Yuan ; Yang Yu ; Qingshan Liu ; Dimitris N. Metaxas

【Abstract】: We consider the problem of subspace clustering using the SSC (Sparse Subspace Clustering) approach, which has several desirable theoretical properties and has been shown to be effective in various computer vision applications.We develop a large scale distributed framework for the computation of SSC via an alternating direction method of multiplier (ADMM) algorithm. The proposed framework solves SSC in column blocks and only involves parallel multivariate Lasso regression subproblems and sample-wise operations. This appealing property allows us to allocate multiple cores/machines for the processing of individual column blocks.We evaluate our algorithm on a shared-memory architecture. Experimental results on real-world datasets confirm that the proposed block-wise ADMM framework is substantially more efficient than its matrix counterpart used by SSC,without sacrificing accuracy. Moreover, our approach is directly applicable to decentralized neighborhood selection for Gaussian graphical models structure estimation.

【Keywords】: subspace clustering; semi-supervised learning; sparsity

491. Articulated Pose Estimation Using Hierarchical Exemplar-Based Models.

【Paper Link】【Pages】:3546-3552

【Authors】: Jiongxin Liu ; Yinxiao Li ; Peter K. Allen ; Peter N. Belhumeur

【Abstract】: Exemplar-based models have achieved great success on localizing the parts of semi-rigid objects. However, their efficacy on highly articulated objects such as humans is yet to be explored. Inspired by hierarchical object representation and recent application of Deep Convolutional Neural Networks (DCNNs) on human pose estimation, we propose a novel formulation that incorporates both hierarchical exemplar-based models and DCNNs in the spatial terms. Specifically, we obtain more expressive spatial models by assuming independence between exemplars at different levels in the hierarchy; we also obtain stronger spatial constraints by inferring the spatial relations between parts at the same level. As our method strikes a good balance between expressiveness and strength of spatial models, it is both effective and generalizable, achieving state-of-the-art results on different benchmarks: Leeds Sports Dataset and CUB-200-2011.

【Keywords】: Pose Estimation; Part Localization; Exemplar-Based Model

492. Multi-View 3D Human Tracking in Crowded Scenes.

【Paper Link】【Pages】:3553-3559

【Authors】: Xiaobai Liu

【Abstract】: This paper presents a robust multi-view method for tracking people in 3D scene. Our method distinguishes itself from previous works in two aspects. Firstly, we define a set of binary spatial relationships for individual subjects or pairs of subjects that appear at the same time, e.g. being left or right, being closer or further to the camera, etc. These binary relationships directly reflect relative positions of subjects in 3D scene and thus should be persisted during inference. Secondly, we introduce an unified probabilistic framework to exploit binary spatial constraints for simultaneous 3D localization and cross-view human tracking. We develop a cluster Markov Chain Monte Carlo method to search the optimal solution. We evaluate our method on both public video benchmarks and newly built multi-view video dataset. Results with comparisons showed that our method could achieve state-of-the-art tracking results and meter-level 3D localization on challenging videos.

【Keywords】: tracking;localization; bayesian

493. Face Model Compression by Distilling Knowledge from Neurons.

【Paper Link】【Pages】:3560-3566

【Authors】: Ping Luo ; Zhenyao Zhu ; Ziwei Liu ; Xiaogang Wang ; Xiaoou Tang

【Abstract】: The recent advanced face recognition systems werebuilt on large Deep Neural Networks (DNNs) or theirensembles, which have millions of parameters. However, the expensive computation of DNNs make theirdeployment difficult on mobile and embedded devices. This work addresses model compression for face recognition,where the learned knowledge of a large teachernetwork or its ensemble is utilized as supervisionto train a compact student network. Unlike previousworks that represent the knowledge by the soften labelprobabilities, which are difficult to fit, we represent theknowledge by using the neurons at the higher hiddenlayer, which preserve as much information as the label probabilities, but are more compact. By leveragingthe essential characteristics (domain knowledge) of thelearned face representation, a neuron selection methodis proposed to choose neurons that are most relevant toface recognition. Using the selected neurons as supervisionto mimic the single networks of DeepID2+ andDeepID3, which are the state-of-the-art face recognition systems, a compact student with simple network structure achieves better verification accuracy on LFW than its teachers, respectively. When using an ensemble of DeepID2+ as teacher, a mimicked student is able to outperform it and achieves 51.6 times compression ratio and 90 times speed-up in inference, making this cumbersome model applicable on portable devices.

【Keywords】: Deep Learning; Model Compression; Face Recognition; Attribute

494. Learning to Answer Questions from Image Using Convolutional Neural Network.

【Paper Link】【Pages】:3567-3573

【Authors】: Lin Ma ; Zhengdong Lu ; Hang Li

【Abstract】: In this paper, we propose to employ the convolutional neural network (CNN) for the image question answering (QA) task. Our proposed CNN provides an end-to-end framework with convolutional architectures for learning not only the image and question representations, but also their inter-modal interactions to produce the answer. More specifically, our model consists of three CNNs: one image CNN to encode the image content, one sentence CNN to compose the words of the question, and one multimodal convolution layer to learn their joint representation for the classification in the space of candidate answer words. We demonstrate the efficacy of our proposed model on the DAQUAR and COCO-QA datasets, which are two benchmark datasets for image QA, with the performances significantly outperforming the state-of-the-art.

【Keywords】: image and language; image question answering; CNN

495. SentiCap: Generating Image Descriptions with Sentiments.

【Paper Link】【Pages】:3574-3580

【Authors】: Alexander Patrick Mathews ; Lexing Xie ; Xuming He

【Abstract】: The recent progress on image recognition and language modeling is making automatic description of image content a reality. However, stylized, non-factual aspects of the written description are missing from the current systems. One such style is descriptions with emotions, which is commonplace in everyday communication, and influences decision-making and interpersonal relationships. We design a system to describe an image with emotions, and present a model that automatically generates captions with positive or negative sentiments. We propose a novel switching recurrent neural network with word-level regularization, which is able to produce emotional image captions using only 2000+ training sentences containing sentiments. We evaluate the captions with different automatic and crowd-sourcing metrics. Our model compares favourably in common quality metrics for image captioning. In 84.6% of cases the generated positive captions were judged as being at least as descriptive as the factual captions. Of these positive captions 88% were confirmed by the crowd-sourced workers as having the appropriate sentiment.

【Keywords】: computer vision; neural networks; sentiment; natural language generation; image caption

496. Look, Listen and Learn - A Multimodal LSTM for Speaker Identification.

【Paper Link】【Pages】:3581-3587

【Authors】: Jimmy S. J. Ren ; Yongtao Hu ; Yu-Wing Tai ; Chuan Wang ; Li Xu ; Wenxiu Sun ; Qiong Yan

【Abstract】: Speaker identification refers to the task of localizing the face of a person who has the same identity as the ongoing voice in a video. This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable. In this paper, we describe a novel multimodal Long Short-Term Memory (LSTM) architecture which seamlessly unifies both visual and auditory modalities from the beginning of each sequence input. The key idea is to extend the conventional LSTM by not only sharing weights across time steps, but also sharing weights across modalities. We show that modeling the temporal dependency across face and voice can significantly improve the robustness to content quality degradations and variations. We also found that our multimodal LSTM is robustness to distractors, namely the non-speaking identities. We applied our multimodal LSTM to The Big Bang Theory dataset and showed that our system outperforms the state-of-the-art systems in speaker identification with lower false alarm rate and higher recognition accuracy.

【Keywords】:

497. Toward a Taxonomy and Computational Models of Abnormalities in Images.

【Paper Link】【Pages】:3588-3596

【Authors】: Babak Saleh ; Ahmed M. Elgammal ; Jacob Feldman ; Ali Farhadi

【Abstract】: The human visual system can spot an abnormal image, and reason about what makes it strange. This task has not received enough attention in computer vision. In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before. We propose a new dataset of abnormal images showing a wide range of atypicalities. We design human subject experiments to discover a coarse taxonomy of the reasons for abnormality. Our experiments reveal three major categories of abnormality: object-centric, scene-centric, and contextual. Based on this taxonomy, we propose a comprehensive computational model that can predict all different types of abnormality in images and outperform prior arts in abnormality recognition.

【Keywords】: Visual Attributes; Classification; Detection; Reasoning; Probabilistic Graphical Models; Crowdsourcing; Human Perception

498. Domain-Constraint Transfer Coding for Imbalanced Unsupervised Domain Adaptation.

【Paper Link】【Pages】:3597-3603

【Authors】: Yao-Hung Hubert Tsai ; Cheng-An Hou ; Wei-Yu Chen ; Yi-Ren Yeh ; Yu-Chiang Frank Wang

【Abstract】: Unsupervised domain adaptation (UDA) deals with the task that labeled training and unlabeled test data collected from source and target domains, respectively. In this paper, we particularly address the practical and challenging scenario of imbalanced cross-domain data. That is, we do not assume the label numbers across domains to be the same, and we also allow the data in each domain to be collected from multiple datasets/sub-domains. To solve the above task of imbalanced domain adaptation, we propose a novel algorithm of Domain-constraint Transfer Coding (DcTC). Our DcTC is able to exploit latent subdomains within and across data domains, and learns a common feature space for joint adaptation and classification purposes. Without assuming balanced cross-domain data as most existing UDA approaches do, we show that our method performs favorably against state-of-the-art methods on multiple cross-domain visual classification tasks.

【Keywords】:

499. Recognizing Actions in 3D Using Action-Snippets and Activated Simplices.

【Paper Link】【Pages】:3604-3610

【Authors】: Chunyu Wang ; John Flynn ; Yizhou Wang ; Alan L. Yuille

【Abstract】: Pose-based action recognition in 3D is the task of recognizing an action (e.g., walking or running) from a sequence of 3D skeletal poses. This is challenging because of variations due to different ways of performing the same action and inaccuracies in the estimation of the skeletal poses. The training data is usually small and hence complex classifiers risk over-fitting the data. We address this task by action-snippets which are short sequences of consecutive skeletal poses capturing the temporal relationships between poses in an action. We propose a novel representation for action-snippets, called activated simplices. Each activity is represented by a manifold which is approximated by an arrangement of activated simplices. A sequence (of action-snippets) is classified by selecting the closest manifold and outputting the corresponding activity. This is a simple classifier which helps avoid over-fitting the data but which significantly outperforms state-of-the-art methods on standard benchmarks.

【Keywords】: sparse coding, activated simplices, nearest neighbor, action recognition, tight

500. DARI: Distance Metric and Representation Integration for Person Verification.

【Paper Link】【Pages】:3611-3617

【Authors】: Guangrun Wang ; Liang Lin ; Shengyong Ding ; Ya Li ; Qing Wang

【Abstract】: The past decade has witnessed the rapid development of feature representation learning and distance metric learning, whereas the two steps are often discussed separately. To explore their interaction, this work proposes an end-to-end learning framework called DARI, i.e. Distance metric And Representation Integration, and validates the effectiveness of DARI in the challenging task of person verification. Given the training images annotated with the labels, we first produce a large number of triplet units, and each one contains three images, i.e. one person and the matched/mismatch references. For each triplet unit, the distance disparity between the matched pair and the mismatched pair tends to be maximized. We solve this objective by building a deep architecture of convolutional neural networks. In particular, the Mahalanobis distance matrix is naturally factorized as one top fully-connected layer that is seamlessly integrated with other bottom layers representing the image feature. The image feature and the distance metric can be thus simultaneously optimized via the one-shot backward propagation. On several public datasets, DARI shows very promising performance on re-identifying individuals cross cameras against various challenges, and outperforms other state-of-the-art approaches.

【Keywords】: Person Verification，Representation Learning， Distance Metric Learning

501. Video Semantic Clustering with Sparse and Incomplete Tags.

【Paper Link】【Pages】:3618-3624

【Authors】: Jingya Wang ; Xiatian Zhu ; Shaogang Gong

【Abstract】: Clustering tagged videos into semantic groups is importantbut challenging due to the need for jointly learning correlations between heterogeneous visual and tag data. The taskis made more difficult by inherently sparse and incompletetag labels. In this work, we develop a method for accuratelyclustering tagged videos based on a novel Hierarchical-MultiLabel Random Forest model capable of correlating structured visual and tag information. Specifically, our model exploits hierarchically structured tags of different abstractnessof semantics and multiple tag statistical correlations, thus discovers more accurate semantic correlations among differentvideo data, even with highly sparse/incomplete tags.

【Keywords】: VIS: Videos; ML: Clustering

502. Path Following with Adaptive Path Estimation for Graph Matching.

【Paper Link】【Pages】:3625-3631

【Authors】: Tao Wang ; Haibin Ling

【Abstract】: Graph matching plays an important role in many fields in computer vision. It is a well-known general NP-hard problem and has been investigated for decades. Among the large amount of algorithms for graph matching, the algorithms utilizing the path following strategy exhibited state-of-art performances. However, the main drawback of this category of algorithms lies in their high computational burden. In this paper, we propose a novel path following strategy for graph matching aiming to improve its computation efficiency. We first propose a path estimation method to reduce the computational cost at each iteration, and subsequently a method of adaptive step length to accelerate the convergence. The proposed approach is able to be integrated into all the algorithms that utilize the path following strategy. To validate our approach, we compare our approach with several recently proposed graph matching algorithms on three benchmark image datasets. Experimental results show that, our approach improves significantly the computation efficiency of the original algorithms, and offers similar or better matching results.

【Keywords】: graph matching; feature matching; path following

503. Pose-Guided Human Parsing by an AND/OR Graph Using Pose-Context Features.

【Paper Link】【Pages】:3632-3640

【Authors】: Fangting Xia ; Jun Zhu ; Peng Wang ; Alan L. Yuille

【Abstract】: Parsing human into semantic parts is crucial to human-centric analysis. In this paper, we propose a human parsing pipeline that uses pose cues, e.g., estimates of human joint locations, to provide pose-guided segment proposals for semantic parts. These segment proposals are ranked using standard appearance cues, deep-learned semantic feature, and a novel pose feature called pose-context. Then these proposals are selected and assembled using an And-Or graph to output a parse of the person. The And-Or graph is able to deal with large human appearance variability due to pose, choice of clothing, etc. We evaluate our approach on the popular Penn-Fudan pedestrian parsing dataset, showing that it significantly outperforms the state of the art, and perform diagnostics to demonstrate the effectiveness of different stages of our pipeline.

【Keywords】: human parsing; pose feature; AOG

504. Diversified Dynamical Gaussian Process Latent Variable Model for Video Repair.

【Paper Link】【Pages】:3641-3647

【Authors】: Hao Xiong ; Tongliang Liu ; Dacheng Tao

【Abstract】: Videos can be conserved on different media. However, storing on media such as films and hard disks can suffer from unexpected data loss, for instance from physical damage. Repair of missing or damaged pixels is essential for video maintenance and preservation. Most methods seek to fill in missing holes by synthesizing similar textures from local or global frames. However, this can introduce incorrect contexts, especially when the missing hole or number of damaged frames is large. Furthermore, simple texture synthesis can introduce artifacts in undamaged and recovered areas. To address aforementioned problems, we propose the diversified dynamical Gaussian process latent variable model (D2GPLVM) for considering the variety in existing videos and thus introducing a diversity encouraging prior to inducing points. The aim is to ensure that the trained inducing points, which are a smaller set of all observed undamaged frames, are more diverse and resistant for context-aware and artifacts-free based video repair. The defined objective function in our proposed model is initially not analytically tractable and must be solved by variational inference. Finally, experimental testing illustrates the robustness and effectiveness of our method for damaged video repair.

【Keywords】: DGPLVM, inducing points, latect variable, diversity encouraging prior.

505. Metric Embedded Discriminative Vocabulary Learning for High-Level Person Representation.

【Paper Link】【Pages】:3648-3654

【Authors】: Yang Yang ; Zhen Lei ; Shifeng Zhang ; Hailin Shi ; Stan Z. Li

【Abstract】: A variety of encoding methods for bag of word (BoW) model have been proposed to encode the local features in image classification. However, most of them are unsupervised and just employ k-means to form the visual vocabulary, thus reducing the discriminative power of the features. In this paper, we propose a metric embedded discriminative vocabulary learning for high-level person representation with application to person re-identification. A new and effective term is introduced which aims at making the same persons closer while different ones farther in the metric space. With the learned vocabulary, we utilize a linear coding method to encode the image-level features (or holistic image features) for extracting high-level person representation. Different from traditional unsupervised approaches, our method can explore the relationship(same or not) among the persons. Since there is an analytic solution to the linear coding, it is easy to obtain the final high-level features. The experimental results on person re-identification demonstrate the effectiveness of our proposed algorithm.

【Keywords】: Vocabulary Learning; Person Re-identification; High-Level features

506. Large Scale Similarity Learning Using Similar Pairs for Person Verification.

【Paper Link】【Pages】:3655-3661

【Authors】: Yang Yang ; Shengcai Liao ; Zhen Lei ; Stan Z. Li

【Abstract】: In this paper, we propose a novel similarity measure and then introduce an efficient strategy to learn it by using only similar pairs for person verification. Unlike existing metric learning methods, we consider both the difference and commonness of an image pair to increase its discriminativeness. Under a pairconstrained Gaussian assumption, we show how to obtain the Gaussian priors (i.e., corresponding covariance matrices) of dissimilar pairs from those of similar pairs. The application of a log likelihood ratio makes the learning process simple and fast and thus scalable to large datasets. Additionally, our method is able to handle heterogeneous data well. Results on the challenging datasets of face verification (LFW and Pub-Fig) and person re-identification (VIPeR) show that our algorithm outperforms the state-of-the-art methods.

【Keywords】: similarity learning; person verification

507. Unsupervised Co-Activity Detection from Multiple Videos Using Absorbing Markov Chain.

【Paper Link】【Pages】:3662-3668

【Authors】: Donghun Yeo ; Bohyung Han ; Joon Hee Han

【Abstract】: We propose a simple but effective unsupervised learning algorithm to detect a common activity (co-activity) from a set of videos, which is formulated using absorbing Markov chain in a principled way. In our algorithm, a complete multipartite graph is first constructed, where vertices correspond to subsequences extracted from videos using a temporal sliding window and edges connect between the vertices originated from different videos; the weight of an edge is proportional to the similarity between the features of two end vertices. Then, we extend the graph structure by adding edges between temporally overlapped subsequences in a video to handle variable-length co-activities using temporal locality, and create an absorbing vertex connected from all other nodes. The proposed algorithm identifies a subset of subsequences as co-activity by estimating absorption time in the constructed graph efficiently. The great advantage of our algorithm lies in the properties that it can handle more than two videos naturally and identify multiple instances of a co-activity with variable lengths in a video. Our algorithm is evaluated intensively in a challenging dataset and illustrates outstanding performance quantitatively and qualitatively.

【Keywords】: co-activity detection; absorbing Markov chain; unsupervised learning

508. Discrete Image Hashing Using Large Weakly Annotated Photo Collections.

【Paper Link】【Pages】:3669-3675

【Authors】: Hanwang Zhang ; Na Zhao ; Xindi Shang ; Huan-Bo Luan ; Tat-Seng Chua

【Abstract】: We address the problem of image hashing by learning binary codes from large and weakly supervised photo collections. Due to the explosive growth of user generated media on the Web, this problem is becoming critical for large-scale visual applications like image retrieval. While most existing hashing methods fail to address this challenge well, our method shows promising improvement due to the following two key advantages.First, we formulate a novel hashing objective that can effectively mine implicit weak supervision by collaborative filtering. Second, we propose a discrete hashing algorithm, offered with efficient optimization, to overcome the inferior optimizations in obtaining binary codes from real-valued solutions. In this way, our method can be considered as a weakly-supervised discrete hashing framework which jointly learns image semantics and their corresponding binary codes. Through training on one million weakly annotated images, our experimental results demonstrate that image retrieval using the proposed hashing method outperforms the other state-of-the-art ones on image and video benchmarks.

【Keywords】: Discrete hashing; Image retrieval, Weakly-supervised Learning

509. Group Cost-Sensitive Boosting for Multi-Resolution Pedestrian Detection.

【Paper Link】【Pages】:3676-3682

【Authors】: Chao Zhu ; Yuxin Peng

【Abstract】: As an important yet challenging problem in computer vision, pedestrian detection has achieved impressive progress in recent years. However, the significant performance decline with decreasing resolution is a major bottleneck of current state-of-the-art methods. For the popular boosting-based detectors, one of the main reasons is that low resolution samples, which are usually more difficult to detect than high resolution ones, are treated by equal costs in the boosting process, leading to the consequence that they are more easily being rejected in early stages and can hardly be recovered in late stages as false negatives. To address this problem, we propose in this paper a new multi-resolution detection approach based on a novel group cost-sensitive boosting algorithm, which extends the popular AdaBoost by exploring different costs for different resolution groups in the boosting process, and places more emphases on low resolution group in order to better handle detection of hard samples. The proposed approach is evaluated on the challenging Caltech pedestrian benchmark, and outperforms other state-of-the-art on different resolution-specific test sets.

【Keywords】:

510. Learning Cross-Domain Neural Networks for Sketch-Based 3D Shape Retrieval.

【Paper Link】【Pages】:3683-3689

【Authors】: Fan Zhu ; Jin Xie ; Yi Fang

【Abstract】: Sketch-based 3D shape retrieval, which returns a set of relevant 3D shapes based on users' input sketch queries, has been receiving increasing attentions in both graphics community and vision community. In this work, we address the sketch-based 3D shape retrieval problem with a novel Cross-Domain Neural Networks (CDNN) approach, which is further extended to Pyramid Cross-Domain Neural Networks (PCDNN) by cooperating with a hierarchical structure. In order to alleviate the discrepancies between sketch features and 3D shape features, a neural network pair that forces identical representations at the target layer for instances of the same class is trained for sketches and 3D shapes respectively. By constructing cross-domain neural networks at multiple pyramid levels, a many-to-one relationship is established between a 3D shape feature and sketch features extracted from different scales. We evaluate the effectiveness of both CDNN and PCDNN approach on the extended large-scale SHREC 2014 benchmark and compare with some other well established methods. Experimental results suggest that both CDNN and PCDNN can outperform state-of-the-art performance, where PCDNN can further improve CDNN when employing a hierarchical structure.

【Keywords】: Cross-domain neural networks;sketch;3D shape retrieval

511. MC-HOG Correlation Tracking with Saliency Proposal.

【Paper Link】【Pages】:3690-3696

【Authors】: Guibo Zhu ; Jinqiao Wang ; Yi Wu ; Xiaoyu Zhang ; Hanqing Lu

【Abstract】: Designing effective feature and handling the model drift problem are two important aspects for online visual tracking. For feature representation, gradient and color features are most widely used, but how to effectively combine them for visual tracking is still an open problem. In this paper, we propose a rich feature descriptor, MC-HOG, by leveraging rich gradient information across multiple color channels or spaces. Then MC-HOG features are embedded into the correlation tracking framework to estimate the state of the target. For handling the model drift problem caused by occlusion or distracter, we propose saliency proposals as prior information to provide candidates and reduce background interference. In addition to saliency proposals, a ranking strategy is proposed to determine the importance of these proposals by exploiting the learnt appearance filter, historical preserved object samples and the distracting proposals. In this way, the proposed approach could effectively explore the color-gradient characteristics and alleviate the model drift problem. Extensive evaluations performed on the benchmark dataset show the superiority of the proposed method.

【Keywords】: visual tracking; correlation filter; saliency proposal;

512. Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks.

【Paper Link】【Pages】:3697-3704

【Authors】: Wentao Zhu ; Cuiling Lan ; Junliang Xing ; Wenjun Zeng ; Yanghao Li ; Li Shen ; Xiaohui Xie

【Abstract】: Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.

【Keywords】: Action recognition; recurrent neural network with LSTM; skeleton;

Special Track on Cognitive Systems 11

513. Using Multiple Representations to Simultaneously Learn Computational Thinking and Middle School Science.

【Paper Link】【Pages】:3705-3711

【Authors】: Satabdi Basu ; Gautam Biswas ; John S. Kinnebrew

【Abstract】: Computational Thinking (CT) is considered a core competency in problem formulation and problem solving. We have developed the Computational Thinking using Simulation and Modeling (CTSiM) learning environment to help middle school students learn science and CT concepts simultaneously. In this paper, we present an approach that leverages multiple linked representations to help students learn by constructing and analyzing computational models of science topics. Results from a recent study show that students successfully use the linked representations to become better modelers and learners.

【Keywords】: Computational Thinking; Science education; Learning by modeling; Scaffolding; Multiple external representations; Modeling and simulation

514. MIDCA: A Metacognitive, Integrated Dual-Cycle Architecture for Self-Regulated Autonomy.

【Paper Link】【Pages】:3712-3718

【Authors】: Michael T. Cox ; Zohreh Alavi ; Dustin Dannenhauer ; Vahid Eyorokon ; Hector Muñoz-Avila ; Don Perlis

【Abstract】: We present a metacognitive, integrated, dual-cycle architecture whose function is to provide agents with a greater capacity for acting robustly in a dynamic environment and managing unexpected events. We present MIDCA 1.3, an implementation of this architecture which explores a novel approach to goal generation, planning and execution given surprising situations. We formally define the mechanism and report empirical results from this goal generation algorithm. Finally, we describe the similarity between its choices at the cognitive level with those at the metacognitive.

【Keywords】: cognitive architecture; goal-driven autonomy; computational metacognition; cognitive robotics

515. Commonsense Interpretation of Triangle Behavior.

【Paper Link】【Pages】:3719-3725

【Authors】: Andrew S. Gordon

【Abstract】: The ability to infer intentions, emotions, and other unobservable psychological states from people's behavior is a hallmark of human social cognition, and an essential capability for future Artificial Intelligence systems. The commonsense theories of psychology and sociology necessary for such inferences have been a focus of logic-based knowledge representation research, but have been difficult to employ in robust automated reasoning architectures. In this paper we model behavior interpretation as a process of logical abduction, where the reasoning task is to identify the most probable set of assumptions that logically entail the observable behavior of others, given commonsense theories of psychology and sociology. We evaluate our approach using Triangle-COPA, a benchmark suite of 100 challenge problems based on an early social psychology experiment by Fritz Heider and Marianne Simmel. Commonsense knowledge of actions, social relationships, intentions, and emotions are encoded as defeasible axioms in first-order logic. We identify sets of assumptions that logically entail observed behaviors by backchaining with these axioms to a given depth, and order these sets by their joint probability assuming conditional independence. Our approach solves almost all (91) of the 100 questions in Triangle-COPA, and demonstrates a promising approach to robust behavior interpretation that integrates both logical and probabilistic reasoning.

【Keywords】: Commonsense Reasoning; Logical Abduction; Interpretation

516. Surprise-Triggered Reformulation of Design Goals.

【Paper Link】【Pages】:3726-3732

【Authors】: Kazjon Grace ; Mary Lou Maher

【Abstract】: This paper presents a cognitive model of goal formulation in designing that is triggered by surprise. Cognitive system approaches to design synthesis focus on generating alternative designs in response to design goals or requirements. Few existing systems provide models for how goals change during designing, a hallmark of creative design in humans. In this paper we present models of surprise and reformulation as metacognitive processes that transform design goals in order to explore surprising regions of a design search space. The model provides a system with specific goals for exploratory behaviour, whereas previous systems have modelled exploration and novelty-seeking abstractly. We use observed designs to construct a probabilistic model that represents expectations about the design domain, and then reason about the unexpectedness of new designs with that model. We implement our model in the domain of culinary creativity, and demonstrate how the cognitive behaviors of surprise and problem reformulation can be incorporated into design reasoning.

【Keywords】: computational design; goal reasoning; surprise; reformulation

517. Visual Learning of Arithmetic Operation.

【Paper Link】【Pages】:3733-3739

【Authors】: Yedid Hoshen ; Shmuel Peleg

【Abstract】: A simple Neural Network model is presented for end-to-end visual learning of arithmetic operations from pictures of numbers. The input consists of two pictures, each showing a 7-digit number. The output, also a picture, displays the number showing the result of an arithmetic operation (e.g., addition or subtraction) on the two input numbers. The concepts of a number, or of an operator, are not explicitly introduced. This indicates that addition is a simple cognitive task, which can be learned visually using a very small number of neurons. Other operations, e.g., multiplication, were not learnable using this architecture. Some tasks were not learnable end-to-end (e.g., addition with Roman numerals), but were easily learnable once broken into two separate sub-tasks: a perceptual Character Recognition and cognitive Arithmetic sub-tasks. This indicates that while some tasks may be easily learnable end-to-end, other may need to be broken into sub-tasks.

【Keywords】: Arithmetic; Perception; cognition; action cycle;End-to-end learning; Visual learning

518. Modeling Human Ad Hoc Coordination.

【Paper Link】【Pages】:3740-3746

【Authors】: Peter M. Krafft ; Chris L. Baker ; Alex Pentland ; Joshua B. Tenenbaum

【Abstract】: Whether in groups of humans or groups of computer agents, collaboration is most effective between individuals who have the ability to coordinate on a joint strategy for collective action. However, in general a rational actor will only intend to coordinate if that actor believes the other group members have the same intention. This circular dependence makes rational coordination difficult in uncertain environments if communication between actors is unreliable and no prior agreements have been made. An important normative question with regard to coordination in these ad hoc settings is therefore how one can come to believe that other actors will coordinate, and with regard to systems involving humans, an important empirical question is how humans arrive at these expectations. We introduce an exact algorithm for computing the infinitely recursive hierarchy of graded beliefs required for rational coordination in uncertain environments, and we introduce a novel mechanism for multiagent coordination that uses it. Our algorithm is valid in any environment with a finite state space, and extensions to certain countably infinite state spaces are likely possible. We test our mechanism for multiagent coordination as a model for human decisions in a simple coordination game using existing experimental data. We then explore via simulations whether modeling humans in this way may improve human-agent collaboration.

【Keywords】: common knowledge; common p-belief; inference; algorithms

519. Predicting Readers' Sarcasm Understandability by Modeling Gaze Behavior.

【Paper Link】【Pages】:3747-3753

【Authors】: Abhijit Mishra ; Diptesh Kanojia ; Pushpak Bhattacharyya

【Abstract】: Sarcasm understandability or the ability to understand textual sarcasm depends upon readers' language proficiency, social knowledge, mental state and attentiveness. We introduce a novel method to predict the sarcasm understandability of a reader. Presence of incongruity in textual sarcasm often elicits distinctive eye-movement behavior by human readers. By recording and analyzing the eye-gaze data, we show that eye-movement patterns vary when sarcasm is understood vis-à-vis when it is not. Motivated by our observations, we propose a system for sarcasm understandability prediction using supervised machine learning. Our system relies on readers' eye-movement parameters and a few textual features, thence, is able to predict sarcasm understandability with an F-score of 93%, which demonstrates its efficacy. The availability of inexpensive embedded-eye-trackers on mobile devices creates avenues for applying such research which benefits web-content creators, review writers and social media analysts alike.

【Keywords】: sarcasm understandability; eye movement;sarcasm understandability prediction;eye tracking; eye movement pattern;context incongruity;multi-instance learning

520. Modeling Human Understanding of Complex Intentional Action with a Bayesian Nonparametric Subgoal Model.

【Paper Link】【Pages】:3754-3760

【Authors】: Ryo Nakahashi ; Chris L. Baker ; Joshua B. Tenenbaum

【Abstract】: Most human behaviors consist of multiple parts, steps, or subtasks. These structures guide our ac- tion planning and execution, but when we observe others, the latent structure of their actions is typ- ically unobservable, and must be inferred in order to learn new skills by demonstration, or to as- sist others in completing their tasks. For example, an assistant who has learned the subgoal struc- ture of a colleague’s task can more rapidly rec- ognize and support their actions as they unfold. Here we model how humans infer subgoals from observations of complex action sequences using a nonparametric Bayesian model, which assumes that observed actions are generated by approxi- mately rational planning over unknown subgoal sequences. We test this model with a behavioral experiment in which humans observed diﬀerent se- ries of goal-directed actions, and inferred both the number and composition of the subgoal sequences associated with each goal. The Bayesian model predicts human subgoal inferences with high ac- curacy, and signiﬁcantly better than several al- ternative models and straightforward heuristics. Motivated by this result, we simulate how learn- ing and inference of subgoals can improve perfor- mance in an artiﬁcial user assistance task. The Bayesian model learns the correct subgoals from fewer observations, and better assists users by more rapidly and accurately inferring the goal of their actions than alternative approaches.

【Keywords】: Action Understanding; Social Cognition; Probabilistic Models of Cognition; User Assistance

521. Unsupervised Lexical Simplification for Non-Native Speakers.

【Paper Link】【Pages】:3761-3767

【Authors】: Gustavo H. Paetzold ; Lucia Specia

【Abstract】: Lexical Simplification is the task of replacing complex words with simpler alternatives. We propose a novel, unsupervised approach for the task. It relies on two resources: a corpus of subtitles and a new type of word embeddings model that accounts for the ambiguity of words. We compare the performance of our approach and many others over a new evaluation dataset, which accounts for the simplification needs of 400 non-native English speakers. The experiments show that our approach outperforms state-of-the-art work in Lexical Simplification.

【Keywords】: Lexical Simplification; Text Simplification; Text Adaptation; Word Embeddings

522. QART: A System for Real-Time Holistic Quality Assurance for Contact Center Dialogues.

【Paper Link】【Pages】:3768-3775

【Authors】: Shourya Roy ; Ragunathan Mariappan ; Sandipan Dandapat ; Saurabh Srivastava ; Sainyam Galhotra ; Balaji Peddamuthu

【Abstract】: Quality assurance (QA) and customer satisfaction (C-Sat) analysis are two commonly used practices to measure goodness of dialogues between agents and customers in contact centers. The practices however have a few shortcomings. QA puts sole emphasis on agents’ organizational compliance aspect whereas C-Sat attempts to measure customers’ satisfaction only based on post dialogue surveys. As a result, outcome of independent QA and C-Sat analysis may not always be in correspondence. Secondly, both processes are retrospective in nature and hence, evidences of bad past dialogues (and consequently bad customer experiences) can only be found after hours or days or weeks depending on their periodicity. Finally, human intensive nature of these practices lead to time and cost overhead while being able to analyze only a small fraction of dialogues. In this paper, we introduce an automatic real-time quality assurance system for contact centers — QART (pronounced cart). QART performs multi-faceted analysis on dialogue utterances, as they happen, using sophisticated statistical and rule-based natural language processing (NLP) techniques. It covers various aspects inspired by today’s QA and C-Sat practices as well as introduces novel incremental dialogue summarization capability. QART front-end is an interactive dashboard providing views of ongoing dialogues at different granularity enabling agents’ supervisors to monitor and take corrective actions as needed. We demonstrate effectiveness of different back-end modules as well as the overall system by experimental results on a real-life contact center chat dataset.

【Keywords】: Quality Assurace; C-Sat Analysis; Call Center Automation

523. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models.

【Paper Link】【Pages】:3776-3784

【Authors】: Iulian Vlad Serban ; Alessandro Sordoni ; Yoshua Bengio ; Aaron C. Courville ; Joelle Pineau

【Abstract】: We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Generative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.

【Keywords】: Dialogue Systems, Cognitive Systems, Neural Networks, Generative Probabilistic Models, Word Embeddings, Transfer Learning

Special Track on Computational Sustainability 22

524. Achieving Stable and Fair Profit Allocation with Minimum Subsidy in Collaborative Logistics.

【Paper Link】【Pages】:3785-3792

【Authors】: Lucas Agussurja ; Hoong Chuin Lau ; Shih-Fen Cheng

【Abstract】: With the advent of e-commerce, logistics providers are faced with the challenge of handling fluctuating and sparsely distributed demand, which raises their operational costs significantly. As a result, horizontal cooperation are gaining momentum around the world. One of the major impediments, however, is the lack of stable and fair profit sharing mechanism. In this paper, we address this problem using the framework of computational cooperative games. We first present cooperative vehicle routing game as a model for collaborative logistics operations. Using the axioms of Shapley value as the conditions for fairness, we show that a stable, fair and budget balanced allocation does not exist in many instances of the game. By relaxing budget balance, we then propose an allocation scheme based on the normalized Shapley value. We show that this scheme maintains stability and fairness while requiring minimum subsidy. Finally, using numerical experiments we demonstrate the feasibility of the scheme under various settings.

【Keywords】: cooperative game theory; Shapley value; profit sharing; collaborative logistics

525. Understanding City Traffic Dynamics Utilizing Sensor and Textual Observations.

【Paper Link】【Pages】:3793-3799

【Authors】: Pramod Anantharam ; Krishnaprasad Thirunarayan ; Surendra Marupudi ; Amit P. Sheth ; Tanvi Banerjee

【Abstract】: Understanding speed and travel-time dynamics in response to various city related events is an important and challenging problem. Sensor data (numerical) containing average speed of vehicles passing through a road link can be interpreted in terms of traffic related incident reports from city authorities and social media data (textual), providing a complementary understanding of traffic dynamics. State-of-the-art research is focused on either analyzing sensor observations or citizen observations; we seek to exploit both in a synergistic manner. We demonstrate the role of domain knowledge in capturing the non-linearity of speed and travel-time dynamics by segmenting speed and travel-time observations into simpler components amenable to description using linear models such as Linear Dynamical System (LDS). Specifically, we propose Restricted Switching Linear Dynamical System (RSLDS) to model normal speed and travel time dynamics and thereby characterize anomalous dynamics. We utilize the city traffic events extracted from text to explain anomalous dynamics. We present a large scale evaluation of the proposed approach on a real-world traffic and twitter dataset collected over a year with promising results.

【Keywords】: Time Series Analysis, Linear Dynamical Systems, Anomaly Detection, Social Data, Sensor Data, Traffic Analytics

526. An Axiomatic Framework for Ex-Ante Dynamic Pricing Mechanisms in Smart Grid.

【Paper Link】【Pages】:3800-3806

【Authors】: Sambaran Bandyopadhyay ; Ramasuri Narayanam ; Pratyush Kumar ; Sarvapali Ramchurn ; Vijay Arya ; Iskandarbin Petra

【Abstract】: In electricity markets, the choice of the right pricing regime is crucial for the utilities because the price they charge to their consumers, in anticipation of their demand in real-time, is a key determinant of their profits and ultimately their survival in competitive energy markets. Among the existing pricing regimes, in this paper, we consider ex-ante dynamic pricing schemes as (i) they help to address the peak demand problem (a crucial problem in smart grids), and (ii) they are transparent and fair to consumers as the cost of electricity can be calculated before the actual consumption. In particular, we propose an axiomatic framework that establishes the conceptual underpinnings of the class of ex-ante dynamic pricing schemes. We first propose five key axioms that reflect the criteria that are vital for energy utilities and their relationship with consumers. We then prove an impossibility theorem to show that there is no pricing regime that satisfies all the five axioms simultaneously. We also study multiple cost functions arising from various pricing regimes to examine the subset of axioms that they satisfy. We believe that our proposed framework in this paper is first of its kind to evaluate the class of ex-ante dynamic pricing schemes in a manner that can be operationalised by energy utilities.

【Keywords】: Smart Grid; Ex-ante Dynamic Pricing; Axiomatic Framework; Impossibility Theorem; Demand Response

527. Multi-Instance Multi-Label Class Discovery: A Computational Approach for Assessing Bird Biodiversity.

【Paper Link】【Pages】:3807-3813

【Authors】: Forrest Briggs ; Xiaoli Z. Fern ; Raviv Raich ; Matthew Betts

【Abstract】: We study the problem of analyzing a large volume ofbioacoustic data collected in-situ with the goal of assessingthe biodiversity of bird species at the data collectionsite. We are interested in the class discoveryproblem for this setting. Specifically, given a large collectionof audio recordings containing bird and othersounds, we aim to automatically select a fixed size subsetof the recordings for human expert labeling suchthat the maximum number of species/classes is discovered.We employ a multi-instance multi-label representationto address multiple simultaneously vocalizingbirds with sounds that overlap in time, and proposenew algorithms for species/class discovery using thisrepresentation. In a comparative study, we show that theproposed methods discover more species/classes thancurrent state-of-the-art in a real world datasetof 92,095 ten-second recordings collected in field conditions.

【Keywords】:

528. Multiagent-Based Route Guidance for Increasing the Chance of Arrival on Time.

【Paper Link】【Pages】:3814-3820

【Authors】: Zhiguang Cao ; Hongliang Guo ; Jie Zhang ; Ulrich Fastenrath

【Abstract】: Transportation and mobility are central to sustainable urban development, where multiagent-based route guidance is widely applied. Traditional multiagent-based route guidance always seeks LET (least expected travel time) paths. However, drivers usually have specific expectations, i.e., tight or loose deadlines, which may not be all met by LET paths. We thus adopt and extend the probability tail model that aims to maximize the probability of reaching destinations before deadlines. Specifically, we propose a decentralized multiagent approach, where infrastructure agents locally collect intentions of concerned vehicle agents and formulate route guidance as a route assignment problem, to guarantee their arrival on time. Experimental results on real road networks justify its ability to increase the chance of arrival on time.

【Keywords】: Multiagent-based Route Guidance; Probability Tail Model; Intelligent Transportation System

529. Understanding Dominant Factors for Precipitation over the Great Lakes Region.

【Paper Link】【Pages】:3821-3827

【Authors】: Soumyadeep Chatterjee ; Stefan Liess ; Arindam Banerjee ; Vipin Kumar

【Abstract】: Statistical modeling of local precipitation involves understanding local, regional and global factors informative of precipitation variability in a region. Modern machine learning methods for feature selection can potentially be explored for identifying statistically significant features from pool of potential predictors of precipitation. In this work, we consider sparse regression, which simultaneously performs feature selection and regression, followed by random permutation tests for selecting dominant factors. We consider average winter precipitation over Great Lakes Region in order to identify its dominant influencing factors.Experiments show that global climate indices, computed at different temporal lags, offer predictive information for winter precipitation. Further, among the dominant factors identified using randomized permutation tests, multiple climate indices indicate the influence of geopotential height patterns on winter precipitation.Using composite analysis, we illustrate that certain patterns are indeed typical in high and low precipitation years, and offer plausible scientific reasons for variations in precipitation.Thus, feature selection methods can be useful in identifying influential climate processes and variables, and thereby provide useful hypotheses over physical mechanisms affecting local precipitation.

【Keywords】: sparse regression; climate science; feature selection

530. A Unifying Variational Inference Framework for Hierarchical Graph-Coupled HMM with an Application to Influenza Infection.

【Paper Link】【Pages】:3828-3834

【Authors】: Kai Fan ; Chunyuan Li ; Katherine A. Heller

【Abstract】: The Hierarchical Graph-Coupled Hidden Markov Model (hGCHMM) is a useful tool for tracking and predicting the spread of contagious diseases, such as influenza, by leveraging social contact data collected from individual wearable devices. However, the existing inference algorithms depend on the assumption that the infection rates are small in probability, typically close to 0. The purpose of this paper is to build a unified learning framework for latent infection state estimation for the hGCHMM, regardless of the infection rate and transition function. We derive our algorithm based on a dynamic auto-encoding variational inference scheme, thus potentially generalizing the hGCHMM to models other than those that work on highly contagious diseases. We experimentally compare our approach with previous Gibbs EM algorithms and standard variational method mean-field inference, on both semi-synthetic data and app collected epidemiological and social records.

【Keywords】: Variational Inference; hGCHMM; Influenza Infection

531. Topic Models to Infer Socio-Economic Maps.

【Paper Link】【Pages】:3835-3841

【Authors】: Lingzi Hong ; Enrique Frías-Martínez ; Vanessa Frías-Martínez

【Abstract】: Socio-economic maps contain important information regarding the population of a country. Computing these maps is critical given that policy makers often times make important decisions based upon such information. However, the compilation of socio-economic maps requires extensive resources and becomes highly expensive. On the other hand, the ubiquitous presence of cell phones, is generating large amounts of spatiotemporal data that can reveal human behavioral traits related to specific socio-economic characteristics. Traditional inference approaches have taken advantage of these datasets to infer regional socio-economic characteristics. In this paper, we propose a novel approach whereby topic models are used to infer socio-economic levels from large-scale spatio-temporal data. Instead of using a pre-determined set of features, we use latent Dirichlet Allocation (LDA) to extract latent recurring patterns of co-occurring behaviors across regions, which are then used in the prediction of socio-economic levels. We show that our approach improves state of the art prediction results by 9%.

【Keywords】: topic models; spatio-temporal data;mobility patterns;natural disasters

532. Energy- and Cost-Efficient Pumping Station Control.

【Paper Link】【Pages】:3842-3848

【Authors】: Timon V. Kanters ; Frans A. Oliehoek ; Michael Kaisers ; Stan R. van den Bosch ; Joep Grispen ; Jeroen Hermans

【Abstract】: With renewable energy becoming more common, energy prices fluctuate more depending on environmental factors such as the weather. Consuming energy without taking volatile prices into consideration can not only become expensive, but may also increase the peak load, which requires energy providers to generate additional energy using less environment-friendly methods. In the Netherlands, pumping stations that maintain the water levels of polder canals are large energy consumers, but the controller software currently used in the industry does not take real-time energy availability into account. We investigate if existing AI planning techniques have the potential to improve upon the current solutions. In particular, we propose a light weight but realistic simulator and investigate if an online planning method (UCT) can utilise this simulator to improve the cost-efficiency of pumping station control policies. An empirical comparison with the current control algorithms indicates that substantial cost, and thus peak load, reduction can be attained.

【Keywords】: energy-efficient; cost-efficient; planning; mcts; monte-carlo tree search; uct; pumping stations; water system; weather; uncertainty; energy prices; real-world application; sequential decision problem; pumping station control

533. Shortest Path Based Decision Making Using Probabilistic Inference.

【Paper Link】【Pages】:3849-3856

【Authors】: Akshat Kumar

【Abstract】: We present a new perspective on the classical shortest path routing (SPR) problem in graphs. We show that the SPR problem can be recast to that of probabilistic inference in a mixture of simple Bayesian networks. Maximizing the likelihood in this mixture becomes equivalent to solving the SPR problem. We develop the well known Expectation-Maximization (EM) algorithm for the SPR problem that maximizes the likelihood, and show that it does not get stuck in a locally optimal solution. Using the same probabilistic framework, we then address an NP-Hard network design problem where the goal is to repair a network of roads post some disaster within a fixed budget such that the connectivity between a set of nodes is optimized. We show that our likelihood maximization approach using the EM algorithm scales well for this problem taking the form of message-passing among nodes of the graph, and provides significantly better quality solutions than a standard mixed-integer programming solver.

【Keywords】:

534. Robust Decision Making for Stochastic Network Design.

【Paper Link】【Pages】:3857-3863

【Authors】: Akshat Kumar ; Arambam James Singh ; Pradeep Varakantham ; Daniel Sheldon

【Abstract】: We address the problem of robust decision making for stochastic network design. Our work is motivated by spatial conservation planning where the goal is to take management decisions within a fixed budget to maximize the expected spread of a population of species over a network of land parcels. Most previous work for this problem assumes that accurate estimates of different network parameters (edge activation probabilities, habitat suitability scores) are available, which is an unrealistic assumption. To address this shortcoming, we assume that network parameters are only partially known, specified via interval bounds. We then develop a decision making approach that computes the solution with minimax regret. We provide new theoretical results regarding the structure of the minmax regret solution which help develop a computationally efficient approach. Empirically, we show that previous approaches that work on point estimates of network parameters result in high regret on several standard benchmarks, while our approach provides significantly more robust solutions.

【Keywords】:

535. Optimizing Infrastructure Enhancements for Evacuation Planning.

【Paper Link】【Pages】:3864-3870

【Authors】: Kanal Kumar ; Julia Romanski ; Pascal Van Hentenryck

【Abstract】: With rapid population growth and urbanization, emergency services in various cities around the world worry that the current transportation infrastructure is no longer adequate for large-scale evacuations. This paper considers how to mitigate this issue through infrastructure upgrades, such as the additions of lanes to road segments and the raising of bridges and roads. The paper proposes a MIP model for deciding the most effective infrastructure upgrades as well as a Benders decomposition approach where the master problem jointly plans the upgrades and evacuation routes and the subproblem schedules the evacuation itself. Experimental results demonstrate the practicability of the approach on a real case study, filling a significant need for emergencies services.

【Keywords】:

536. Spatially Regularized Streaming Sensor Selection.

【Paper Link】【Pages】:3871-3879

【Authors】: Changsheng Li ; Fan Wei ; Weishan Dong ; Xiangfeng Wang ; Junchi Yan ; Xiaobin Zhu ; Qingshan Liu ; Xin Zhang

【Abstract】: Sensor selection has become an active topic aimed at energy saving, information overload prevention, and communication cost planning in sensor networks. In many real applications, often the sensors' observation regions have overlaps and thus the sensor network is inherently redundant. Therefore it is important to select proper sensors to avoid data redundancy. This paper focuses on how to incrementally select a subset of sensors in a streaming scenario to minimize information redundancy, and meanwhile meet the power consumption constraint. We propose to perform sensor selection in a multi-variate interpolation framework, such that the data sampled by the selected sensors can well predict those of the inactive sensors. Importantly, we incorporate sensors' spatial information as two regularizers, which leads to significantly better prediction performance. We also define a statistical variable to store sufficient information for incremental learning, and introduce a forgetting factor to track sensor streams' evolvement. Experiments on both synthetic and real datasets validate the effectiveness of the proposed method. Moreover, our method is over 10 times faster than the state-of-the-art sensor selection algorithm.

【Keywords】:

537. Preventing Illegal Logging: Simultaneous Optimization of Resource Teams and Tactics for Security.

【Paper Link】【Pages】:3880-3886

【Authors】: Sara Marie McCarthy ; Milind Tambe ; Christopher Kiekintveld ; Meredith L. Gore ; Alex Killion

【Abstract】: Green security — protection of forests, fish and wildlife — is a critical problem in environmental sustainability. We focus on the problem of optimizing the defense of forests againstillegal logging, where often we are faced with the challenge of teaming up many different groups, from national police to forest guards to NGOs, each with differing capabilities and costs. This paper introduces a new, yet fundamental problem: SimultaneousOptimization of Resource Teams and Tactics (SORT). SORT contrasts with most previous game-theoretic research for green security — in particular based onsecurity games — that has solely focused on optimizing patrolling tactics, without consideration of team formation or coordination. We develop new models and scalable algorithms to apply SORT towards illegal logging in large forest areas. We evaluate our methods on a variety of synthetic examples, as well as a real-world case study using data from our on-going collaboration in Madagascar .

【Keywords】: Game Theory; Multi-Agents; Security

538. Big-Data Mechanisms and Energy-Policy Design.

【Paper Link】【Pages】:3887-3893

【Authors】: Ankit Pat ; Kate Larson ; Srinivasen Keshav

【Abstract】: A confluence of technical, economic and political forces are revolutionizing the energy sector. Policy-makers, who decide on incentives and penalties for possible courses of actions, play a critical role in determining which outcomes arise. However, designing appropriate energy policies is a complex and challenging task. Our vision is to provide tools and methodologies for policy makers so that they can leverage the power of big data to make evidence-based decisions. In this paper we present an approach we call big-data mechanism design which combines a mechanism design framework with stakeholder surveys and data to allow policy-makers to gauge the costs and benefits of potential policy decisions.We illustrate the effectiveness of this approach in a concrete application domain: the peaksaver PLUS program in Ontario, Canada.

【Keywords】: energy policy, mechanism design, utilities, preference elicitation

539. Benders Decomposition for Large-Scale Prescriptive Evacuations.

【Paper Link】【Pages】:3894-3900

【Authors】: Julia Romanski ; Pascal Van Hentenryck

【Abstract】: This paper considers prescriptive evacuation planning for a region threatened by a natural disaster such a flood, a wildfire, or a hurricane. It proposes a Benders decomposition that generalizes the two-stage approach proposed in earlier work for convergent evacuation plans. Experimental results show that Benders decomposition provides significant improvements in solution quality in reasonable time: It finds provably optimal solutions to scenarios considered in prior work, closing these instances, and increases the number of evacuees by 10 to 15% on average on more complex flood scenarios.

【Keywords】:

540. Predicting Spatio-Temporal Propagation of Seasonal Influenza Using Variational Gaussian Process Regression.

【Paper Link】【Pages】:3901-3907

【Authors】: Ransalu Senanayake ; Simon Timothy O'Callaghan ; Fabio Ramos

【Abstract】: Understanding and predicting how influenza propagates is vital to reduce its impact. In this paper we develop a nonparametric model based on Gaussian process (GP) regression to capture the complex spatial and temporal dependencies present in the data. A stochastic variational inference approach was adopted to address scalability. Rather than modeling the problem as a time-series as in many studies, we capture the space-time dependencies by combining different kernels. A kernel averaging technique which converts spatially-diffused point processes to an area process is proposed to model geographical distribution. Additionally, to accurately model the variable behavior of the time-series, the GP kernel is further modified to account for non-stationarity and seasonality. Experimental results on two datasets of state-wide US weekly flu-counts consisting of 19,698 and 89,474 data points, ranging over several years, illustrate the robustness of the model as a tool for further epidemiological investigations.

【Keywords】: Gaussian process; variational inference; big data; spatio-temporal model; influenza

541. Intelligent Habitat Restoration Under Uncertainty.

【Paper Link】【Pages】:3908-3914

【Authors】: Tommaso Urli ; Jana Brotánková ; Philip Kilby ; Pascal Van Hentenryck

【Abstract】: Conservation is an ethic of sustainable use of natural resources which focuses on the preservation of biodiversity, i.e., the degree of variation of life. Conservation planning seeks to reach this goal by means of deliberate actions, aimed at the protection (or restoration) of biodiversity features. In this paper we present an intelligent system to assist conservation managers in planning habitat restoration actions, with focus on the activities to be carried out in the islands of the Great Barrier Reef (QLD) and the Pilbara (WA) regions of Australia. In particular, we propose a constrained optimisation formulation of the habitat restoration planning (HRP) problem, capturing aspects such as population dynamics and uncertainty. We show that the HRP is NP-hard, and develop a constraint programming (CP) model and a large neighbourhood search (LNS) procedure to generate activity plans under budgeting constraints.

【Keywords】: Habitat Restoration Planning; Conservation Planning; Constraint programming; Large neighbourhood search; Uncertainty; Optimisation

542. Adaptable Regression Method for Ensemble Consensus Forecasting.

【Paper Link】【Pages】:3915-3921

【Authors】: John K. Williams ; Peter P. Neilley ; Joseph P. Koval ; Jeff McDonald

【Abstract】: Accurate weather forecasts enhance sustainability by facilitating decision making across a broad range of endeavors including public safety, transportation, energy generation and management, retail logistics, emergency preparedness, and many others. This paper presents a method for combining multiple scalar forecasts to obtain deterministic predictions that are generally more accurate than any of the constituents. Exponentially-weighted forecast bias estimates and error covariance matrices are formed at observation sites, aggregated spatially and temporally, and used to formulate a constrained, regularized least squares regression problem that may be solved using quadratic programming. The model is re-trained when new observations arrive, updating the forecast bias estimates and consensus combination weights to adapt to weather regime and input forecast model changes. The algorithm is illustrated for 0-72 hour temperature forecasts at over 1200 sites in the contiguous U.S. based on a 22-member forecast ensemble, and its performance over multiple seasons is compared to a state-of-the-art ensemble-based forecasting system. In addition to weather forecasts, this approach to consensus may be useful for ensemble predictions of climate, wind energy, solar power, energy demand, and numerous other quantities.

【Keywords】: ensemble consensus; optimization; forecasting; prediction; regression; regularization; quadratic programming

543. Optimizing Resilience in Large Scale Networks.

【Paper Link】【Pages】:3922-3928

【Authors】: XiaojJan Wu ; Daniel Sheldon ; Shlomo Zilberstein

【Abstract】: We propose a decision making framework to optimize the resilience of road networks to natural disasters such as floods. Our model generalizes an existing one for this problem by allowing roads with a broad class of stochastic delay models. We then present a fast algorithm based on the sample average approximation (SAA) method and network design techniques to solve this problem approximately. On a small existing benchmark, our algorithm produces near-optimal solutions and the SAA method converges quickly with a small number of samples. We then apply our algorithm to a large real-world problem to optimize the resilience of a road network to failures of stream crossing structures to minimize travel times of emergency medical service vehicles. On medium-sized networks, our algorithm obtains solutions of comparable quality to a greedy baseline method but is 30–60 times faster. Our algorithm is the only existing algorithm that can scale to the full network, which has many thousands of edges.

【Keywords】: Network Design; Resilience Optimization;

544. Transfer Learning from Deep Features for Remote Sensing and Poverty Mapping.

【Paper Link】【Pages】:3929-3935

【Authors】: Michael Xie ; Neal Jean ; Marshall Burke ; David Lobell ; Stefano Ermon

【Abstract】: The lack of reliable data in developing countries is a major obstacle to sustainable development, food security, and disaster relief. Poverty data, for example, is typically scarce, sparse in coverage, and labor-intensive to obtain. Remote sensing data such as high-resolution satellite imagery, on the other hand, is becoming increasingly available and inexpensive. Unfortunately, such data is highly unstructured and currently no techniques exist to automatically extract useful insights to inform policy decisions and help direct humanitarian efforts. We propose a novel machine learning approach to extract large-scale socioeconomic indicators from high-resolution satellite imagery. The main challenge is that training data is very scarce, making it difficult to apply modern techniques such as Convolutional Neural Networks (CNN). We therefore propose a transfer learning approach where nighttime light intensities are used as a data-rich proxy. We train a fully convolutional CNN model to predict nighttime lights from daytime imagery, simultaneously learning features that are useful for poverty prediction. The model learns filters identifying different terrains and man-made structures, including roads, buildings, and farmlands, without any supervision beyond nighttime lights. We demonstrate that these learned features are highly informative for poverty mapping, even approaching the predictive performance of survey data collected in the field.

【Keywords】: Deep Learning, Computer Vision, Poverty Mapping, Remote Sensing, Satellite Imagery, Transfer Learning

545. An Algorithm to Coordinate Measurements Using Stochastic Human Mobility Patterns in Large-Scale Participatory Sensing Settings.

【Paper Link】【Pages】:3936-3943

【Authors】: Alexandros Zenonos ; Sebastian Stein ; Nicholas R. Jennings

【Abstract】: Participatory sensing is a promising new low-cost approach for collecting environmental data. However, current large-scale environmental participatory sensing campaigns typically do not coordinate the measurements of participants, which can lead to gaps or redundancy in the collected data. While some work has considered this problem, it has made several unrealistic assumptions. In particular, it assumes that complete and accurate knowledge about the participants future movements is available and it does not consider constraints on the number of measurements a user is willing to take. To address these shortcomings, we develop a computationally-efficient coordination algorithm (Best-match) to suggest to users where and when to take measurements. Our algorithm exploits human mobility patterns, but explicitly considers the inherent uncertainty of these patterns. We empirically evaluate our algorithm on a real-world human mobility and air quality dataset and show that it outperforms the state-of-the-art greedy and pull-based proximity algorithms in dynamic environments.

【Keywords】: Participatory sensing; Coordination; Gaussian processes; Emissions; Air quality; Human mobility; Dynamic environments

Special Track on Integrated AI Capabilities 3

546. Bagging Ensembles for the Diagnosis and Prognostication of Alzheimer's Disease.

【Paper Link】【Pages】:3944-3950

【Authors】: Peng Dai ; Femida Gwadry-Sridhar ; Michael Bauer ; Michael Borrie

【Abstract】: Alzheimer's disease (AD) is a chronic neurodegenerative disease, which involves the degeneration of various brain functions, resulting in memory loss, cognitive disorder and death. Large amounts of multivariate heterogeneous medical test data are available for the analysis of brain deterioration. How to measure the deterioration remains a challenging problem. In this study, we first investigate how different regions of the human brain change as the patient develops AD. Correlation analysis and feature ranking are performed based on the feature vectors from different stages of the pathologic process in Alzheimer disease. Then, an automatic diagnosis system is presented, which is based on a hybrid manifold learning for feature embedding and the bootstrap aggregating (Bagging) algorithm for classification.We investigate two different tasks, i.e. diagnosis and progression prediction. Extensive comparison is made against Support Vector Machines (SVM), Random Forest (RF), Decision Tree (DT) and Random Subspace (RS) methods. Experimental results show that our proposed algorithm yields superior results when compared to the other methods, suggesting promising robustness for possible clinical applications.

【Keywords】: Manifold Learning; Alzheimer's Disease; Ensemble Learning; Aging; Diagnosis and Prognostication

【Paper Link】【Pages】:3951-3957

【Authors】: Goren Gordon ; Samuel Spaulding ; Jacqueline Kory Westlund ; Jin Joo Lee ; Luke Plummer ; Marayna Martinez ; Madhurima Das ; Cynthia Breazeal

【Abstract】: Though substantial research has been dedicated towards using technology to improve education, no current methods are as effective as one-on-one tutoring. A critical, though relatively understudied, aspect of effective tutoring is modulating the student's affective state throughout the tutoring session in order to maximize long-term learning gains. We developed an integrated experimental paradigm in which children play a second-language learning game on a tablet, in collaboration with a fully autonomous social robotic learning companion. As part of the system, we measured children's valence and engagement via an automatic facial expression analysis system. These signals were combined into a reward signal that fed into the robot's affective reinforcement learning algorithm. Over several sessions, the robot played the game and personalized its motivational strategies (using verbal and non-verbal actions) to each student. We evaluated this system with 34 children in preschool classrooms for a duration of two months. We saw that (1) children learned new words from the repeated tutoring sessions, (2) the affective policy personalized to students over the duration of the study, and (3) students who interacted with a robot that personalized its affective feedback strategy showed a significant increase in valence, as compared to students who interacted with a non-personalizing robot. This integrated system of tablet-based educational content, affective sensing, affective policy learning, and an autonomous social robot holds great promise for a more comprehensive approach to personalized tutoring.

【Keywords】: affective computing;children education;personal robot;robot tutor;second language learning

548. A Framework for Resolving Open-World Referential Expressions in Distributed Heterogeneous Knowledge Bases.

【Paper Link】【Pages】:3958-3965

【Authors】: Tom Williams ; Matthias Scheutz

【Abstract】: We present a domain-independent approach to reference resolution that allows a robotic or virtual agent to resolve references to entities (e.g., objects and locations) found in open worlds when the information needed to resolve such references is distributed among multiple heterogeneous knowledge bases in its architecture. An agent using this approach can combine information from multiple sources without the computational bottleneck associated with centralized knowledge bases. The proposed approach also facilitates “lazy constraint evaluation”, i.e., verifying properties of the referent through different modalities only when the information is needed. After specifying the interfaces by which a reference resolution algorithm can request information from distributed knowledge bases, we present an algorithm for performing open-world reference resolution within that framework, analyze the algorithm’s performance, and demonstrate its behavior on a simulated robot.

【Keywords】: natural language understanding; human-robot interaction; reference resolution; open worlds; givenness hierarchy; integrated systems

Innovative Applications Deployed Papers 3

549. Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security.

【Paper Link】【Pages】:3966-3973

【Authors】: Fei Fang ; Thanh Hong Nguyen ; Rob Pickles ; Wai Y. Lam ; Gopalasamy R. Clements ; Bo An ; Amandeep Singh ; Milind Tambe ; Andrew Lemieux

【Abstract】: Poaching is a serious threat to the conservation of key species and whole ecosystems. While conducting foot patrols is the most commonly used approach in many countries to prevent poaching, such patrols often do not make the best use of limited patrolling resources. To remedy this situation, prior work introduced a novel emerging application called PAWS (Protection Assistant for Wildlife Security); PAWS was proposed as a game-theoretic (``security games'') decision aid to optimize the use of patrolling resources. This paper reports on PAWS's significant evolution from a proposed decision aid to a regularly deployed application, reporting on the lessons from the first tests in Africa in Spring 2014, through its continued evolution since then, to current regular use in Southeast Asia and plans for future worldwide deployment. In this process, we have worked closely with two NGOs (Panthera and Rimba) and incorporated extensive feedback from professional patrolling teams. We outline key technical advances that lead to PAWS's regular deployment: (i) incorporating complex topographic features, e.g., ridgelines, in generating patrol routes; (ii) handling uncertainties in species distribution (game theoretic payoffs); (iii) ensuring scalability for patrolling large-scale conservation areas with fine-grained guidance; and (iv) handling complex patrol scheduling constraints.

【Keywords】: Game Theory; Computational Sustainability; Wildlife Protection; Security Games; Deployed Application

550. Ontology Re-Engineering: A Case Study from the Automotive Industry.

【Paper Link】【Pages】:3974-3981

【Authors】: Nestor Rychtyckyj ; Venkatesh Raman ; Baskaran Sankaranarayanan ; P. Sreenivasa Kumar ; Deepak Khemani

【Abstract】: For over twenty five years Ford has been utilizing an AI-based system to manage process planning for vehicle assembly at our assembly plants around the world. The scope of the AI system, known originally as the Direct Labor Management System and now as the Global Study Process Allocation System (GSPAS),has increased over the years to include additional functionality on Ergonomics and Powertrain Assembly (Engine and Transmission plants). The knowledge about Ford’s manufacturing processes is contained in an ontology originally developed using the KL-ONE representation language and methodology. To preserve the viability of the GSPAS ontology and to make it easily usable for other applications within Ford, we needed to re-engineer and convert the KL-ONE ontology into a semantic web OWL/RDF format. In this paper, we will discuss the process by which we re-engineered the existing GSPAS KL-ONE ontology and deployed semantic web technology in our application.

【Keywords】:

【Paper Link】【Pages】:3982-3990

【Authors】: Adam Sadilek ; Henry A. Kautz ; Lauren DiPrete ; Brian Labus ; Eric Portman ; Jack Teitel ; Vincent Silenzio

【Abstract】: Foodborne illness afflicts 48 million people annually in the U.S.alone. Over 128,000 are hospitalized and 3,000 die from the infection.While preventable with proper food safety practices, the traditional restaurant inspection process has limited impact given the predictability and low frequency of inspections, and the dynamic nature of the kitchen environment. Despite this reality, the inspection process has remained largely unchanged for decades. We apply machine learning to Twitter data and develop a system that automatically detects venues likely to pose a public health hazard.Health professionals subsequently inspect individual flagged venues in a double blind experiment spanning the entire Las Vegas metropolitan area over three months. By contrast, previous research in this domain has been limited to indirect correlative validation using only aggregate statistics. We show that adaptive inspection process is 63% more effective at identifying problematic venues than the current state of the art. The live deployment shows that if every inspection in Las Vegas became adaptive, we can prevent over 9,000 cases of foodborne illness and 557 hospitalizations annually. Additionally,adaptive inspections result in unexpected benefits, including the identification of venues lacking permits, contagious kitchen staff,and fewer customer complaints filed with the Las Vegas health department.

【Keywords】: NLP, computation epidemiology, human computation, social media, location

Innovative Applications Emerging Application Papers 11

552. An Autonomous Override System to Prevent Airborne Loss of Control.

【Paper Link】【Pages】:3991-3996

【Authors】: Sweewarman Balachandran ; Ella Atkins

【Abstract】: Loss of Control (LOC) is the most common precursor to aircraft accidents. This paper presents a Flight Safety Assessment and Management (FSAM) decision system to reduce in-flight LOC risk. FSAM nominally serves as a monitor to detect conditions that pose LOC risk, automatically activating the appropriate control authority if necessary to prevent LOC and restore a safe operational state. This paper contributes an efficient Markov Decision Process (MDP) formulation for FSAM. The state features capture risk associated with aircraft dynamics, configuration, health, pilot behavior and weather. The reward function trades cost of inaction against the cost of overriding the current control authority. A sparse sampling algorithm obtains a near-optimal solution for the MDP online. This approach enables the FSAM MDP to incorporate dynamically changing flight envelope and environment constraints into decision-making. Case studies based on real-world aviation incidents are presented.

【Keywords】: MDP; Sparse Sampling

553. Document Type Classification in Online Digital Libraries.

【Paper Link】【Pages】:3997-4002

【Authors】: Cornelia Caragea ; Jian Wu ; Sujatha Das Gollapalli ; C. Lee Giles

【Abstract】: Online digital libraries make it easier for researchers to search for scientific information. They have been proven as powerful resources in many data mining, machine learning and information retrieval applications that require high-quality data. The quality of the data highly depends on the accuracy of classifiers that identify the types of documents that are crawled from the Web, e.g., as research papers, slides, books, etc., for appropriate indexing. These classifiers in turn depend on the choice of the feature representation. We propose novel features that result in high-accuracy classifiers for document type classification. Experimental results on several datasets show that our classifiers outperform models that are employed in current systems.

【Keywords】: Supervised learning; Document type classification; Structural features; Scholarly digital libraries

554. Data-Augmented Software Diagnosis.

【Paper Link】【Pages】:4003-4009

【Authors】: Amir Elmishali ; Roni Stern ; Meir Kalech

【Abstract】: Software fault prediction algorithms predict which software components is likely to contain faults using machine learning techniques. Software diagnosis algorithm identify the faulty software components that caused a failure using model-based or spectrum based approaches. We show how software fault prediction algorithms can be used to improve software diagnosis. The resulting data-augmented diagnosis algorithm overcomes key problems in software diagnosis algorithms: ranking diagnoses and distinguishing between diagnoses with high probability and low probability. We demonstrate the efficiency of the proposed approach empirically on three open sources domains, showing significant increase in accuracy of diagnosis and efficiency of troubleshooting. These encouraging results suggests broader use of data-driven methods to complement and improve existing model-based methods.

【Keywords】: Software diagnosis; fault prediction

555. Automated Regression Testing Using Constraint Programming.

【Paper Link】【Pages】:4010-4015

【Authors】: Arnaud Gotlieb ; Mats Carlsson ; Marius Liaaen ; Dusica Marijan ; Alexandre Petillon

【Abstract】: In software validation, regression testing aims to check the absence of regression faults in new releases of a software system. Typically, test cases used in regression testing are executed during a limited amount of time and are selected to check a given set of user requirements. When testing large systems, the number of regression tests grows quickly over the years, and yet the available time slot stays limited. In order to overcome this problem, an approach known as test suite reduction (TSR), has been developed in software engineering to select a smallest subset of test cases, so that each requirement remains covered at least once. However solving the TSR problem is difficult as the underlying optimization problem is NP-hard, but it is also crucial for vendors interested in reducing the time to market of new software releases. In this paper, we address regression testing and TSR with Constraint Programming (CP). More specifically, we propose new CP models to solve TSR that exploit global constraints, namely NValue and GCC. We reuse a set of preprocessing rules to reduce a priori each instance, and we introduce a structure-aware search heuristic. We evaluated our CP models and proposed improvements against existing approaches, including a simple greedy approach and MINTS, the state-of-the-art tool of the software engineering community. Our experiments show that CP outperforms both the greedy approach and MINTS when it is interfaced with MiniSAT, in terms of percentage of reduction and execution time. When MINTS is interfaced with CPLEX, we show that our CP model performs better only on percentage of reduction. Finally, by working closely with validation engineers from Cisco Systems, Norway, we integrated our CP model into an industrial regression testing process.

【Keywords】: Regression testing; Constraint Programming application

556. Wikipedia in the Tourism Industry: Forecasting Demand and Modeling Usage Behavior.

【Paper Link】【Pages】:4016-4021

【Authors】: Pejman Khadivi ; Naren Ramakrishnan

【Abstract】: Due to the economic and social impacts of tourism, both private and public sectors are interested in precisely forecasting the tourism demand volume in a timely manner. With recent advances in social networks, more people use online resources to plan their future trips. In this paper we explore the application of Wikipedia usage trends (WUTs) in tourism analysis. We propose a framework that deploys WUTs for forecasting the tourism demand of Hawaii. We also propose a data-driven approach, using WUTs, to estimate the behavior of tourists when they plan their trips.

【Keywords】:

557. Optimizing Energy Costs in a Zinc and Lead Mine.

【Paper Link】【Pages】:4022-4027

【Authors】: Alan Kinsella ; Alan F. Smeaton ; Barry Hurley ; Barry O'Sullivan ; Helmut Simonis

【Abstract】: Boliden Tara Mines Ltd. consumed 184.7 GWh of electricity in 2014, equating to over 1% of the national demand of Ireland or approximately 35,000 homes. Ireland’s industrial electricity prices, at an average of 13 c/KWh in 2014, are amongst the most expensive in Europe. Cost effective electricity procurement is ever more pressing for businesses to remain competitive. In parallel, the proliferation of intelligent devices has led to the industrial Internet of Things paradigm becoming mainstream. As more and more devices become equipped with network connectivity, smart metering is fast becoming a means of giving energy users access to a rich array of consumption data. These modern sensor networks have facilitated the development of applications to process, analyse, and react to continuous data streams in real-time. Subsequently, future procurement and consumption decisions can be informed by a highly detailed evaluation of energy usage. With these considerations in mind, this paper uses variable energy prices from Ireland’s Single Electricity Market, along with smart meter sensor data, to simulate the scheduling of an industrial-sized underground pump station in Tara Mines. The objective is to reduce the overall energy costs whilst still functioning within the system’s operational constraints. An evaluation using real-world electricity prices and detailed sensor data for 2014 demonstrates significant savings of up to 10.72% over the year compared to the existing control systems.

【Keywords】:

558. Automated Capture and Execution of Manufacturability Rules Using Inductive Logic Programming.

【Paper Link】【Pages】:4028-4034

【Authors】: Abha Moitra ; Ravi Palla ; Arvind Rangarajan

【Abstract】: Capturing domain knowledge can be a time-consuming process that typically requires the collaboration of a Subject Matter Expert and a modeling expert to encode the knowledge. In a number of domains and applications, this situation is further exacerbated by the fact that the Subject Matter Expert may find it difficult to articulate the domain knowledge as a procedure or rules, but instead may find it easier to classify instance data. To facilitate this type of knowledge elicitation from Subject Matter Experts, we have developed a system that automatically generates formal and executable rules from provided labeled instance data. We do this by leveraging the techniques of Inductive Logic Programming (ILP) to generate Horn clause based rules to separate out positive and negative instance data. We illustrate our approach on a Design For Manufacturability (DFM) platform where the goal is to design products that are easy to manufacture by providing early manufacturability feedback. Specifically we show how our approach can be used to generate feature recognition rules from positive and negative instance data supplied by Subject Matter Experts. Our platform is interactive, provides visual feedback and is iterative. The feature identification rules generated can be inspected, manually refined and vetted.

【Keywords】: Inductive Logic Programming, Machine Learning, Design for Manufacturability, Feature Recognition

559. MetaSeer.STEM: Towards Automating Meta-Analyses.

【Paper Link】【Pages】:4035-4040

【Authors】: Venkata Kishore Neppalli ; Cornelia Caragea ; Robin Mayes ; Kim Nimon ; Fred Oswald

【Abstract】: Meta-analysis is a principled statistical approach for summarizing quantitative information reported across studies within a research domain of interest. Although the results of meta-analyses can be highly informative,the process of collecting and coding the data for a meta analysis is often a labor-intensive effort fraught with the potential for human error and idiosyncrasy. This is due to the fact that researchers typically spend weeks poring over published journal articles, technical reports, book chapters and other materials in order to retrieve key data elements that are then manually coded for subsequent analyses (e.g., descriptive statistics, effect sizes, reliability estimates, demographics, and study conditions).In this paper, we propose a machine learning based system developed to support automated extraction of data pertinent to STEM education meta-analyses, including educational and human resource initiatives aimed at improving achievement, literacy and interest in the fields of science, technology, engineering, and mathematics.

【Keywords】: meta-analysis; extractor, classification

560. Data Driven Game Theoretic Cyber Threat Mitigation.

【Paper Link】【Pages】:4041-4046

【Authors】: John Robertson ; Vivin Paliath ; Jana Shakarian ; Amanda Thart ; Paulo Shakarian

【Abstract】: Penetration testing is regarded as the gold-standard for understanding how well an organization can withstand sophisticated cyber-attacks. However, the recent prevalence of markets specializing in zero-day exploits on the darknet make exploits widely available to potential attackers. The cost associated with these sophisticated kits generally precludes penetration testers from simply obtaining such exploits -- so an alternative approach is needed to understand what exploits an attacker will most likely purchase and how to defend against them. In this paper, we introduce a data-driven security game framework to model an attacker and provide policy recommendations to the defender. In addition to providing a formal framework and algorithms to develop strategies, we present experimental results from applying our framework, for various system configurations, on real-world exploit market data actively mined from the darknet.

【Keywords】: game theory, cyber threat intelligence

561. Automated Volumetric Intravascular Plaque Classification Using Optical Coherence Tomography (OCT).

【Paper Link】【Pages】:4047-4052

【Authors】: Ronny Shalev ; Daisuke Nakamura ; Setsu Nishino ; Andrew M. Rollins ; Hiram G. Bezerra ; David L. Wilson ; Soumya Ray

【Abstract】: An estimated 17.5 million people died from a cardiovascular disease in 2012, representing 31% of all global deaths. Most acute coronary events result from rupture of the protective fibrous cap overlying an atherosclerotic plaque. The task of early identification of plaque types that can potentially rupture is, therefore, of great importance. The state-of-the-art approach to imaging blood vessels is intravascular optical coherence tomography (IVOCT). However, currently, this is an offline approach where the images are first collected and then manually analyzed a frame at a time to identify regions at risk of thrombosis. This process is extremely laborious, time consuming and prone to human error. We are building a system that, when complete, will provide interactive 3D visualization of a blood vessel as an IVOCT is in progress. The visualization will highlight different plaque types and enable quick identification of regions at risk for thrombosis. In this paper, we describe our approach, focusing on machine learning methods that are a key enabling technology. Our empirical results using real OCT data show that our approach can identify different plaque types efficiently with high accuracy across multiple patients.

【Keywords】: Plaque classification; Optical Coherence Tomography; OCT; machine learning

562. A Hidden Markov Model Approach to Infer Timescales for High-Resolution Climate Archives.

【Paper Link】【Pages】:4053-4061

【Authors】: Mai Winstrup

【Abstract】: We present a Hidden Markov Model-based algorithm for constructing timescales for paleoclimate records by annual layer counting. This objective, statistics-based approach has a number of major advantages over the current manual approach, beginning with speed. Manual layer counting of a single core (up to 3km in length) can require multiple person-years of time; the StratiCounter algorithm can count up to 100 layers/min, corresponding to a full-length timescale constructed in a few days. Moreover, the algorithm gives rigorous uncertainty estimates for the resulting timescale, which are far smaller than those produced manually. We demonstrate the utility of StratiCounter by applying it to ice-core data from two cores from Greenland and Antarctica. Performance of the algorithm is comparable to a manual approach. When using all available data, false-discovery rates and miss rates are 1-1.2% and 1.2-1.6%, respectively, for the two cores. For one core, even better agreement is found when using only the chemistry series primarily employed by human experts in the manual approach.

【Keywords】: HMM; Forward-Backward algorithm

Innovative Applications Challenge Problem Papers 1

563. Infusing Human Factors into Algorithmic Crowdsourcing.

【Paper Link】【Pages】:4062-4064

【Authors】: Han Yu ; Chunyan Miao ; Zhiqi Shen ; Jun Lin ; Cyril Leung ; Qiang Yang

【Abstract】: The emergence of crowdsourcing systems have provided a viable mechanism for incorporating humans into the computational loop at large scale and in real-time. This offers an unprecedent opportunity to study how artificial intelligence (AI) techniques and humans can collaborate to solve problems. An important challenge in crowdsourcing is how to make optimal use of human resources as people have different skills and their availability may be limited. In this paper, we provide the research community with a new dataset derived from an online game-based platform to address this challenge. Six crowdsourcing task allocation scenarios with different overall workload levels and worker population characteristics were presented to over 400 players to solve. With close to 3,000 game sessions and over 300,000 task allocation decisions from human and AI players, the dataset provides an efficient focal point for the research community to design solutions that can sustainably tap into the pool of human resources through crowdsourcing.

【Keywords】: Algorithmic Crowdsourcing; human factors

EAAI Symposium Full Paper 10

564. Using Domain Knowledge to Improve Monte-Carlo Tree Search Performance in Parameterized Poker Squares.

【Paper Link】【Pages】:4065-4070

【Authors】: Robert Arrington ; Clay Langley ; Steven Bogaerts

【Abstract】: Poker Squares is a single-player card game played on a 5 x 5 grid, in which a player attempts to create as many high-scoring Poker hands as possible. As a stochastic single-player game with an extremely large state space, this game offers an interesting area of application for Monte-Carlo Tree Search (MCTS). This paper describes enhancements made to the MCTS algorithm to improve computer play, including pruning in the selection stage and a greedy simulation algorithm. These enhancements make extensive use of domain knowledge in the form of a state evaluation heuristic. Experimental results demonstrate both the general efficacy of these enhancements and their ideal parameter settings.

【Keywords】: Monte-Carlo Tree Search; poker squares; domain knowledge; heuristic

565. BeeMo, a Monte Carlo Simulation Agent for Playing Parameterized Poker Squares.

【Paper Link】【Pages】:4071-4074

【Authors】: Karo Castro-Wunsch ; William Maga ; Calin Anton

【Abstract】: We investigated Parameterized Poker Squares to approximate an optimal game playing agent. We organized our inquiry along three dimensions: partial hand representation, search algorithms, and partial hand utility learning. For each dimension we implemented and evaluated several designs, among which we selected the best strategies to use for BeeMo, our final product. BeeMo uses a parallel flat Monte-Carlo search. The search is guided by a heuristic based on hand patterns utilities, which are learned through an iterative improvement method involving Monte-Carlo simulations and optimized greedy search.

【Keywords】:

566. Conceptualizing Curse of Dimensionality with Parallel Coordinates.

【Paper Link】【Pages】:4075-

【Authors】: G. Devi ; Charu Chauhan ; Sutanu Chakraborti

【Abstract】: We report on a novel use of parallel coordinates as a pedagogical tool for illustrating the non-intuitive properties of high dimensional spaces with special emphasis on the phenomenon of Curse of Dimensionality. Also, we have collated what we believe to be a representative sample of diverse approaches that exist in literature to conceptualize the Curse of Dimensionality. We envisage that the paper will have pedagogical value in structuring the way Curse of Dimensionality is presented in classrooms and associated lab sessions.

【Keywords】: Curse of Dimensionality; High Dimensional Spaces; Parallel Coordinates; Visual Area

567. Teaching Big Data Analytics Skills with Intelligent Workflow Systems.

【Paper Link】【Pages】:3997-4088

【Authors】: Yolanda Gil

【Abstract】: We have designed an open and modular course for data science and big data analytics using a workflow paradigm that allows students to easily experience big data through a sophisticated yet easy to use instrument that is an intelligent workflow system. A key aspect of this work is the use of semantic workflows to capture and reuse end-to-end analytic methods that experts would use to analyze big data, and the use of an intelligent workflow system to elaborate the workflow and manage its execution and resulting datasets. Through the exposure of big data analytics in a workflow framework, students will be able to get first-hand experiences with a breadth of big data topics, including multi-step data analytic and statistical methods, software reuse and composition, parallel distributed programming, high-end computing. In addition, students learn about a range of topics in AI, including semantic representations and ontologies, machine learning, natural language processing, and image analysis.

【Keywords】: Data science, intelligent workflows, education

568. Design of an Online Course on Knowledge-Based AI.

【Paper Link】【Pages】:4089-4094

【Authors】: Ashok K. Goel ; David A. Joyner

【Abstract】: In Fall 2014 we offered an online course on Knowledge-Based Artificial Intelligence (KBAI) to about 200 students as part of the Georgia Tech Online MS in CS program. By now we have offered the course to more than 1000 students. We describe the design, development and delivery of the online KBAI class in Fall 2014.

【Keywords】: Online education, knowledge-based AI, cognitive systems

569. Learning and Using Hand Abstraction Values for Parameterized Poker Squares.

【Paper Link】【Pages】:4095-4100

【Authors】: Todd W. Neller ; Colin M. Messinger ; Zuozhi Yang

【Abstract】: We describe the experimental development of an AI player that adapts to different point systems for Parameterized Poker Squares. After introducing the game and research competition challenge, we describe our static board evaluation utilizing learned evaluations of abstract partial Poker hands. Next, we evaluate various time management strategies and search algorithms. Finally, we show experimentally which of our design decisions most signicantly accounted for observed performance.

【Keywords】: game artificial intelligence; reinforcement learning; expectimax

570. Creating Interactive and Visual Educational Resources for AI.

【Paper Link】【Pages】:4101-4106

【Authors】: Sameer Singh ; Sebastian Riedel

【Abstract】: Teaching artificial intelligence is effective if the experience is a visual and interactive one, with educational materials that utilize combinations of various content types such as text, math, and code into an integrated experience. Unfortunately, easy-to-use tools for creating such pedagogical resources are not available to the educators, resulting in most courses being taught using a disconnected set of static materials, which is not only ineffective for learning AI, but further, requires repeated and redundant effort for the instructor. In this paper, we introduce Moro, a software tool for easily creating and presenting AI-friendly teaching materials. Moro notebooks integrate content of different types (text, math, code, images), allow real-time interactions via modifiable and executable code blocks, and are viewable in browsers both as long-form pages and as presentations. Creating notebooks is easy and intuitive; the creation tool is also in-browser, is WYSIWYG for quick iterations of editing, and supports a variety of shortcuts and customizations for efficiency. We present three deployed case studies of Moro that widely differ from each other, demonstrating its utility in a variety of scenarios such as in-class teaching and conference tutorials.

【Keywords】:

571. From the Lab to the Classroom and Beyond: Extending a Game-Based Research Platform for Teaching AI to Diverse Audiences.

【Paper Link】【Pages】:4107-4112

【Authors】: Nicole Sintov ; Debarun Kar ; Thanh Nguyen ; Fei Fang ; Kevin Hoffman ; Arnaud Lyet ; Milind Tambe

【Abstract】: Recent years have seen increasing interest in AI from outside the AI community. This is partly due to applications based on AI that have been used in real-world domains, for example, the successful deployment of game theory-based decision aids in security domains. This paper describes our teaching approach for introducing the AI concepts underlying security games to diverse audiences. We adapted a game-based research platform that served as a testbed for recent research advances in computational game theory into a set of interactive role-playing games. We guided learners in playing these games as part of our teaching strategy, which also included didactic instruction and interactive exercises on broader AI topics. We describe our experience in applying this teaching approach to diverse audiences, including students of an urban public high school, university undergraduates, and security domain experts who protect wildlife. We evaluate our approach based on results from the games and participant surveys.

【Keywords】: AI education; AI applications; Computer-aided education; Reasoning under uncertainty; Game playing/entertainment; educational games; security games; role-playing games; decision making; artificial intelligence;

572. The Turing Test in the Classroom.

【Paper Link】【Pages】:4113-4118

【Authors】: Lisa Torrey ; Karen Johnson ; Sid Sondergard ; Pedro Ponce ; Laura Desmond

【Abstract】: This paper discusses the Turing Test as an educational activity for undergraduate students. It describes in detail an experiment that we conducted in a first-year non-CS course. We also suggest other pedagogical purposes that the Turing Test could serve.

【Keywords】:

573. A Survey of Current Practice and Teaching of AI.

【Paper Link】【Pages】:4119-4125

【Authors】: Michael Wollowski ; Robert Selkowitz ; Laura E. Brown ; Ashok K. Goel ; George Luger ; Jim Marshall ; Andrew Neel ; Todd W. Neller ; Peter Norvig

【Abstract】: The field of AI has changed significantly in the past couple of years and will likely continue to do so. Driven by a desire to expose our students to relevant and modern materials, we conducted two surveys, one of AI instructors and one of AI practitioners. The surveys were aimed at gathering infor-mation about the current state of the art of introducing AI as well as gathering input from practitioners in the field on techniques used in practice. In this paper, we present and briefly discuss the responses to those two surveys.

【Keywords】: Survey of AI Practice; Survey of AI Education

EAAI Symposium Poster Paper 6

574. IRobot: Teaching the Basics of Artificial Intelligence in High Schools.

【Paper Link】【Pages】:4126-4127

【Authors】: Harald Burgsteiner ; Martin Kandlhofer ; Gerald Steinbauer

【Abstract】: Profound knowledge about Artificial Intelligence (AI) will become increasingly important for careers in science and engineering. Therefore an innovative educational project teaching fundamental concepts of AI at high school level will be presented in this paper. We developed an AI-course covering major topics (problem solving, search, planning, graphs, datastructures, automata, agent systems, machine learning) which comprises both theoretical and hands-on components. A pilot project was conducted and empirically evaluated. Results of the evaluation show that the participating pupils have become familiar with those concepts and the various topics addressed. Results and lessons learned from this project form the basis for further projects in different schools which intend to integrate AI in future secondary science education.

【Keywords】: Artificial Intelligence; Computer Science Education; Teaching AI

575. A.I. as an Introduction to Research Methods in Computer Science.

【Paper Link】【Pages】:4128-4129

【Authors】: Raghuram Ramanujan

【Abstract】: While many computer science programs offer courses on research methods, such classes typically tend to be aimed at graduate students. In this paper, we propose a novel means for introducing undergraduate students to research experiences in computer science — via an introductory Artificial Intelligence (A.I.) course. Students explore the content areas typically covered in an upper-level A.I. course (heuristic search, constraint satisfaction, game-playing etc.), while also learning about the mechanics of how empirical research is conducted in this field.

【Keywords】:

576. An Online Logic Programming Development Environment.

【Paper Link】【Pages】:4130-4131

【Authors】: Christian Reotutar ; Mbathio Diagne ; Evgenii Balai ; Edward Wertz ; Peter Lee ; Shao-Lon Yeh ; Yuanlin Zhang

【Abstract】: Recent progress in logic programming, particularly answer set programming, has enabled us to teach it to undergraduate and high school students. We developed an online answer set programming environment with simple interface and self contained file system. It is expected to make the teaching of answer set programming more effective and help us to reach more students.

【Keywords】:

577. Using Declarative Programming in an Introductory Computer Science Course for High School Students.

【Paper Link】【Pages】:4132-4133

【Authors】: Maritza Reyes ; Cynthia Perez ; Rocky Upchurch ; Timothy Yuen ; Yuanlin Zhang

【Abstract】: This paper discusses the design of an introductory computer science course for high school students using declarative programming. Though not often taught at the K-12 level, declarative programming is a viable paradigm for teaching computer science due to its importance in artificial intelligence and in helping student explore and understand problem spaces. This paper describes the authors' implementation of a declarative programming course for high school students during a 4-week summer session.

【Keywords】:

578. Teaching Automated Strategic Reasoning Using Capstone Tournaments.

【Paper Link】【Pages】:4134-4135

【Authors】: Oscar Veliz ; Marcus Gutierrez ; Christopher Kiekintveld

【Abstract】: Courses in artificial intelligence and related topics often cover methods for reasoning under uncertainty, decision theory, and game theory. However, these methods can seem very abstract when students first encounter them, and they are often taught using simple “toy” problems. Our goal is to help students to operationalize this knowledge by designing sophisticated autonomous agents that must make complex decisions in games that capture their interest. We describe a tournament-based pedagogy that we have used in two different courses with two different games based on current research topics in artificial intelligence to engage students in designing agents that use strategic reasoning. Many students find this structure very engaging, and we find that students develop a deeper understanding of the abstract strategic reasoning concepts introduced in the courses.

【Keywords】: Game Theory, Pedagogy

579. Training Watson - A Cognitive Systems Course.

【Paper Link】【Pages】:4136-4138

【Authors】: Michael Wollowski

【Abstract】: We developed a course in which students train an instance of Watson and develop an application that interacts with the trained instance. Additionally, students learn technical in-formation about the Jeopardy! version of Watson and they discuss a future infused with cognitive assistants. In this poster, we justify this course, characterize major assessment items and provide advice on choosing a domain.

【Keywords】: Watson; Cognitive Systems course

EAAI Symposium Model AI Assignments 1

580. Model AI Assignments 2016.

【Paper Link】【Pages】:4139-4141

【Authors】: Todd W. Neller ; Laura E. Brown ; James B. Marshall ; Lisa Torrey ; Nate Derbinsky ; Andrew A. Ward ; Thomas E. Allen ; Judy Goldsmith ; Nahom Muluneh

【Abstract】: The Model AI Assignments session seeks to gather and disseminate the best assignment designs of the Artificial Intelligence (AI) Education community. Recognizing that assignments form the core of student learning experience, we here present abstracts of six AI assignments from the 2016 session that are easily adoptable, playfully engaging, and flexible for a variety of instructor needs.

【Keywords】: model AI assignments

Senior Member Blue Sky Papers 4

581. Indefinite Scalability for Living Computation.

【Paper Link】【Pages】:4142-4146

【Authors】: David H. Ackley

【Abstract】: In a question-and-answer format, this summary paper presents background material for the AAAI-16 Senior Member Presentation Track “Blue Sky Ideas” talk of the same name.

【Keywords】: indefinite scalability, robust-first computing, best-effort computing

582. Embedding Ethical Principles in Collective Decision Support Systems.

【Paper Link】【Pages】:4147-4151

【Authors】: Joshua Greene ; Francesca Rossi ; John Tasioulas ; Kristen Brent Venable ; Brian Williams

【Abstract】: The future will see autonomous machines acting in the same environment as humans, in areas as diverse as driving, assistive technology, and health care. Think of self-driving cars, companion robots, and medical diagnosis support systems. We also believe that humans and machines will often need to work together and agree on common decisions. Thus hybrid collective decision making systems will be in great need. In this scenario, both machines and collective decision making systems should follow some form of moral values and ethical principles (appropriate to where they will act but always aligned to humans'), as well as safety constraints. In fact, humans would accept and trust more machines that behave as ethically as other humans in the same environment. Also, these principles would make it easier for machines to determine their actions and explain their behavior in terms understandable by humans. Moreover, often machines and humans will need to make decisions together, either through consensus or by reaching a compromise. This would be facilitated by shared moral values and ethical principles.

【Keywords】:

583. Five Dimensions of Reasoning in the Wild.

【Paper Link】【Pages】:4152-4156

【Authors】: Don Perlis

【Abstract】: Reasoning does not work well when done in isolation from its significance, both to the needs and interests of an agent and with respect to the wider world. Moreover, those issues may best be handled with a new sort of data structure that goes beyond the knowledge base and incorporates aspects of perceptual knowledge and even more, in which a kind of anticipatory action may be key.

【Keywords】: reasoning, envisioning

584. Ethical Dilemmas for Adaptive Persuasion Systems.

【Paper Link】【Pages】:4157-4162

【Authors】: Oliviero Stock ; Marco Guerini ; Fabio Pianesi

【Abstract】: A key acceptability criterion for artificial agents will be the possible moral implications of their actions. In particular, intelligent persuasive systems (systems designed to influence humans via communication) constitute a highly sensitive topic because of their intrinsically social nature. Still, ethical studies in this area are rare and tend to focus on the output of the required action; instead, this work focuses on the acceptability of persuasive acts themselves.Building systems able to persuade while being ethically acceptable requires that they be capable of intervening flexibly and of taking decisions about which specific persuasive strategy to use. We show how, exploiting a behavioral approach, based on human assessment of moral dilemmas, we obtain results that will lead to more ethically appropriate systems. Experiments we have conducted address the type of persuader, the strategies adopted and the circumstances. Dimensions surfaced that can characterize the interpersonal differences concerning moral acceptability of machine performed persuasion, usable for strategy adaptation. We also show that the prevailing preconceived negative attitude toward persuasion by a machine is not predictive of actual moral acceptability judgement when subjects are confronted with specific cases.

【Keywords】: AI and ethics; persuasion systems; moral dilemmas

Senior Member Summary Talks 4

585. Ontology Instance Linking: Towards Interlinked Knowledge Graphs.

【Paper Link】【Pages】:4163-4169

【Authors】: Jeff Heflin ; Dezhao Song

【Abstract】: Due to the decentralized nature of the Semantic Web, the same real-world entity may be described in various data sources with different ontologies and assigned syntactically distinct identifiers. In order to facilitate data utilization and consumption in the Semantic Web, without compromising the freedom of people to publish their data, one critical problem is to appropriately interlink such heterogeneous data. This interlinking process is sometimes referred to as Entity Coreference, i.e., finding which identifiers refer to the same real-world entity. In this paper, we first summarize state-of-the-art algorithms in detecting such coreference relationships between ontology instances. We then discuss various techniques in scaling entity coreference to large-scale datasets. Finally, we present well-adopted evaluation datasets and metrics, and compare the performance of the state-of-the-art algorithms on such datasets.

【Keywords】: Semantic Web; Linked Data; Entity Coreference; Scalability; Domain-Independence

586. Natural Language Processing for Enhancing Teaching and Learning.

【Paper Link】【Pages】:4170-4176

【Authors】: Diane J. Litman

【Abstract】: Advances in natural language processing (NLP) and educational technology, as well as the availability of unprecedented amounts of educationally-relevant text and speech data, have led to an increasing interest in using NLP to address the needs of teachers and students. Educational applications differ in many ways, however, from the types of applications for which NLP systems are typically developed. This paper will organize and give an overview of research in this area, focusing on opportunities as well as challenges.

【Keywords】:

587. Strategic Behaviour When Allocating Indivisible Goods.

【Paper Link】【Pages】:4177-4183

【Authors】: Toby Walsh

【Abstract】: We survey some recent research regarding strategic behaviour in resource allocation problems, focusing on the fair division of indivisible goods. We consider a number of computational questions like how a single strategic agent misreports their preferences to ensure a particular outcome, and how agents compute a Nash equilibrium when they all act strategically. We also identify a number of future directions like dealing with non-additive utilities, and partial or probabilistic information about the preferences of other agents.

【Keywords】: fair divsion, indivisible goods, strategic behavior

588. Rational Verification: From Model Checking to Equilibrium Checking.

【Paper Link】【Pages】:4184-4191

【Authors】: Michael Wooldridge ; Julian Gutierrez ; Paul Harrenstein ; Enrico Marchioni ; Giuseppe Perelli ; Alexis Toumi

【Abstract】: Rational verification is concerned with establishing whether a given temporal logic formula φ is satisfied in some or all equilibrium computations of a multi-agent system – that is, whether the system will exhibit the behaviour φ under the assumption that agents within the system act rationally in pursuit of their preferences. After motivating and introducing the framework of rational verification, we present formal models through which rational verification can be studied, and survey the complexity of key decision problems. We give an overview of a prototype software tool for rational verification, and conclude with a discussion and related work.

【Keywords】: multi-agent systems; game theory; temporal logic; model checking; equilibrium checking; rational verification; synthesis

Student Abstracts 48

【Paper Link】【Pages】:4192-4193

【Authors】: Ofra Amir ; Barbara J. Grosz ; Krzysztof Z. Gajos

【Abstract】: People collaborate in carrying out such complex activities as treating patients, co-authoring documents and developing software. While technologies such as Dropbox and Github enable groups to work in a distributed manner, coordinating team members' individual activities poses significant challenges. In this paper, we formalize the problem of "information sharing in loosely-coupled extended-duration teamwork." We develop a new representation, Mutual Influence Potential Networks (MIP-Nets), to model collaboration patterns and dependencies among activities, and an algorithm, MIP-DOI, that uses this representation to reason about information sharing.

【Keywords】: Information Sharing; Human Teamwork; Collaboration

590. Weighted A* Algorithms for Unsupervised Feature Selection with Provable Bounds on Suboptimality.

【Paper Link】【Pages】:4194-4195

【Authors】: Hiromasa Arai ; Ke Xu ; Crystal Maung ; Haim Schweitzer

【Abstract】: Identifying a small number of features that can represent the data is believed to be NP-hard. Previous approaches exploit algebraic structure and use randomization. We propose an algorithm based on ideas similar to the Weighted A* algorithm in heuristic search. Our experiments show this new algorithm to be more accurate than the current state of the art.

【Keywords】:

591. Abstraction Using Analysis of Subgames.

【Paper Link】【Pages】:4196-4197

【Authors】: Anjon Basak

【Abstract】: Normal form games are one of the most familiar representations for modeling interactions among multiple agent. However, modeling many realistic interactions between agents results in games that are extremely large. In these cases computing standard solutions like Nash equilibrium may be intractable. To overcome this issue the idea of abstraction has been investigated, most prominently in research on computer Poker. Solving a game using abstraction requires using some method to simplify the game before it is analyzed. We study a new variation for solving normal form games using abstraction that is based on finding and solving suitable sub games. We compare this method with several variations of a common type of abstraction based on clustering similar strategies.

【Keywords】: game theory; abstraction; subgame

592. Bayesian Markov Games with Explicit Finite-Level Types.

【Paper Link】【Pages】:4198-4199

【Authors】: Muthukumaran Chandrasekaran ; Yingke Chen ; Prashant Doshi

【Abstract】: In impromptu or ad hoc settings, participating players are precluded from precoordination. Subsequently, each player's own model is private and includes some uncertainty about the others' types or behaviors. Harsanyi's formulation of a Bayesian game lays emphasis on this uncertainty while the players each play exactly one turn. We propose a new game-theoretic framework where Bayesian players engage in a Markov game and each has private but imperfect information regarding other players' types. Consequently, we construct player types whose structure is explicit and includes a finite level belief hierarchy instead of utilizing Harsanyi's abstract types and a common prior distribution. We formalize this new framework and demonstrate its effectiveness on two standard ad hoc teamwork domains involving two or more ad hoc players.

【Keywords】: algorithmic game theory; constraint satisfaction; finite belief hierarchy; Markov games

593. BRBA: A Blocking-Based Association Rule Hiding Method.

【Paper Link】【Pages】:4200-4201

【Authors】: Peng Cheng ; Ivan Lee ; Li Li ; Kuo-Kun Tseng ; Jeng-Shyang Pan

【Abstract】: Privacy preserving in association mining is an important research topic in the database security field. This paper has proposed a blocking-based method to solve the association rule hiding problem for data sharing. It aims at reducing undesirable side effects and increasing desirable side effects, while ensuring to conceal all sensitive rules. The candidate transactions are selected for sanitization based on their relations with border rules. Comparative experiments on real datasets demonstrate that the proposed method can achieve its goals.

【Keywords】: Association rule hiding; Blocking; Border rules

594. A CP-Based Approach for Popular Matching.

【Paper Link】【Pages】:4202-4203

【Authors】: Danuta Sorina Chisca ; Mohamed Siala ; Gilles Simonin ; Barry O'Sullivan

【Abstract】: We propose a constraint programming approach to the popular matching problem. We show that one can use the Global Cardinality Constraint to encode the problem even in cases that involve ties in the ordinal preferences of the applicants.

【Keywords】: Constraint Programming; Popular matching; Preferences

595. Predicting Prices in the Power TAC Wholesale Energy Market.

【Paper Link】【Pages】:4204-4205

【Authors】: Moinul Morshed Porag Chowdhury

【Abstract】: The Power TAC simulation emphasizes the strategic problems that broker agents face in managing the economics of a smart grid. The brokers must make trades in multiple markets and to be successful, brokers must make many good predictions about future supply, demand,and prices. Clearing price prediction is an important part of the broker’s wholesale market strategy because it helps the broker to make intelligent decisions when purchasing energy at low cost in a day-ahead market. I describe my work on using machine learning methods to predict prices in the Power TAC wholesale market, which will be used in future bidding strategies.

【Keywords】: Artificial Intelligence, Machine Learning, Smart Grid, Multi-agent Systems

596. Robust Execution Strategies for Probabilistic Temporal Planning.

【Paper Link】【Pages】:4206-4207

【Authors】: Sam Dietrich ; Kyle Lund ; James C. Boerkoel

【Abstract】: A critical challenge in temporal planning is robustly dealing with non-determinism introduced by the environment, e.g., the durational uncertainty of an action taken by a robot in the physical world due to slippage or other unexpected influences. Recent advances show that robustness, which accounts for uncertainty in predicting schedule success, is a better measure of solution quality than traditional metrics such as flexibility. This paper introduces the Robust Execution Problem (REP) for finding maximally robust dispatch strategies for general probabilistic temporal planning problems. While the REP is generally intractable in practice, we introduce approximate solution techniques—one that can be computed statically prior to the start of execution while providing robustness guarantees and one that dynamically adjusts to opportunities and setbacks during execution. We show empirically that dynamically optimizing for robustness improves the likelihood of execution success.

【Keywords】: Scheduling; Robustness; Execution Strategy; Robust Execution

597. A Comparison of Supervised Learning Algorithms for Telerobotic Control Using Electromyography Signals.

【Paper Link】【Pages】:4208-4211

【Authors】: Tyler M. Frasca ; Antonio G. Sestito ; Craig Versek ; Douglas E. Dow ; Barry C. Husowitz ; Nate Derbinsky

【Abstract】: Human Computer Interaction (HCI) is central for many applications, including hazardous environment inspection and telemedicine. Whereas traditional methods ofHCI for teleoperating electromechanical systems include joysticks, levers, or buttons, our research focuses on using electromyography (EMG) signals to improve intuition and response time. An important challenge is to accurately and efficiently extract and map EMG signals to known position for real-time control. In this preliminary work, we compare the accuracy and real-time performance of several machine-learning techniques for recognizing specific arm positions. We present results from offline analysis, as well as end-to-end operation using a robotic arm.

【Keywords】: Robotics; Machine Learning; Human-Robot Interaction

598. Trust and Distrust Across Coalitions: Shapley Value Based Centrality Measures for Signed Networks (Student Abstract Version).

【Paper Link】【Pages】:4212-

【Authors】: Varun Gangal ; Abhishek Narwekar ; Balaraman Ravindran ; Ramasuri Narayanam

【Abstract】: We propose Shapley Value based centrality measures for signed social networks. We also demonstrate that they lead to improved precision for the troll detection task.

【Keywords】: AI and the Web; Game Theory

599. Authorship Attribution Using a Neural Network Language Model.

【Paper Link】【Pages】:4212-4213

【Authors】: Zhenhao Ge ; Yufang Sun ; Mark J. T. Smith

【Abstract】: In practice, training language models for individual authors is often expensive because of limited data resources. In such cases, Neural Network Language Models (NNLMs), generally outperform the traditional non-parametric N-gram models. Here we investigate the performance of a feed-forward NNLM on an authorship attribution problem, with moderate author set size and relatively limited data. We also consider how the text topics impact performance. Compared with a well-constructed N-gram baseline method with Kneser-Ney smoothing, the proposed method achieves nearly 2.5% reduction in perplexity and increases author classification accuracy by 3.43% on average, given as few as 5 test sentences. The performance is very competitive with the state of the art in terms of accuracy and demand on test data.

【Keywords】: neural networks; language modeling; text classification

600. Structure Aware L1 Graph for Data Clustering.

【Paper Link】【Pages】:4214-4215

【Authors】: Shuchu Han ; Hong Qin

【Abstract】: In graph-oriented machine learning research, L1 graph is an efficient way to represent the connections of input data samples. Its construction algorithm is based on a numerical optimization motivated by Compressive Sensing theory. As a result, It is a nonparametric method which is highly demanded. However, the information of data such as geometry structure and density distribution are ignored. In this paper, we propose a Structure Aware (SA) L1 graph to improve the data clustering performance by capturing the manifold structure of input data. We use a local dictionary for each datum while calculating its sparse coefficients. SA-L1 graph not only preserves the locality of data but also captures the geometry structure of data. The experimental results show that our new algorithm has better clustering performance than L1 graph.

【Keywords】:

601. Multivariate Conditional Outlier Detection and Its Clinical Application.

【Paper Link】【Pages】:4216-4217

【Authors】: Charmgil Hong ; Milos Hauskrecht

【Abstract】: This paper overviews and discusses our recent work on a multivariate conditional outlier detection framework for clinical applications.

【Keywords】: multivariate data modeling; conditional outlier detection; clinical outlier detection

602. Learning Complex Stand-Up Motion for Humanoid Robots.

【Paper Link】【Pages】:4218-4219

【Authors】: Heejin Jeong ; Daniel D. Lee

【Abstract】: In order for humanoid robots to complete various assigned tasks without any human assistance, they must have the ability to stand up on their own. In this abstract, we introduce complex stand-up motion of humanoid robots learned by using Reinforcement Learning.

【Keywords】: Reinforcement Learning; Robotics; Applications of AI

603. Connecting the Dots Using Contextual Information Hidden in Text and Images.

【Paper Link】【Pages】:4220-4221

【Authors】: Md. Abdul Kader ; Sheikh Motahar Naim ; Arnold P. Boedihardjo ; M. Shahriar Hossain

【Abstract】: Creation of summaries of events of interest from multitude of unstructured data is a challenging task commonly faced by intelligence analysts while seeking increased situational awareness. This paper proposes a framework called Storyboarding that leverages unstructured text and images to explain events as sets of sub-events. The framework first generates a textual context for each human face detected from images and then builds a chain of coherent documents where two consecutive documents of the chain contain a common theme as well as a context. Storyboarding helps analysts quickly narrow down large number of possibilities to a few significant ones for further investigation. Empirical studies on Wikipedia documents, images and news articles show that Storyboarding is able to provide deeper insights on events of interests.

【Keywords】:

604. Monte Carlo Tree Search for Multi-Robot Task Allocation.

【Paper Link】【Pages】:4222-4223

【Authors】: Bilal Kartal ; Ernesto Nunes ; Julio Godoy ; Maria L. Gini

【Abstract】: Multi-robot teams are useful in a variety of task allocation domains such as warehouse automation and surveillance. Robots in such domains perform tasks at given locations and specific times, and are allocated tasks to optimize given team objectives. We propose an efficient, satisficing and centralized Monte Carlo TreeSearch based algorithm exploiting branch and bound paradigm to solve the multi-robot task allocation problem with spatial, temporal and other side constraints. Unlike previous heuristics proposed for this problem, our approach offers theoretical guarantees and finds optimal solutions for some non-trivial data sets.

【Keywords】: Monte Carlo Tree Search; Optimization; Search

605. Hierarchy Prediction in Online Communities.

【Paper Link】【Pages】:4224-4225

【Authors】: Denys Katerenchuk ; Andrew Rosenberg

【Abstract】: With the development of the Internet, a big part of social interactions have moved online, and people have unconsciously brought their daily communicational habits to the web. Understanding these communications is important because it will lead to a better understanding of online communities, and can improve areas such as e-commerce, advertisement, topic modeling, security, and others. We propose to develop a natural language based ranking algorithm to predict user influence levels in online communication groups.

【Keywords】: Influence detection; Hierarchy Prediction; Power Identification; User Ranking

606. Handling Class Imbalance in Link Prediction Using Learning to Rank Techniques.

【Paper Link】【Pages】:4226-4227

【Authors】: Bopeng Li ; Sougata Chaudhuri ; Ambuj Tewari

【Abstract】: We consider the link prediction (LP) problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network. Many existing methods reduce LP to binary classification. However, the dominance of absent links in real world networks makes misclassification error a poor performance metric. Instead, researchers have argued for using ranking performance measures, like AUC, AP and NDCG, for evaluation. We recast the LP problem as a learning to rank problem and use effective learning to rank techniques directly during training which allows us to deal with the class imbalance problem systematically. As a demonstration of our general approach, we develop an LP method by optimizing the cross-entropy surrogate, originally used in the popular ListNet ranking algorithm. We conduct extensive experiments on publicly available co-authorship, citation and metabolic networks to demonstrate the merits of our method.

【Keywords】: Link Prediction; Class Imbalance; Learning to Rank

607. Predicting Links and Their Building Time: A Path-Based Approach.

【Paper Link】【Pages】:4228-4229

【Authors】: Manling Li ; Yantao Jia ; Yuanzhuo Wang ; Zeya Zhao ; Xueqi Cheng

【Abstract】: Predicting links and their building time in a knowledge network has been extensively studied in recent years. Most structure-based predictive methods consider structures and the time information of edges separately, which fail to characterize the correlation between them. In this paper, we propose a structure called the Time-Difference-Labeled Path, and a link prediction method (TDLP). Experiments show that TDLP outperforms the state-of-the-art methods.

【Keywords】: Temporal Link Prediction;Knowledge Network;Time-Difference-Labeled Path

【Paper Link】【Pages】:4230-4231

【Authors】: Xin Li ; Yanghui Rao ; Yanjia Chen ; Xuebo Liu ; Huan Huang

【Abstract】: With the development of Web 2.0, many users express their opinions online. This paper is concerned with the classification of social emotions on varied-scale datasets. Different from traditional models which weight training documents equally, the concept of emotional entropy is proposed to estimate the weight and tackle the issue of noisy documents. The topic assignment is also used to distinguish different emotional senses of the same word. Experimental evaluations using different data sets validate the effectiveness of the proposed social emotion classification model.

【Keywords】: Social emotion classification; Emotional entropy; Public opinion mining

609. Two-Stream Contextualized CNN for Fine-Grained Image Classification.

【Paper Link】【Pages】:4232-4233

【Authors】: Jiang Liu ; Chenqiang Gao ; Deyu Meng ; Wangmeng Zuo

【Abstract】: Human's cognition system prompts that context information provides potentially powerful clue while recognizing objects. However, for fine-grained image classification, the contribution of context may vary over different images, and sometimes the context even confuses the classification result. To alleviate this problem, in our work, we develop a novel approach, two-stream contextualized Convolutional Neural Network, which provides a simple but efficient context-content joint classification model under deep learning framework. The network merely requires the raw image and a coarse segmentation as input to extract both content and context features without need of human interaction. Moreover, our network adopts a weighted fusion scheme to combine the content and the context classifiers, while a subnetwork is introduced to adaptively determine the weight for each image. According to our experiments on public datasets, our approach achieves considerable high recognition accuracy without any tedious human's involvements, as compared with the state-of-the-art approaches.

【Keywords】: deep learning;contextualization;fine-grained object classification

610. Decision Sum-Product-Max Networks.

【Paper Link】【Pages】:4234-4235

【Authors】: Mazen Melibari ; Pascal Poupart ; Prashant Doshi

【Abstract】: Sum-Product Networks (SPNs) were recently proposed as a new class of probabilistic graphical models that guarantee tractable inference, even on models with high-treewidth. In this paper, we propose a new extension to SPNs, called Decision Sum-Product-Max Networks (Decision-SPMNs), that makes SPNs suitable for discrete multi-stage decision problems. We present an algorithm that solves Decision-SPMNs in a time that is linear in the size of the network. We also present algorithms to learn the parameters of the network from data.

【Keywords】: Tractable Models; Sum-Product Networks; Decision Making Under Uncertainty;

611. Iterative Project Quasi-Newton Algorithm for Training RBM.

【Paper Link】【Pages】:4236-4237

【Authors】: Shuai Mi ; Xiaozhao Zhao ; Yuexian Hou ; Peng Zhang ; Wenjie Li ; Dawei Song

【Abstract】: The restricted Boltzmann machine (RBM) has been used as building blocks for many successful deep learning models, e.g., deep belief networks (DBN) and deep Boltzmann machine (DBM) etc. The training of RBM can be extremely slow in pathological regions. The second order optimization methods, such as quasi-Newton methods, were proposed to deal with this problem. However, the non-convexity results in many obstructions for training RBM, including the infeasibility of applying second order optimization methods. In order to overcome this obstruction, we introduce an em-like iterative project quasi-Newton (IPQN) algorithm. Specifically, we iteratively perform the sampling procedure where it is not necessary to update parameters, and the sub-training procedure that is convex. In sub-training procedures, we apply quasi-Newton methods to deal with the pathological problem. We further show that Newton's method turns out to be a good approximation of the natural gradient (NG) method in RBM training. We evaluate IPQN in a series of density estimation experiments on the artificial dataset and the MNIST digit dataset. Experimental results indicate that IPQN achieves an improved convergent performance over the traditional CD method.

【Keywords】: RBM; Newton's method; Natural gradient

612. Pseudo-Tree Construction Heuristics for DCOPs with Variable Communication Times.

【Paper Link】【Pages】:4238-4239

【Authors】: Atena M. Tabakhi

【Abstract】: Empirical evaluations of DCOP algorithms are typically done in simulation and under the assumption that the communication times between all pairs of agents are identical, which is unrealistic in many real-world applications. In this abstract, we incorporate non-uniform communication times in the default DCOP model and propose heuristics that exploit these communication times to speed up DCOP algorithms that operate on pseudo-trees.

【Keywords】:

613. A Word Embedding and a Josa Vector for Korean Unsupervised Semantic Role Induction.

【Paper Link】【Pages】:4240-4241

【Authors】: Kyeong-Min Nam ; Yu-Seop Kim

【Abstract】: We propose an unsupervised semantic role labeling method for Korean language, one of the agglutinative languages which have complicated suffix structures telling much of syntactic. First, we construct an argument embedding and then develop a indicator vector of the suffix such as a Josa. And, we construct an argument tuple by concatenating above two vectors. The role induction is performed by clustering the argument tuples.These method which achieves up to a 70.16% of F1-score and 75.85% of accuracy.

【Keywords】:

614. Conquering Adversary Behavioral Uncertainty in Security Games: An Efficient Modeling Robust Based Algorithm.

【Paper Link】【Pages】:4242-4243

【Authors】: Thanh Hong Nguyen ; Arunesh Sinha ; Milind Tambe

【Abstract】: Stackelberg Security Games (SSG) have been widely applied for solving real-world security problems—with a significant research emphasis on modeling attackers’ behaviors to handle their bounded rationality. However, access to real-world data (used for learning an accurate behavioral model) is often limited, leading to uncertainty in attacker’s behaviors while modeling. This paper therefore focuses on addressing behavioral uncertainty in SSG with the following main contributions: 1) we present a new uncertainty game model that integrates uncertainty intervals into a behavioral model to capture behavioral uncertainty; 2) based on this game model, we propose a novel robust algorithm that approximately computes the defender’s optimal strategy in the worst-case scenario of uncertainty—with a bound guarantee on its solution quality.

【Keywords】: security games; behavioral modeling; uncertainty

615. Bayesian AutoEncoder: Generation of Bayesian Networks with Hidden Nodes for Features.

【Paper Link】【Pages】:4244-4245

【Authors】: Kaneharu Nishino ; Mary Inaba

【Abstract】: We propose Bayesian AutoEncoder (BAE) in order to construct a recognition system which uses feedback information. BAE constructs a generative model of input data as a Bayes Net. The network trained by BAE obtains its hidden variables as the features of given data. It can execute inference for each variable through belief propagation, using both feedforward and feedback information. We confirmed that BAE can construct small networks with one hidden layer and extract features as hidden variables from 3x3 and 5x5 pixel input data.

【Keywords】:

616. Human-Robot Trust and Cooperation Through a Game Theoretic Framework.

【Paper Link】【Pages】:4246-4247

【Authors】: Erin Paeng ; Jane Wu ; James C. Boerkoel

【Abstract】: Trust and cooperation are fundamental to human interactions. How much we trust other people directly influences the decisions we make and our willingness to cooperate. It thus seems natural that trust be equally important in successful human-robot interaction (HRI), since how much a human trusts a robot affects how they might interact with it. We propose using a coin entrustment game, a variant of prisoner’s dilemma, to measure trust and cooperation as separate phenomenon between human and robot agents. With this game, we test the following hypotheses: (1) Humans will achieve and maintain higher levels of trust when interacting with what they believe to be a robot than with another human; and (2) humans will cooperate more readily with robots and will maintain a higher level of cooperation. This work contributes an experimental paradigm that uses the coin entrustment game as a way to test our hypotheses. Our empirical analysis shows that humans tend to trust robots to a greater degree than other humans, while cooperating equally well in both.

【Keywords】: Trust, Human-robot Interaction

617. Efficient Collaborative Crowdsourcing.

【Paper Link】【Pages】:4248-4249

【Authors】: Zhengxiang Pan ; Han Yu ; Chunyan Miao ; Cyril Leung

【Abstract】: We consider the problem of making efficient quality-time-cost trade-offs in collaborative crowdsourcing systems in which different skills from multiple workers need to be combined to complete a task. We propose CrowdAsm - an approach which helps collaborative crowdsourcing systems determine how to combine the expertise of available workers to maximize the expected quality of results while minimizing the expected delays. Analysis proves that CrowdAsm can achieve close to optimal profit for workers in a given crowdsourcing system if they follow the recommendations.

【Keywords】: Collaborative Crowdsourcing; inter-generational; team formation

618. SPAN: Understanding a Question with Its Support Answers.

【Paper Link】【Pages】:4250-4251

【Authors】: Liang Pang ; Yanyan Lan ; Jiafeng Guo ; Jun Xu ; Xueqi Cheng

【Abstract】: Matching a question to its best answer is a common task in community question answering. In this paper, we focus on the non-factoid questions and aim to pick out the best answer from its candidate answers. Most of the existing deep models directly measure the similarity between question and answer by their individual sentence embeddings. In order to tackle the problem of the information lack in question's descriptions and the lexical gap between questions and answers, we propose a novel deep architecture namely SPAN in this paper. Specifically we introduce support answers to help understand the question, which are defined as the best answers of those similar questions to the original one. Then we can obtain two kinds of similarities, one is between question and the candidate answer, and the other one is between support answers and the candidate answer. The matching score is finally generated by combining them. Experiments on Yahoo! Answers demonstrate that SPAN can outperform the baseline models.

【Keywords】:

619. Towards Structural Tractability in Hedonic Games.

【Paper Link】【Pages】:4252-4253

【Authors】: Dominik Peters

【Abstract】: Hedonic games are a well-studied model of coalition formation, in which selfish agents are partitioned into disjoint sets, and agents care about the make-up of the coalition they end up in. The computational problem of finding a stable outcome tends to be computationally intractable, even after severely restricting the types of preferences that agents are allowed to report. We investigate a structural way of achieving tractability, by requiring that agents' preferences interact in a well-behaved manner. Precisely, we show that stable outcomes can be found in linear time for hedonic games that satisfy a notion of bounded treewidth and bounded degree.

【Keywords】: hedonic games; bounded treewidth; structural tractability; preference restrictions

620. Heuristic Planning for Hybrid Systems.

【Paper Link】【Pages】:4254-4255

【Authors】: Wiktor Mateusz Piotrowski ; Maria Fox ; Derek Long ; Daniele Magazzeni ; Fabio Mercorio

【Abstract】: Planning in hybrid systems has been gaining research interest in the Artificial Intelligence community in recent years. Hybrid systems allow for a more accurate representation of real world problems, though solving them is very challenging due to complex system dynamics and a large model feature set. We developed DiNo, a new planner designed to tackle problems set in hybrid domains.DiNo is based on the discretise and validate approach and uses the novel Staged Relaxed Planning Graph+ (SRPG+) heuristic.

【Keywords】: Automated Planning; PDDL+; Planning as Model Checking; Planning in Mixed Discrete/Continuous Domains

621. Counter-Transitivity in Argument Ranking Semantics.

【Paper Link】【Pages】:4256-4257

【Authors】: Fuan Pu ; Jian Luo ; Guiming Luo

【Abstract】: The principle of counter-transitivity plays a vital role in argumentation. It states that an argument is strong when its attackers are weak, and is weak when its attackers are strong. In this work, we develop a formal theory about the argument ranking semantics based on this principle. Three approaches, quantity-based, quality-based and the unity of them, are defined to implement the principle. Then, we show an iterative refinement algorithm for capturing the ranking on arguments based on the recursive nature of the principle.

【Keywords】: Abstract argumentation framework; Ranking Semantics; counter-transitivity;

622. Discriminative Structure Learning of Arithmetic Circuits.

【Paper Link】【Pages】:4258-4259

【Authors】: Amirmohammad Rooshenas ; Daniel Lowd

【Abstract】: The biggest limitation of probabilistic graphical models is the complexity of inference, which is often intractable. An appealing alternative is to use tractable probabilistic models, such as arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently. In this paper, we present the first discriminative structure learning algorithm for ACs, DACLearn (Discriminative AC Learner), which optimizes conditional log-likelihood. Based on our experiments, DACLearn learns models that are more accurate and compact than other tractable generative and discriminative baselines.

【Keywords】: discriminative learning; structure learning; tractable models; graphical models; arithmetic circuits

623. Unsupervised Measure of Word Similarity: How to Outperform Co-Occurrence and Vector Cosine in VSMs.

【Paper Link】【Pages】:4260-4261

【Authors】: Enrico Santus ; Alessandro Lenci ; Tin-Shing Chiu ; Qin Lu ; Chu-Ren Huang

【Abstract】: In this paper, we claim that vector cosine – which is generally considered among the most efficient unsupervised measures for identifying word similarity in Vector Space Models – can be outperformed by an unsupervised measure that calculates the extent of the intersection among the most mutually dependent contexts of the target words. To prove it, we describe and evaluate APSyn, a variant of the Average Precision that, without any optimization, outperforms the vector cosine and the co-occurrence on the standard ESL test set, with an improvement ranging between +9.00% and +17.98%, depending on the number of chosen top contexts.

【Keywords】: Semantic Relations; Semantics; Hypernymy; Entailment; Classifier; Featurese; Unsupervised; Vector Space Models; VSMs; Distributional Semantic Models; DSMs

624. ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms.

【Paper Link】【Pages】:4262-4263

【Authors】: Enrico Santus ; Alessandro Lenci ; Tin-Shing Chiu ; Qin Lu ; Chu-Ren Huang

【Abstract】: In this paper, we describe ROOT13, a supervised system for the classification of hypernyms, co-hyponyms and random words. The system relies on a Random Forest algorithm and 13 unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT13 achieves an F1 score of 88.3%, against a baseline of 57.6% (vector cosine). When the classification is binary, ROOT13 achieves the following results: hypernyms-co-hyponyms (93.4% vs. 60.2%), hypernyms-random (92.3% vs. 65.5%) and co-hyponyms-random (97.3% vs. 81.5%). Our results are competitive with state-of-the-art models.

【Keywords】: Semantic Relations; Semantics; Hypernymy; Entailment; Classifier; Featurese; Unsupervised; Vector Space Models; VSMs; Distributional Semantic Models; DSMs

625. Abstracting Complex Domains Using Modular Object-Oriented Markov Decision Processes.

【Paper Link】【Pages】:4264-4265

【Authors】: Shawn Squire ; Marie desJardins

【Abstract】: We present an initial proposal for modular object-oriented MDPs, an extension of OO-MDPs that abstracts complex domains that are partially observable and stochastic with multiple goals. Modes reduce the curse of dimensionality by reducing the number of attributes, objects, and actions into only the features relevant for each goal. These modes may also be used as an abstracted domain to be transferred to other modes or to another domain.

【Keywords】: markov decision process; mdp; object-oriented; options; affordances; abstraction

626. Image Privacy Prediction Using Deep Features.

【Paper Link】【Pages】:4266-4267

【Authors】: Ashwini Kishore Tonge ; Cornelia Caragea

【Abstract】: Online image sharing in social media sites such as Facebook, Flickr, and Instagram can lead to unwanted disclosure and privacy violations, when privacy settings are used inappropriately. With the exponential increase in the number of images that are shared online, the development of effective and efficient prediction methods for image privacy settings are highly needed. In this study, we explore deep visual features and deep image tags for image privacy prediction. The results of our experiments show that models trained on deep visual features outperform those trained on SIFT and GIST. The results also show that deep image tags combined with user tags perform best among all tested features.

【Keywords】: Deep visual feature, Deep tag, User tag, Deep neural network, Deep image tag, Social networking site, Image privacy classification, Neural network, Privacy setting, Deep feature

627. Evaluating the Robustness of Game Theoretic Solutions When Using Abstraction.

【Paper Link】【Pages】:4268-4269

【Authors】: Oscar Samuel Veliz

【Abstract】: Game theory is a tool for modeling multi-agent decision problems and has been used to analyze strategies in domains such as poker, security, and trading agents. One method for solving very large games is to use abstraction techniques to shrink the game by removing detail, solve the reduced game, and then translate the solution back to the original game. We present a methodology for evaluating the robustness of different game-theoretic solution concepts to the errors introduced by the abstraction process. We present an initial empirical study of the robustness of several solution methods when using abstracted games.

【Keywords】: Game Theory; Abstraction; Empirical Game Modeling

628. Text Simplification Using Neural Machine Translation.

【Paper Link】【Pages】:4270-4271

【Authors】: Tong Wang ; Ping Chen ; John Rochford ; Jipeng Qiang

【Abstract】: Text simplification (TS) is the technique of reducing the lexical, syntactical complexity of text. Existing automatic TS systems can simplify text only by lexical simplification or by manually defined rules. Neural Machine Translation (NMT) is a recently proposed approach for Machine Translation (MT) that is receiving a lot of research interest. In this paper, we regard original English and simplified English as two languages, and apply a NMT model–Recurrent Neural Network (RNN) encoder-decoder on TS to make the neural network to learn text simplification rules by itself. Then we discuss challenges and strategies about how to apply a NMT model to the task of text simplification.

【Keywords】: Text Simplification; RNN; Deep Learning

629. Business Event Curation: Merging Human and Automated Approaches.

【Paper Link】【Pages】:4272-4273

【Authors】: Yiqi Wang ; Huiying Ma ; Nichola Lowe ; Maryann Feldman ; Charles Schmitt

【Abstract】: We present preliminary work to construct a knowledge curation system to advance research in the study of regional economics. The proposed system exploits natural language processing (NLP) techniques to automatically implement business event extraction, provides a user-facing interface to assist human curators, and a feedback loop to improve the performance of the Information Extraction Model for the automated parts of the system. Progress to date has shown that we can improve standard NLP approaches for entity and relationship extraction through heuristic means and provide indexing of extracted relationships to aid curation.

【Keywords】: Information Extraction;Business Intelligence;Natural Language Processing;Curation

630. Direct Discriminative Bag Mapping for Multi-Instance Learning.

【Paper Link】【Pages】:4274-4275

【Authors】: Jia Wu ; Shirui Pan ; Peng Zhang ; Xingquan Zhu

【Abstract】: Multi-instance learning (MIL) is useful for tackling labeling ambiguity in learning tasks, by allowing a bag of instances to share one label. Recently, bag mapping methods, which transform a bag to a single instance in a new space via instance selection, have drawn significant attentions. To date, most existing works are developed based on the original space, i.e., utilizing all instances for bag mapping, and instance selection is indirectly tied to the MIL objective. As a result, it is hard to guarantee the distinguish capacity of the selected instances in the new bag mapping space for MIL. In this paper, we propose a direct discriminative mapping approach for multi-instance learning (MILDM), which identifies instances to directly distinguish bags in the new mapping space. Experiments and comparisons on real-world learning tasks demonstrate the algorithm performance.

【Keywords】: Bag, Multi-instance, Classification

631. Mobility Sequence Extraction and Labeling Using Sparse Cell Phone Data.

【Paper Link】【Pages】:4276-4277

【Authors】: Yingxiang Yang ; Peter Widhalm ; Shounak Athavale ; Marta C. González

【Abstract】: Human mobility modeling for either transportation system development or individual location based services has a tangible impact on people's everyday experience. In recent years cell phone data has received a lot of attention as a promising data source because of the wide coverage, long observation period, and low cost. The challenge in utilizing such data is how to robustly extract people's trip sequences from sparse and noisy cell phone data and endow the extracted trips with semantic meaning, i.e., trip purposes.In this study we reconstruct trip sequences from sparse cell phone records. Next we propose a Bayesian trip purpose classification method and compare it to a Markov random field based trip purpose clustering method, representing scenarios with and without labeled training data respectively. This procedure shows how the cell phone data, despite their coarse granularity and sparsity, can be turned into a low cost, long term, and ubiquitous sensor network for mobility related services.

【Keywords】: Sparse spatial temporal traces; Markov random field; Bayesian classification method; Unlabeled training data

632. Epitomic Image Super-Resolution.

【Paper Link】【Pages】:4278-4279

【Authors】: Yingzhen Yang ; Zhangyang Wang ; Zhaowen Wang ; Shiyu Chang ; Ding Liu ; Honghui Shi ; Thomas S. Huang

【Abstract】: We propose Epitomic Image Super-Resolution (ESR) to enhance the current internal SR methods that exploit the self-similarities in the input. Instead of local nearest neighbor patch matching used in most existing internal SR methods, ESR employs epitomic patch matching that features robustness to noise, and both local and non-local patch matching. Extensive objective and subjective evaluation demonstrate the effectiveness and advantage of ESR on various images.

【Keywords】:

633. MicroScholar: Mining Scholarly Information from Chinese Microblogs.

【Paper Link】【Pages】:4280-4281

【Authors】: Yang Yu ; Xiaojun Wan

【Abstract】: For many researchers, one of the biggest issues is the lack of an efficient method to obtain latest academic progresses in related research fields. We notice that many researchers tend to share their research progresses or recommend scholarly information they have known on their microblogs. In order to exploit microblogging to benefit scientific research, we build a system called MicroScholar to automatically collecting and mining scholarly information from Chinese microblogs. In this paper, we briefly introduce the system framework and focus on the component of scholarly microblog categorization. Several kinds of features have been used in the component and experimental results demonstrate their usefulness.

【Keywords】:

634. Intrinsic and Extrinsic Evaluations of Word Embeddings.

【Paper Link】【Pages】:4282-4283

【Authors】: Michael Zhai ; Johnny Tan ; Jinho D. Choi

【Abstract】: In this paper, we first analyze the semantic composition of word embeddings by cross-referencing their clusters with the manual lexical database, WordNet. We then evaluate a variety of word embedding approaches by comparing their contributions to two NLP tasks. Our experiments show that the word embedding clusters give high correlations to the synonym and hyponym sets in WordNet, and give 0.88% and 0.17% absolute improvements in accuracy to named entity recognition and part-of-speech tagging, respectively.

【Keywords】: embeddings; clustering

635. User-Centric Affective Computing of Image Emotion Perceptions.

【Paper Link】【Pages】:4284-4285

【Authors】: Sicheng Zhao ; Hongxun Yao ; Wenlong Xie ; Xiaolei Jiang

【Abstract】: We propose to predict the personalized emotion perceptions of images for each viewer. Different factors that may influence emotion perceptions, including visual content, social context, temporal evolution, and location influence are jointly investigated via the presented rolling multi-task hypergraph learning. For evaluation, we set up a large scale image emotion dataset from Flickr, named Image-Emotion-Social-Net, with over 1 million images and about 8,000 users. Experiments conducted on this dataset demonstrate the superiority of the proposed method, as compared to state-of-the-art.

【Keywords】: Affective computing; Image emotion; Personalized perception; Hypergraph learning

636. Learning Structural Features of Nodes in Large-Scale Networks for Link Prediction.

【Paper Link】【Pages】:4286-4288

【Authors】: Aakas Zhiyuli ; Xun Liang ; Xiaoping Zhou

【Abstract】: We present an algorithm (LsNet2Vec) that, given a large-scale network (millions of nodes), embeds the structural features of node into a lower and fixed dimensions of vector in the set of real numbers. We experiment and evaluate our proposed approach with twelve datasets collected from SNAP. Results show that our model performs comparably with state-of-the-art methods, such as Katz method and Random Walk Restart method, in various experiment settings.

【Keywords】: large-scale network; link prediction; graph modeling;

Doctoral Consortium 17

637. Interactive Learning and Analogical Chaining for Moral and Commonsense Reasoning.

【Paper Link】【Pages】:4289-4290

【Authors】: Joseph A. Blass

【Abstract】: Autonomous systems must consider the moral ramifications of their actions. Moral norms vary among people and depend on common sense, posing a challenge for encoding them explicitly in a system. I propose to develop a model of repeated analogical chaining and analogical reasoning to enable autonomous agents to interactively learn to apply common sense and model an individual’s moral norms.

【Keywords】: Commonsense Reasoning; Moral Reasoning; Analogical Reasoning

638. Machine Learning for Computational Psychology.

【Paper Link】【Pages】:4291-4292

【Authors】: Sarah M. Brown

【Abstract】: Advances in sensing and imaging have provided psychology researchers new tools to understand how the brain creates the mind and simultaneously revealed the need for a new paradigm of mind-brain correspondence-- a set of basic theoretical tenets and an overhauled methodology. I develop machine learning methods to overcome three initial technical barriers to application of the new paradigm. I assess candidate solutions to these problems using two test datasets representing different areas of psychology: the first aiming to build more objective Post-Traumatic Stress Disorder(PTSD) diagnostic tools using virtual reality and peripheral physiology, the second aiming to verify theoretical tenets of the new paradigm in a study of basic affect using functional Magnetic Resonance Imaging(fMRI). Specifically I address three technical challenges: assessing performance in small, real datasets through stability; learning from labels of varying quality; and probabilistic representations of dynamical systems.

【Keywords】:

639. Robust Learning from Demonstration Techniques and Tools.

【Paper Link】【Pages】:4293-4294

【Authors】: William Curran

【Abstract】: Large state spaces and the curse of dimensionality contribute to the complexity of a task. Learning from demonstration techniques can be combined with reinforcement learning to narrow the exploration space of an agent, but require consistent and accurate demonstrations, as well as the state-action pairs for an entire demonstration. Individuals with severe motor disabilities are often slow and prone to human errors in demonstrations while teaching. My dissertation develops tools to allow persons with severe motor disabilities, and individuals in general, to train these systems. To handle these large state spaces as well as human error, we developed Dimensionality Reduced Reinforcement Learning. To accommodate slower feedback, we will develop a movie-reel style learning from demonstration interface.

【Keywords】: Reinforcement Learning; Feature Selection; Transfer

640. Integrating Planning and Recognition to Close the Interaction Loop.

【Paper Link】【Pages】:4295-4296

【Authors】: Richard G. Freedman

【Abstract】: In many real-world domains, the presence of machines is becoming more ubiquitous to the point that they are usually more than simple automation tools. As part of the environment amongst human users, it is necessary for these computers and robots to be able to interact with them reasonably by either working independently around them or participating in a task, especially one with which a person needs help. This interactive procedure requires several steps: recognizing the user and environment from sensor data, interpreting the user’s activity and motives, determining a responsive behavior, performing the behavior, and then recognizing everything again to confirm the behavior choice and replan if necessary. At the moment, the research areas addressing these steps, activity recognition, plan recognition, intent recognition, and planning, have all been primarily studied independently. However, pipelining each independent process can be risky in real-time situations where there may be enough time to only run a few steps. This leads to a critical question: how do we perform everything under time constraints? In this thesis summary, I propose a framework that integrates these processes by taking advantage of features shared between them.

【Keywords】: Plan Recognition; Activity Recognition; Planning; LDA

641. Apprenticeship Scheduling for Human-Robot Teams.

【Paper Link】【Pages】:4297-4298

【Authors】: Matthew C. Gombolay

【Abstract】: Resource optimization and scheduling is a costly, challenging problem that affects almost every aspect of our lives. One example that affects each of us is health care: Poor systems design and scheduling of resources can lead to higher rates of patient noncompliance and burnout of health care providers, as highlighted by the Institute of Medicine (Brandenburg et al. 2015). In aerospace manufacturing, every minute re-scheduling in response to dynamic disruptions in the build process of a Boeing 747 can cost up to $100.000. The military is also highly invested in the effective use of resources. In missile defense, for example, operators must =solve a challenging weapon-to-target problem, balancing the cost of expendable, defensive weapons while hedging against uncertainty in adversaries’ tactics. Researchers in artificial intelligence (AI) planning and scheduling strive to develop algorithms to improve resource allocation. However, there are two primary challenges. First, optimal task allocation and sequencing with upper and lower-bound temporal constraints (i.e., deadlines and wait constraints) is NP-Hard (Bertsimas and Weismantel 2005). Approximation techniques for scheduling exist and typically rely on the algorithm designer crafting heuristics based on domain expertise to decompose or structure the scheduling problem and prioritize the manner in which resources are allocated and tasks are sequenced (Tang and Parker 2005; Jones, Dias, and Stentz 2011). The second problem is this aforementioned reliance on crafting clever heuristics based on domain knowledge. Manually capturing domain knowledge within a scheduling algorithm remains a challenging process and leaves much to be desired (Ryan et al. 2013). The aim of my thesis is to develop an autonomous system that 1) learns the heuristics and implicit rules-of-thumb developed by domain experts from years of experience, 2) embeds and leverages this knowledge within a scalable resource optimization framework, and 3) provides decision support in a way that engages users and benefits them in their decision-making process. By intelligently leveraging the ability of humans to learn heuristics and the speed of modern computation, we can improve the ability to coordinate resources in these time and safety-critical domains.

【Keywords】: Scheduling; Learning From Example; Resource Optimization; Robotics

【Paper Link】【Pages】:4299-4300

【Authors】: Nadin Kökciyan

【Abstract】: In online social networks (OSNs), users are allowed to create and share content about themselves and others. When multiple entities start distributing content, information can reach unintended individuals and inference can reveal more information about the user. Existing applications do not focus on detecting privacy violations before they occur in the system. This thesis proposes an agent-based representation of a social network, where the agents manage users' privacy requirements and create privacy agreements with agents. The privacy context, such as the relations among users, various content types in the system, and so on are represented with a formal language. By reasoning with this formal language, an agent checks the current state of the system to resolve privacy violations before they occur. We argue that commonsense reasoning could be useful to solve some of privacy examples reported in the literature. We will develop new methods to automatically identify private information using commonsense reasoning, which has never been applied to privacy context. Moreover, agents may have conflicting privacy requirements. We will study how to use agreement technologies in privacy settings for agents to resolve conflicts automatically.

【Keywords】: privacy; online social networks; multiagent systems

【Paper Link】【Pages】:4301-4302

【Authors】: Wen-Yu Lee

【Abstract】: The goal of the research is to discover and summarize data from the emerging social media into information of interests. Specifically, leveraging user-contributed data from cross-domain social media, the idea is to perform multi-modal learning for a given photo, aiming to present people’s description or comments, geographical information, and events of interest, closely related to the photo. These information then can be used for various purposes, such as being a real-time guide for the tourists to improve the quality of tourism. As a result, this research investigates modern challenges of image annotation, image retrieval, and cross-media mining, followed by presenting promising ways to conquer the challenges.

【Keywords】: Cross-Media Mining; Social Media; Multi-Modal Learning

644. Unsupervised Learning of HTNs in Complex Adversarial Domains.

【Paper Link】【Pages】:4303-4304

【Authors】: Michael A. Leece

【Abstract】: While Hierarchical Task Networks are frequently cited as flexible and powerful planning models, they are often ignored due to the intensive labor cost for experts/programmers, due to the need to create and refine the model by hand. While recent work has begun to address this issue by working towards learning aspects of an HTN model from demonstration, or even the whole framework, the focus so far has been on simple domains, which lack many of the challenges faced in the real world such as imperfect information and real-time environments. I plan to extend this work using the domain of real-time strategy (RTS) games, which have gained recent popularity as a challenging and complex domain for AI research.

【Keywords】: HTN; StarCraft; Planning; Learning from demonstration

645. Estimating Text Intelligibility via Information Packaging Analysis.

【Paper Link】【Pages】:4305-4306

【Authors】: Junyi Jessy Li

【Abstract】: Effective communication through language involves organizing the content a person or system wishes to convey into text that flows naturally. There are many ways to render the same information, but those appropriate for one group of audience may not be intelligible to another. The goal of this thesis to analyze and address factors that influence the intelligibility of text from two aspects of information packaging: discourse structure and text specificity. Effective communication through language involves organizing the content a person or system wishes to convey into text that flows naturally. There are many ways to render the same information, but those appropriate for one group of audience may not be intelligible to another. The goal of this thesis to analyze and address factors that influence the intelligibility of text from two aspects of information packaging: discourse structure and text specificity.

【Keywords】: discourse; specificity; text analysis

646. Robust Classification under Covariate Shift with Application to Active Learning.

【Paper Link】【Pages】:4307-4308

【Authors】: Anqi Liu

【Abstract】: In supervised machine learning, model performance can decrease significantly when the distribution generating the new data varies from the distribution that generated the training data. One of the situations is covariate shift which happens a lot when labeled training data is missing, hard to get access to or very expensive to uniformly collect. All (probabilistic) classifiers will suffer from covariate shift. This motivates our research. Generally, we try to answer this question: how can we deal with covariate shift and generate predictions that are robust and reliable? We propose to develop a general framework for classification under covariate shift that is robust, flexible and accurate.

【Keywords】: Covariate Shift;Active Learning;Robust Classification

647. Analogical Generalization of Linguistic Constructions.

【Paper Link】【Pages】:4309-4310

【Authors】: Clifton James McFate

【Abstract】: Human language is extraordinarily creative in form and function, and adapting to this ever-shifting linguistic landscape is a daunting task for interactive cognitive systems. Recently, construction grammar has emerged as a linguistic theory for representing these complex and often idiomatic linguistic forms. Furthermore, analogical generalization has been proposed as a learning mechanism for extracting linguistic constructions from input. I propose an account that uses a computational model of analogy to learn and generalize argument structure constructions.

【Keywords】: Analogy; Construction Grammar; Natural Language

648. Writing Stories with Help from Recurrent Neural Networks.

【Paper Link】【Pages】:4311-4342

【Authors】: Melissa Roemmele

【Abstract】: This thesis explores the use of a recurrent neural network model for a novel story generation task. In this task, the model analyzes an ongoing story and generates a sentence that continues the story.

【Keywords】:

649. Scaling-Up MAP and Marginal MAP Inference in Markov Logic.

【Paper Link】【Pages】:4343-

【Authors】: Somdeb Sarkhel

【Abstract】: Markov Logic Networks (MLNs) use a few weighted first-order logic formulas to represent large probabilistic graphical models and are ideally suited for representing both relational and probabilistic knowledge in a wide variety of application domains such as, NLP, computer vision, and robotics. However, inference in them is hard because the graphical models can be extremely large, having millions of variables and features (potentials). Therefore, several lifted inference algorithms that exploit relational structure and operate at the compact first-order level, have been developed in recent years. However, the focus of much of existing research on lifted inference is on marginal inference while algorithms for MAP and marginal MAP inference are far less advanced. The aim of the proposed thesis is to fill this void, by developing next generation inference algorithms for MAP and marginal MAP inference.

【Keywords】: Markov Logic Network; MAP Inference; Learning

650. Adapting Plans through Communication with Unknown Teammates.

【Paper Link】【Pages】:4315-4316

【Authors】: Trevor Sarratt

【Abstract】: My thesis addresses the problem of planning under teammate behavior uncertainty by introducing the concept of intentional multiagent communication within ad hoc teams. In partially observable multiagent domains, agents much share information regarding aspects of the environment such that uncertainty is reduced across the team, permitting better coordination. Similarly, we consider how communication may be utilized within ad hoc teams to resolve behavioral uncertainty. Transmitting intentional messages allows agents to adjust predictions of a teammate's individual course of action. In short, an ad hoc agent coordinating with an unknown teammate can identify uncertainties within its own predictive model of teammate behavior then request the appropriate policy information, allowing the agent to adapt its personal plan. The main contribution of this work is the characterization of the interaction between learning, communication, and planning in ad hoc teams.

【Keywords】:

651. Pragmatic Querying in Heterogeneous Knowledge Graphs.

【Paper Link】【Pages】:4317-4318

【Authors】: Amar Viswanathan

【Abstract】: Knowledge Graphs with rich schemas can allow for complex querying. My thesis focuses on providing accessible Knowledge using Gricean notions of Cooperative Answering as a motivation. More specifically, using Query Reformulations, Data Awareness, and a Pragmatic Context, along with the results they can become more responsive to user requirements and user context.

【Keywords】: Query Reformulation, RDF Reformulation

652. Architectural Mechanisms for Situated Natural Language Understanding in Uncertain and Open Worlds.

【Paper Link】【Pages】:4319-4320

【Authors】: Tom Williams

【Abstract】: As natural language capable robots and other agents become more commonplace, the ability for these agents to understand truly natural human speech is becoming increasingly important. What is more, these agents must be able to understand truly natural human speech in realistic scenarios, in which an agent may not have full certainty in its knowledge of its environment, and in which an agent may not have full knowledge of the entities contained in its environment. As such, I am interested in developing architectural mechanisms which will allow robots to understand natural language in uncertain and open-worlds. My work towards this goal has primarily focused on two problems: (1) reference resolution, and (2) pragmatic reasoning.

【Keywords】: natural language understanding; human-robot interaction; reference resolution; pragmatic reasoning; open worlds

653. Affective Computing and Applications of Image Emotion Perceptions.

【Paper Link】【Pages】:4321-4323

【Authors】: Sicheng Zhao ; Hongxun Yao

【Abstract】: Images can convey rich semantics and evoke strong emotions in viewers. The research of my PhD thesis focuses on image emotion computing (IEC), which aims to predict the emotion perceptions of given images. The development of IEC is greatly constrained by two main challenges: affective gap and subjective evaluation. Previous works mainly focused on finding features that can express emotions better to bridge the affective gap, such as elements-of-art based features and shape features. According to the emotion representation models, including categorical emotion states (CES) and dimensional emotion space (DES), three different tasks are traditionally performed on IEC: affective image classification, regression and retrieval. The state-of-the-art methods on the three above tasks are image-centric, focusing on the dominant emotions for the majority of viewers. For my PhD thesis, I plan to answer the following questions: (1) Compared to the low-level elements-of-art based features, can we find some higher level features that are more interpretable and have stronger link to emotions? (2) Are the emotions that are evoked in viewers by an image subjective and different? If they are, how can we tackle the user-centric emotion prediction? (3) For image-centric emotion computing, can we predict the emotion distribution instead of the dominant emotion category?

【Keywords】: Affective computing; Image emotion; Personalized perception; Emotion distribution

What's Hot Papers 9

654. What's Hot in Human Language Technology: Highlights from NAACL HLT 2015.

【Paper Link】【Pages】:4324-4326

【Authors】: Joyce Yue Chai ; Anoop Sarkar ; Rada Mihalcea

【Abstract】: This paper shows a few examples to highlight the trends observed at the NAACL HLT 2015 conference.

【Keywords】: Human Language Technology, Computational Linguistics

655. What's Hot in the Answer Set Programming Competition.

【Paper Link】【Pages】:4327-4329

【Authors】: Martin Gebser ; Marco Maratea ; Francesco Ricca

【Abstract】: Answer Set Programming (ASP) is a declarative programming paradigm with roots in logic programming, knowledge representation, and non-monotonic reasoning. The ASP competition series aims at assessing and promoting the evolution of ASP systems and applications. Its growing range of challenging application-oriented benchmarks inspires and showcases continuous advancements of the state of the art in ASP.

【Keywords】:

656. Inductive Logic Programming: Challenges.

【Paper Link】【Pages】:4330-4332

【Authors】: Katsumi Inoue ; Hayato Ohwada ; Akihiro Yamamoto

【Abstract】: An overview of notable ILP areas, focusing on three invited talks at ILP 2015, two best student papers and the panel discussion on "ILP 25 Years".

【Keywords】: Inductive Logic Programming

657. What's Hot in Intelligent User Interfaces.

【Paper Link】【Pages】:4333-4334

【Authors】: Shimei Pan ; Oliver Brdiczka ; Giuseppe Carenini ; Duen Horng Chau ; Per Ola Kristensson

【Abstract】: The ACM Conference on Intelligent User Interfaces (IUI) is the annual meeting of the intelligent user interface community and serves as a premier international forum for reporting outstanding research and development on intelligent user interfaces. ACM IUI is where the Human-Computer Interaction (HCI) community meets the Artificial Intelligence (AI) community. Here we summarize the latest trends in IUI based on our experience organizing the 20th ACM IUI Conference in Atlanta in 2015.

【Keywords】: Sensor; Data Analytics; Human-centered; Pervasive Computing; Affective Computing

658. General Video Game AI: Competition, Challenges and Opportunities.

【Paper Link】【Pages】:4335-4337

【Authors】: Diego Perez Liebana ; Spyridon Samothrakis ; Julian Togelius ; Tom Schaul ; Simon M. Lucas

【Abstract】: The General Video Game AI framework and competition pose the problem of creating artificial intelligence that can play a wide, and in principle unlimited, range of games. Concretely, it tackles the problem of devising an algorithm that is able to play any game it is given, even if the game is not known a priori. This area of study can be seen as an approximation of General Artificial Intelligence, with very little room for game-dependent heuristics. This short paper summarizes the motivation, infrastructure, results and future plans of General Video Game AI, stressing the findings and first conclusions drawn after two editions of our competition, and outlining our future plans.

【Keywords】: competitions; games; reinforcement learning; evolutionary computation

659. Angry Birds as a Challenge for Artificial Intelligence.

【Paper Link】【Pages】:4338-4339

【Authors】: Jochen Renz ; Xiaoyu Ge ; Rohan Verma ; Peng Zhang

【Abstract】: The Angry Birds AI Competition (aibirds.org) has been held annually since 2012 in conjunction with some of the major AI conferences, most recently with IJCAI 2015. The goal of the competition is to build AI agents that can play new Angry Birds levels as good as or better than the best human players. Successful agents should be able to quickly analyze new levels and to predict physical consequences of possible actions in order to select actions that solve a given level with a high score. Agents have no access to the game internal physics, but only receive screenshots of the live game. In this paper we describe why this problem is a challenge for AI, and why it is an important step towards building AI that can successfully interact with the real world. We also summarise some highlights of past competitions, including a new competition track we introduced recently.

【Keywords】:

660. What's Hot in Heuristic Search.

【Paper Link】【Pages】:4340-4342

【Authors】: Roni Stern ; Levi H. S. Lelis

【Abstract】: Search in general, and heuristic search in particular, is at the heart of many Artificial Intelligence algorithms and applications. There is now a growing and active community devoted to the empirical and theoretical study of heuristic search algorithms, thanks to the successful application of search-based algorithms to areas such as robotics, domain-independent planning, optimization, and computer games. In this extended abstract we highlight recent efforts in understanding suboptimal search algorithms, as well as ensembles of heuristics and algorithms. The result of these efforts are meta-reasoning methods which are applied to orchestrate the different components of modern search algorithms. Finally, we mention recent innovative applications of search that demonstrate the relevance of the field to general AI.

【Keywords】: Heuristic Search

661. Competition of Distributed and Multiagent Planners (CoDMAP).

【Paper Link】【Pages】:4343-4345

【Authors】: Michal Stolba ; Antonín Komenda ; Daniel L. Kovacs

【Abstract】: As a part of the workshop on Distributed and Multiagent Planning (DMAP) at the International Conference on Automated Planning and Scheduling (ICAPS) 2015, we have organized a competition in distributed and multiagent planning. The main aims of the competition were to consolidate the planners in terms of input format; to promote development of multiagent planners both inside and outside of the multiagent research community; and to provide a proof-of-concept of a potential future multiagent planning track of the International Planning Competition (IPC). In this paper we summarize course and highlights of the competition.

【Keywords】: multiagent planning; distributed planning; competition

662. What's Hot at RoboCup.

【Paper Link】【Pages】:4346-4348

【Authors】: Peter Stone

【Abstract】: The aim of this paper is to give an overview of the latest and most innovative developments at RoboCup, as well as highlighting some of the current and future challenges upon which today's RoboCup participants are focused.

【Keywords】: RoboCup

Demonstration Papers 29

663. Artificial Intelligence for Predictive and Evidence Based Architecture Design.

【Paper Link】【Pages】:4349-4350

【Authors】: Mehul Bhatt ; Jakob Suchan ; Carl P. L. Schultz ; Vasiliki Kondyli ; Saurabh Goyal

【Abstract】: The evidence-based analysis of people's navigation and wayfinding behaviour in large-scale built-up environments (e.g., hospitals, airports) encompasses the measurement and qualitative analysis of a range of aspects including people's visual perception in new and familiar surroundings, their decision-making procedures and intentions, the affordances of the environment itself, etc. In our research on large-scale evidence-based qualitative analysis of wayfinding behaviour, we construe visual perception and navigation in built-up environments as a dynamic narrative construction process of movement and exploration driven by situation-dependent goals, guided by visual aids such as signage and landmarks, and influenced by environmental (e.g., presence of other people, time of day, lighting) and personal (e.g., age, physical attributes) factors. We employ a range of sensors for measuring the embodied visuo-locomotive experience of building users: eye-tracking, egocentric gaze analysis, external camera based visual analysis to interpret fine-grained behaviour (e.g., stopping, looking around, interacting with other people), and also manual observations made by human experimenters. Observations are processed, analysed, and integrated in a holistic model of the visuo-locomotive narrative experience at the individual and group level. Our model also combines embodied visual perception analysis with analysis of the structure and layout of the environment (e.g., topology, routes, isovists) computed from available 3D models of the building. In this framework, abstract regions like the visibility space, regions of attention, eye movement clusters, are treated as first class visuo-spatial and iconic objects that can be used for interpreting the visual experience of subjects in a high-level qualitative manner. The final integrated analysis of the wayfinding experience is such that it can even be presented in a virtual reality environment thereby providing an immersive experience (e.g., using tools such as the Oculus Rift) of the qualitative analysis for single participants, as well as for a combined analysis of large group. This capability is especially important for experiments in post-occupancy analysis of building performance. Our construction of indoor wayfinding experience as a form of moving image analysis centralizes the role and influence of perceptual visuo-spatial characteristics and morphological features of the built environment into the discourse on wayfinding research. We will demonstrate the impact of this work with several case-studies, particularly focussing on a large-scale experiment conducted at the New Parkland Hospital in Dallas Texas, USA.

【Keywords】: applied artificial intelligence; visual perception; architectural cognition

664. co-rank: An Online Tool for Collectively Deciding Efficient Rankings Among Peers.

【Paper Link】【Pages】:4351-4352

【Authors】: Ioannis Caragiannis ; George A. Krimpas ; Marianna Panteli ; Alexandros A. Voudouris

【Abstract】: Our aim with co-rank is to facilitate the grading of exams or assignments in massive open online courses (MOOCs).

【Keywords】: peer grading; social choice; MOOCs; Borda count

665. SVVAMP: Simulator of Various Voting Algorithms in Manipulating Populations.

【Paper Link】【Pages】:4353-4354

【Authors】: François Durand ; Fabien Mathieu ; Ludovic Noirie

【Abstract】: We present SVVAMP, a Python package dedicated to the study of voting systems with an emphasis on manipulation analysis.

【Keywords】:

666. Deploying PAWS to Combat Poaching: Game-Theoretic Patrolling in Areas with Complex Terrain (Demonstration).

【Paper Link】【Pages】:4355-4356

【Authors】: Fei Fang ; Thanh Hong Nguyen ; Rob Pickles ; Wai Y. Lam ; Gopalasamy R. Clements ; Bo An ; Amandeep Singh ; Milind Tambe

【Abstract】: The conservation of key wildlife species such as tigers and elephants are threatened by poaching activities. In many conservation areas, foot patrols are conducted to prevent poaching but they may not be well-planned to make the best use of the limited patrolling resources. While prior work has introduced PAWS (Protection Assistant for Wildlife Security) as a game-theoretic decision aid to design effective foot patrol strategies to protect wildlife, the patrol routes generated by PAWS may be difficult to follow in areas with complex terrain. Subsequent research has worked on the significant evolution of PAWS, from an emerging application to a regularly deployed software. A key advance of the deployed version of PAWS is that it incorporates the complex terrain information and generates a strategy consisting of easy-to-follow routes. In this demonstration, we provide 1) a video introducing the PAWS system; 2) an interactive visualization of the patrol routes generated by PAWS in an example area with complex terrain; and 3) a machine-human competition in designing patrol strategy given complex terrain and animal distribution.

【Keywords】: Game Theory; Computational Sustainability; Wildlife Protection; Security Games; Deployed Application

【Paper Link】【Pages】:4357-4358

【Authors】: Maria Ivanova Gorinova ; Yoad Lewenberg ; Yoram Bachrach ; Alfredo Kalaitzis ; Michael Fagan ; Dean Carignan ; Nitin Gautam

【Abstract】: We demonstrate a system for predicting gaming related properties from Twitter accounts. Our system predicts various traits of users based on the tweets publicly available in their profiles. Such inferred traits include degrees of tech-savviness and knowledge on computer games, actual gaming performance, preferred platform, degree of originality, humor and influence on others. Our system is based on machine learning models trained on crowd-sourced data. It allows people to select Twitter accounts of their fellow gamers, examine the trait predictions made by our system, and the main drivers of these predictions. We present empirical results on the performance of our system based on its accuracy on our crowd-sourced dataset.

【Keywords】:

668. NLU Framework for Voice Enabling Non-Native Applications on Smart Devices.

【Paper Link】【Pages】:4359-4360

【Authors】: Soujanya Lanka ; Deepika Pathania ; Pooja Kushalappa ; Pradeep Varakantham

【Abstract】: Voice is a critical user interface on smart devices (wearables, phones, speakers, televisions) to access applications (or services) available on them. Unfortunately, only a few native applications (provided by the OS developer) are typically voice enabled in devices of today. Since, the utility of a smart device is determined more by the strength of external applications developed for the device, voice enabling non-native applications in a scalable, seamless manner within the device is a critical use case and is the focus of our work. We have developed a Natural Language Understanding (NLU) framework that uses templates supported by the application (as determined by the application developer). This framework can be employed in any mobile OS (Android, iOS, Tizen, Android wear) for a wide range of devices. To aid this demonstration, we have implemented the framework as a service in Android OS. When the user issues a voice command, the natural language query is obtained by this service (using one of local, cloud based or hybrid speech recognizers). The service then executes our NLU framework to identify the relevant application and particular action details. In this demonstration, we will showcase this NLU framework implemented as an Android service on a set of applications that will be installed on the fly. Specifically, we will show how the voice queries are understood and necessary services are launched on android smart wearables and phones.

【Keywords】: Natural Language Understanding; Smart Devices

669. Modeling and Experimentation Framework for Fuzzy Cognitive Maps.

【Paper Link】【Pages】:4361-4362

【Authors】: Maikel León Espinosa ; Gonzalo Nápoles Ruiz

【Abstract】: Many papers describe the use of Fuzzy Cognitive Maps as a modeling/representation technique for real-life scenarios’ simulation or prediction. However, not many real software implementations are described neither found. In this proposal the authors describe a modeling and experimentation framework where realistic problems can be recreated using Fuzzy Cognitive Maps as a knowledge representation form. Design elements, and descriptions of the algorithms that have been incorporated into the software, and hybridized with Fuzzy Cognitive Maps, are presented in this paper. Case studies were conducted and are illustrated with the intention of demonstrating the success and practical value of the general approach together with the implementation tool.

【Keywords】: Fuzzy Cognitive Maps; modeling/representation technique; real-life scenarios’ simulation

670. Using Convolutional Neural Networks to Analyze Function Properties from Images.

【Paper Link】【Pages】:4363-4364

【Authors】: Yoad Lewenberg ; Yoram Bachrach ; Ian A. Kash ; Peter B. Key

【Abstract】: We propose a system for determining properties of mathematical functions given an image of their graph representation. We demonstrate our approach for two-dimensional graphs (curves of single variable functions) and three-dimensional graphs (surfaces of two variable functions), studying the properties of convexity and symmetry. Our method uses a Convolutional Neural Network which classifies functions according to these properties, without using any hand-crafted features. We propose algorithms for randomly constructing functions with convexity or symmetry properties, and use the images generated by these algorithms to train our network. Our system achieves a high accuracy on this task, even for functions where humans find it difficult to determine the function's properties from its image.

【Keywords】:

671. Predicting Personal Traits from Facial Images Using Convolutional Neural Networks Augmented with Facial Landmark Information.

【Paper Link】【Pages】:4365-4366

【Authors】: Yoad Lewenberg ; Yoram Bachrach ; Sukrit Shankar ; Antonio Criminisi

【Abstract】: We consider the task of predicting various traits of a person given an image of their face. We aim to estimate traits such as gender, ethnicity and age, as well as more subjective traits as the emotion a person expresses or whether they are humorous or attractive. Due to the recent surge of research on Deep Convolutional Neural Networks (CNNs), we begin by using a CNN architecture, and corroborate that CNNs are promising for facial attribute prediction. To further improve performance, we propose a novel approach that incorporates facial landmark information for input images as an additional channel, helping the CNN learn face-specific features so that the landmarks across various training images hold correspondence. We empirically analyze the performance of our proposed method, showing consistent improvement over the baselines across traits. We demonstrate our system on a sizeable Face Attributes Dataset (FAD), comprising of roughly 200,000 labels, for 10 most sought-after traits, for over 10,000 facial images.

【Keywords】:

672. EKNOT: Event Knowledge from News and Opinions in Twitter.

【Paper Link】【Pages】:4367-4368

【Authors】: Min Li ; Jingjing Wang ; Wenzhu Tong ; Hongkun Yu ; Xiuli Ma ; Yucheng Chen ; Haoyan Cai ; Jiawei Han

【Abstract】: We present the EKNOT system that automatically discovers major events from online news articles, connects each event to its discussion in Twitter, and provides a comprehensive summary of the events from both news media and social media's point of view. EKNOT takes a time period as input and outputs a complete picture of the events within the given time range along with the public opinions. For each event, EKNOT provides multi-dimensional summaries: a) a summary from news for an objective description; b) a summary from tweets containing opinions/sentiments; c) an entity graph which illustrates the major players involved and their correlations; d) the time span of the event; and e) an opinion (sentiment) distribution. Also, if a user is interested in a particular event, he/she can zoom into this event to investigate its aspects (sub-events) summarized in the same manner. EKNOT is built on real-time crawled news articles and tweets, allowing users to explore the dynamics of major events with minimal delays.

【Keywords】:

673. BBookX: Building Online Open Books for Personalized Learning.

【Paper Link】【Pages】:4369-4370

【Authors】: Chen Liang ; Shuting Wang ; Zhaohui Wu ; Kyle Williams ; Bart Pursel ; Benjamin Bräutigam ; Sherwyn Saul ; Hannah Williams ; Kyle Bowen ; C. Lee Giles

【Abstract】: We demonstrate BBookX, a novel system that auto-matically builds in collaboration with a user online openbooks by searching open educational resources (OER).This system explores the use of retrieval technologies todynamically generate zero-cost materials such as text-books for personalized learning.

【Keywords】:

【Paper Link】【Pages】:4371-4372

【Authors】: Huijie Lin ; Jia Jia ; Jie Huang ; Enze Zhou ; Jingtian Fu ; Yejun Liu ; Huanbo Luan

【Abstract】: In this demo, we build a practical mobile application, Moodee,to help detect and release users’ psychological stress byleveraging users’ social media data in online social networks,and provide an interactive user interface to present users’and friends’ psychological stress states in an visualized andintuitional way.Given users’ online social media data as input, Moodee intelligentlyand automatically detects users’ stress states. Moreover,Moodee would recommend users with different linksto help release their stress. The main technology of this demois a novel hybrid model - a factor graph model combinedwith Deep Neural Network, which can leverage social mediacontent and social interaction information for stress detection.We think that Moodee can be helpful to people’s mentalhealth, which is a vital problem in modern world.

【Keywords】: Social Media, Wellbeing; Mental Health

675. Write-righter: An Academic Writing Assistant System.

【Paper Link】【Pages】:4373-4374

【Authors】: Yuanchao Liu ; Xin Wang ; Ming Liu ; Xiaolong Wang

【Abstract】: Writing academic articles in English is a challenging task for non-native speakers, as more effort has to be spent to enhance their language expressions. This paper presents an academic writing assistant system called Write-righter, which can provide real-time hint and recommendation by analyzing the input context. To achieve this goal, some novel strategies, e.g., semantic extension based sentence retrieval and LDA based sentence structure identification have been proposed. Write-righter is expected to help people express their ideas correctly by recommending top N most possible expressions.

【Keywords】:

676. An Image Analysis Environment for Species Identification of Food Contaminating Beetles.

【Paper Link】【Pages】:4375-4376

【Authors】: Daniel Martin ; Hongjian Ding ; Leihong Wu ; Howard Semey ; Amy Barnes ; Darryl Langley ; Su Inn Park ; Zhichao Liu ; Weida Tong ; Joshua Xu

【Abstract】: Food safety is vital to the well-being of society; therefore, it is important to inspect food products to ensure minimal health risks are present. The presence of certain species of insects, especially storage beetles, is a reliable indicator of possible contamination during storage and food processing. However, the current approach of identifying species by visual examination of insect fragments is rather subjective and time-consuming. To aid this inspection process, we have developed in collaboration with FDA food analysts some image analysis-based machine intelligence to achieve species identification with up to 90% accuracy. The current project is a continuation of this development effort. Here we present an image analysis environment that allows practical deployment of the machine intelligence on computers with limited processing power and memory. Using this environment, users can prepare input sets by selecting images for analysis, and inspect these images through the integrated panning and zooming capabilities. After species analysis, the results panel allows the user to compare the analyzed images with reference images of the proposed species. Further additions to this environment should include a log of previously analyzed images, and eventually extend to interaction with a central cloud repository of images through a web-based interface.

【Keywords】: Food Safety Inspection, Food Contamination, Image Analysis, Machine Learning

677. Jikan to Kukan: A Hands-On Musical Experience in AI, Games and Art.

【Paper Link】【Pages】:4377-4378

【Authors】: Georgia Rossmann Martins ; Mário Escarce Junior ; Leandro Soriano Marcolino

【Abstract】: AI is typically applied in video games in the creation of artificial opponents, in order to make them strong, realistic or even fallible (for the game to be "enjoyable" by human players). We offer a different perspective: we present the concept of "Art Games", a view that opens up many possibilities for AI research and applications. Conference participants will play Jikan to Kukan, an art game where the player dynamically creates the soundtrack with the AI system, while developing her experience in the unconscious world of a character.

【Keywords】: Computer Games; AI & Arts

678. WWDS APIs: Application Programming Interfaces for Efficient Manipulation of World WordNet Database Structure.

【Paper Link】【Pages】:4379-4380

【Authors】: Hanumant Harichandra Redkar ; Sudha Bhingardive ; Kevin Patel ; Pushpak Bhattacharyya ; Neha Prabhugaonkar ; Apurva Nagvenkar ; Ramdas Karmali

【Abstract】: WordNets are useful resources for natural language processing. Various WordNets for different languages have been developed by different groups. Recently, World WordNet Database Structure (WWDS) was proposed by Redkar et. al (2015) as a common platform to store these different WordNets. However, it is underutilized due to lack of programming interface. In this paper, we present WWDS APIs, which are designed to address this shortcoming. These WWDS APIs, in conjunction with WWDS, act as a wrapper that enables developers to utilize WordNets without worrying about the underlying storage structure. The APIs are developed in PHP, Java, and Python, as they are the preferred programming languages of most developers and researchers working in language technologies. These APIs can help in various applications like machine translation, word sense disambiguation, multilingual information retrieval, etc.

【Keywords】: WordNet; WWDS; WWDS APIs; World WordNet Database Structure

679. Artificial Swarm Intelligence, a Human-in-the-Loop Approach to A.I.

【Paper Link】【Pages】:4381-4382

【Authors】: Louis Rosenberg

【Abstract】: Most research into Swarm Intelligence explores swarms of autonomous robots or simulated agents. Little work, however, has been done on swarms of networked humans. This paper introduces UNU, an online platform that enables networked users to assemble in real-time swarms and tackle problems as an Artificial Swarm Intelligence (ASI). Modeled after biological swarms, UNU enables large groups of networked users to work together in real-time synchrony, forging a unified dynamic system that can quickly answer questions and make decisions. Early testing suggests that human swarming has significant potential for harnessing the Collective Intelligence (CI) of online groups, often exceeding the natural abilities of individual participants.

【Keywords】: Artificial Swarm Intelligence; A.I.; HCI; Human Computer Interaction; Artificial Intelligence; Collective Intelligence

680. Toward Interactive Relational Learning.

【Paper Link】【Pages】:4383-4384

【Authors】: Ryan A. Rossi ; Rong Zhou

【Abstract】: This paper introduces the Interactive Relational Machine Learning (iRML) paradigm in which users interactively design relational models by specifying the various components, constraints, and relational data representation, as well as perform evaluation, analyze errors, and make adjustments and refinements in a closed-loop. iRML requires fast real-time learning and inference methods capable of interactive rates. Methods are investigated that enable direct manipulation of the various components of the RML method. Visual representation and interaction techniques are also developed for exploring the space of relational models and the trade-offs of the various components and design choices.

【Keywords】: interactive machine learning; interactive relational learning; visual analytics; network visualization; real-time system; web platform; large networks; relational learning; statistical relational learning; semi-supervised learning; visual graph mining

681. EDDIE: An Embodied AI System for Research and Intervention for Individuals with ASD.

【Paper Link】【Pages】:4385-4386

【Authors】: Robert Selkowitz ; Jonathan Rodgers ; P. J. Moskal ; Jon Mrowczynski ; Christine Colson

【Abstract】: We report on the ongoing development of EDDIE (Emotion Demonstration, Decoding, Interpretation, and Encoding), an interactive embodied AI to be deployed as an intervention system for children diagnosed with High-Functioning Autism Spectrum Disorders (HFASD). EDDIE presents the subject with interactive requests to decode facial expressions presented through an avatar, encode requested expressions, or do both in a single session. Facial tracking software interprets the subject’s response, and allows for immediate feedback. The system fills a need in research and intervention for children with HFASD by providing an engaging platform for presentation of exemplar expressions consistent with mechanical systems of facial action measurement integrated with an automatic system for interpreting and giving feedback to the subject’s expressions. Both live interaction with EDDIE and video recordings of human-EDDIE interaction will be demonstrated.

【Keywords】:

682. A Tool to Graphically Edit CP-Nets.

【Paper Link】【Pages】:4387-4388

【Authors】: Aidan Shafran ; Sam Saarinen ; Judy Goldsmith

【Abstract】: Conditional preference networks (CP-nets) are a mathematical formalism for compactly representing preferences over combinatorial domains. The software package presented allows editing of CP-nets through a graphical interface, loads and saves to an XML-based file format, and detects properties of the currently loaded CP-net.

【Keywords】: Preferences, visualization, CP-nets

683. A Visual Semantic Framework for Innovation Analytics.

【Paper Link】【Pages】:4389-4390

【Authors】: Walid Shalaby ; Kripa Rajshekhar ; Wlodek Zadrozny

【Abstract】: In this demo we present a semantic framework for innovation and patent analytics powered by Mined Semantic Analysis (MSA). Our framework provides cognitive assistance to its users through a Web-based visual and interactive interface. First, we describe building a conceptual knowledge graph by mining user-generated encyclopedic textual corpus for semantic associations. Then, we demonstrate applying the acquired knowledge to support many cognition and knowledge based use cases for innovation analysis including technology exploration and landscaping, competitive analysis, literature and prior art search and others.

【Keywords】: semantic analysis; innovation analytics; congnitive assistance; visual framework

684. Multi-Agent System Development MADE Easy.

【Paper Link】【Pages】:4391-4392

【Authors】: Zhiqi Shen ; Han Yu ; Chunyan Miao ; Siyao Li ; Yiqiang Chen

【Abstract】: Agent-Oriented Software Engineering (AOSE) is an emerging software engineering paradigm that advocates the application of best practices in the development of Multi-Agent Systems (MAS) through the use of agents and organizations of agents. This paper outlines the MADE system, which provides an interactive platform for people who are not well-versed in AOSE to contribute to the rapid prototyping of MASs with ease.

【Keywords】: Multi-agent systems; agent-oriented software engineering; goal setting theory

685. A Fraud Resilient Medical Insurance Claim System.

【Paper Link】【Pages】:4393-4394

【Authors】: Yuliang Shi ; Chenfei Sun ; Qingzhong Li ; Lizhen Cui ; Han Yu ; Chunyan Miao

【Abstract】: As many countries in the world start to experience population aging, there are an increasing number of people relying on medical insurance to access healthcare resources. Medical insurance frauds are causing billions of dollars in losses for public healthcare funds. The detection of medical insurance frauds is an important and difficult challenge for the artificial intelligence (AI) research community. This paper outlines HFDA, a hybrid AI approach to effectively and efficiently identify fraudulent medical insurance claims which has been tested in an online medical insurance claim system in China.

【Keywords】: Medical Insurance Fraud Detection; Information Theory; Outlier Analysis; Decision Support

686. DECT: Distributed Evolving Context Tree for Understanding User Behavior Pattern Evolution.

【Paper Link】【Pages】:4395-4396

【Authors】: Xiaokui Shu ; Nikolay Laptev ; Danfeng (Daphne) Yao

【Abstract】: Internet user behavior models characterize user browsing dynamics or the transitions among web pages. The models help Internet companies improve their services by accurately targeting customers and providing them the information they want. For instance, specific web pages can be customized and prefetched for individuals based on sequences of web pages they have visited. Existing user behavior models abstracted as time-homogeneous Markov models cannot efficiently model user behavior variation through time. This demo presents DECT, a scalable time-variant variable-order Markov model. DECT digests terabytes of user session data and yields user behavior patterns through time. We realize DECT using Apache Spark and deploy it on top of Yahoo! infrastructure. We demonstrate the benefits of DECT with anomaly detection and ad click rate prediction applications. DECT enables the detection of higher-order path anomalies and provides deep insights into ad click rates with respect to user visiting paths.

【Keywords】: internet user behavior, user behavior pattern, user behavior, markov model, ad click rate, ad click probability, time series, variable order markov model, higher order, web page, user behavior model

687. Markov Argumentation Random Fields.

【Paper Link】【Pages】:4397-4398

【Authors】: Yuqing Tang ; Nir Oren ; Katia P. Sycara

【Abstract】: We demonstrate an implementation of Markov Argumentation Random Fields (MARFs), a novel formalism combining elements of formal argumentation theory and probabilistic graphical models. In doing so MARFs provide a principled technique for the merger of probabilistic graphical models and non-monotonic reasoning, supporting human reasoning in ``messy’’ domains where the knowledge about conflicts should be applied. Our implementation takes the form of a graphical tool which supports users in interpreting complex information. We have evaluated our implementation in the domain of intelligence analysis, where analysts must reason and determine likelihoods of events using information obtained from conflicting sources.

【Keywords】: Argumentation;Probabilistic Graphical models;Markov Random Fields

688. Shoot to Know What: An Application of Deep Networks on Mobile Devices.

【Paper Link】【Pages】:4399-4400

【Authors】: Jiaxiang Wu ; Qinghao Hu ; Cong Leng ; Jian Cheng

【Abstract】: Convolutional neural networks (CNNs) have achieved impressive performance in a wide range of computer vision areas. However, the application on mobile devices remains intractable due to the high computation complexity. In this demo, we propose the Quantized CNN (Q-CNN), an efficient framework for CNN models, to fulfill efficient and accurate image classification on mobile devices. Our Q-CNN framework dramatically accelerates the computation and reduces the storage/memory consumption, so that mobile devices can independently run an ImageNet-scale CNN model. Experiments on the ILSVRC-12 dataset demonstrate 4~6x speed-up and 15~20x compression, with merely one percentage drop in the classification accuracy. Based on the Q-CNN framework, even mobile devices can accurately classify images within one second.

【Keywords】: Convolutional Neural Network; Quantization; Mobile Devices

689. SAPE: A System for Situation-Aware Public Security Evaluation.

【Paper Link】【Pages】:4401-4402

【Authors】: Shu Wu ; Qiang Liu ; Ping Bai ; Liang Wang ; Tieniu Tan

【Abstract】: Public security events are occurring all over the world, bringing threat to personal and property safety, and homeland security. It is vital to construct an effective model to evaluate and predict the public security. In this work, we establish a Situation-Aware Public Security Evaluation (SAPE) platform. Based on conventional Recurrent Neural Networks (RNN), we develop a new variant of RNN to handle temporal contexts in public security event datasets. The proposed model can achieve better performance than the compared state-of-the-art methods. On SAPE, There are two parts of demonstrations, i.e., global public security evaluation and China public security evaluation. In the global part, based on Global Terrorism Database from UMD, for each country, SAPE can predict risk level and top-n potential terrorist organizations which might attack the country. The users can also view the actual attacking organizations and predicted results. For each province in China, SAPE can predict the risk level and the probability scores of different types of events in the next month. The users can also view the actual numbers of events and predicted risk levels of the past one year.

【Keywords】:

【Paper Link】【Pages】:4403-4404

【Authors】: Shu Wu ; Qiang Liu ; Yong Liu ; Liang Wang ; Tieniu Tan

【Abstract】: With the growing online social media, rumors are spread fast and viewed by more and more people on the Internet. Rumors bring significant harm to daily life and public security. It is crucial to evaluate the credibility of information and detect the rumors on social media automatically. In this work, we establish a Network Information Credibility Evaluation (NICE) platform, which collects a database of rumors that have been verified on Sina Weibo and automatically evaluates the information generated by users on social media but has not been verified. Users can use a query to search related information. If the according information appears in our database, users can identify it is a rumor immediately. Otherwise, NICE will show users with real-time results crawled automatically from social media and can calculate credibility of a specific result with our algorithm. Our algorithm learns dynamic representations for information on social media based on behavior information, dynamic information, user information and comment information. Then, we use an ordinary logistic regression to classify information into rumors and non-rumors. Based on our algorithm, NICE system achieves satisfactory performance on evaluating information credibility and detecting rumors on social media.

【Keywords】:

691. Productive Aging through Intelligent Personalized Crowdsourcing.

【Paper Link】【Pages】:4405-

【Authors】: Han Yu ; Chunyan Miao ; Siyuan Liu ; Zhengxiang Pan ; Nur Syahidah Bte Khalid ; Zhiqi Shen ; Cyril Leung

【Abstract】: The current generation of senior citizens are enjoying unparalleled levels of good health than previous generations. The need for personal fulfilment after retirement has driven many of them to participate in productive aging activities such as volunteering. This paper outlines the Silver Productive (SP) mobile app, a system powered by the RTS-P intelligent personalized task sub-delegation approach with dynamic worker effort pricing functions. It provides an algorithmic crowdsourcing platform to enable seniors to contribute their effort through productive aging activities and help organizations efficiently utilize seniors' collective productivity.

【Keywords】: Algorithmic Crowdsourcing; Reputation; Task Delegation; Task Sub-delegation; Dynamic Worker Effort Pricing

30. AAAI 2016:Phoenix, Arizona, USA

Paper Num: 691 || Session Num: 36

Technical Papers 11

1. Inferring Multi-Dimensional Ideal Points for US Supreme Court Justices.

2. Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds.

3. Scientific Ranking over Heterogeneous Academic Hypernetwork.

4. MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-Based Protein Structure Prediction.

5. Hospital Stockpiling Problems with Inventory Sharing.

6. Predicting ICU Mortality Risk by Grouping Temporal Trends from a Multivariate Panel of Physiologic Measurements.

7. Learning to Generate Posters of Scientific Papers.

8. Face Behind Makeup.

9. Social Role-Aware Emotion Contagion in Image Social Networks.

10. Survival Prediction by an Integrated Learning Criterion on Intermittently Varying Healthcare Data.

11. On the Minimum Differentially Resolving Set Problem for Diffusion Source Inference in Networks.

Technical Papers: AI and the Web 33

12. From Tweets to Wellness: Wellness Event Detection from Twitter Streams.

13. "8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality.

14. Business-Aware Visual Concept Discovery from Social Media for Multimodal Business Venue Recognition.

15. Capturing Semantic Correlation for Item Recommendation in Tagging Systems.

16. Identifying Sentiment Words Using an Optimization Model with L1 Regularization.

17. Community-Based Question Answering via Heterogeneous Social Network Learning.

18. College Towns, Vacation Spots, and Tech Hubs: Using Geo-Social Media to Model and Compare Locations.

19. Inferring a Personalized Next Point-of-Interest Recommendation Model with Latent Behavior Patterns.

20. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback.

21. Improved Neural Machine Translation with SMT Features.

22. A Scalable Framework to Choose Sellers in E-Marketplaces Using POMDPs.

23. Fusing Social Networks with Deep Learning for Volunteerism Tendency Prediction.

24. Detect Overlapping Communities via Ranking Node Popularities.

25. Top-N Recommender System via Matrix Completion.

26. Robust Text Classification in the Presence of Confounding Bias.

27. Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts.

28. Fortune Teller: Predicting Your Career Path.

29. Predicting Online Protest Participation of Social Media Users.

30. Context-Sensitive Twitter Sentiment Classification Using Neural Network.

31. ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources.

32. On the Effectiveness of Linear Models for One-Class Collaborative Filtering.

33. Supervised Hashing via Uncorrelated Component Analysis.

34. Commonsense in Parts: Mining Part-Whole Relations from the Web and Image Tags.

35. Recommendation with Social Dimensions.

36. Column-Oriented Datalog Materialization for Large Knowledge Graphs.

37. Semantic Community Identification in Large Attribute Networks.

38. Unfolding Temporal Dynamics: Predicting Social Media Popularity Using Multi-scale Temporal Decomposition.

39. Modeling Users' Preferences and Social Links in Social Networking Services: A Joint-Evolving Perspective.

40. Cross-Lingual Taxonomy Alignment with Bilingual Biterm Topic Model.

41. Online Cross-Modal Hashing for Web Image Retrieval.

42. Understanding Emerging Spatial Entities.

43. Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark.

44. STELLAR: Spatial-Temporal Latent Ranking for Successive Point-of-Interest Recommendation.

Technical Papers: Cognitive Modeling and Cognitive Systems 2

45. Learning the Preferences of Ignorant, Inconsistent Agents.

46. Egocentric Video Search via Physical Interactions.

Technical Papers: Computational Sustainability and AI 2

47. Learning Deep Representation from Big and Heterogeneous Data for Traffic Accident Inference.

48. Autonomous Electricity Trading Using Time-of-Use Tariffs in a Competitive Market.

Technical Papers: Game Playing and Interactive Entertainment 2

49. Reuse of Neural Modules for General Video Game Playing.

50. Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games Using Convolutional Networks.

Technical Papers: Game Theory and Economic Paradigms 42

51. Computing Possible and Necessary Equilibrium Actions (and Bipartisan Set Winners).

52. From Duels to Battlefields: Computing Equilibria of Blotto and Other Games.

53. Maximizing Revenue with Limited Correlation: The Cost of Ex-Post Incentive Compatibility.

54. Blind, Greedy, and Random: Algorithms for Matching and Clustering Using Only Ordinal Information.

55. Strategyproof Peer Selection: Mechanisms, Analyses, and Experiments.

56. A Security Game Combining Patrolling and Alarm-Triggered Responses Under Spatial and Detection Uncertainties.

57. Learning Market Parameters Using Aggregate Demand Queries.

58. An Algorithmic Framework for Strategic Fair Division.

59. One Size Does Not Fit All: A Game-Theoretic Approach for Dynamically and Effectively Screening for Threats.

60. Strategy-Based Warm Starting for Regret Minimization in Games.

61. Using Correlated Strategies for Computing Stackelberg Equilibria in Extensive-Form Games.

62. Assignment and Pricing in Roommate Market.

63. Incentives for Strategic Behavior in Fisher Market Games.

64. Rules for Choosing Societal Tradeoffs.

65. Judgment Aggregation under Issue Dependencies.

66. Price of Pareto Optimality in Hedonic Games.

67. Multiwinner Analogues of the Plurality Rule: Axiomatic and Algorithmic Perspectives.

68. Ad Auctions and Cascade Model: GSP Inefficiency and Algorithms.

69. Variations on the Hotelling-Downs Model.

70. A Geometric Method to Construct Minimal Peer Prediction Mechanisms.

71. Sequence-Form and Evolutionary Dynamics: Realization Equivalence to Agent Form and Logit Dynamics.

72. Who Can Win a Single-Elimination Tournament?