Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA. AAAI Press 【DBLP Link】
【Paper Link】 【Pages】:2-8
【Authors】: Takuya Akiba ; Takanori Hayashi ; Nozomi Nori ; Yoichi Iwata ; Yuichi Yoshida
【Abstract】: We propose an indexing scheme for top-k shortest-path distance queries on graphs, which is useful in a wide range of important applications such as network-aware search and link prediction. While considerable effort has been made for efficiently answering standard (top-1) distance queries, none of previous methods can be directly extended for top-k distance queries. We propose a new framework for top-k distance queries based on 2-hop cover and then present an efficient indexing algorithm based on the simple but effective recent notion of pruned landmark labeling. Extensive experimental results on real social and web graphs show the scalability, efficiency and robustness of our method. Moreover, we demonstrate the usefulness of top-k distance queries through an application to link prediction.
【Keywords】: graphs; social networks; web graphs; shortest-path distance; node similarity measure; link prediction
【Paper Link】 【Pages】:9-15
【Authors】: Mustafa Al-Bakri ; Manuel Atencia ; Steffen Lalande ; Marie-Christine Rousset
【Abstract】: In this paper we model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We describe a novel import-by-query algorithm that alternates steps of sub-query rewriting and of tailored querying the Linked Data cloud in order to import data as specific as possible for inferring or contradicting given target same-as facts. Experiments conducted on a real-world dataset have demonstrated the feasibility of this approach and its usefulness in practice for data linkage and disambiguation.
【Keywords】: Linked Open Data data linkage forward and backward reasoning Datalog rules external queries
【Paper Link】 【Pages】:16-22
【Authors】: Jun Chen ; Chaokun Wang ; Jianmin Wang
【Abstract】: Intelligent item recommendation is a key issue in AI research which enables recommender systems to be more “human-minded” when generating recommendations. However, one of the major features of human — forgetting, has barely been discussed as regards recommender systems. In this paper, we considered people’s forgetting of interest when performing personalized recommendations, and brought forward a personalized framework to integrate interest-forgetting property with Markov model. Multiple implementations of the framework were investigated and compared. The experimental evaluation showed that our methods could significantly improve the accuracy of item recommendation, which verified the importance of considering interest-forgetting in recommendations.
【Keywords】: IFMM; Interest-Forgetting; Markov Model; Personalized Recommendation; Interest Retention
【Paper Link】 【Pages】:23-29
【Authors】: Jun Chen ; Chaokun Wang ; Jianmin Wang
【Abstract】: The short-term reconsumption behaviors, i.e. “reconsume” the near past, account for a large proportion of people’s activities every day and everywhere. In this paper, we firstly derived four generic features which influence people’s short-term reconsumption behaviors. These features were extracted with respect to different roles in the process of reconsumption behaviors, i.e. users, items and interactions. Then, we brought forward two fast algorithms with the linear and the quadratic kernels to predict whether a user will perform a short-term reconsumption at a specific time given the context. The experimental results show that our proposed algorithms are more accurate in the prediction tasks compared with the baselines. Meanwhile, the time complexity of online prediction of our algorithms is O(1), which enables fast prediction in real-world scenarios. The prediction contributes to more intelligent decision-making, e.g. potential revisited customer identification, personalized recommendation, and information re-finding.
【Keywords】: STREC; Reconsumption Behaviors; Prediction Algorithms; Feature Extraction; Web User Analysis
【Paper Link】 【Pages】:30-36
【Authors】: Tao Chen ; Hany M. SalahEldeen ; Xiangnan He ; Min-Yen Kan ; Dongyuan Lu
【Abstract】: Image tweets are becoming a prevalent form of socialmedia, but little is known about their content — textualand visual — and the relationship between the two mediums.Our analysis of image tweets shows that while visualelements certainly play a large role in image-text relationships, other factors such as emotional elements, also factor into the relationship. We develop Visual-Emotional LDA (VELDA), a novel topic model to capturethe image-text correlation from multiple perspectives (namely, visual and emotional). Experiments on real-world image tweets in both Englishand Chinese and other user generated content, show that VELDA significantly outperforms existingmethods on cross-modality image retrieval. Even in other domains where emotion does not factor in imagechoice directly, our VELDA model demonstrates good generalization ability, achieving higher fidelity modeling of such multimedia documents.
【Keywords】: image tweets; microblog; image and text; topic model
【Paper Link】 【Pages】:37-43
【Authors】: Xuefeng Chen ; Yifeng Zeng ; Gao Cong ; Shengchao Qin ; Yanping Xiang ; Yuanshun Dai
【Abstract】: Point-of-interest(POI) recommendation becomes a valuable service in location-based social networks. Based on the norm that similar users are likely to have similar preference of POIs, the current recommendation techniques mainly focus on users' preference to provide accurate recommendation results. This tends to generate a list of homogeneous POIs that are clustered into a narrow band of location categories(like food, museum, etc.) in a city. However, users are more interested to taste a wide range of flavors that are exposed in a global set of location categories in the city.In this paper, we formulate a new POI recommendation problem, namely top-K location category based POI recommendation, by introducing information coverage to encode the location categories of POIs in a city.The problem is NP-hard. We develop a greedy algorithm and further optimization to solve this challenging problem. The experimental results on two real-world datasets demonstrate the utility of new POI recommendations and the superior performance of the proposed algorithms.
【Keywords】: Recommendation;Social Networks
【Paper Link】 【Pages】:44-50
【Authors】: Yan-Ying Chen ; Yin-Hsi Kuo ; Chun-Che Wu ; Winston H. Hsu
【Abstract】: Name of an identity is strongly influenced by his/her cultural background such as gender and ethnicity, both vital attributes for user profiling, attribute-based retrieval, etc. Typically, the associations between names and attributes (e.g., people named "Amy" are mostly females) are annotated manually or provided by the census data of governments. We propose to associate a name and its likely demographic attributes by exploiting click-throughs between name queries and images with automatically detected facial attributes. This is the first work attempting to translate an abstract name to demographic attributes in visual-data-driven manner, and it is adaptive to incremental data, more countries and even unseen names (the names out of click-through data) without additional manual labels. In the experiments, the automatic name-attribute associations can help gender inference with competitive accuracy by using manual labeling. It also benefits profiling social media users and keyword-based face image retrieval, especially for contributing 12% relative improvement of accuracy in adapting to unseen names.
【Keywords】: Click-Through Data; Name Attributes; Image Retrieval; User Profiling
【Paper Link】 【Pages】:51-57
【Authors】: Belkacem Chikhaoui ; Mauricio Chiazzaro ; Shengrui Wang
【Abstract】: This paper addresses a new problem concerning the evolution of influence relationships between communities in dynamic social networks. A weighted temporal multigraph is employed to represent the dynamics of the social networks and analyze the influence relationships between communities over time. To ensure the interpretability of the knowledge discovered, evolution of the influence relationships is assessed by introducing the Granger causality. Through extensive experiments, we empirically demonstrate the suitability of our model for studying the evolution of influence between communities. Moreover, we empirically show how our model is able to accurately predict the influence of communities over time using random forest regression.
【Keywords】: Granger causality; Influence evolution; Dynamic social networks; Influence prediction; Multigraphs; Random forests
【Paper Link】 【Pages】:58-64
【Authors】: Sara Cohen ; Aviv Zohar
【Abstract】: Link prediction functions are important tools that are used to predict the evolution of a network, to locate hidden or surprising links, and to recommend new connections that should be formed. Multiple link prediction functions have been developed in the past. However, their evaluation has mostly been based onexperimental work, which has shown that the quality of a link prediction function varies significantly depending on the input domain. There is currently very little understanding of why and how a specific link prediction function works well for a particular domain. The underlying foundations of a link prediction function are often left informal---each function contains implicit assumptions about the dynamics of link formation, and about structural properties that result from these dynamics. We draw upon the motivation used in characterizations of ranking algorithms, as well as other celebrated results from social choice, and present an axiomatic basis for link prediction. This approach seeks to deconstruct each function into basic axioms, or properties, that make explicit its underlying assumptions. Our framework uses ``property templates'' that can be considered as general choices made by a function designer, such as what score is assigned to a 2-vertex graph, which vertices are irrelevant to the score, how removing edges or contracting vertices affects the score, and more. Using this framework, we fully characterize four well known link prediction functions and show that they are in fact derived from different variants of a single basic set of property templates.
【Keywords】: Link prediction; Axiomatic approach
【Paper Link】 【Pages】:65-71
【Authors】: Peng Cui ; Tianyang Zhang ; Fei Wang ; Peng He
【Abstract】: Collective social and behavioral information commonly exists in nature. There is a widespread intuitive sense that the characteristics of these social and behavioral information are to some extend related to the themes (or semantics) of the activities or targets. In this paper, we explicitly validate the interplay of collective social behavioral information and group themes using a large scale real dataset of online groups, and demonstrate the possibility of perceiving group themes from collective social and behavioral information. We propose a REgularized miXEd Regression (REXER) model based on matrix factorization to infer hierarchical semantics (including both group category and group labels) from collective social and behavioral information of group members. We extensively evaluate the proposed method in a large scale real online group dataset. For the prediction of group themes, the proposed REXER achieves satisfactory performances in various criterions. More specifically, we can predict the category of a group (among 6 categories) purely based on the collective social and behavioral information of the group with the Precision@1 to be 55.16% , without any assistance from group labels or conversation contents. We also show, perhaps counterintuitively, that the collective social and behavioral information is more reliable than the titles and labels of groups for inferring the group categories.
【Keywords】:
【Paper Link】 【Pages】:72-78
【Authors】: Aron Culotta ; Nirmal Ravi Kumar ; Jennifer Cutler
【Abstract】: Understanding the demographics of users of online social networks has important applications for health, marketing, and public messaging. In this paper, we predict the demographics of Twitter users based on whom they follow. Whereas most prior approaches rely on a supervised learning approach, in which individual users are labeled with demographics, we instead create a distantly labeled dataset by collecting audience measurement data for 1,500 websites (e.g., 50% of visitors to gizmodo.com are estimated to have a bachelor's degree). We then fit a regression model to predict these demographics using information about the followers of each website on Twitter. The resulting average held-out correlation is .77 across six different variables (gender, age, ethnicity, education, income, and child status). We additionally validate the model on a smaller set of Twitter users labeled individually for ethnicity and gender, finding performance that is surprisingly competitive with a fully supervised approach.
【Keywords】: social media, demographics, regression
【Paper Link】 【Pages】:79-87
【Authors】: Wei Dai ; Abhimanu Kumar ; Jinliang Wei ; Qirong Ho ; Garth A. Gibson ; Eric P. Xing
【Abstract】: As Machine Learning (ML) applications embrace greater data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands. Effective use of clusters for ML programs requires considerable expertise in writing distributed code, but existing highly-abstracted frameworks like Hadoop that pose low barriers to distributed-programming have not, in practice, matched the performance seen in highly specialized and advanced ML implementations. The recent Parameter Server (PS) paradigm is a middle ground between these extremes, allowing easy conversion of single-machine parallel ML programs into distributed ones, while maintaining high throughput through relaxed ``consistency models" that allow asynchronous (and, hence, inconsistent) parameter reads. However, due to insufficient theoretical study, it is not clear which of these consistency models can really ensure correct ML algorithm output; at the same time, there remain many theoretically-motivated but undiscovered opportunities to maximize computational throughput. Inspired by this challenge, we study both the theoretical guarantees and empirical behavior of iterative-convergent ML algorithms in existing PS consistency models. We then use the gleaned insights to improve a consistency model using an "eager" PS communication mechanism, and implement it as a new PS system that enables ML programs to reach their solution more quickly.
【Keywords】: Large-scale ML, Parameter Server, Distributed Stochastic Optimization, Probability Bounds, Semi-Synchronous Systems
【Paper Link】 【Pages】:88-94
【Authors】: Jiwei Ding ; Wentao Ding ; Wei Hu ; Yuzhong Qu
【Abstract】: The quantity of entities in the Linked Data is increasing rapidly. For entity search and browsing systems, filtering is very useful for users to find entities that they are interested in. Type is a kind of widely-used facet and can be easily obtained from knowledge bases, which enables to create filters by selecting at most K types of an entity collection. However, existing approaches often fail to select high-quality type filters due to complex overlap between types. In this paper, we propose a novel type selection approach based upon Budgeted Maximum Coverage (BMC), which can achieve integral optimization for the coverage quality of type filters. Furthermore, we define a new optimization problem called Extended Budgeted Maximum Coverage (EBMC) and propose an EBMC-based approach, which enhances the BMC-based approach by incorporating the relevance between entities and types, so as to create sensible type filters. Our experimental results show that the EBMC-based approach performs best comparing with several representative approaches.
【Keywords】: Budgeted Maximum Coverage; Type Filter; Type Hierarchy
【Paper Link】 【Pages】:95-101
【Authors】: Valeria Fionda ; Gianluigi Greco
【Abstract】: Due to the openness and decentralization of the Web, mechanisms to represent and reason about the reliability of RDF data become essential. This paper embarks on a formal analysis of RDF data enriched with trust information by focusing on the characterization of its model-theoretic semantics and on the study of relevant reasoning problems. The impact of trust values on the computational complexity of well-known concepts related to the entailment of RDF graphs is studied. In particular, islands of tractability are identified for classes of acyclic and nearly-acyclic graphs. Moreover, an implementation of the framework and an experimental evaluation on real data are discussed.
【Keywords】: RDF reasoning; entailment; trust; islands of tractability
【Paper Link】 【Pages】:102-108
【Authors】: Valeria Fionda ; Giuseppe Pirrò ; Mariano P. Consens
【Abstract】: We introduce Extended Property Paths (EPPs), a significant enhancement of SPARQL property paths. EPPs allow to capture in a succinct way a larger class of navigational queries than property paths. We present the syntax and formal semantics of EPPs and introduce two different evaluation strategies. The first is based on an algorithm implemented in a custom query processor. The second strategy leverages a translation algorithm of EPPs into SPARQL queries that can be executed on existing SPARQL processors. We compare the two evaluation strategies on real data to highlight their pros and cons.
【Keywords】: Graph Navigational Languages; Property Paths; Translation into SPARQL; RDF
【Paper Link】 【Pages】:109-115
【Authors】: Birte Glimm ; Yevgeny Kazakov ; Ilianna Kollia ; Giorgos B. Stamou
【Abstract】: The paper presents an approach for optimizing the evaluation of SPARQL queries over OWL ontologies using SPARQL's OWL Direct Semantics entailment regime. The approach is based on the computation of lower and upper bounds, but we allow for much more expressive queries than related approaches. In order to optimize the evaluation of possible query answers in the upper but not in the lower bound, we present a query extension approach that uses schema knowledge from the queried ontology to extend the query with additional parts. We show that the resulting query is equivalent to the original one and we use the additional parts that are simple to evaluate for restricting the bounds of subqueries of the initial query. In an empirical evaluation we show that the proposed query extension approach can lead to a significant decrease in the query execution time of up to four orders of magnitude.
【Keywords】: SPARQL Queries; Query Bounds; Query Extension; Query Answering; OWL Ontologies
【Paper Link】 【Pages】:116-122
【Authors】: Kalpa Gunaratna ; Krishnaprasad Thirunarayan ; Amit P. Sheth
【Abstract】: Semantic Web documents that encode facts about entities on the Web have been growing rapidly in size and evolving over time. Creating summaries on lengthy Semantic Web documents for quick identification of the corresponding entity has been of great contemporary interest. In this paper, we explore automatic summarization techniques that characterize and enable identification of an entity and create summaries that are human friendly. Specifically, we highlight the importance of diversified (faceted) summaries by combining three dimensions: diversity, uniqueness, and popularity. Our novel diversity-aware entity summarization approach mimics human conceptual clustering techniques to group facts and picks representative facts from each group to form concise (i.e., short) and comprehensive (i.e., improved coverage through diversity) summaries. We evaluate our approach against the state-of-the-art techniques and show that our work improves both the quality and the efficiency of entity summarization.
【Keywords】: Entity summary; Hierarchical conceptual clustering; Ranking; RDF; DBpedia
【Paper Link】 【Pages】:123-129
【Authors】: Guibing Guo ; Jie Zhang ; Neil Yorke-Smith
【Abstract】: Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.
【Keywords】: Recommender systems; social trust; trust influence
【Paper Link】 【Pages】:130-136
【Authors】: Dongxiao He ; Dayou Liu ; Di Jin ; Weixiong Zhang
【Abstract】: Discovery of communities in networks is a fundamental data analysis problem. Most of the existing approaches have focused on discovering communities of nodes, while recent studies have shown great advantages and utilities of the knowledge of communities of links. Stochastic models provides a promising class of techniques for the identification of modular structures, but most stochastic models mainly focus on the detection of node communities rather than link communities. We propose a stochastic model, which not only describes the structure of link communities, but also considers the heterogeneous distribution of community sizes, a property which is often ignored by other models. We then learn the model parameters using a method of maximum likelihood based on an expectation-maximization algorithm. To deal with large complex real networks, we extend the method by a strategy of iterative bipartition. The extended method is not only efficient, but is also able to determine the number of communities for a given network. We test our approach on both synthetic benchmarks and real-world networks including an application to a large biological network, and also compare it with two existing methods. The results demonstrate the superior performance of our approach over the competing methods for detecting link communities.
【Keywords】: Complex Network, Community Structure, Stochastic Model, Expectation-Maximization
【Paper Link】 【Pages】:137-144
【Authors】: Xian-Sheng Hua ; Jin Li
【Abstract】: With the advances in distributed computation, machine learn-ing and deep neural networks, we enter into an era that it is possible to build a real world image recognition system. There are three essential components to build a real-world image recognition system: 1) creating representative features, 2) de-signing powerful learning approaches, and 3) identifying massive training data. While extensive researches have been done on the first two aspects, much less attention has been paid on the third. In this paper, we present an end-to-end Web knowledge discovery system, Prajna. Starting from an arbi-trary set of entities as inputs, Prajna automatically crawls im-ages from multiple sources, identifies images that have relia-bly labeled, trains models and build a recognition system that is capable of recognizing any new images of the entity set. Due to the high cost of manual data labeling, leveraging the massive yet noisy data on the Internet is a natural idea, but the practical engineering aspect is highly challenging. Prajna fo-cuses on separating reliable training data from extensive noisy data, which is a key to the capability of extending an image recognition system to support arbitrary entities. In this paper, we will analyze the intrinsic characteristics of Internet image data, and find ways to mine accurate and informative infor-mation from those data to build a training set, which is then used to train image recognition models. Prajna is capable of automatically building an image recognition system for those entities as long as we can collect sufficient number of images of the entities on the Web.
【Keywords】:
【Paper Link】 【Pages】:145-150
【Authors】: Mans Hulden ; Miikka Silfverberg ; Jerid Francom
【Abstract】: Text-based geolocation classifiers often operate with a grid-based view of the world. Predicting document location of origin based on text content on a geodesic grid is computationally attractive since many standard methods for supervised document classification carry over unchanged to geolocation in the form of predicting a most probable grid cell for a document. However, the grid-based approach suffers from sparse data problems if one wants to improve classification accuracy by moving to smaller cell sizes. In this paper we investigate an enhancement of common methods for determining the geographic point of origin of a text document by kernel density estimation. For geolocation of tweets we obtain a improvements upon non-kernel methods on datasets of U.S. and global Twitter content.
【Keywords】: document geolocation; twitter; kernel classifier
【Paper Link】 【Pages】:151-159
【Authors】: Cheng Jin ; Wenhui Mao ; Ruiqi Zhang ; Yuejie Zhang ; Xiangyang Xue
【Abstract】: A new algorithm via Canonical Correlation Analysis (CCA) is developed in this paper to support more effective cross-modal image clustering for large-scale annotated image collections. It can be treated as a bi-media multimodal mapping problem and modeled as a correlation distribution over multimodal feature representations. It integrates the multimodal feature generation with the Locality Linear Coding (LLC) and co-occurrence association network, multimodal feature fusion with CCA, and accelerated hierarchical k-means clustering, which aims to characterize the correlations between the inter-related visual features in images and semantic features in captions, and measure their association degree more precisely. Very positive results were obtained in our experiments using a large quantity of public data.
【Keywords】:
【Paper Link】 【Pages】:160-167
【Authors】: Di Jin ; Zheng Chen ; Dongxiao He ; Weixiong Zhang
【Abstract】: An important problem in analyzing complex networks is discovery of modular or community structures embedded in the networks. Although being promising for identifying network communities, the popular stochastic models often do not preserve node degrees, thus reducing their representation power and applicability to real-world networks. Here we address this critical problem. Instead of using a blockmodel, we adopted a random-graph null model to faithfully capture community structures by preserving in the model the expected node degrees. The new model, learned using nonnegative matrix factorization, is more accurate and robust in representing community structures than the existing methods. Our results from extensive experiments on synthetic benchmarks and real-world networks show the superior performance of the new method over the existing methods in detecting both disjoint and overlapping communities.
【Keywords】: Complex Network, Community Detection, Stochastic Model, Nonnegative Matrix Factorization
【Paper Link】 【Pages】:168-174
【Authors】: Seungyeon Kim ; Joonseok Lee ; Guy Lebanon ; Haesun Park
【Abstract】: Sentiment analysis predicts a one-dimensional quantity describing the positive or negative emotion of an author. Mood analysis extends the one-dimensional sentiment response to a multi-dimensional quantity, describing a diverse set of human emotions. In this paper, we extend sentiment and mood analysis temporally and model emotions as a function of time based on temporal streams of blog posts authored by a specific author. The model is useful for constructing predictive models and discovering scientific models of human emotions.
【Keywords】:
【Paper Link】 【Pages】:175-181
【Authors】: Patrick Koopmann ; Renate A. Schmidt
【Abstract】: Uniform interpolation and the dual task of forgetting restrict the ontology to a specified subset of concept and role names. This makes them useful tools for ontology analysis, ontology evolution and information hiding. Most previous research focused on uniform interpolation of TBoxes. However, especially for applications in privacy and information hiding, it is essential that uniform interpolation methods can deal with ABoxes as well. We present the first method that can compute uniform interpolants of any ALC ontology with ABoxes. ABoxes bring their own challenges when computing uniform interpolants, possibly requiring disjunctive statements or nominals in the resulting ABox. Our method can compute representations of uniform interpolants in ALCO. An evaluation on realistic ontologies shows that these uniform interpolants can be practically computed, and can often even be presented in pure ALC.
【Keywords】: Ontologies; Description Logics; Uniform Interpolation; Forgetting; Approximation; Predicate Hiding; Logical Difference; Ontology Analysis; Automated Reasoning; Resolution
【Paper Link】 【Pages】:182-188
【Authors】: Virgile Landeiro Dos Reis ; Aron Culotta
【Abstract】: Recent work has demonstrated the value of social media monitoring for health surveillance (e.g., tracking influenza or depression rates). It is an open question whether such data can be used to make causal inferences (e.g., determining which activities lead to increased depression rates). Even in traditional, restricted domains, estimating causal effects from observational data is highly susceptible to confounding bias. In this work, we estimate the effect of exercise on mental health from Twitter, relying on statistical matching methods to reduce confounding bias. We train a text classifier to estimate the volume of a user's tweets expressing anxiety, depression, or anger, then compare two groups: those who exercise regularly (identified by their use of physical activity trackers like Nike+), and a matched control group. We find that those who exercise regularly have significantly fewer tweets expressing depression or anxiety; there is no significant difference in rates of tweets expressing anger. We additionally perform a sensitivity analysis to investigate how the many experimental design choices in such a study impact the final conclusions, including the quality of the classifier and the construction of the control group.
【Keywords】: social media, public health
【Paper Link】 【Pages】:189-195
【Authors】: Freddy Lécué ; Jeff Z. Pan
【Abstract】: Deductive reasoning and inductive learning are the most common approaches for deriving knowledge. In real world applications when data is dynamic and incomplete, especially those exposed by sensors, reasoning is limited by dynamics of data while learning is biased by data incompleteness. Therefore discovering consistent knowledge from incomplete and dynamic data is a challenging open problem. In our approach the semantics of data is captured through ontologies to empower learning (mining) with (Description Logics) reasoning. Consistent knowledge discovery is achieved by applying generic, significative, representative association semantic rules. The experiments have shown scalable, accurate and consistent knowledge discovery with data from Dublin.
【Keywords】: Semantic web; evolving ontology; dynamic ontology; dynamic reasoning; temporal reasoning
【Paper Link】 【Pages】:196-202
【Authors】: He Liu ; Hongliang Yu ; Zhi-Hong Deng
【Abstract】: Multi-document summarization is of great value to many real world applications since it can help people get the main ideas within a short time.In this paper, we tackle the problem of extracting summary sentences from multi-document sets by applying sparse coding techniques and present a novel framework to this challenging problem. Based on the data reconstruction and sentence denoising assumption, we present a two-level sparse representation model to depict the process of multi-document summarization. Three requisite properties is proposed to form an ideal reconstructable summary: Coverage, Sparsity and Diversity. We then formalize the task of multi-document summarization as an optimization problem according to the above properties, and use simulated annealing algorithm to solve it.Extensive experiments on summarization benchmark data sets DUC2006 and DUC2007 show that our proposed model is effective and outperforms the state-of-the-art algorithms.
【Keywords】: summarization; sparse coding
【Paper Link】 【Pages】:203-209
【Authors】: Qiang Liu ; Shu Wu ; Liang Wang
【Abstract】: With rapid growth of information on the internet, recommender systems become fundamental for helping users alleviate the problem of information overload. Since contextual information can be used as a significant factor in modeling user behavior, various context-aware recommendation methods are proposed. However, the state-of-the-art context modeling methods treat contexts as other dimensions similar to the dimensions of users and items, and cannot capture the special semantic operation of contexts. On the other hand, some works on multi-domain relation prediction can be used for the context-aware recommendation, but they have problems in generating recommendation under a large amount of contextual information. In this work, we propose Contextual Operating Tensor (COT) model, which represents the common semantic effects of contexts as a contextual operating tensor and represents a context as a latent vector. Then, to model the semantic operation of a context combination, we generate contextual operating matrix from the contextual operating tensor and latent vectors of contexts. Thus latent vectors of users and items can be operated by the contextual operating matrices. Experimental results show that the proposed COT model yields significant improvements over the competitive compared methods on three typical datasets, i.e., Food, Adom and Movielens-1M datasets.
【Keywords】: recommender system; context-awareness; matrix factorization; rating prediction
【Paper Link】 【Pages】:210-216
【Authors】: Weiwei Liu ; Zhi-Hong Deng ; Xiuwen Gong ; Frank Jiang ; Ivor W. Tsang
【Abstract】: Effective forecasting of future prevalent topics plays animportant role in social network business development.It involves two challenging aspects: predicting whethera topic will become prevalent, and when. This cannotbe directly handled by the existing algorithms in topicmodeling, item recommendation and action forecasting.The classic forecasting framework based on time seriesmodels may be able to predict a hot topic when a seriesof periodical changes to user-addressed frequency in asystematic way. However, the frequency of topics discussedby users often changes irregularly in social networks.In this paper, a generic probabilistic frameworkis proposed for hot topic prediction, and machine learningmethods are explored to predict hot topic patterns.Two effective models, PreWHether and PreWHen, areintroduced to predict whether and when a topic will becomeprevalent. In the PreWHether model, we simulatethe constructed features of previously observed frequencychanges for better prediction. In the PreWHen model,distributions of time intervals associated with the emergenceto prevalence of a topic are modeled. Extensiveexperiments on real datasets demonstrate that ourmethod outperforms the baselines and generates moreeffective predictions.
【Keywords】: social network; time series; probability model
【Paper Link】 【Pages】:217-223
【Authors】: Zhongqi Lu ; Zhicheng Dou ; Jianxun Lian ; Xing Xie ; Qiang Yang
【Abstract】: News recommendation has become a big attraction with which major Web search portals retain their users. Two effective approaches are Content-based Filtering and Collaborative Filtering, each serving a specific recommendation scenario. The Content-based Filtering approaches inspect rich contexts of the recommended items, while the Collaborative Filtering approaches predict the interests of long-tail users by collaboratively learning from interests of related users. We have observed empirically that, for the problem of news topic displaying, both the rich context of news topics and the long-tail users exist. Therefore, in this paper, we propose a Content-based Collaborative Filtering approach (CCF) to bring both Content-based Filtering and Collaborative Filtering approaches together. We found that combining the two is not an easy task, but the benefits of CCF are impressive. On one hand, CCF makes recommendations based on the rich contexts of the news. On the other hand, CCF collaboratively analyzes the scarce feedbacks from the long-tail users. We tailored this CCF approach for the news topic displaying on the Bing front page and demonstrated great gains in attracting users. In the experiments and analyses part of this paper, we discuss the performance gains and insights in news topic recommendation in Bing.
【Keywords】: Content-based; Collaborative Filtering; News Recommendation
【Paper Link】 【Pages】:224-230
【Authors】: Zongyang Ma ; Aixin Sun ; Quan Yuan ; Gao Cong
【Abstract】: Stack Overflow and MedHelp are examples of domain-specific community-based question answering (CQA) systems. Different from CQA systems for general topics (e.g., Yahoo! Answers, Baidu Knows), questions and answers in domain-specific CQA systems are mostly in the same topical domain, enabling more comprehensive interaction between users on fine-grained topics. In such systems, users are more likely to ask questions on unfamiliar topics and to answer questions matching their expertise. Users can also vote answers based on their judgements. In this paper, we propose a Tri-Role Topic Model (TRTM) to model the tri-roles of users (i.e., as askers, answerers, and voters, respectively) and the activities of each role including composing question, selecting question to answer, contributing and voting answers. The proposed model can be used to enhance CQA systems from many perspectives. As a case study, we conducted experiments on ranking answers for questions on Stack Overflow, a CQA system for professional and enthusiast programmers. Experimental results show that TRTM is effective in facilitating users getting ideal rankings of answers, particularly for new and less popular questions. Evaluated on nDCG, TRTM outperforms state-of-the-art methods.
【Keywords】: Topic Model; Question Answering
【Paper Link】 【Pages】:231-237
【Authors】: Boris Motik ; Yavor Nenov ; Robert Edgar Felix Piro ; Ian Horrocks
【Abstract】: Rewriting is widely used to optimise owl:sameAs reasoning in materialisation based OWL 2 RL systems. We investigate issues related to both the correctness and efficiency of rewriting, and present an algorithm that guarantees correctness, improves efficiency, and can be effectively parallelised. Our evaluation shows that our approach can reduce reasoning times on practical data sets by orders of magnitude.
【Keywords】: Datalog; Materialisation; Parallel; Equality; owl:sameAs; RDF;
【Paper Link】 【Pages】:238-246
【Authors】: Stephen Mussmann ; John Moore ; Joseph John Pfeiffer III ; Jennifer Neville
【Abstract】: Due to the recent availability of large complex networks, considerable analysis has focused on understanding and characterizing the properties of these networks. Scalable generative graph models focus on modeling distributions of graphs that match real world network properties and scale to large datasets. Much work has focused on modeling networks with a power law degree distribution, clustering, and small diameter. In network analysis, the assortativity statistic is defined as the correlation between the degrees of linked nodes in the network. The assortativity measure can distinguish between types of networks---social networks commonly exhibit positive assortativity, in contrast to biological or technological networks that are typically disassortative. Despite this, little work has focused on scalable graph models that capture assortativity in networks. The contributions of our work are twofold. First, we prove that an unbounded number of pairs of networks exist with the same degree distribution and assortativity, yet different joint degree distributions. Thus, assortativity as a network measure cannot distinguish between graphs with complex (non-linear) dependence in their joint degree distributions. Motivated by this finding, we introduce a generative graph model that explicitly estimates and models the joint degree distribution. Our Binned Chung Lu method accurately captures both the joint degree distribution and assortativity, while still matching characteristics such as the degree distribution and clustering coefficients. Further, our method has subquadratic learning and sampling methods that enable scaling to large, real world networks. We evaluate performance compared to other scalable graph models on six real world networks, including a citation network with over 14 million edges.
【Keywords】: Generative Graph Models; Social Network Analysis
【Paper Link】 【Pages】:247-253
【Authors】: Peter F. Patel-Schneider
【Abstract】: RDF and Description Logics work in an open-world setting where absence of information is not information about absence. Nevertheless, Description Logic axioms can be interpreted in a closed-world setting and in this setting they can be used for both constraint checking and closed-world recognition against information sources. When the information sources are expressed in well-behaved RDF or RDFS (i.e., RDF graphs interpreted in the RDF or RDFS semantics) this constraint checking and closed-world recognition is simple to describe. Further this constraint checking can be implemented as SPARQL querying and thus effectively performed.
【Keywords】: Semantic Web; RDF; RDFS; Description Logics; OWL; Constraints
【Paper Link】 【Pages】:254-260
【Authors】: Guilin Qi ; Zhe Wang ; Kewen Wang ; Xuefeng Fu ; Zhiqiang Zhuang
【Abstract】: Model-based approaches provide a semantically well justified way to revise ontologies. However, in general, model-based revision operators are limited due to lack of efficient algorithms and inexpressibility of the revision results. In this paper, we make both theoretical and practical contribution to efficient computation of model-based revisions in DL-Lite. Specifically, we show that maximal approximations of two well-known model-based revisions for DL-Lite_R can be computed using a syntactic algorithm. However, such a coincidence of model-based and syntactic approaches does not hold when role functionality axioms are allowed. As a result, we identify conditions that guarantee such a coincidence for DL-Lite_FR. Our result shows that both model-based and syntactic revisions can co-exist seamlessly and the advantages of both approaches can be taken in one revision operator. Based on our theoretical results, we develop a graph-based algorithm for the revision operat
【Keywords】: Belief revision;Description logics;Ontology Evolution
【Paper Link】 【Pages】:261-267
【Authors】: Suhas Ranganath ; Jiliang Tang ; Xia Hu ; Hari Sundaram ; Huan Liu
【Abstract】: The rise of social media provides a great opportunity for people to reach out to their social connections to satisfy their information needs. However, generic social media platforms are not explicitly designed to assist information seeking of users. In this paper, we propose a novel framework to identify the social connections of a user able to satisfy his information needs. The information need of a social media user is subjective and personal, and we investigate the utility of his social context to identify people able to satisfy it. We present questions users post on Twitter as instances of information seeking activities in social media. We infer soft community memberships of the asker and his social connections by integrating network and content information. Drawing concepts from the social foci theory, we identify answerers whose community memberships in the question domain overlap with that of the asker. Our experiments demonstrate that the framework is effective in identifying answerers to social media questions.
【Keywords】: Social Media, Q&A, Social Foci
【Paper Link】 【Pages】:268-274
【Authors】: Marie-Christine Rousset ; Federico Ulliana
【Abstract】: We present a novel semantics for extracting bounded-level modules from RDF ontologies and databases augmented with safe inference rules, a la Datalog. Dealing with a recursive rule language poses challenging issues for defining the module semantics, and also makes module extraction algorithmically unsolvable in some cases. Our results include a set of module extraction algorithms compliant with the novel semantics. Experimental results show that the resulting framework is effective in extracting expressive modules from RDF datasets with formal guarantees, whilst controlling their succinctness.
【Keywords】: RDF, datalog, Semantic Web
【Paper Link】 【Pages】:275-281
【Authors】: Yikang Shen ; Wenge Rong ; Zhiwei Sun ; Yuanxin Ouyang ; Zhang Xiong
【Abstract】: Community-based Question Answering (CQA) has become popular in knowledge sharing sites since it allows users to get answers to complex, detailed, and personal questions directly from other users. Large archives of historical questions and associated answers have been accumulated. Retrieving relevant historical answers that best match a question is an essential component of a CQA service. Most state of the art approaches are based on bag-of-words models, which have been proven successful in a range of text matching tasks, but are insufficient for capturing the important word sequence information in short text matching. In this paper, a new architecture is proposed to more effectively model the complicated matching relations between questions and answers. It utilises a similarity matrix which contains both lexical and sequential information. Afterwards the information is put into a deep architecture to find potentially suitable answers. The experimental study shows its potential in improving matching accuracy of question and answer.
【Keywords】: Question/Answer Matching; Deep Convolutional Neural Network
【Paper Link】 【Pages】:282-289
【Authors】: Sigal Sina ; Avi Rosenfeld ; Sarit Kraus ; Navot Akiva
【Abstract】: An important area of social network research is identifying missing information which is not explicitly represented in the network or is not visible to all. In this paper, we propose a novel Hybrid Approach of Classifier and Clustering,a which we refer to as HACC, to solve the missing node identification problem in social networks. HACC utilizes a classifier as a preprocessing step in order to integrate all known information into one similarity measure and then uses a clustering algorithm to identify missing nodes. Specifically, we used the information on the network structure, attributes about known users (nodes) and pictorial information to evaluate HACC and found that it performs significantly better than other missing node algorithms. We also argue that HACC is a general approach and domain independent and can be easily applied to other domains. We support this claim by evaluating HACC on a second authorship identification domain as well.
【Keywords】: Clustering; K-means; missing nodes;
【Paper Link】 【Pages】:290-296
【Authors】: Dongjin Song ; David A. Meyer
【Abstract】: With the rapid development of signed social networks in which therelationships between two nodes can be either positive (indicatingrelations such as like) or negative (indicating relations such asdislike), producing a personalized ranking list with positive linkson the top and negative links at the bottom is becoming anincreasingly important task. To accomplish it, we propose ageneralized AUC (GAUC) to quantify the ranking performance ofpotential links (including positive, negative, and unknown statuslinks) in partially observed signed social networks. In addition, wedevelop a novel link recommendation algorithm by directly optimizingthe GAUC loss. We conduct experimental studies based upon Wikipedia,MovieLens, and Slashdot; our results demonstrate the effectivenessand the efficiency of the proposed approach.
【Keywords】: Link Recommendation; Signed Social Networks; AUC
【Paper Link】 【Pages】:297-303
【Authors】: Wei Sun ; Pengyuan Wang ; Dawei Yin ; Jian Yang ; Yi Chang
【Abstract】: Advertising effectiveness measurement is a fundamental problem in online advertising. Various causal inference methods have been employed to measure the causal effects of ad treatments. However, existing methods mainly focus on linear logistic regression for univariate and binary treatments and are not well suited for complex ad treatments of multi-dimensions, where each dimension could be discrete or continuous. In this paper we propose a novel two-stage causal inference framework for assessing the impact of complex ad treatments. In the first stage, we estimate the propensity parameter via a sparse additive model; in the second stage, a propensity-adjusted regression model is applied for measuring the treatment effect. Our approach is shown to provide an unbiased estimation of the ad effectiveness under regularity conditions. To demonstrate the efficacy of our approach, we apply it to a real online advertising campaign to evaluate the impact of three ad treatments: ad frequency, ad channel, and ad size. We show that the ad frequency usually has a treatment effect cap when ads are showing on mobile device. In addition, the strategies for choosing best ad size are completely different for mobile ads and online ads.
【Keywords】:
【Paper Link】 【Pages】:304-310
【Authors】: Jie Tang ; Chenhui Zhang ; Keke Cai ; Li Zhang ; Zhong Su
【Abstract】: Finding a subset of users to statistically represent the original social network is a fundamental issue in Social Network Analysis (SNA). The problem has not been extensively studied in existing literature. In this paper, we present a formal definition of the problem of \textbf{sampling representative users} from social network. We propose two sampling models and theoretically prove their NP-hardness. To efficiently solve the two models, we present an efficient algorithm with provable approximation guarantees. Experimental results on two datasets show that the proposed models for sampling representative users significantly outperform (+6\%-23\% in terms of Precision@100) several alternative methods using authority or structure information only. The proposed algorithms are also effective in terms of time complexity. Only a few seconds are needed to sampling ~300 representative users from a network of 100,000 users.All data and codes are publicly available.
【Keywords】: network sampling; social networks
【Paper Link】 【Pages】:311-317
【Authors】: Goutham Tholpadi ; Mrinal Kanti Das ; Trapit Bansal ; Chiranjib Bhattacharyya
【Abstract】: Commenting is a popular facility provided by news sites. Analyzing such user-generated content has recently attracted research interest. However, in multilingual societies such as India, analyzing such user-generated content is hard due to several reasons: (1) There are more than 20 official languages but linguistic resources are available mainly for Hindi. It is observed that people frequently use romanized text as it is easy and quick using an English keyboard, resulting in multi-glyphic comments, where the texts are in the same language but in different scripts. Such romanized texts are almost unexplored in machine learning so far. (2) In many cases, comments are made on a specific part of the article rather than the topic of the entire article. Off-the-shelf methods such as correspondence LDA are insufficient to model such relationships between articles and comments. In this paper, we extend the notion of correspondence to model multi-lingual, multi-script, and inter-lingual topics in a unified probabilistic model called the Multi-glyphic Correspondence Topic Model (MCTM). Using several metrics, we verify our approach and show that it improves over the state-of-the-art.
【Keywords】: unsupervised learning; hierarchical Bayesian models; topic models; user generated content; news; comments; multilingual; multi-glyphic; romanized text
【Paper Link】 【Pages】:318-324
【Authors】: Jinpeng Wang ; Gao Cong ; Wayne Xin Zhao ; Xiaoming Li
【Abstract】: In this paper, we propose to study the problem of identifying and classifying tweets into intent categories. For example, a tweet “I wanna buy a new car” indicates the user’s intent for buying a car. Identifying such intent tweets will have great commercial value among others. In particular, it is important that we can distinguish different types of intent tweets. We propose to classify intent tweets into six categories, namely Food & Drink, Travel, Career & Education, Goods & Services, Event and Activities and Trifle. We propose a semisupervised learning approach to categorizing intent tweets into the six categories.We construct a test collection by using a bootstrap method. Our experimental results show that our approach is effective in inferring intent categories for tweets.
【Keywords】:
【Paper Link】 【Pages】:325-331
【Authors】: Senzhang Wang ; Zhao Yan ; Xia Hu ; Philip S. Yu ; Zhoujun Li
【Abstract】: Studying the bursty nature of cascades in social media is practically important in many applications such as product sales prediction, disaster relief, and stock market prediction. Although the cascade volume prediction has been extensively studied, how to predict when a burst will come remains an open problem. It is challenging to predict the time of the burst due to the ``quick rise and fall'' pattern and the diverse time spans of the cascades. To this end, this paper proposes a classification based approach for burst time prediction by utilizing and modeling rich knowledge in information diffusion. Particularly, we first propose a time window based approach to predict in which time window the burst will appear. This paves the way to transform the time prediction task to a classification problem. To address the challenge that the original time series data of the cascade popularity only are not sufficient for predicting cascades with diverse magnitudes and time spans, we explore rich information diffusion related knowledge and model them in a scale-independent manner. Extensive experiments on a Sina Weibo reposting dataset demonstrate the superior performance of the proposed approach in accurately predicting the burst time of posts.
【Keywords】: information diffusion; time prediction; social network
【Paper Link】 【Pages】:332-338
【Authors】: Xiangyu Wang ; Dayu He ; Danyang Chen ; Jinhui Xu
【Abstract】: In this paper, we propose a novel collaborative filtering approach for predicting the unobserved links in a network (or graph) with both topological and node features. Our approach improves the well-known compressed sensing based matrix completion method by introducing a new multiple-independent-Bernoulli-distribution model as the data sampling mask. It makes better link predictions since the model is more general and better matches the data distributions in many real-world networks, such as social networks like Facebook. As a result, a satisfying stability of the prediction can be guaranteed. To obtain an accurate multiple-independent-Bernoulli-distribution model of the topological feature space, our approach adjusts the sampling of the adjacency matrix of the network (or graph) using the clustering information in the node feature space. This yields a better performance than those methods which simply combine the two types of features. Experimental results on several benchmark datasets suggest that our approach outperforms the best existing link prediction methods.
【Keywords】: Link Prediction; Compressed Sensing; Matrix Completion; Collaborative Filtering
【Paper Link】 【Pages】:339-345
【Authors】: Yu Wu ; Wei Wu ; Zhoujun Li ; Ming Zhou
【Abstract】: This paper proposes mining query subtopics from questions in community question answering (CQA). The subtopics are represented as a number of clusters of questions with keywords summarizing the clusters. The task is unique in that the subtopics from questions can not only facilitate user browsing in CQA search, but also describe aspects of queries from a question-answering perspective. The challenges of the task include how to group semantically similar questions and how to find keywords capable of summarizing the clusters. We formulate the subtopic mining task as a non-negative matrix factorization (NMF) problem and further extend the model of NMF to incorporate question similarity estimated from metadata of CQA into learning. Compared with existing methods, our method can jointly optimize question clustering and keyword extraction and encourage the former task to enhance the latter. Experimental results on large scale real world CQA datasets show that the proposed method significantly outperforms the existing methods in terms of keyword extraction, while achieving a comparable performance to the state-of-the-art methods for question clustering.
【Keywords】: topic mining; non-negative matrix factorization; keyword extraction
【Paper Link】 【Pages】:346-352
【Authors】: Miao Xie ; Qiusong Yang ; Qing Wang ; Gao Cong ; Gerard de Melo
【Abstract】: Studying the spread of phenomena in social networks is critical but still not fully solved. Existing influence maximization models assume a static network, disregarding its evolution over time. We introduce the continuous time constrained influence maximization problem for dynamic diffusion networks, based on a novel diffusion model called DynaDiffuse. Although the problem is NP-hard, the influence spread functions are monotonic and submodular, enabling fast approximations on top of an innovative stochastic model checking approach. Experiments on real social network data show that our model finds higher quality solutions and our algorithm outperforms state-of-art alternatives.
【Keywords】: Social Network; Influence Maximization; Influence Diffusion; Dynamic Networks; Stochastic Model Checking; Greedy Algorithm
【Paper Link】 【Pages】:353-359
【Authors】: Xiaohui Yan ; Jiafeng Guo ; Yanyan Lan ; Jun Xu ; Xueqi Cheng
【Abstract】: Bursty topics discovery in microblogs is important for people to grasp essential and valuable information. However, the task is challenging since microblog posts are particularly short and noisy. This work develops a novel probabilistic model, namely Bursty Biterm Topic Model (BBTM), to deal with the task. BBTM extends the Biterm Topic Model (BTM) by incorporating the burstiness of biterms as prior knowledge for bursty topic modeling, which enjoys the following merits: 1) It can well solve the data sparsity problem in topic modeling over short texts as the same as BTM; 2) It can automatical discover high quality bursty topics in microblogs in a principled and efficient way. Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines significantly.
【Keywords】: short text; topic model; text mining; bursty topic; event detection;
【Paper Link】 【Pages】:360-366
【Authors】: Bo Yang ; Xuehua Zhao
【Abstract】: Stochastic blockmodel (SBM) enables us to decompose and analyze an exploratory network without a priori knowledge about its intrinsic structure. However, the task of effectively and efficiently learning a SBM from a large-scale network is still challenging due to the high computational cost of its model selection and parameter estimation. To address this issue, we present a novel SBM learning algorithm referred to as BLOS (BLOckwise Sbm learning). Distinct from the literature, the model selection and parameter estimation of SBM are concurrently, rather than alternately, executed in BLOS by embedding the minimum message length criterion into a block-wise EM algorithm, which greatly reduces the time complexity of SBM learning without losing learning accuracy and modeling flexibility. Its effectiveness and efficiency have been tested through rigorous comparisons with the state-of-the-art methods on both synthetic and real-world networks.
【Keywords】: social network analysis;stochastic blockmodel;community detection;graph mining;model selection
【Paper Link】 【Pages】:367-373
【Authors】: Yang Yang ; Jie Tang ; Cane Wing-ki Leung ; Yizhou Sun ; Qicong Chen ; Juanzi Li ; Qiang Yang
【Abstract】: Information diffusion, which studies how information is propagated in social networks, has attracted considerable research effort recently. However, most existing approaches do not distinguish social roles that nodes may play in the diffusion process. In this paper, we study the interplay between users' social roles and their influence on information diffusion. We propose a Role-Aware INformation diffusion model (RAIN) that integrates social role recognition and diffusion modeling into a unified framework. We develop a Gibbs-sampling based algorithm to learn the proposed model using historical diffusion data. The proposed model can be applied to different scenarios. For instance, at the micro-level, the proposed model can be used to predict whether an individual user will repost a specific message; while at the macro-level, we can use the model to predict the scale and the duration of a diffusion process. We evaluate the proposed model on a real social media data set. Our model performs much better in both micro- and macro-level prediction than several alternative methods.
【Keywords】: information diffusion; social role-aware; RAIN
【Paper Link】 【Pages】:374-380
【Authors】: Weilong Yao ; Jing He ; Hua Wang ; Yanchun Zhang ; Jie Cao
【Abstract】: Pair-wise ranking methods have been widely used in recommender systems to deal with implicit feedback. They attempt to discriminate between a handful of observed items and the large set of unobserved items. In these approaches, however, user preferences and item characteristics cannot be estimated reliably due to overfitting given highly sparse data. To alleviate this problem, in this paper, we propose a novel hierarchical Bayesian framework which incorporates ``bag-of-words'' type meta-data on items into pair-wise ranking models for one-class collaborative filtering. The main idea of our method lies in extending the pair-wise ranking with a probabilistic topic modeling. Instead of regularizing item factors through a zero-mean Gaussian prior, our method introduces item-specific topic proportions as priors for item factors. As a by-product, interpretable latent factors for users and items may help explain recommendations in some applications. We conduct an experimental study on a real and publicly available dataset, and the results show that our algorithm is effective in providing accurate recommendation and interpreting user factors and item factors.
【Keywords】: Collaborative Filtering;Sparsity Reduction;Implicit Feedback;Collaborative Ranking
【Paper Link】 【Pages】:381-388
【Authors】: Quanzeng You ; Jiebo Luo ; Hailin Jin ; Jianchao Yang
【Abstract】: Sentiment analysis of online user generated content is important for many social media analytics tasks. Researchers have largely relied on textual sentiment analysis to develop systems to predict political elections, measure economic indicators, and so on. Recently, social media users are increasingly using images and videos to express their opinions and share their experiences. Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is complementary to textual sentiment analysis. Motivated by the needs in leveraging large scale yet noisy training data to solve the extremely challenging problem of image sentiment analysis, we employ Convolutional Neural Networks (CNN). We first design a suitable CNN architecture for image sentiment analysis. We obtain half a million training samples by using a baseline sentiment algorithm to label Flickr images. To make use of such noisy machine labeled data, we employ a progressive strategy to fine-tune the deep network. Furthermore, we improve the performance on Twitter images by inducing domain transfer with a small number of manually labeled Twitter images. We have conducted extensive experiments on manually labeled Twitter images. The results show that the proposed CNN can achieve better performance in image sentiment analysis than competing algorithms.
【Keywords】: Visual Sentiment Analysis; Convolutional Neural Network; Social Multimedia; Data Mining
【Paper Link】 【Pages】:389-395
【Authors】: Chenyi Zhang ; Ke Wang ; Ee-Peng Lim ; Qinneng Xu ; Jianling Sun ; Hongkun Yu
【Abstract】: Typically a user prefers an item (e.g., a movie) because she likes certain features of the item (e.g., director, genre, producer). This observation motivates us to consider a feature-centric recommendation approach to item recommendation: instead of directly predicting the rating on items, we predict the rating on the features of items, and use such ratings to derive the rating on an item. This approach offers several advantages over the traditional item-centric approach: it incorporates more information about why a user chooses an item, it generalizes better due to the denser feature rating data, it explains the prediction of item ratings through the predicted feature ratings. Another contribution is turning a principled item-centric solution into a feature-centric solution, instead of inventing a new algorithm that is feature-centric. This approach maximally leverages previous research. We demonstrate this approach by turning the traditional item-centric latent factor model into a feature-centric solution and demonstrate its superiority over item-centric approaches.
【Keywords】:
【Paper Link】 【Pages】:396-402
【Authors】: Hongyi Zhang ; Irwin King ; Michael R. Lyu
【Abstract】: Community detection is an important technique to understand structures and patterns in complex networks. Recently, overlapping community detection becomes a trend due to the ubiquity of overlapping and nested communities in real world. However, existing approaches have ignored the use of implicit link preference information, i.e., links can reflect a node's preference on the targets of connections it wants to build. This information has strong impact on community detection since a node prefers to build links with nodes inside its community than those outside its community. In this paper, we propose a preference-based nonnegative matrix factorization (PNMF) model to incorporate implicit link preference information. Unlike conventional matrix factorization approaches, which simply approximate the original adjacency matrix in value, our model maximizes the likelihood of the preference order for each node by following the intuition that a node prefers its neighbors than other nodes. Our model overcomes the indiscriminate penalty problem in which non-linked pairs inside one community are equally penalized in objective functions as those across two communities. We propose a learning algorithm which can learn a node-community membership matrix via stochastic gradient descent with bootstrap sampling. We evaluate our PNMF model on several real-world networks. Experimental results show that our model outperforms state-of-the-art approaches and can be applied to large datasets.
【Keywords】:
【Paper Link】 【Pages】:403-409
【Authors】: Qi Zhang ; Yeyun Gong ; Ya Guo ; Xuanjing Huang
【Abstract】: The task of predicting retweet behavior is an important and essential step for various social network applications, such as business intelligence, popular event prediction, and so on. Due to the increasing requirements, in recent years, the task has attracted extensive attentions. In this work, we propose a novel method using non-parametric statistical models to combine structural, textual, and temporal information together to predict retweet behavior. To evaluate the proposed method, we collect a large number of microblogs and their corresponding social networks from a real microblog service. Experimental results on the constructed dataset demonstrate that the proposed method can achieve better performance than state-of-the-art methods. The relative improvement of the the proposed over the method using only textual information is more than 38.5% in terms of F1-Score.
【Keywords】:
【Paper Link】 【Pages】:410-416
【Authors】: Weinan Zhang ; Zhaoyan Ming ; Yu Zhang ; Ting Liu ; Tat-Seng Chua
【Abstract】: Question retrieval in current community-based question answering (CQA) services does not, in general, work well for long and complex queries. One of the main difficulties lies in the word mismatch between queries and candidate questions. Existing solutions try to expand the queries at word level, but they usually fail to consider concept level enrichment. In this paper, we explore a pivot language translation based approach to derive the paraphrases of key concepts. We further propose a unified question retrieval model which integrates the keyconcepts and their paraphrases for the query question. Experimental results demonstrate that the paraphrase enhanced retrieval model significantly outperforms the state-of-the-art models in question retrieval.
【Keywords】:
【Paper Link】 【Pages】:417-424
【Authors】: Xinjie Zhou ; Xiaojun Wan ; Jianguo Xiao
【Abstract】: User-generated reviews are valuable resources for decision making. Identifying the aspect categories discussed in a given review sentence (e.g., “food” and “service” in restaurant reviews) is an important task of sentiment analysis and opinion mining. Given a predefined aspect category set, most previous researches leverage hand-crafted features and a classification algorithm to accomplish the task. The crucial step to achieve better performance is feature engineering which consumes much human effort and may be unstable when the product domain changes. In this paper, we propose a representation learning approach to automatically learn useful features for aspect category detection. Specifically, a semi-supervised word embedding algorithm is first proposed to obtain continuous word representations on a large set of reviews with noisy labels. Afterwards, we propose to generate deeper and hybrid features through neural networks stacked on the word vectors. A logistic regression classifier is finally trained with the hybrid features to predict the aspect category. The experiments are carried out on a benchmark dataset released by SemEval-2014. Our approach achieves the state-of-the-art performance and outperforms the best participating team as well as a few strong baselines.
【Keywords】: aspect category detection; opinion mining; representation learning
【Paper Link】 【Pages】:425-431
【Authors】: Virginia Ortiz Andersson ; Ricardo Matsumura de Araújo
【Abstract】: Uniquely identifying individuals using anthropometric and gait data allows for passive biometric systems, where cooperation from the subjects being identified is not required. In this paper, we report on experiments using a novel data set composed of 140 individuals walking in front of a Microsoft Kinect sensor. We provide a methodology to extract anthropometric and gait features from this data and show results of applying different machine learning algorithms on subject identification tasks. Focusing on KNN classifiers, we discuss how accuracy varies in different settings, including number of individuals in a gallery, types of attributes used and number of considered neighbors. Finally, we compare the obtained results with other results in the literature, showing that our approach has comparable accuracy for large galleries.
【Keywords】: biometrics; kinect; machine learning
【Paper Link】 【Pages】:432-438
【Authors】: Sarah M. Erfani ; Mahsa Baktashmotlagh ; Sutharshan Rajasegarar ; Shanika Karunasekera ; Christopher Leckie
【Abstract】: The problem of unsupervised anomaly detection arises in awide variety of practical applications. While one-class sup-port vector machines have demonstrated their effectiveness asan anomaly detection technique, their ability to model largedatasets is limited due to their memory and time complexityfor training. To address this issue for supervised learning ofkernel machines, there has been growing interest in randomprojection methods as an alternative to the computationallyexpensive problems of kernel matrix construction and sup-port vector optimisation. In this paper we leverage the theoryof nonlinear random projections and propose the RandomisedOne-class SVM (R1SVM), which is an efficient and scalableanomaly detection technique that can be trained on large-scale datasets. Our empirical analysis on several real-life andsynthetic datasets shows that our randomised 1SVM algo-rithm achieves comparable or better accuracy to deep autoen-coder and traditional kernelised approaches for anomaly de-tection, while being approximately 100 times faster in train-ing and testing
【Keywords】: Anomaly detection, Outlier detection, One-class SVM, Randomisation, Random Projection
【Paper Link】 【Pages】:439-445
【Authors】: Xiaomin Fang ; Rong Pan ; Guoxiang Cao ; Xiuqiang He ; Wenyuan Dai
【Abstract】: Personalized tag recommendation systems recommend a list of tags to a user when he is about to annotate an item. It exploits the individual preference and the characteristic of the items. Tensor factorization tech- niques have been applied to many applications, such as tag recommendation. Models based on Tucker Decomposition can achieve good performance but require a lot of computation power. On the other hand, mod- els based on Canonical Decomposition can run in linear time and are more feasible for online recommendation. In this paper, we propose a novel method for personalized tag recommendation, which can be considered as a nonlinear extension of Canonical Decomposition. Different from linear tensor factorization, we exploit Gaussian radial basis function to increase the model’s capacity. The experimental results show that our proposed method outperforms the state-of-the-art methods for tag recommendation on real datasets and perform well even with a small number of features, which verifies that our models can make better use of features.
【Keywords】: tensor decomposition, nonlinear, personalized recommendation
【Paper Link】 【Pages】:446-453
【Authors】: Marzyeh Ghassemi ; Marco A. F. Pimentel ; Tristan Naumann ; Thomas Brennan ; David A. Clifton ; Peter Szolovits ; Mengling Feng
【Abstract】: The ability to determine patient acuity (or severity of illness) has immediate practical use for clinicians. We evaluate the use of multivariate timeseries modeling with the multi-task Gaussian process (GP) models using noisy, incomplete, sparse, heterogeneous and unevenly-sampled clinical data, including both physiological signals and clinical notes. The learned multi-task GP (MTGP) hyperparameters are then used to assess and forecast patient acuity. Experiments were conducted with two real clinical data sets acquired from ICU patients: firstly, estimating cerebrovascular pressure reactivity, an important indicator of secondary damage for traumatic brain injury patients, by learning the interactions between intracranial pressure and mean arterial blood pressure signals, and secondly, mortality prediction using clinical progress notes. In both cases, MTGPs provided improved results: an MTGP model provided better results than single-task GP models for signal interpolation and forecasting (0.91 vs 0.69 RMSE), and the use of MTGP hyperparameters obtained improved results when used as additional classification features (0.812 vs 0.788 AUC).
【Keywords】: Multivariate timeseries modeling; Modeling of sparse and heterogeneous data; clinical data modeling; severity of illness assessment; Multi-task Guassian Process
【Paper Link】 【Pages】:454-460
【Authors】: Fei Mi ; Dit-Yan Yeung
【Abstract】: With the enormous scale of massive open online courses (MOOCs), peer grading is vital for addressing the assessment challenge for open-ended assignments or exams while at the same time providing students with an effective learning experience through involvement in the grading process. Most existing MOOC platforms use simple schemes for aggregating peer grades, e.g., taking the median or mean. To enhance these schemes, some recent research attempts have developed machine learning methods under either the cardinal setting (for absolute judgment) or the ordinal setting (for relative judgment). In this paper, we seek to study both cardinal and ordinal aspects of peer grading within a common framework. First, we propose novel extensions to some existing probabilistic graphical models for cardi- nal peer grading. Not only do these extensions give su- perior performance in cardinal evaluation, but they also outperform conventional ordinal models in ordinal eval- uation. Next, we combine cardinal and ordinal models by augmenting ordinal models with cardinal predictions as prior. Such combination can achieve further performance boosts in both cardinal and ordinal evaluations, suggesting a new research direction to pursue for peer grading on MOOCs. Extensive experiments have been conducted using real peer grading data from a course called “Science, Technology, and Society in China I” offered by HKUST on the Coursera platform.
【Keywords】: Peer Grading; Peer Assessment; MOOCs; Cardinal Peer Grading; Ordinal Peer Grading;
【Paper Link】 【Pages】:461-469
【Authors】: Piotr Lech Szczepanski ; Mateusz Krzysztof Tarkowski ; Tomasz Pawel Michalak ; Paul Harrenstein ; Michael Wooldridge
【Abstract】: Solution concepts from cooperative game theory, such as the Shapley value or the Banzhaf index, have recently been advocated as interesting extensions of standard measures of node centrality in networks. While this direction of research is promising, the computation of game-theoretic centrality can be challenging. In an attempt to address the computational issues of game-theoretic network centrality, we present a generic framework for constructing game-theoretic network centralities. We prove that all extensions that can be expressed in this framework are computable in polynomial time. Using our framework, we present the first game-theoretic extensions of weighted and normalized degree centralities, impact factor centrality,distance-scaled and normalized betweenness centrality,and closeness and normalized closeness centralities.
【Keywords】: game-theoretic network centrality; Shapley value; emivalue
【Paper Link】 【Pages】:470-476
【Authors】: Suhang Wang ; Jiliang Tang ; Huan Liu
【Abstract】: Sparse learning has been proven to be a powerful techniquein supervised feature selection, which allows toembed feature selection into the classification (or regression)problem. In recent years, increasing attentionhas been on applying spare learning in unsupervisedfeature selection. Due to the lack of label information,the vast majority of these algorithms usually generatecluster labels via clustering algorithms and then formulateunsupervised feature selection as sparse learningbased supervised feature selection with these generatedcluster labels. In this paper, we propose a novel unsupervisedfeature selection algorithm EUFS, which directlyembeds feature selection into a clustering algorithm viasparse learning without the transformation. The AlternatingDirection Method of Multipliers is used to addressthe optimization problem of EUFS. Experimentalresults on various benchmark datasets demonstrate theeffectiveness of the proposed framework EUFS.
【Keywords】: Unsupervised Feature Selection; Sparse Learning; Clustering
【Paper Link】 【Pages】:477-484
【Authors】: Yongqing Wang ; Huawei Shen ; Shenghua Liu ; Xueqi Cheng
【Abstract】: Predicting cascade dynamics has important implications for understanding information propagation and launching viral marketing. Previous works mainly adopt a pair-wise manner, modeling the propagation probability between pairs of users using n 2 independent parameters for n users. Consequently, these models suffer from severe overfitting problem, especially for pairs of users without direct interactions, limiting their prediction accuracy. Here we propose to model the cascade dynamics by learning two low-dimensional user-specific vectors from observed cascades, capturing their influence and susceptibility respectively. This model requires much less parameters and thus could combat overfitting problem. Moreover, this model could naturally model context-dependent factors like cumulative effect in information propagation. Extensive experiments on synthetic dataset and a large-scale microblogging dataset demonstrate that this model outperforms the existing pair-wise models at predicting cascade dynamics, cascade size, and "who will be retweeted."
【Keywords】: information cascades; influence; susceptbility; viral marketing; social media
【Paper Link】 【Pages】:485-491
【Authors】: David Balduzzi ; Hastagiri Vanchinathan ; Joachim M. Buhmann
【Abstract】: Error backpropagation is an extremely effective algorithm for assigning credit in artificial neural networks. However, weight updates under Backprop depend on lengthy recursive computations and require separate output and error messages — features not shared by biological neurons, that are perhaps unnecessary. In this paper, we revisit Backprop and the credit assignment problem. We first decompose Backprop into a collection of interacting learning algorithms; provide regret bounds on the performance of these sub-algorithms; and factorize Backprop's error signals. Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop. Finally, we provide a sufficient condition for Kickback to follow error gradients, and show that Kickback matches Backprop's performance on real-world regression benchmarks.
【Keywords】: neural networks; deep learning; error backpropagation; credit assignment; gradient descent
【Paper Link】 【Pages】:492-500
【Authors】: Josefina Sierra-Santibáñez
【Abstract】: This paper presents an agent-based model of the emergence and transmission of a language system for the expression of logical combinations of propositions. The model assumes the agents have some cognitive capacities for invention, adoption, repair, induction and adaptation, a common vocabulary for basic categories, and the ability to construct complex concepts using recursive combinations of basic categories with logical categories. It also supposes the agents initially do not have a vocabulary for logical categories (i.e. logical connectives), nor grammatical constructions for expressing logical combinations of basic categories through language. The results of the experiments we have performed show that a language system for the expression of logical combinations emerges as a result of a process of self-organisation of the agents' linguistic interactions. Such a language system is concise, because it only uses words and grammatical constructions for three logical categories (i.e. and, or, not). It is also expressive, since it allows the communication of logical combinations of categories of the same complexity as propositional logic formulas, using linguistic devices such as syntactic categories, word order and auxiliary words. Furthermore, it is easy to learn and reliably transmitted across generations, according to the results of our experiments.
【Keywords】: Cognitive Modeling; Symbolic AI; Simulating Humans; Logic; Language Acquisition; Adaptive Behavior
【Paper Link】 【Pages】:501-507
【Authors】: Joseph A. Blass ; Kenneth D. Forbus
【Abstract】: Moral reasoning is important to accurately model as AI systems become ever more integrated into our lives. Moral reasoning is rapid and unconscious; analogical reasoning, which can be unconscious, is a promising approach to model moral reasoning. This paper explores the use of analogical generalizations to improve moral reasoning. Analogical reasoning has already been used to successfully model moral reasoning in the MoralDM model, but it exhaustively matches across all known cases, which is computationally intractable and cognitively implausible for human-scale knowledge bases. We investigate the performance of an extension of MoralDM to use the MAC/FAC model of analogical retrieval over three conditions, across a set of highly confusable moral scenarios.
【Keywords】: Analogy; Moral Reasoning; Artificial Intelligence; Cognitive Science
【Paper Link】 【Pages】:508-514
【Authors】: Erik Cambria ; Jie Fu ; Federica Bisio ; Soujanya Poria
【Abstract】: Predicting the affective valence of unknown multi-word expressions is key for concept-level sentiment analysis. AffectiveSpace 2 is a vector space model, built by means of random projection, that allows for reasoning by analogy on natural language con- cepts. By reducing the dimensionality of affec- tive common-sense knowledge, the model allows semantic features associated with concepts to be generalized and, hence, allows concepts to be intu- itively clustered according to their semantic and affective relatedness. Such an affective intuition (so called because it does not rely on explicit fea- tures, but rather on implicit analogies) enables the inference of emotions and polarity conveyed by multi-word expressions, thus achieving efficient concept-level sentiment analysis.
【Keywords】:
【Paper Link】 【Pages】:515-521
【Authors】: Alfredo Gabaldon ; Pat Langley
【Abstract】: In recent work, Langley et al. (2014) introduced UMBRA, a systemfor plan and dialogue understanding. The program applies a form of abductive inference to generate explanations incrementally from relational descriptions of observed behavior and knowledge inthe form of rules. Although UMBRA's creators described the systemarchitecture, knowledge, and inferences, along with experimental studies of its operation, they did not provide a formalization of its structures or processes. In this paper, we analyze both aspects of the architecture in terms of the Situation Calculus — a classicallogic for reasoning about dynamical systems — and give a specification of the inference task the system performs. After this, we state some properties of this formalization thatare desirable for the task of incremental dialogue understanding. We conclude by discussing related work and describing our plans for additional research.
【Keywords】: Dialogue understanding, Belief Ascription, Incremental processing
【Paper Link】 【Pages】:522-528
【Authors】: JungWoo Ha ; Kyung-Min Kim ; Byoung-Tak Zhang
【Abstract】: Learning mutually-grounded vision-language knowledge is a foundational task for cognitive systems and human-level artificial intelligence. Most of knowledge-learning techniques are focused on single modal representations in a static environment with a fixed set of data. Here, we explore an ecologically more-plausible setting by using a stream of cartoon videos to build vision-language concept hierarchies continuously. This approach is motivated by the literature on cognitive development in early childhood. We present the model of deep concept hierarchy (DCH) that enables the progressive abstraction of concept knowledge in multiple levels. We develop a stochastic method for graph construction, i.e. a graph Monte Carlo algorithm, to search efficiently the huge compositional space of the vision-language concepts. The concept hierarchies are built incrementally and can handle concept drift, allowing for being deployed in lifelong learning environments. Using a series of approximately 200 episodes of educational cartoon videos we demonstrate the emergence and evolution of the concept hierarchies as the video stories unfold. We also present the application of the deep concept hierarchies for context-dependent translation between vision and language, i.e. the transcription of a visual scene into text and the generation of visual imagery from text.
【Keywords】: Deep Concept Hierarchy; Multimodal Concept Learning; Hypergraphs; Graph Monte Carlo; Visual-Linguistic Knowledge; Vision-Language Translation; Cartoon Videos
【Paper Link】 【Pages】:529-536
【Authors】: Jesse Hoey ; Tobias Schröder
【Abstract】: Notions of identity and of the self have long been studied in social psychology and sociology as key guiding elements of social interaction and coordination. In the AI of the future, these notions will also play a role in producing natural, socially appropriate artificially intelligent agents that encompass subtle and complex human social and affective skills. We propose here a Bayesian generalization of the sociological affect control theory of self as a theoretical foundation for socio-affectively skilled artificial agents. This theory posits that each human maintains an internal model of his or her deep sense of "self" that captures their emotional, psychological, and socio-cultural sense of being in the world. The "self" is then externalised as an identity within any given interpersonal and institutional situation, and this situational identity is the person's local (in space and time) representation of the self. Situational identities govern the actions of humans according to affect control theory. Humans will seek situations that allow them to enact identities consistent with their sense of self. This consistency is cumulative over time: if some parts of a person's self are not actualized regularly, the person will have a growing feeling of inauthenticity that they will seek to resolve. In our present generalisation, the self is represented as a probability distribution, allowing it to be multi-modal (a person can maintain multiple different identities), uncertain (a person can be unsure about who they really are), and learnable (agents can learn the identities and selves of other agents). We show how the Bayesian affect control theory of self can underpin artificial agents that are socially intelligent.
【Keywords】: Affect Control Theory; Probabilistic Reasoning; Markov decision process; Identity; Self; Affective Computing
【Paper Link】 【Pages】:537-543
【Authors】: Pat Langley ; Adam Arvay
【Abstract】: This paper presents a novel approach to inductive process modeling, the task of constructing a quantitative account of dynamical behavior from time-series data and background knowledge. We review earlier work on this topic, noting its reliance on methods that evaluate entire model structures and use repeated simulation to estimate parameters, which together make severe computational demands. In response, we present an alternative method for process model induction that assumes each process has a rate, that this rate is determined by an algebraic expression, and that changes due to a process are directly proportionalto its rate. We describe RPM, an implemented system that incorporates these ideas, and we report analyses and experiments that suggest it scales well to complex domains and data sets. In closing, we discuss related research and outline ways to extend the framework.
【Keywords】: Scientific discovery, Process models, Explanation
【Paper Link】 【Pages】:544-550
【Authors】: Justin Li ; John E. Laird
【Abstract】: This paper presents the first functional evaluation of spontaneous, uncued retrieval from long-term memory in a cognitive architecture. The key insight is that current deliberate cued retrieval mechanisms require the agent to have knowledge of when and what to retrieve --- knowledge that may be missing or incorrect. Spontaneous uncued retrieval eliminates these requirements through automatic retrievals that use the agent's problem solving context as a heuristic for relevance, thus supplementing deliberate cued retrieval. Using constraints derived from this insight, we sketch the space of spontaneous retrieval mechanisms and describe an implementation of spontaneous retrieval in Soar together with an agent that takes advantage of that mechanism. Empirical evidence is provided in the Missing Link word-puzzle domain, where agents using spontaneous retrieval out-perform agents without that capability, leading us to conclude that spontaneous retrieval can be a useful mechanism and is worth further exploration.
【Keywords】: spontaneous retrieval; missing link
【Paper Link】 【Pages】:551-557
【Authors】: Chen Liang ; Kenneth D. Forbus
【Abstract】: Fast and efficient learning over large bodies of commonsense knowledge is a key requirement for cognitive systems. Semantic web knowledge bases provide an important new resource of ground facts from which plausible inferences can be learned. This paper applies structured logistic regression with analogical generalization (SLogAn) to make use of structural as well as statistical information to achieve rapid and robust learning. SLogAn achieves state-of-the-art performance in a standard triplet classification task on two data sets and, in addition, can provide understandable explanations for its answers.
【Keywords】: analogy; machine learning; semantic web; reasoning
【Paper Link】 【Pages】:558-564
【Authors】: Peter Lindes ; Deryle W. Lonsdale ; David W. Embley
【Abstract】: Machine reading is a relatively new field that features computer programs designed to read flowing text and extract fact assertions expressed by the narrative content. This task involves two core technologies: natural language processing (NLP) and information extraction (IE). In this paper we describe a machine reading system that we have developed within a cognitive architecture. We show how we have integrated into the framework several levels of knowledge for a particular domain, ideas from cognitive semantics and construction grammar, plus tools from prior NLP and IE research. The result is a system that is capable of reading and interpreting complex and fairly idiosyncratic texts in the family history domain. We describe the architecture and performance of the system. After presenting the results from several evaluations that we have carried out, we summarize possible future directions.
【Keywords】: cognitive agent; cognitive architecture; ontology
【Paper Link】 【Pages】:565-571
【Authors】: Matthew D. McLure ; Scott E. Friedman ; Kenneth D. Forbus
【Abstract】: Concept learning is a central problem for cognitive systems. Generalization techniques can help organize examples by their commonalities, but comparisons with non-examples, near-misses, can provide discrimination. Early work on near-misses required hand-selected examples by a teacher who understood the learner’s internal representations. This paper introduces Analogical Learning by Integrating Generalization and Near-misses (ALIGN) and describes three key advances. First, domain-general cognitive models of analogical processes are used to handle a wider range of examples. Second, ALIGN’s analogical generalization process constructs multiple probabilistic representations per concept via clustering, and hence can learn disjunctive concepts. Finally, ALIGN uses unsupervised analogical retrieval to find its own near-miss examples. We show that ALIGN out-performs analogical generalization on two perceptual data sets: (1) hand-drawn sketches; and (2) geospatial concepts from strategy-game maps.
【Keywords】: analogy; concept learning; near-misses; sketch recognition
【Paper Link】 【Pages】:572-578
【Authors】: Marjorie McShane ; Petr Babkin
【Abstract】: Ellipsis is a linguistic process that makes certain aspects of text meaning not directly traceable to surface text elements and, therefore, inaccessible to most language processing technologies. However, detecting and resolving ellipsis is an indispensable capability for language-enabled intelligent agents. The key insight of the work presented here is that not all cases of ellipsis are equally difficult: some can be detected and resolved with high confidence even before we are able to build agents with full human-level semantic and pragmatic understanding of text. This paper describes a fully automatic, implemented and evaluated method of treating one class of ellipsis: elided scopes of modality. Our cognitively-inspired approach, which centrally leverages linguistic principles, has also been applied to overt referring expressions with equally promising results.
【Keywords】: ellipsis; reference resolution; modality
【Paper Link】 【Pages】:579-585
【Authors】: Daniel R. Schlegel ; Stuart C. Shapiro
【Abstract】: There are very few reasoners which combine natural deduction and subsumption reasoning, and there are none which do so while supporting concurrency. Inference Graphs are a graph-based inference mechanism using an expressive first-order logic, capable of subsumption and natural deduction reasoning using concurrency. Evaluation of concurrency characteristics on a combination natural deduction and subsumption reasoning problem has shown linear speedup with the number of processors.
【Keywords】: Automated Reasoning; Inference Graphs; Subsumption; Natural Deduction; SNePS
【Paper Link】 【Pages】:586-592
【Authors】: Miaolong Yuan ; Bo Tian ; Vui Ann Shim ; Huajin Tang ; Haizhou Li
【Abstract】: Hippocampal place cells and entorhinal grid cells have been hypothesized to be able to form map-like spatial representation of the environment, namely cognitive map. In most prior approaches, either neural network methods or only hippocampal models are used for building cognitive maps, lacking biological fidelity to the entorhinal-hippocampal system. This paper presents a novel computational model to build cognitive maps of real environments using both place cells and grid cells. The proposed model includes two major components: (1) A competitive Hebbian learning algorithm is used to select velocity-coupled grid cell population activities, which path-integrate self-motion signals to determine computation of place cell population activities; (2) Visual cues of environments are used to correct the accumulative errors intrinsically associated with the path integration process. Experiments performed on a mobile robot show that cognitive maps of the real environment can be efficiently built. The proposed model would provide an alternative neuro-inspired approach for robotic mapping, navigation and localization.
【Keywords】: cognitive map; robotic spatial cognition; brain-inspired SLAM
【Paper Link】 【Pages】:593-600
【Authors】: Keyang Zhang ; Kenny Q. Zhu ; Seung-won Hwang
【Abstract】: To judge how much a pair of words (or texts) are semantically related is acognitive process. However, previous algorithms for computing semanticrelatedness are largely based on co-occurrences within textualwindows, and do not actively leverage cognitive human perceptions ofrelatedness. To bridge this perceptional gap, we propose to utilizefree association as signals to capture such human perceptions.However, free association, being manually evaluated,has limited lexical coverage and is inherently sparse. We propose to expand lexical coverage and overcome sparseness by constructing an association network of terms and concepts that combines signals from free association norms and five types of co-occurrences extracted from therich structures of Wikipedia. Our evaluation results validate thatsimple algorithms on this network give competitive results incomputing semantic relatedness between words and between shorttexts.
【Keywords】: Concept Association; Cognitive computing; Semantic Similarity
【Paper Link】 【Pages】:601-607
【Authors】: Saima Aman ; Charalampos Chelmis ; Viktor K. Prasanna
【Abstract】: Applications in sustainability domains such as in energy, transportation, and natural resource and environment monitoring, increasingly use sensors for collecting data and sending it back to centrally located processing nodes. While data can usually be collected by the sensors at a very high speed, in many cases, it can not be sent back to central nodes at a frequency that is required for fast and real-time modeling and decision-making. This may be due to physical limitations of the transmission networks, or due to consumers limiting frequent transmission of data from sensors located at their premises for security and privacy concerns. We propose a novel solution to the problem of making short term predictions in absence of real-time data from sensors. A key implication of our work is that by using real-time data from only a small subset of influential sensors, we are able to make predictions for all sen- sors. We evaluated our approach with a large real-world electricity consumption data collected from smart meters in Los Angeles and the results show that between prediction horizons of 2 to 8 hours, despite lack of real time data, our influence model outperforms the baseline model that uses real-time data. Also, when using partial real-time data from only ≈ 7% influential smart meters, we witness prediction error increase by only ≈ 0.5% over the baseline, thus demonstrating the usefulness of our method for practical scenarios.
【Keywords】: Prediction Models; Partial Data; Smart Grid
【Paper Link】 【Pages】:608-614
【Authors】: Filippo Bistaffa ; Alessandro Farinelli ; Sarvapali D. Ramchurn
【Abstract】: We consider the Social Ridesharing (SR) problem, where a set of commuters, connected through a social network, arrange one-time rides at short notice. In particular, we focus on the associated optimisation problem of forming cars to minimise the travel cost of the overall system modelling such problem as a graph constrained coalition formation (GCCF) problem, where the set of feasible coalitions is restricted by a graph (i.e., the social network). Moreover, we significantly extend the state of the art algorithm for GCCF, i.e., the CFSS algorithm, to solve our GCCF model of the SR problem. Our empirical evaluation uses a real dataset for both spatial (GeoLife) and social data (Twitter), to validate the applicability of our approach in a realistic application scenario. Empirical results show that our approach computes optimal solutions for systems of medium scale (up to 100 agents) providing significant cost reductions (up to -36.22%). Moreover, we can provide approximate solutions for very large systems (i.e., up to 2000 agents) and good quality guarantees (i.e., with an approximation ratio of 1.41 in the worst case) within minutes (i.e., 100 seconds).
【Keywords】: Coalition Formation; Social Networks; Ridesharing
【Paper Link】 【Pages】:615-621
【Authors】: Frits de Nijs ; Matthijs T. J. Spaan ; Mathijs de Weerdt
【Abstract】: Renewable power sources such as wind and solar are inflexible in their energy production, which requires demand to rapidly follow supply in order to maintain energy balance. Promising controllable demands are air-conditioners and heat pumps which use electric energy to maintain a temperature at a setpoint. Such Thermostatically Controlled Loads (TCLs) have been shown to be able to follow a power curve using reactive control. In this paper we investigate the use of planning under uncertainty to pro-actively control an aggregation of TCLs to overcome temporary grid imbalance. We present a formal definition of the planning problem under consideration, which we model using the Multi-Agent Markov Decision Process (MMDP) framework. Since we are dealing with hundreds of agents, solving the resulting MMDPs directly is intractable. Instead, we propose to decompose the problem by decoupling the interactions through arbitrage. Decomposition of the problem means relaxing the joint power consumption constraint, which means that joining the plans together can cause overconsumption. Arbitrage acts as a conflict resolution mechanism during policy execution, using the future expected value of policies to determine which TCLs should receive the available energy. We experimentally compare several methods to plan with arbitrage, and conclude that a best response-like mechanism is a scalable approach that returns near-optimal solutions.
【Keywords】: Planning under uncertainty; Multi-Agent Markov Decision Processes; Decomposition; Arbitrage; Smart Grids
【Paper Link】 【Pages】:622-628
【Authors】: John P. Dickerson ; Tuomas Sandholm
【Abstract】: The preferred treatment for kidney failure is a transplant; however, demand for donor kidneys far outstrips supply. Kidney exchange, an innovation where willing but incompatible patient-donor pairs can exchange organs- — via barter cycles and altruist-initiated chains —provides a life-saving alternative.Typically, fielded exchanges act myopically, considering only the current pool of pairs when planning the cycles and chains. Yet kidney exchange is inherently dynamic, with participants arriving and departing. Also, many planned exchange transplants do not go to surgery due to various failures. So, it is important to consider the future when matching. Motivated by our experience running the computational side of a large nationwide kidney exchange, we present FutureMatch, a framework for learning to match in a general dynamic model. FutureMatch takes as input a high-level objective (e.g., "maximize graft survival of transplants over time'') decided on by experts, then automatically (i) learns based on data how to make this objective concrete and (ii) learns the ``means'' to accomplish this goal — a task, in our experience, that humans handle poorly. It uses data from all live kidney transplants in the US since 1987 to learn the quality of each possible match; it then learns the potentials of elements of the current input graph offline (e.g., potentials of pairs based on features such as donor and patient blood types), translates these to weights, and performs a computationally feasible batch matching that incorporates dynamic, failure-aware considerations through the weights. We validate FutureMatch on real fielded exchange data. It results in higher values of the objective. Furthermore, even under economically inefficient objectives that enforce equity, it yields better solutions for the efficient objective (which does not incorporate equity) than traditional myopic matching that uses the efficiency objective.
【Keywords】: Kidney exchange; dynamic matching
【Paper Link】 【Pages】:629-635
【Authors】: Ehsan Elhamifar ; Shankar Sastry
【Abstract】: In this paper, we consider the problem of energy disaggregation, i.e., decomposing a whole home electricity signal into its component appliances. We propose a new supervised algorithm, which in the learning stage, automatically extracts signature consumption patterns of each device by modeling the device as a mixture of dynamical systems. In order to extract signature consumption patterns of a device corresponding to its different modes of operation, we define appropriate dissimilarities between energy snippets of the device and use them in a subset selection scheme, which we generalize to deal with time-series data. We then form a dictionary that consists of extracted power signatures across all devices. We cast the disaggregation problem as an optimization over a representation in the learned dictionary and incorporate several novel priors such as device-sparsity, knowledge about devices that do or do not work together as well as temporal consistency of the disaggregated solution. Real experiments on a publicly available energy dataset demonstrate that our proposed algorithm achieves promising results for energy disaggregation.
【Keywords】:
【Paper Link】 【Pages】:636-643
【Authors】: Stefano Ermon ; Ronan Le Bras ; Santosh K. Suram ; John M. Gregoire ; Carla P. Gomes ; Bart Selman ; Robert Bruce van Dover
【Abstract】: Identifying important components or factors in large amounts of noisy data is a key problem in machine learning and data mining. Motivated by a pattern decomposition problem in materials discovery, aimed at discovering new materials for renewable energy, e.g. for fuel and solar cells, we introduce CombiFD, a framework for factor based pattern decomposition that allows the incorporation of a-priori knowledge as constraints, including complex combinatorial constraints. In addition, we propose a new pattern decomposition algorithm, called AMIQO, based on solving a sequence of (mixed-integer) quadratic programs. Our approach considerably outperforms the state of the art on the materials discovery problem, scaling to larger datasets and recovering more precise and physically meaningful decompositions. We also show the effectiveness of our approach for enforcing background knowledge on other application domains.
【Keywords】: Combinatorial Optimization; Computational Sustainability; Materials Discovery; Source Separation; Factor Analysis; Matrix Factorization
【Paper Link】 【Pages】:644-650
【Authors】: Stefano Ermon ; Yexiang Xue ; Russell Toth ; Bistra N. Dilkina ; Richard Bernstein ; Theodoros Damoulas ; Patrick Clark ; Steve DeGloria ; Andrew Mude ; Christopher Barrett ; Carla P. Gomes
【Abstract】: Understanding spatio-temporal resource preferences is paramount in the design of policies for sustainable development. Unfortunately, resource preferences are often unknown to policy-makers and have to be inferred from data. In this paper we consider the problem of inferring agents' preferences from observed movement trajectories, and formulate it as an Inverse Reinforcement Learning (IRL) problem . With the goal of informing policy-making, we take a probabilistic approach and consider generative models that can be used to simulate behavior under new circumstances such as changes in resource availability, access policies, or climate. We study the Dynamic Discrete Choice (DDC) models from econometrics and prove that they generalize the Max-Entropy IRL model, a widely used probabilistic approach from the machine learning literature. Furthermore, we develop SPL-GD, a new learning algorithm for DDC models that is considerably faster than the state of the art and scales to very large datasets. We consider an application in the context of pastoralism in the arid and semi-arid regions of Africa, where migratory pastoralists face regular risks due to resource availability, droughts, and resource degradation from climate change and development. We show how our approach based on satellite and survey data can accurately model migratory pastoralism in East Africa and that it considerably outperforms other approaches on a large-scale real-world dataset of pastoralists' movements in Ethiopia collected over 3 years.
【Keywords】:
【Paper Link】 【Pages】:651-657
【Authors】: Vitor Campanholo Guizilini ; Fabio Tozeto Ramos
【Abstract】: We introduce a novel method for the continuous online prediction of particulate matter in the air (more specifically, PM10 and PM2.5) given sparse sensor information. A nonparametric model is developed using Gaussian Processes, which eschews the need for an explicit formulation of internal -- and usually very complex -- dependencies between meteorological variables. Instead, it uses historical data to extrapolate pollutant values both spatially (in areas with no sensor information) and temporally (the near future). Each prediction also contains a respective variance, indicating its uncertainty level and thus allowing a probabilistic treatment of results. A novel training methodology (Structural Cross-Validation) is presented, which preserves the spatio-temporal structure of available data during the hyperparameter optimization process. Tests were conducted using a real-time feed from a sensor network in an area of roughly 50x80 km, alongside comparisons with other techniques for air pollution prediction. The promising results motivated the development of a smartphone applicative and a website, currently in use to increase the efficiency of air quality monitoring and control in the area.
【Keywords】: Gaussian Processes;Machine Learning;Spatio-Temporal;Online learning;RT-AQF;environment monitoring
【Paper Link】 【Pages】:658-664
【Authors】: Hassan L. Hijazi ; Terrence W. K. Mak ; Pascal Van Hentenryck
【Abstract】: We address the problem of power system restoration after a significant blackout. Prior work focus on optimization methods for finding high-quality restoration plans. Optimal solutions consist in a sequence of grid repairs and corresponding steady states. However, such approaches lack formal guarantees on the transient stability of restoration actions, a key property to avoid additional grid damage and cascading failures. In this paper, we show how to integrate transient stability in the optimization procedure by capturing the rotor dynamics of power generators. Our approach reasons about the differential equations describing the dynamics and their underlying transient states. The key contribution lies in modeling and solving optimization problems that return stable generators dispatch minimizing the difference with respect to steady states solutions. Computational efficiency is increased using preprocessing procedures along with traditional reduction techniques. Experimental results on existing benchmarks confirm the feasibility of the new approach.
【Keywords】:
【Paper Link】 【Pages】:665-671
【Authors】: Micha Kahlen ; Wolfgang Ketter
【Abstract】: Electric vehicles will play a crucial role in balancing the future electrical grid, which is complicated by many intermittent renewable energy sources. We developed an algorithm that determines for a fleet of electric vehicles, which EV at what price and location to commit to the operating reserve market to either absorb excess capacity or provide electricity during shortages (vehicle-2-grid). The algorithm takes the value of immobility into account by using carsharing fees as a reference point. A virtual power plant autonomously replaces cars that are committed to the operating reserves and are then rented out, with other idle cars to pool the risks of uncertainty. We validate our model with data from a free float carsharing fleet of 500 electric vehicles. An analysis of expected future developments (2015, 2018, and 2022) in operating reserve demand and battery costs yields that the gross profits for a carsharing operator increase between 7-12% with a negligible decrease in car availability (<0.01%).
【Keywords】: electric vehicles; virtual power plants; renwable energy; sustainability; agents
【Paper Link】 【Pages】:672-678
【Authors】: Liangda Li ; Hongyuan Zha
【Abstract】: Energy disaggregation, the task of taking a whole home electricity signal and decomposing it into its component appliances, has been proved to be essential in energy conservation research. One powerful cue for breaking down the entire household's energy consumption is user's daily energy usage behavior, which has so far received little attention: existing works on energy disaggregation mostly ignored the relationship between the energy usages of various appliances across different time slots. To model such relationship, we combine topic models with Hawkes processes, and propose a novel probabilistic model based on marked Hawkes process that enables the modeling of marked event data. The proposed model seeks to capture the influence from the occurrence and the marks of one usage event to the occurrence and the marks of subsequent usage events in the future. We also develop an inference algorithm based on variational inference for model parameter estimation. Experimental results on both synthetic data and three real world data sets demonstrate the effectiveness of our model, which outperforms state-of-the-art approaches in decomposing the entire consumed energy to each appliance. Analyzing the influence captured by the proposed model provides further insights into numerous interesting energy usage behavior patterns.
【Keywords】: Energy disaggregation;Energy Usage Behavior;Marked Hawkes Process
【Paper Link】 【Pages】:679-686
【Authors】: BoonPing Lim ; Menkes van den Briel ; Sylvie Thiébaux ; Scott Backhaus ; Russell Bent
【Abstract】: Energy consumption in commercial and educational buildings is impacted by group activities such as meetings, workshops, classes and exams, and can be reduced by scheduling these activities to take place at times and locations that are favorable from an energy standpoint. This paper improves on the effectiveness of energy-aware room-booking and occupancy scheduling approaches, by allowing the scheduling decisions to rely on an explicit model of the building's occupancy-based HVAC control. The core component of our approach is a mixed-integer linear programming (MILP) model which optimally solves the joint occupancy scheduling and occupancy-based HVAC control problem. To scale up to realistic problem sizes, we embed this MILP model into a large neighbourhood search (LNS). We obtain substantial energy reduction in comparison with occupancy-based HVAC control using arbitrary schedules or using schedules obtained by existing heuristic energy-aware scheduling approaches.
【Keywords】: Smart buildings; Occupancy scheduling; Mixed integer programming; Large neighborhood search; HVAC control; Planning and scheduling;
【Paper Link】 【Pages】:687-694
【Authors】: Eoin O'Mahony ; David B. Shmoys
【Abstract】: Bike-sharing systems are becoming increasingly prevalent in urban environments. They provide a low-cost, environmentally-friendly transportation alternative for cities. The management of these systems gives rise to many optimization problems. Chief among these problems is the issue of bicycle rebalancing. Users imbalance the system by creating demand in an asymmetric pattern. This necessitates action to put the system back in balance with the requisite levels of bicycles at each station to facilitate future use. In this paper, we tackle the problem of maintaing system balance during peak rush-hour usageas well as rebalancing overnight to prepare the systemfor rush-hour usage. We provide novel problem formulationsthat have been motivated by both a close collaborationwith the New York City bike share (Citibike) and a careful analysisof system usage data. We analyze system data to discover the best placement of bikes tofacilitate usage. We solve routing problems forovernight shifts as well as clustering problems for handlingmid rush-hour usage. The tools developed from this research are currently in daily use at NYC Bike Share LLC, operators of Citibike.
【Keywords】: Computational Sustainability; Bike Sharing; Optimization
【Paper Link】 【Pages】:695-701
【Authors】: Athanasios Aris Panagopoulos ; Georgios Chalkiadakis ; Nicholas Robert Jennings
【Abstract】: The power output of photovoltaic systems (PVS) increases with the use of effective and efficient solar tracking techniques. However, current techniques suffer from several drawbacks in their tracking policy: (i) they usually do not consider the forecasted or prevailing weather conditions; even when they do, they (ii) rely on complex closed-loop controllers and sophisticated instruments; and (iii) typically, they do not take the energy consumption of the trackers into account. In this paper, we propose a policy iteration method (along with specialized variants), which is able to calculate near-optimal trajectories for effective and efficient day-ahead solar tracking, based on weather forecasts coming from on-line providers. To account for the energy needs of the tracking system, the technique employs a novel and generic consumption model. Our simulations show that the proposed methods can increase the power output of a PVS considerably, when compared to standard solar tracking techniques.
【Keywords】: Solar tracking; Energy Efficiency; Smart Grid; MDP; Dynamic programming; Policy Iteration; Alternating
【Paper Link】 【Pages】:702-708
【Authors】: Sandhya Saisubramanian ; Pradeep Varakantham ; Hoong Chuin Lau
【Abstract】: In emergency medical systems, arriving at the incident locationa few seconds early can save a human life. Thus, this paper is motivated by the need to reduce the response time– time taken to arrive at the incident location after receivingthe emergency call — of Emergency Response Vehicles, ERVs(ex: ambulances, fire rescue vehicles) for as many requests as possible. We expect to achieve this primarily by positioning the ”right” number of ERVs at the ”right” places and at the ”right” times. Given the exponentially large action space(with respect to number of ERVs and their placement) and the stochasticity in location and timing of emergency incidents,this problem is computationally challenging. To that end, ourcontributions building on existing data-driven approaches are three fold:1. Based on real world evaluation metrics, we provide a riskbased optimization criterion to learn from past incident data. Instead of minimizing expected response time, we minimize the largest value of response time such that the risk of finding requests that have a higher value is bounded(ex: Only 10% of requests should have a response time greater than 8 minutes).2. We develop a mixed integer linear optimization formulation to learn and compute an allocation from a set of inputrequests while considering the risk criterion.3. To allow for ”live” reallocation of ambulances, we provide a decomposition method based on Lagrangian Relaxation to significantly reduce the run-time of the optimization formulation.Finally, we provide an exhaustive evaluation on real-world datasets from two asian cities that demonstrates the improvement provided by our approach over current practice and the best known approach from literature.
【Keywords】: Emergency Medical Systems; Optimization; Lagrangian Dual Decomposition
【Paper Link】 【Pages】:709-715
【Authors】: Hermann Schichl ; Meinolf Sellmann
【Abstract】: We develop a new approach for a pre-disaster planning problem which consists in computing an optimal investment plan to strengthen a transportation network, given that a future disaster probabilistically destroys links in the network. We show how the problem can be formulated as a non-linear integer program and devise an AI algorithm to solve it. In particular, we introduce a new type of extreme resource constraint and develop a practically efficient propagation algorithm for it. Experiments show several orders of magnitude improvements over existing approaches, allowing us to close an existing real-world benchmark and to solve to optimality other, more challenging benchmarks.
【Keywords】: Stochastic Optimization; Pre-Disaster Planning; Resilient Society
【Paper Link】 【Pages】:716-722
【Authors】: Bochao Shen ; Balakrishnan Narayanaswamy ; Ravi Sundaram
【Abstract】: Peak demand for electricity continues to surge around the world. The supply-demand imbalance manifests itself in many forms, from rolling brownouts in California to power cuts in India. It is often suggested that exposing consumers to real-time pricing, will incentivize them to change their usage and mitigate the problem - akin to increasing tolls at peak commute times. We show that risk-averse consumers of electricity react to price fluctuations by scaling back on their total demand, not just their peak demand, leading to the unintended consequence of an overall decrease in production/consumption and reduced economic efficiency. We propose a new scheme that allows homes to move their demands from peak hours in exchange for greater electricity consumption in non-peak hours - akin to how airlines incentivize a passenger to move from an over-booked flight in exchange for, say, two tickets in the future. We present a formal framework for the incentive model that is applicable to different forms of the electricity market. We show that our scheme not only enables increased consumption and consumer social welfare but also allows the distribution company to increase profits. This is achieved by allowing load to be shifted while insulating consumers from real-time price fluctuations. This win-win is important if these methods are to be embraced in practice.
【Keywords】: smart grid; demand response; incentive mechanism
【Paper Link】 【Pages】:723-729
【Authors】: Adish Singla ; Marco Santoni ; Gábor Bartók ; Pratik Mukerji ; Moritz Meenen ; Andreas Krause
【Abstract】: Bike sharing systems have been recently adopted by a growing number of cities as a new means of transportation offering citizens a flexible, fast and green alternative for mobility. Users can pick up or drop off the bicycles at a station of their choice without prior notice or time planning. This increased flexibility comes with the challenge of unpredictable and fluctuating demand as well as irregular flow patterns of the bikes. As a result, these systems can incur imbalance problems such as the unavailability of bikes or parking docks at stations. In this light, operators deploy fleets of vehicles which re-distribute the bikes in order to guarantee a desirable service level. Can we engage the users themselves to solve the imbalance problem in bike sharing systems? In this paper, we address this question and present a crowdsourcing mechanism that incentivizes the users in the bike repositioning process by providing them with alternate choices to pick or return bikes in exchange for monetary incentives. We design the complete architecture of the incentives system which employs optimal pricing policies using the approach of regret minimization in online learning. We investigate the incentive compatibility of our mechanism and extensively evaluate it through simulations based on data collected via a survey study. Finally, we deployed the proposed system through a smartphone app among users of a large scale bike sharing system operated by a public transport company, and we provide results from this experimental deployment. To our knowledge, this is the first dynamic incentives system for bikes re-distribution ever deployed in a real-world bike sharing system.
【Keywords】: Bike sharing systems; mobility; crowdsourcing; mechanism design; incentive compatibility; smartphone app
【Paper Link】 【Pages】:730-736
【Authors】: Xuan Song ; Quanshi Zhang ; Yoshihide Sekimoto ; Ryosuke Shibasaki ; Nicholas Jing Yuan ; Xing Xie
【Abstract】: The frequency and intensity of natural disasters has significantly increased over the past decades and this trend is predicted to continue. Facing these possible and unexpected disasters, understanding and simulating of human emergency mobility following disasters will becomethe critical issue for planning effective humanitarian relief, disaster management, and long-term societal reconstruction. However, due to the uniquenessof various disasters and the unavailability of reliable and large scale human mobility data, such kind of research is very difficult to be performed. Hence, in this paper,we collect big and heterogeneous data (e.g. 1.6 million users' GPS records in three years, 17520 times of Japan earthquake data in four years, news reporting data, transportation network data and etc.) to capture and analyze human emergency mobility following different disasters. By mining these big data, we aim to understand what basic laws govern human mobility following disasters, and develop a general model of human emergency mobility for generating and simulating large amount of human emergency movements. The experimental results and validations demonstrate the efficiency of our simulation model, and suggest that human mobility following disasters may be significantly morepredictable and can be easier simulated than previously thought.
【Keywords】: human mobility;big data;disaster informatics
【Paper Link】 【Pages】:737-744
【Authors】: Alexander David Styler ; Illah Reza Nourbakhsh
【Abstract】: With increasing numbers of electric and hybrid vehicles on the road, transportation presents a unique opportunity to leverage data-driven intelligence to realize large scale impact in energy use and emissions. Energy management in these vehicles is highly sensitive to upcoming power load on the vehicle, which is not considered in conventional reactive policies calculated at design time. Advancements in cheap sensing and computation have enabled on-board upcoming load predictions which can be used to optimize energy management. In this work, we propose and evaluate a novel, real-time optimization strategy that leverages predictions from prior data in a simulated hybrid battery-supercapacitor energy management task. We demonstrate a complete adaptive system that improves over the lifetime of the vehicle as more data is collected and prediction accuracy improves. Using thousands of miles of real-world data collected from both petrol and electric vehicles, we evaluate the performance of our optimization strategy with respect to our cost function. The system achieves performance within 10% of the optimal upper bound calculated using a priori knowledge of the upcoming loads. This performance implies improved battery thermal stability, efficiency, and longevity. Our strategy can be applied to optimize energy use in gas-electric hybrids, battery cooling in electric vehicles, and many other load-sensitive tasks in transportation.
【Keywords】: Learning; Prediction; Optimization; Energy; Transportation
【Paper Link】 【Pages】:745-752
【Authors】: Umair Z. Ahmed ; Krishnendu Chatterjee ; Sumit Gulwani
【Abstract】: Simple board games, like Tic-Tac-Toe and CONNECT-4, play an important role not only in the development of mathematical and logical skills, but also in the emotional and social development. In this paper, we address the problem of generating targeted starting positions for such games. This can facilitate new approaches for bringing novice players to mastery, and also leads to discovery of interesting game variants. We present an approach that generates starting states of varying hardness levels for player 1 in a two-player board game, given rules of the board game, the desired number of steps required for player 1 to win, and the expertise levels of the two players. Our approach leverages symbolic methods and iterative simulation to efficiently search the extremely large state space. We present experimental results that include discovery of states of varying hardness levels for several simple grid-based board games. The presence of such states for standard game variants like 4 x 4 Tic-Tac-Toe opens up new games to be played that have never been played as the default start state is heavily biased.
【Keywords】: Board Games; Games on Graphs; Symbolic methods and Binary Decision Diagrams (BDDs); Problem generation for games
【Paper Link】 【Pages】:753-762
【Authors】: Quentin Galvane ; Rémi Ronfard ; Christophe Lino ; Marc Christie
【Abstract】: We describe an optimization-based approach for automatically creating well-edited movies from a 3D animation. While previous work has mostly focused on the problem of placing cameras to produce nice-looking views of the action, the problem of cutting and pasting shots from all available cameras has never been addressed extensively. In this paper, we review the main causes of editing errors in literature and propose an editing model relying on a minimization of such errors. We make a plausible semi-Markov assumption, resulting in a dynamic programming solution which is computationally efficient. We also show that our method can generate movies with different editing rhythms and validate the results through a user study. Combined with state-of-the-art cinematography, our approach therefore promises to significantly extend the expressiveness and naturalness of virtual movie-making.
【Keywords】: Storytelling ; Video Editing ; Vision ; Continuity rules ;
【Paper Link】 【Pages】:763-769
【Authors】: Michael Albert ; Vincent Conitzer ; Giuseppe Lopomo
【Abstract】: In a classic result in the mechanism design literature, Cremerand McLean (1985) show that if buyers’ valuations are sufficiently correlated, a mechanism exists that allows the seller to extract the full surplus from efficient allocation as revenue. This result is commonly seen as “too good to be true” (in practice), casting doubt on its modeling assumptions. In this paper, we use an automated mechanism design approach to assess how sensitive the Cremer-McLean result is to relaxing its main technical assumption. That assumption implies that each valuation that a bidder can have results in a unique conditional distribution over the external signal(s). We relax this, allowing multiple valuations to be consistent with the same distribution over the external signal(s). Using similar insights to Cremer-McLean, we provide a highly efficient algorithm for computing the optimal revenue in this more general case. Using this algorithm, we observe that indeed, as the number of valuations consistent with a distribution grows, the optimal revenue quickly drops to that of a reserve-price mechanism. Thus, automated mechanism design allows us to gain insight into the precise sense in which Cremer-McLean is “too good to be true.”
【Keywords】: Mechanism Design; Automated Mechanism Design; Simple Mechanisms; Incomplete Information; Optimal Mechanisms
【Paper Link】 【Pages】:770-776
【Authors】: Kareem Amin ; Rachel Cummings ; Lili Dworkin ; Michael Kearns ; Aaron Roth
【Abstract】: We consider the problem of learning from revealed preferences in an online setting. In our framework, each period a consumer buys an optimal bundle of goods from a merchant according to her (linear) utility function and current prices, subject to a budget constraint. The merchant observes only the purchased goods, and seeks to adapt prices to optimize his profits. We give an efficient algorithm for the merchant's problem that consists of a learning phase in which the consumer's utility function is (perhaps partially) inferred, followed by a price optimization step. We also give an alternative online learning algorithm for the setting where prices are set exogenously, but the merchant would still like to predict the bundle that will be bought by the consumer, for purposes of inventory or supply chain management. In contrast with most prior work on the revealed preferences problem, we demonstrate that by making stronger assumptions on the form of utility functions, efficient algorithms for both learning and profit maximization are possible, even in adaptive, online settings.
【Keywords】:
【Paper Link】 【Pages】:777-783
【Authors】: Elliot Anshelevich ; Onkar Bhardwaj ; John Postl
【Abstract】: We examine the quality of social choice mechanisms using a utilitarian view, in which all of the agents have costs for each of the possible alternatives. While these underlying costs determine what the optimal alternative is, they may be unknown to the social choice mechanism; instead the mechanism must decide on a good alternative based only on the ordinal preferences of the agents which are induced by the underlying costs. Due to its limited information, such a social choice mechanism cannot simply select the alternative that minimizes the total social cost (or minimizes some other objective function). Thus, we seek to bound the distortion: the worst-case ratio between the social cost of the alternative selected and the optimal alternative. Distortion measures how good a mechanism is at approximating the alternative with minimum social cost, while using only ordinal preference information. The underlying costs can be arbitrary, implicit, and unknown; our only assumption is that the agent costs form a metric space, which is a natural assumption in many settings. We quantify the distortion of many well-known social choice mechanisms. We show that for both total social cost and median agent cost, many positional scoring rules have large distortion, while on the other hand Copeland and similar mechanisms perform optimally or near-optimally, always obtaining a distortion of at most 5. We also give lower bounds on the distortion that could be obtained by any deterministic social choice mechanism, and extend our results on median agent cost to more general objective functions.
【Keywords】: social choice, distortion, metric preferences
【Paper Link】 【Pages】:784-790
【Authors】: Haris Aziz ; Markus Brill ; Vincent Conitzer ; Edith Elkind ; Rupert Freeman ; Toby Walsh
【Abstract】: We consider approval-based committee voting, i.e., the setting where each voter approves a subset of candidates, and these votes are then used to select a fixed-size set of winners (committee). We propose a natural axiom for this setting, which we call justified representation (JR). This axiom requires that if a large enough group of voters exhibits agree- ment by supporting the same candidate, then at least one voter in this group has an approved candidate in the winning committee. We show that for every list of ballots it is possible to select a committee that provides JR. We then check if this axiom is fulfilled by well-known approval-based voting rules. We show that the answer is negative for most of the rules we consider, with notable exceptions of PAV (Proportional Approval Voting), an extreme version of RAV (Reweighted Approval Voting), and, for a restricted preference domain, MAV (Minimax Approval Voting). We then introduce a stronger version of the JR axiom, which we call extended justified representation (EJR), and show that PAV satisfies EJR, while other rules do not. We also consider several other questions related to JR and EJR, including the relationship between JR/EJR and unanimity, and the complexity of the associated algorithmic problems.
【Keywords】:
【Paper Link】 【Pages】:791-797
【Authors】: Jeremiah Blocki ; Nicolas Christin ; Anupam Datta ; Ariel D. Procaccia ; Arunesh Sinha
【Abstract】: Modern organizations (e.g., hospitals, social networks, government agencies) rely heavily on audit to detect and punish insiders who inappropriately access and disclose confidential information. Recent work on audit games models the strategic interaction between an auditor with a single audit resource and auditees as a Stackelberg game, augmenting associated well-studied security games with a configurable punishment parameter. We significantly generalize this audit game model to account for multiple audit resources where each resource is restricted to audit a subset of all potential violations, thus enabling application to practical auditing scenarios. We provide an FPTAS that computes an approximately optimal solution to the resulting non-convex optimization problem. The main technical novelty is in the design and correctness proof of an optimization transformation that enables the construction of this FPTAS. In addition, we experimentally demonstrate that this transformation significantly speeds up computation of solutions for a class of audit games and security games.
【Keywords】: Auditing; Game Theory
【Paper Link】 【Pages】:798-804
【Authors】: Avrim Blum ; Yishay Mansour ; Jamie Morgenstern
【Abstract】: Auction theory traditionally assumes that bidders’ val- uation distributions are known to the auctioneer, such as in the celebrated, revenue-optimal Myerson auc- tion (Myerson 1981). However, this theory does not de- scribe how the auctioneer comes to possess this infor- mation. Recently work (Cole and Roughgarden 2014) showed that an approximation based on a finite sample of independent draws from each bidder’s distribution is sufficient to produce a near-optimal auction. In this work, we consider the problem of learning bidders’ val- uation distributions from much weaker forms of obser- vations. Specifically, we consider a setting where there is a repeated, sealed-bid auction with n bidders, but all we observe for each round is who won, but not how much they bid or paid. We can also participate (i.e., submit a bid) ourselves, and observe when we win. From this information, our goal is to (approximately) recover the inherently recoverable part of the underlying bid distributions. We also consider extensions where different subsets of bidders participate in each round, and where bidders’ valuations have a common-value component added to their independent private values.
【Keywords】:
【Paper Link】 【Pages】:805-811
【Authors】: Branislav Bosanský ; Jiri Cermak
【Abstract】: Stackelberg equilibrium is a solution concept prescribing for a player an optimal strategy to commit to, assuming the opponent knows this commitment and plays the best response. Although this solution concept is a cornerstone of many security applications, the existing works typically do not consider situations where the players can observe and react to the actions of the opponent during the course of the game. We extend the existing algorithmic work to extensive-form games and introduce novel algorithm for computing Stackelberg equilibria that exploits the compact sequence-form representation of strategies. Our algorithm reduces the size of the linear programs from exponential in the baseline approach to linear in the size of the game tree. Experimental evaluation on randomly generated games and a security-inspired search game demonstrates significant improvement in the scalability compared to the baseline approach.
【Keywords】: Stackelberg Equilibrium; extensive-form games; mixed-integer linear programming
【Paper Link】 【Pages】:812-818
【Authors】: Branislav Bosanský ; Albert Xin Jiang ; Milind Tambe ; Christopher Kiekintveld
【Abstract】: Many search and security games played on a graph can be modeled as normal-form zero-sum games with strategies consisting of sequences of actions. The size of the strategy space provides a computational challenge when solving these games. This complexity is tackled either by using the compact representation of sequential strategies and linear programming, or by incremental strategy generation of iterative double-oracle methods. In this paper, we present novel hybrid of these two approaches: compact-strategy double-oracle (CS-DO) algorithm that combines the advantages of the compact representation with incremental strategy generation. We experimentally compare CS-DO with the standard approaches and analyze the impact of the size of the support on the performance of the algorithms. Results show that CS-DO dramatically improves the convergence rate in games with non-trivial support
【Keywords】: normal-form games with sequential strategies; incremental strategy generation; zero-sum games
【Paper Link】 【Pages】:819-826
【Authors】: Markus Brill ; Vincent Conitzer
【Abstract】: Models of strategic candidacy analyze the incentives of candidates to run in an election. Most work on this topic assumes that strategizing only takes place among candidates, whereas voters vote truthfully. In this paper, we extend the analysis to also include strategic behavior on the part of the voters. (We also study cases where only candidates or only voters are strategic.) We consider two settings in which strategic voting is well-defined and has a natural interpretation: majority-consistent voting with single-peaked preferences and voting by successive elimination. In the former setting, we analyze the type of strategic behavior required in order to guarantee desirable voting outcomes. In the latter setting, we determine the complexity of computing the set of potential outcomes if both candidates and voters act strategically.
【Keywords】:
【Paper Link】 【Pages】:827-834
【Authors】: Benedikt Bünz ; Sven Seuken ; Benjamin Lubin
【Abstract】: Computing prices in core-selecting combinatorial auctions is a computationally hard problem. Auctions with many bids can only be solved using a recently proposed core constraint generation (CCG) algorithm, which may still take days on hard instances. In this paper, we present a new algorithm that significantly outperforms the current state of the art. Towards this end, we first provide an alternative definition of the set of core constraints, where each constraint is weakly stronger, and prove that together these constraints define the identical polytope to the previous definition. Using these new theoretical insights we develop two new algorithmic techniques which generate additional constraints in each iteration of the CCG algorithm by 1) exploiting separability in allocative conflicts between participants in the auction, and 2) by leveraging non-optimal solutions. We show experimentally that our new algorithm leads to significant speed-ups on a variety of large combinatorial auction problems. Our work provides new insights into the structure of core constraints and advances the state of the art in fast algorithms for computing core prices in large combinatorial auctions.
【Keywords】: Combinatorial Auctions; Core; Constraint Generation;
【Paper Link】 【Pages】:835-841
【Authors】: Mithun Chakraborty ; Sanmay Das ; Justin Peabody
【Abstract】: The logarithmic market scoring rule (LMSR), the most common automated market making rule for prediction markets, is typically studied in the framework of dealer markets, where the market maker takes one side of every transaction. The continuous double auction (CDA) is a much more widely used microstructure for general financial markets in practice. In this paper, we study the properties of CDA prediction markets with zero-intelligence traders in which an LMSR-style market maker participates actively. We extend an existing idea of Robin Hanson for integrating LMSR with limit order books in order to provide a new, self-contained market making algorithm that does not need “special” access to the order book and can participate as another trader. We find that, as expected, the presence of the market maker leads to generally lower bid-ask spreads and higher trader surplus (or price improvement), but, surprisingly, does not necessarily improve price discovery and market efficiency; this latter effect is more pronounced when there is higher variability in trader beliefs.
【Keywords】: Prediction Market; Continuous Double Auction; Market Maker; Logarithmic Market Scoring Rule
【Paper Link】 【Pages】:842-850
【Authors】: Hau Chan ; Luis E. Ortiz
【Abstract】: Roughly speaking, Interdependent Defense (IDD) games, previously proposed, model the situation where an attacker wants to cause as much damage as possible to a network by attacking one of the sites in the network. Each site must make an investment decision regarding security to protect itself against a direct or indirect attack, the latter due to potential transfer-risk from an unprotected neighboring site. The work introducing IDD games discusses potential applications to model the essence of real-world scenarios such as the 2006 transatlantic aircraft plot. In this paper, our focus is the study of the problem of computing a Nash Equilibrium (NE) in IDD games. We show that an efficient algorithm to determine whether some attacker’s strategy can be a part of a NE in an instance of IDD games is unlikely to exist. Yet, we provide a dynamic programming algorithm to compute an approximate NE when the graph/network structure of the game is a directed tree with a single source, and show that it is an FPTAS. We also introduce an improved heuristic to compute an approximate NE on arbitrary graph structures. Our experiments show that our heuristic is more efficient, and provides better approximations, than best-response-gradient dynamics for the case of Internet games, a class of games introduced and studied in the original work on IDD games.
【Keywords】:
【Paper Link】 【Pages】:851-857
【Authors】: Yiling Chen ; Kobbi Nissim ; Bo Waggoner
【Abstract】: In a search task, a group of agents compete to be the first to find the solution. Each agent has different private information to incorporate into its search. This problem is inspired by settings such as scientific research, Bitcoin hash inversion, or hunting for some buried treasure. A social planner such as a funding agency, mining pool, or pirate captain might like to convince the agents to collaborate, share their information, and greatly reduce the cost of searching. However, this cooperation is in tension with the individuals' competitive desire to each be the first to win the search. The planner's proposal should incentivize truthful information sharing, reduce the total cost of searching, and satisfy fairness properties that preserve the spirit of the competition. We design contract-based mechanisms for information sharing without money. The planner solicits the agents' information and assigns search locations to the agents, who may then search only within their assignments. Truthful reporting of information to the mechanism maximizes an agent's chance to win the search. Epsilon-voluntary participation is satisfied for large search spaces. In order to formalize the planner's goals of fairness and reduced search cost, we propose a simplified, simulated game as a benchmark and quantify fairness and search cost relative to this benchmark scenario. The game is also used to implement our mechanisms. Finally, we extend to the case where coalitions of agents may participate in the mechanism, forming larger coalitions recursively.
【Keywords】: mechanism design, fairness, cooperation
【Paper Link】 【Pages】:858-864
【Authors】: John A. Doucette ; Kate Larson ; Robin Cohen
【Abstract】: Deciding the outcome of an election when voters have provided only partial orderings over their preferences requires voting rules that accommodate missing data. While existing techniques, including considerable recent work, address missingness through circumvention, we propose the novel application of conventional machine learning techniques to predict the missing components of ballots via latent patterns in the information that voters are able to provide. We show that suitable predictive features can be extracted from the data, and demonstrate the high performance of our new framework on the ballots from many real world elections, including comparisons with existing techniques for voting with partial orderings. Our technique offers a new and interesting conceptualization of the problem, with stronger connections to machine learning than conventional social choice techniques.
【Keywords】: Social Choice; Partial Preferences; Imputation
【Paper Link】 【Pages】:865-871
【Authors】: Edith Elkind ; Piotr Faliszewski ; Martin Lackner ; Svetlana Obraztsova
【Abstract】: We study the complexity of deciding if a given profile of incomplete votes (i.e., a profile of partial orders over a given set of alternatives) can be extended to a single-crossing profile of complete votes (total orders). This problem models settings where we have partial knowledge regarding voters' preferences and we would like to understand whether the given preference profile may be single-crossing. We show that this problem admits a polynomial-time algorithm when the order of votes is fixed and the input profile consists of top orders, but becomes NP-complete if we are allowed to permute the votes and the input profile consists of weak orders or independent-pairs orders. Also, we identify a number of practical special cases of both problems that admit polynomial-time algorithms.
【Keywords】: single-crossing preferences; incomplete preferences; algorithms
【Paper Link】 【Pages】:872-878
【Authors】: Uriel Feige ; Michal Feldman ; Nicole Immorlica ; Rani Izsak ; Brendan Lucier ; Vasilis Syrgkanis
【Abstract】: We introduce a new hierarchy over monotone set functions, that we refer to as MPH (Maximum over Positive Hypergraphs). Levels of the hierarchy correspond to the degree of complementarity in a given function. The highest level of the hierarchy, MPH-m (where m is the total number of items) captures all monotone functions. The lowest level, MPH-1, captures all monotone submodular functions, and more generally, the class of functions known as XOS. Every monotone function that has a positive hypergraph representation of rank k (in the sense defined by Abraham, Babaioff, Dughmi and Roughgarden [EC 2012]) is in MPH-k. Every monotone function that has supermodular degree k (in the sense defined by Feige and Izsak [ITCS 2013]) is in MPH-(k+1). In both cases, the converse direction does not hold, even in an approximate sense. We present additional results that demonstrate the expressiveness power of MPH-k. One can obtain good approximation ratios for some natural optimization problems, provided that functions are required to lie in low levels of the MPH hierarchy. We present two such applications. One shows that the maximum welfare problem can be approximated within a ratio of k+1 if all players hold valuation functions in MPH-k. The other is an upper bound of 2k on the price of anarchy of simultaneous first price auctions.
【Keywords】: Combinatorial Auctions;Complementarities;Welfare Maximization;Price of Anarchy;Hypergraph Valuations
【Paper Link】 【Pages】:879-885
【Authors】: Michal Feldman ; Ofir Geri
【Abstract】: We study strong equilibria in symmetric capacitated cost-sharing games. In these games, a graph with designated source s and sink t is given, and each edge is associated with some cost. Each agent chooses strategically an s-t path, knowing that the cost of each edge is shared equally between all agents using it. Two variants of cost-sharing games have been previously studied: (i) games where coalitions can form, and (ii) games where edges are associated with capacities; both variants are inspired by real-life scenarios. In this work we combine these variants and analyze strong equilibria (profiles where no coalition can deviate) in capacitated games. This combination gives rise to new phenomena that do not occur in the previous variants. Our contribution is two-fold. First, we provide a topological characterization of networks that always admit a strong equilibrium. Second, we establish tight bounds on the efficiency loss that may be incurred due to strategic behavior, as quantified by the strong price of anarchy (and stability) measures. Interestingly, our results are qualitatively different than those obtained in the analysis of each variant alone, and the combination of coalitions and capacities entails the introduction of more refined topology classes than previously studied.
【Keywords】: network congestion games; cost-sharing games; strong price of anarchy; strong equilibrium; coalitions; capacities; network topology
【Paper Link】 【Pages】:886-892
【Authors】: Diodato Ferraioli ; Carmine Ventre ; Gabor Aranyi
【Abstract】: In this paper, we study protocols that allow to discern conscious and unconscious decisions of human beings; i.e., protocols that measure awareness. Consciousness is a central research theme in Neuroscience and AI, which remains, to date, an obscure phenomenon of human brains. Our starting point is a recent experiment, called Post Decision Wagering (PDW) (Persaud, McLeod, and Cowey 2007), that attempts to align experimenters' and subjects' objectives by leveraging financial incentives. We note a similarity with mechanism design, a research area which aims at the design of protocols that reconcile often divergent objectives through incentive-compatibility. We look at the issue of measuring awareness from this perspective. We abstract the setting underlying the PDW experiment and identify three factors that could make it ineffective: rationality, risk attitude and bias of subjects. Using mechanism design tools, we study the barrier between possibility and impossibility of incentive compatibility with respect to the aforementioned characteristics of subjects. We complete this study by showing how to use our mechanisms to potentially get a better understanding of consciousness.
【Keywords】:
【Paper Link】 【Pages】:893-899
【Authors】: Aris Filos-Ratsikas ; Minming Li ; Jie Zhang ; Qiang Zhang
【Abstract】: We study the problem of locating a single facility on a real line based on the reports of self-interested agents, when agents have double-peaked preferences, with the peaks being on opposite sides of their locations.We observe that double-peaked preferences capture real-life scenarios and thus complement the well-studied notion of single-peaked preferences. We mainly focus on the case where peaks are equidistant from the agents’ locations and discuss how our results extend to more general settings. We show that most of the results for single-peaked preferences do not directly apply to this setting; this makes the problem essentially more challenging. As our main contribution, we present a simple truthful-in-expectation mechanism that achieves an approximation ratio of 1+b/c for both the social and the maximum cost, where b is the distance of the agent from the peak and c is the minimum cost of an agent. For the latter case, we provide a 3/2 lower bound on the approximation ratio of any truthful-in-expectation mechanism. We also study deterministic mechanisms under some natural conditions, proving lower bounds and approximation guarantees. We prove that among a large class of reasonable mechanisms, there is no deterministic mechanism that outpeforms our truthful-in-expectation mechanism.
【Keywords】: Facility location, double-peaked preferences, approximation ratio, mechanism design.
【Paper Link】 【Pages】:900-906
【Authors】: Rafael M. Frongillo ; Yiling Chen ; Ian A. Kash
【Abstract】: We study the problem of eliciting and aggregating probabilistic information from multiple agents. In order to successfully aggregate the predictions of agents, the principal needs to elicit some notion of confidence from agents, capturing how much experience or knowledge led to their predictions. To formalize this, we consider a principal who wishes to learn the distribution of a random variable. A group of Bayesian agents has each privately observed some independent samples of the random variable. The principal wishes to elicit enough information from each agent, so that her posterior is the same as if she had directly received all of the samples herself. Leveraging techniques from Bayesian statistics, we represent confidence as the number of samples an agent has observed, which is quantified by a hyperparameter from a conjugate family of prior distributions. This then allows us to show that if the principal has access to a few samples, she can achieve her aggregation goal by eliciting predictions from agents using proper scoring rules. In particular, with access to one sample, she can successfully aggregate the agents' predictions if and only if every posterior predictive distribution corresponds to a unique value of the hyperparameter, a property which holds for many common distributions of interest. When this uniqueness property does not hold, we construct a novel and intuitive mechanism where a principal with two samples can elicit and optimally aggregate the agents' predictions.
【Keywords】: information elicitation, information aggregation, scoring rules; conjugate priors
【Paper Link】 【Pages】:907-913
【Authors】: Etsushi Fujita ; Julien Lesca ; Akihisa Sonoda ; Taiki Todo ; Makoto Yokoo
【Abstract】: Core-selection is a crucial property of social choice functions, or rules, in social choice literature. It is also desirable to address the incentive of agents to cheat by misreporting their preferences. This paper investigates an exchange problem where each agent may have multiple indivisible goods, agents' preferences over sets of goods are assumed to be lexicographic, and side payments are not allowed. We propose an exchange rule called augmented top-trading-cycles (ATTC) procedure based on the original TTC procedure. We first show that the ATTC procedure is core-selecting. We then show that finding a beneficial misreport under the ATTC procedure is NP-hard. Under the ATTC procedure, we finally clarify the relationship between preference misreport and splitting, which is a different type of manipulation.
【Keywords】: Mechanism Design; Exchange; Top-Trading-Cycles; Core; Splittings; Complexity
【Paper Link】 【Pages】:914-920
【Authors】: Jiarui Gan ; Bo An ; Yevgeniy Vorobeychik
【Abstract】: Stackelberg security games have been widely deployed in recent years to schedule security resources. An assumption in most existing security game models is that one security resource assigned to a target only protects that target. However, in many important real-world security scenarios, when a resource is assigned to a target, it exhibits protection externalities: that is, it also protects other “neighbouring” targets. We investigate such Security Games with Protection Externalities (SPEs). First, we demonstrate that computing a strong Stackelberg equilibrium for an SPE is NP-hard, in contrast with traditional Stackelberg security games which can be solved in polynomial time. On the positive side, we propose a novel column generation based approach—CLASPE—to solve SPEs. CLASPE features the following novelties: 1) a novel mixed-integer linear programming formulation for the slave problem; 2) an extended greedy approach with a constant-factor approximation ratio to speed up the slave problem; and 3) a linear-scale linear programming that efficiently calculates the upper bounds of target-defined subproblems for pruning. Our experimental evaluation demonstrates that CLASPE enable us to scale to realistic-sized SPE problem instances.
【Keywords】: Game theory; Stackelberg game; Protection externality; Stackelberg game
【Paper Link】 【Pages】:921-928
【Authors】: Chen Hajaj ; John P. Dickerson ; Avinatan Hassidim ; Tuomas Sandholm ; David Sarne
【Abstract】: We present a credit-based matching mechanism for dynamic barter markets — and kidney exchange in particular — that is both strategy proof and efficient, that is, it guarantees truthful disclosure of donor-patient pairs from the transplant centers and results in the maximum global matching. Furthermore, the mechanism is individually rational in the sense that, in the long run, it guarantees each transplant center more matches than the center could have achieved alone. The mechanism does not require assumptions about the underlying distribution of compatibility graphs — a nuance that has previously produced conflicting results in other aspects of theoretical kidney exchange. Our results apply not only to matching via 2-cycles: the matchings can also include cycles of any length and altruist-initiated chains, which is important at least in kidney exchanges. The mechanism can also be adjusted to guarantee immediate individual rationality at the expense of economic efficiency, while preserving strategy proofness via the credits. This circumvents a well-known impossibility result in static kidney exchange concerning the existence of an individually rational, strategy-proof, and maximal mechanism. We show empirically that the mechanism results in significant gains on data from a national kidney exchange that includes 59% of all US transplant centers.
【Keywords】: Dynamic mechanism design; kidney exchange
【Paper Link】 【Pages】:929-935
【Authors】: Martin Hoefer ; Daniel Vaz ; Lisa Wagner
【Abstract】: Coalition formation is a fundamental problem in the organization of many multi-agent systems. In large populations, the formation of coalitions is often restricted by structural visibility and locality constraints under which agents can reorganize. We capture and study this aspect using a novel network-based model for dynamic locality within the popular framework of hedonic coalition formation games. We analyze the effects of network-based visibility and structure on the convergence of coalition formation processes to stable states. Our main result is a tight characterization of the structures based on which dynamic coalition formation can stabilize quickly. Maybe surprisingly, polynomial-time convergence can be achieved if and only if coalition formation is based on complete or star graphs.
【Keywords】: Coalition Formation; Hedonic Games; Stable Matching; Locality
【Paper Link】 【Pages】:936-943
【Authors】: Hadi Hosseini ; Kate Larson ; Robin Cohen
【Abstract】: We consider the problem of repeatedly matching a set of alternatives to a set of agents with dynamic ordinal preferences. Despite a recent focus on designing one-shot matching mechanisms in the absence of monetary transfers, little study has been done on strategic behavior of agents in sequential assignment problems. We formulate a generic dynamic matching problem via a sequential stochastic matching process. We design a mechanism based on random serial dictatorship (RSD) that, given any history of preferences and matching decisions, guarantees global stochastic strategyproofness while satisfying desirable local properties. We further investigate the notion of envyfreeness in such sequential settings.
【Keywords】: Matching; Random Assignment; Strategyproofness; Dynamic Preferences; Mechanism Design
【Paper Link】 【Pages】:944-950
【Authors】: Anna Karlin ; Eric Lei
【Abstract】: Consider a scenario in which there are multiple employers competing to hire the best possible employee. How does the competition between the employers affect their hiring strategies or their ability to hire one of the best possible candidates? In this paper, we address this question by studying a generalization of the classical secretary problem from optimal stopping theory: a set of ranked employers compete to hire from the same random stream of employees, and each employer wishes to hire the best candidate in the bunch. We show how to derive subgame-perfect Nash equilibrium strategies in this game and analyze the impact the competition has on the quality of the hires as a function of the rank of the employer. We present numerical results from simulations of these strategies.
【Keywords】: algorithmic game theory; secretary problem; optimal stopping; subgame-perfect equilibrium
【Paper Link】 【Pages】:951-957
【Authors】: Ryoji Kurata ; Masahiro Goto ; Atsushi Iwasaki ; Makoto Yokoo
【Abstract】:
School choice programs are implemented to give students/parents an opportunity to choose the public school the students attend. Controlled school choice programs need to provide choices for students/parents while maintaining distributional constraints on the balance on the composition of students, typically in terms of socioeconomic status. Previous works show that setting soft-bounds, which flexibly change the priorities of students based on their types, is more appropriate than setting hard-bounds, which strictly limit the number of accepted students for each type. We consider a case where soft-bounds are imposed and one student can belong to multiple types, e.g., financially-distressed'' and
minority'' types. We first show that when we apply a model that is a straightforward extension of an existing model for disjoint types, there is a chance that no stable matching exists. Thus, we propose an alternative model and an alternative stability definition, where a school has reserved seats for each type. We show that a stable matching is guaranteed to exist in this model, and develop a mechanism called Deferred Acceptance for Overlapping Types (DA-OT). The DA-OT mechanism is strategy-proof and obtains the student-optimal matching within all stable matchings. Computer simulation results illustrate that the DA-OT outperforms an artificial cap mechanism, where the number of seats for each type is fixed.
【Keywords】: Matching theory, School choice, Affirmative action, Strategy-proof, Stability
【Paper Link】 【Pages】:958-964
【Authors】: Aron Laszka ; Yevgeniy Vorobeychik ; Xenofon D. Koutsoukos
【Abstract】: To penetrate sensitive computer networks, attackers can use spear phishing to sidestep technical security mechanisms by exploiting the privileges of careless users. In order to maximize their success probability, attackers have to target the users that constitute the weakest links of the system. The optimal selection of these target users takes into account both the damage that can be caused by a user and the probability of a malicious e-mail being delivered to and opened by a user. Since attackers select their targets in a strategic way, the optimal mitigation of these attacks requires the defender to also personalize the e-mail filters by taking into account the users' properties. In this paper, we assume that a learned classifier is given and propose strategic per-user filtering thresholds for mitigating spear-phishing attacks. We formulate the problem of filtering targeted and non-targeted malicious e-mails as a Stackelberg security game. We characterize the optimal filtering strategies and show how to compute them in practice. Finally, we evaluate our results using two real-world datasets and demonstrate that the proposed thresholds lead to lower losses than non-strategic thresholds.
【Keywords】: game theory; spear-phishing; machine learning; e-mail filtering; targeted attacks
【Paper Link】 【Pages】:965-971
【Authors】: Hooyeon Lee ; Yoav Shoham
【Abstract】: We consider the situation in which an organizer is trying to convenean event, and needs to choose whom out of a given set of agents to invite.Agents have preferences over how many attendees should be at the eventand possibly also who the attendees should be.This induces a stability requirement: All invited agents should preferattending to not attending, and all the other agents should not regretbeing not invited.The organizer's objective is to find an invitation of maximum size,subject to the stability requirement. We investigate the computational complexity of finding such an invitation when agents are truthful, as well as the mechanism design problem when agents act strategically.
【Keywords】: stable invitation; group scheduling
【Paper Link】 【Pages】:972-978
【Authors】: Omer Lev ; Joel Oren ; Craig Boutilier ; Jeffrey S. Rosenschein
【Abstract】: We study a game with \emph{strategic} vendors (the agents) who own multiple items and a single buyer with a submodular valuation function. The goal of the vendors is to maximize their revenue via pricing of the items, given that the buyer will buy the set of items that maximizes his net payoff.% (valuation minus the prices). We show this game may not always have a pure Nash equilibrium, in contrast to previous results for the special case where each vendor owns a single item. We do so by relating our game to an intermediate, discrete game in which the vendors only choose the available items, and their prices are set exogenously afterwards. We further make use of the intermediate game to provide tight bounds on the price of anarchy for the subset games that have pure Nash equilibria; we find that the optimal PoA reached in the previous special cases does not hold, but only a logarithmic one. Finally, we show that for a special case of submodular functions, efficient pure Nash equilibria always exist.
【Keywords】: Game theory; Equilibrium; Price of anarchy; Price of stability
【Paper Link】 【Pages】:979-985
【Authors】: Yuqian Li ; Vincent Conitzer
【Abstract】: In cooperative game theory, it is typically assumed that the value of each coalition is known. We depart from this, assuming that v(S) is only a noisy estimate of the true value V (S), which is not yet known. In this context, we investigate which solution concepts maximize the probability of ex-post stability (after the true values are revealed). We show how various conditions on the noise characterize the least core and the nucleolus as optimal. Modifying some aspects of these conditions to (arguably) make them more realistic, we obtain characterizations of new solution concepts as being optimal, including the partial nucleolus, the multiplicative least core, and the multiplicative nucleolus.
【Keywords】: cooperative game theory; uncertainty; characterizing solution concepts
【Paper Link】 【Pages】:986-992
【Authors】: Reshef Meir ; David C. Parkes
【Abstract】: We put forward a new model of congestion games where agents have uncertainty over the routes used by other agents. We take a non-probabilistic approach, assuming that each agent knows that the number of agents using an edge is within a certain range. Given this uncertainty, we model agents who either minimize their worst-case cost (WCC) or their worst-case regret (WCR), and study implications on equilibrium existence, convergence through adaptive play, and efficiency. Under the WCC behavior the game reduces to a modified congestion game, and welfare improves when agents have moderate uncertainty. Under WCR behavior the game is not, in general, a congestion game, but we show convergence and efficiency bounds for a simple class of games.
【Keywords】: congestion games; uncertainty; potential; routing
【Paper Link】 【Pages】:993-999
【Authors】: Svetlana Obraztsova ; Evangelos Markakis ; Maria Polukarov ; Zinovi Rabinovich ; Nicholas R. Jennings
【Abstract】: We study convergence properties of iterative voting procedures. Such procedures are defined by a voting rule and a (restricted) iterative process, where at each step one agent can modify his vote towards a better outcome for himself. It is already known that if the iteration dynamics (the manner in which voters are allowed to modify their votes) are unrestricted, then the voting process may not converge. For most common voting rules this may be observed even under the best response dynamics limitation. It is therefore important to investigate whether and which natural restrictions on the dynamics of iterative voting procedures can guarantee convergence. To this end, we provide two general conditions on the dynamics based on iterative myopic improvements, each of which is sufficient for convergence. We then identify several classes of voting rules (including Positional Scoring Rules, Maximin, Copeland and Bucklin), along with their corresponding iterative processes, for which at least one of these conditions hold.
【Keywords】:
【Paper Link】 【Pages】:1000-1006
【Authors】: Ariel D. Procaccia ; Nisarg Shah ; Yair Zick
【Abstract】: We present the first model of optimal voting under adversarial noise. From this viewpoint, voting rules are seen as error-correcting codes: their goal is to correct errors in the input rankings and recover a ranking that is close to the ground truth. We derive worst-case bounds on the relation between the average accuracy of the input votes, and the accuracy of the output ranking. Empirical results from real data show that our approach produces significantly more accurate rankings than alternative approaches.
【Keywords】: Voting rules, Ground truth, Adversarial noise
【Paper Link】 【Pages】:1007-1013
【Authors】: Zinovi Rabinovich ; Svetlana Obraztsova ; Omer Lev ; Evangelos Markakis ; Jeffrey S. Rosenschein
【Abstract】: Following recent studies of iterative voting and its effects on plurality vote outcomes, we provide characterisations and complexity results for three models of iterative voting under the plurality rule. Our focus is on providing a better understanding regarding the set of equilibria attainable by iterative voting processes. We start with the basic model of plurality voting. We first establish some useful properties of equilibria, reachable by iterative voting, which enable us to show that deciding whether a given profile is an iteratively reachable equilibrium is NP-complete. We then proceed to combine iterative voting with the concept of truth bias, a model where voters prefer to be truthful when they cannot affect the outcome. We fully characterise the set of attainable truth-biased equilibria, and show that it is possible to determine all such equilibria in polynomial time. Finally, we also examine the model of lazy voters, in which a voter may choose to abstain from the election. We establish convergence of the iterative process, albeit not necessarily to a Nash equilibrium. As in the case with truth bias, we also provide a polynomial time algorithm to find all the attainable equilibria.
【Keywords】:
【Paper Link】 【Pages】:1014-1020
【Authors】: Goran Radanovic ; Boi Faltings
【Abstract】: The modern web critically depends on aggregation of information from self-interested agents, for example opinion polls, product ratings, or crowdsourcing. We consider a setting where multiple objects (questions, products, tasks) are evaluated by a group of agents. We first construct a minimal peer prediction mechanism that elicits honest evaluations from a homogeneous population of agents with different private beliefs. Second, we show that it is impossible to strictly elicit honest evaluations from a heterogeneous group of agents with different private beliefs. Nevertheless, we provide a modified version of a divergence-based Bayesian Truth Serum that incentivizes agents to report consistently, making truthful reporting a weak equilibrium of the mechanism.
【Keywords】: Mechanism Design; Information Elicitation; Peer Prediction
【Paper Link】 【Pages】:1021-1028
【Authors】: Erel Segal-Halevi ; Avinatan Hassidim ; Yonatan Aumann
【Abstract】: We consider the problem of fair division of a two dimensional heterogeneous good among several agents. Applications include division of land as well as ad space in print and electronic media. Classical cake cutting protocols either consider a one-dimensional resource, or allocate each agent several disconnected pieces. In practice, however, the two dimensional shape of the allotted piece is of crucial importance in many applications, e.g., squares or bounded aspect-ratio rectangles are most useful for building houses as well as advertisements. We thus introduce and study the problem of envy-free two-dimensional division wherein the utility of the agents depends on the geometric shape of the allocated pieces (as well as the location and size). In addition to envy-freeness, we require that the fraction allocated to each agent be at least a certain constant that depends only on the shape of the cake and the number of agents. We focus on the case where the allotted pieces must be square and the cakes are either squares or the unbounded plane. We provide algorithms for the problem for settings with two and three agents.
【Keywords】: cake-cutting; envy-free; two-dimensional; square; quarter-plane; fair-division; moving-knife
【Paper Link】 【Pages】:1029-1035
【Authors】: Paolo Serafino ; Carmine Ventre
【Abstract】: In this paper, we consider the facility location problem un- der a novel model recently proposed in the literature, which combines the no-money constraint (i.e. the impossibility to employ monetary transfers between the mechanism and the agents) with the presence of heterogeneous facilities, i.e. facilities serving different purposes. Agents thus have a significantly different cost model w.r.t. the classical model with homogeneous facilities studied in literature. We initiate the study of non-utilitarian optimization functions under this novel model. In particular, we consider the case where the optimization goal consists of minimizing the maximum connection cost of the agents. In this setting, we investigate both deterministic and randomized algorithms and derive both lower and upper bounds regarding the approximability of strate- gyproof mechanisms.
【Keywords】: Algorithmic Mechanism Design; Mechanisms without Money; Facility Location
【Paper Link】 【Pages】:1036-1042
【Authors】: Oskar Skibski ; Tomasz P. Michalak ; Yuko Sakurai ; Michael Wooldridge ; Makoto Yokoo
【Abstract】: We propose a novel representation for coalitional games with externalities, called Partition Decision Trees. This representation is based on rooted directed trees, where non-leaf nodes are labelled with agents' names, leaf nodes are labelled with payoff vectors, and edges indicate membership of agents in coalitions. We show that this representation is fully expressive, and for certain classes of games significantly more concise than an extensive representation. Most importantly, Partition Decision Trees are the first formalism in the literature under which most of the direct extensions of the Shapley value to games with externalities can be computed in polynomial time.
【Keywords】: coalitional games; Shapley value; externalities; partition function games; decision trees
【Paper Link】 【Pages】:1043-1049
【Authors】: Rohith Dwarakanath Vallam ; Priyanka Bhatt ; Debmalya Mandal ; Y. Narahari
【Abstract】: Increased interest in web-based education has spurred the proliferation of online learning environments. However, these platforms suffer from high dropout rates due to lack of sustained motivation among the students taking the course. In an effort to address this problem, we propose an incentive-based, instructor-driven approach to orchestrate the interactions in online educational forums (OEFs). Our approach takes into account the heterogeneity in skills among the students as well as the limited budget available to the instructor. We first analytically model OEFs in a non-strategic setting using ideas from lumpable continuous time Markov chains and compute expected aggregate transient net-rewards for the instructor and the students. We next consider a strategic setting where we use the rewards computed above to set up a mixed-integer linear program which views an OEF as a single-leader-multiple-followers Stackelberg game and recommends an optimal plan to the instructor for maximizing student participation. Our experimental results reveal several interesting phenomena including a striking non-monotonicity in the level of participation of students vis-a-vis the instructor's arrival rate.
【Keywords】: online educational forums; incentive design; Stackelberg game; continuous time Markov chains; mixed integer linear program; instructor-student interactions
【Paper Link】 【Pages】:1050-1056
【Authors】: Mason Wright ; Yevgeniy Vorobeychik
【Abstract】: Team formation is a core problem in AI. Remarkably, little prior work has addressed the problem of mechanism design for team formation, accounting for the need to elicit agents' preferences over potential teammates. Coalition formation in the related hedonic games has received much attention, but only from the perspective of coalition stability, with little emphasis on the mechanism design objectives of true preference elicitation, social welfare, and equity. We present the first formal mechanism design framework for team formation, building on recent combinatorial matching market design literature. We exhibit four mechanisms for this problem, two novel, two simple extensions of known mechanisms from other domains. Two of these (one new, one known) have desirable theoretical properties. However, we use extensive experiments to show our second novel mechanism, despite having no theoretical guarantees, empirically achieves good incentive compatibility, welfare, and fairness.
【Keywords】: mechanism design; hedonic games; coalition formation
【Paper Link】 【Pages】:1057-1063
【Authors】: Haifeng Xu ; Zinovi Rabinovich ; Shaddin Dughmi ; Milind Tambe
【Abstract】: Stackelberg security games have been widely deployed to protect real-word assets. The main solution concept there is the Strong Stackelberg Equilibrium (SSE), which optimizes the defender's random allocation of limited security resources. However, solely deploying the SSE mixed strategy has limitations. In the extreme case, there are security games where the defender is able to defend all the assets ``almost perfectly" at the SSE, but she still sustains significant loss. In this paper, we propose an approach for improving the defender's utility in such scenarios. Perhaps surprisingly, our approach is to strategically reveal to the attacker information about the sampled pure strategy. Specifically, we propose a two-stage security game model, where in the first stage the defender allocates resources and the attacker selects a target to attack, and in the second stage the defender strategically reveals local information about that target, potentially deterring the attacker's attack plan. We then study how the defender can play optimally in both stages. We show, theoretically and experimentally, that the two-stage security game model allows the defender to gain strictly better utility than SSE.
【Keywords】: Security Games; Equilibirium Computation; Strategic Information
【Paper Link】 【Pages】:1064-1070
【Authors】: Dengji Zhao ; Sarvapali D. Ramchurn ; Enrico H. Gerding ; Nicholas R. Jennings
【Abstract】: We consider dual-role exchange markets, where traders can offer to both buy and sell the same commodity in the exchange but, if they transact, they can only be either a buyer or a seller, which is determined by the market mechanism. To design desirable mechanisms for such exchanges, we show that existing solutions may not be incentive compatible, and more importantly, cause the market maker to suffer a significant deficit. Hence, to combat this problem, following McAfee's trade reduction approach, we propose a new trade reduction mechanism, called balanced trade reduction, that is incentive compatible and also provides flexible trade-offs between efficiency and deficit.
【Keywords】: Double Auction; Trade Reduction; Budget Balance; Ridesharing; EV-Charging
【Paper Link】 【Pages】:1071-1078
【Authors】: Song Zuo ; Pingzhong Tang
【Abstract】: The problem of computing optimal strategy to commit to in various games has attracted intense research interests and has important real-world applications such as security (attacker-defender) games. In this paper, we consider the problem of computing optimal leader’s machine to commit to in two-person repeated game, where the follower also plays a machine strategy. Machine strategy is the generalized notion of automaton strategy, where the number of states in the automaton can be possibly infinite. We begin with the simple case where both players are confined to automata strategies, and then extend to general (possibly randomized) machine strategies. We first give a concise linear program to compute the optimal leader’s strategy and give two efficient implementations of the linear program: one via enumeration of a convex hull and the other via randomization. We then investigate the case where two machines have different levels of intelligence in the sense that one machine is able to record more history information than the other. We show that an intellectually superior leader, sometimes considered being exploited by the follower, can figure out the follower’s machine by brute-force and exploit the follower in return.
【Keywords】: Stackelberg game, Repeated game, Bounded Rationality, Machine strategy
【Paper Link】 【Pages】:1079-1085
【Authors】: Hiromasa Arai ; Crystal Maung ; Haim Schweitzer
【Abstract】: Approximating a matrix by a small subset of its columns is a known problem in numerical linear algebra. Algorithms that address this problem have been used in areas which include, among others, sparse approximation, unsupervised feature selection, data mining, and knowledge representation. Such algorithms were investigated since the 1960's, with recent results that use randomization. The problem is believed to be NP-Hard, and to the best of our knowledge there are no previously published algorithms aimed at computing optimal solutions. We show how to model the problem as a graph search, and propose a heuristic based on eigenvalues of related matrices. Applying the A* search strategy with this heuristic is guaranteed to find the optimal solution. Experimental results on common datasets show that the proposed algorithm can effectively select columns from moderate size matrices, typically improving by orders of magnitude the run time of exhaustive search. We also show how to combine the proposed algorithm with other non-optimal (but much faster) algorithms in a ``two stage'' framework, which is guaranteed to improve the accuracy of the other algorithms.
【Keywords】: A*, Unsupervised feature selection
【Paper Link】 【Pages】:1086-1092
【Authors】: Joseph Kelly Barker ; Richard E. Korf
【Abstract】: We present an intuitive explanation for the limited effectiveness of front-to-end bidirectional heuristic search, supported with extensive evidence from many commonly-studied domains. While previous work has proved the limitations of specific algorithms, we show that any front-to-end bidirectional heuristic search algorithm will likely be dominated by unidirectional heuristic search or bidirectional brute-force search. We also demonstrate a pathological case where bidirectional heuristic search is the dominant algorithm, so a stronger claim cannot be made. Finally, we show that on the four-peg Towers Of Hanoi with arbitrary start and goal states, bidirectional brute-force search outperforms unidirectional heuristic search using pattern-database heuristics.
【Keywords】: Bidirectional Search; Pathfinding; Heuristic Search
【Paper Link】 【Pages】:1093-1099
【Authors】: Nawal Benabbou ; Patrice Perny
【Abstract】: This paper proposes incremental preference elicitation methods for multiobjective state space search. Our approach consists in integrating weight elicitation and search to determine, in a vector-valued state-space graph, a solution path that best fits the Decision Maker's preferences. We first assume that the objective weights are imprecisely known and propose a state space search procedure to determine the set of possibly optimal solutions. Then, we introduce incremental elicitation strategies during the search that use queries to progressively reduce the set of admissible weights until a nearly-optimal path can be identified. The validity of our algorithms is established and numerical tests are provided to test their efficiency both in terms of number of queries and solution times.
【Keywords】: multiobjective optimisation; state space search; preference elicitation
【Paper Link】 【Pages】:1100-1106
【Authors】: Adi Botea ; Ben Strasser ; Daniel Harabor
【Abstract】: In this work we give a first tractability analysis of Compressed Path Databases, space efficient oracles used to very quickly identify the first arc on a shortest path. We study the complexity of computing an optimal compressed path database for general directed and undirected graphs. We find that in both cases the problem is NP-complete. We also show that, for graphs which can be decomposed along articulalion points, the problem can be decomposed into independent parts, with a corresponding reduction in its level of difficulty. In particular, this leads to simple and tractable algorithms which yield optimal compression results for trees.
【Keywords】:
【Paper Link】 【Pages】:1107-1113
【Authors】: Shaowei Cai ; Jinkun Lin ; Kaile Su
【Abstract】: Minimum Vertex Cover (MinVC) is a well known NP-hard combinatorial optimization problem, and local search has been shown to be one of the most effective approaches to this problem. State-of-the-art MinVC local search algorithms employ edge weighting techniques and prefer to select vertices with higher weighted score. These algorithms are not robust and especially have poor performance on instances with structures which defeat greedy heuristics. In this paper, we propose a vertex weighting scheme to address this shortcoming, and combine it within the current best MinVC local search algorithm NuMVC, leading to a new algorithm called TwMVC. Our experiments show that TwMVC outperforms NuMVC on the standard benchmarks namely DIMACS and BHOSLIB. To the best of our knowledge, TwMVC is the first MinVC algorithm that attains the best known solution for all instances in both benchmarks. Further, TwMVC shows superiority on a benchmark of real-world networks.
【Keywords】:
【Paper Link】 【Pages】:1114-1120
【Authors】: Katharina Eggensperger ; Frank Hutter ; Holger H. Hoos ; Kevin Leyton-Brown
【Abstract】: Hyperparameter optimization is crucial for achieving peak performance with many machine learning algorithms; however, the evaluation of new optimization techniques on real-world hyperparameter optimization problems can be very expensive. Therefore, experiments are often performed using cheap synthetic test functions with characteristics rather different from those of real benchmarks of interest. In this work, we introduce another option: cheap-to-evaluate surrogates of real hyperparameter optimization benchmarks that share the same hyperparameter spaces and feature similar response surfaces. Specifically, we train regression models on data describing a machine learning algorithm’s performance depending on its hyperparameter setting, and then cheaply evaluate hyperparameter optimization methods using the model’s performance predictions in lieu of running the real algorithm. We evaluated a wide range of regression techniques, both in terms of how well they predict the performance of new hyperparameter settings and in terms of the quality of surrogate benchmarks obtained. We found that tree-based models capture the performance of several machine learning algorithms well and yield surrogate benchmarks that closely resemble real-world benchmarks, while being much easier to use and orders of magnitude cheaper to evaluate.
【Keywords】: Sequential Model-based Bayesian Optimization; Hyperparameter optimization; Exploitation of benchmarks and experimentation; Performance modeling
【Paper Link】 【Pages】:1121-1127
【Authors】: Caroline Even ; Victor Pillac ; Pascal Van Hentenryck
【Abstract】: Evacuation planning is a critical aspect of disaster preparedness and response to minimize the number of people exposed to a threat. Controlled evacuations aim at managing the flow of evacuees as efficiently as possible and have been shown to produce significant benefits compared to self-evacuations. However, existing approaches do not capture the delays introduced by diverging and crossing evacuation routes, although evidence from actual evacuations highlights that these can lead to significant congestion. This paper introduces the concept of convergent evacuation plans to tackle this issue. It presents a MIP model to obtain optimal convergent evacuation plans which, unfortunately, does not scale to realistic instances. The paper then proposes a two-stage approach that separates the route design and the evacuation scheduling. Experimental results on a real case study show that the two-stage approach produces better primal bounds than the MIP model and is two orders of magnitude faster; It also produces dual bounds stronger than the linear relaxation of the MIP model. Finally, simulations of the evacuation demonstrate that convergent evacuation plans outperform existing approaches for realistic driver behaviors.
【Keywords】:
【Paper Link】 【Pages】:1128-1135
【Authors】: Matthias Feurer ; Jost Tobias Springenberg ; Frank Hutter
【Abstract】: Model selection and hyperparameter optimization is crucial in applying machine learning to a novel dataset. Recently, a subcommunity of machine learning has focused on solving this problem with Sequential Model-based Bayesian Optimization (SMBO), demonstrating substantial successes in many applications. However, for computationally expensive algorithms the overhead of hyperparameter optimization can still be prohibitive. In this paper we mimic a strategy human domain experts use: speed up optimization by starting from promising configurations that performed well on similar datasets. The resulting initialization technique integrates naturally into the generic SMBO framework and can be trivially applied to any SMBO method. To validate our approach, we perform extensive experiments with two established SMBO frameworks (Spearmint and SMAC) with complementary strengths; optimizing two machine learning frameworks on 57 datasets. Our initialization procedure yields mild improvements for low-dimensional hyperparameter optimization and substantially improves the state of the art for the more complex combined algorithm selection and hyperparameter optimization problem.
【Keywords】: Machine Learning; Meta-Learning; Bayesian Optimization; Hyperparameter Optimization; Sequential Model-based Optimization
【Paper Link】 【Pages】:1136-1143
【Authors】: Andreas Fröhlich ; Armin Biere ; Christoph M. Wintersteiger ; Youssef Hamadi
【Abstract】: Satisfiability Modulo Theories (SMT) is essential for many practical applications, e.g., in hard- and software verification, and increasingly also in other scientific areas like computational biology. A large number of applications in these areas benefit from bit-precise reasoning over finite-domain variables. Current approaches in this area translate a formula over bit-vectors to an equisatisfiable propositional formula, which is then given to a SAT solver. In this paper, we present a novel stochastic local search (SLS) algorithm to solve SMT problems, especially those in the theory of bit-vectors, directly on the theory level. We explain how several successful techniques used in modern SLS solvers for SAT can be lifted to the SMT level. Experimental results show that our approach can compete with state-of-the-art bit-vector solvers on many practical instances and, sometimes, outperform existing solvers. This offers interesting possibilities in combining our approach with existing techniques, and, moreover, new insights into the importance of exploiting problem structure in SLS solvers for SAT. Our approach is modular and, therefore, extensible to support other theories, potentially allowing SLS to become part of the more general SMT framework.
【Keywords】: Satisfiability Modulo Theories; Stochastic Local Search; Bit-Vectors
【Paper Link】 【Pages】:1144-1150
【Authors】: Daisuke Hatano ; Takuro Fukunaga ; Takanori Maehara ; Ken-ichi Kawarabayashi
【Abstract】: In this paper, we formulate a new problem related to the well-known influence maximization in the context of computational advertising. Our new problem considers allocating marketing channels (e.g., TV, newspaper, and websites) to advertisers from the view point of a match maker, which was not taken into account in previous studies on the influence maximization. The objective of the problem is to find an allocation such that each advertiser can influence some given number of customers while the slots of marketing channels are limited. We propose an algorithm based on the Lagrangian decomposition. We empirically show that our algorithm computes better quality solutions than existing algorithms, scales up to graphs of 10M vertices, and performs well particularly in a parallel environment.
【Keywords】: combinatorial optimization; computational advertisement
【Paper Link】 【Pages】:1151-1157
【Authors】: Matthew Hatem ; Scott Kiesel ; Wheeler Ruml
【Abstract】: There are two major paradigms for linear-space heuristic search: iterative deepening (IDA) and recursive best-first search (RBFS). While the node regeneration overhead of IDA is easily characterized in terms of the heuristic branching factor, the overhead of RBFS depends on how widely the promising nodes are separated in the search tree, and is harder to anticipate. In this paper, we present two simple techniques for improving the performance of RBFS while maintaining its advantages over IDA*. While these techniques work well in practice, they do not provide any theoretical bounds on the amount of regeneration overhead. To this end, we introduce RBFScr, the first method for provably bounding the regeneration overhead of RBFS. We show empirically that this improves its performance in several domains, both for optimal and suboptimal search, and also yields a better linear-space anytime heuristic search. RBFScr is the first linear space best-first search robust enough to solve a variety of domains with varying operator costs.
【Keywords】: linear-space search; recursive best-first search; heuristics
【Paper Link】 【Pages】:1158-1164
【Authors】: Carlos Hernández ; Roberto Asín ; Jorge A. Baier
【Abstract】: Generalized Adaptive A (GAA) is an incremental algorithm that replans using A when solving goal-directed navigation problems in dynamic terrain. Immediately after each A search, it runs an efficient procedure that updates the heuristic values of states that were just expanded by A, making them more informed. Those updates allow GAA to speed up subsequent A searches. Being based on A, it is simple to describe and communicate; however, it is outperformed by other incremental algorithms like the state-of-the-art DLite algorithm at goal-directed navigation. In this paper we show how GAA can be modified to exploit more information from a previous search in addition to the updated heuristic function. Specifically, we show how GAA can be modified to utilize the paths found by a previous A search. Our algorithm — Multipath Generalized Adaptive A (MPGAA) — has the same theoretical properties of GAA and differs from it by only a few lines of pseudocode. Arguably, MPGAA is simpler to understand than DLite. We evaluate MPGAA over various realistic dynamic terrain settings, and observed that it generally outperforms the state-of-the-art algorithm D*Lite in scenarios resembling outdoor and indoor navigation.
【Keywords】: D Lite; Generalized Adaptive A; A*
【Paper Link】 【Pages】:1165-1173
【Authors】: Bojun Huang
【Abstract】: In this paper we show that the alpha-beta algorithm and its successor MT-SSS, as two classic minimax search algorithms, can be implemented as rollout algorithms , a generic algorithmic paradigm widely used in many domains. Specifically, we define a family of rollout algorithms, in which the rollout policy is restricted to select successor nodes only from a certain subset of the children list. We show that any rollout policy in this family (either deterministic or randomized) is guaranteed to evaluate the game tree correctly with a finite number of rollouts. Moreover, we identify simple rollout policies in this family that ``implement'' alpha-beta and MT-SSS. Specifically, given any game tree, the rollout algorithms with these particular policies always visit the same set of leaf nodes in the same order with alpha-beta and MT-SSS*, respectively. Our results suggest that traditional pruning techniques and the recent Monte Carlo Tree Search algorithms, as two competing approaches for game tree evaluation, may be unified under the rollout paradigm.
【Keywords】:
【Paper Link】 【Pages】:1174-1181
【Authors】: Tiep Le ; Tran Cao Son ; Enrico Pontelli ; William Yeoh
【Abstract】: This paper explores the use of answer set programming (ASP) in solving distributed constraint optimization problems (DCOPs). It makes the following contributions: (i)~It shows how one can formulate DCOPs as logic programs; (ii)~It introduces ASP-DPOP, the first DCOP algorithm that is based on logic programming; (iii)~It experimentally shows that ASP-DPOP can be up to two orders of magnitude faster than DPOP (its imperative-programming counterpart) as well as solve some problems that DPOP fails to solve due to memory limitations; and (iv)~It demonstrates the applicability of ASP in the wide array of multi-agent problems currently modeled as DCOPs.
【Keywords】: DCOP; answer set programming;
【Paper Link】 【Pages】:1182-1190
【Authors】: Tyler Lu ; Craig Boutilier
【Abstract】: Data-driven analytics — in areas ranging from consumer marketing to public policy — often allow behavior prediction at the level of individuals rather than population segments , offering the opportunity to improve decisions that impact large populations. Modeling such (generalized) assignment problems as linear programs, we propose a general value-directed compression technique for solving such problems at scale. We dynamically segment the population into cells using a form of column generation, constructing groups of individuals who can provably be treated identically in the optimal solution. This compression allows problems, unsolvable using standard LP techniques, to be solved effectively. Indeed, once a compressed LP is constructed, problems can solved in milliseconds. We provide a theoretical analysis of themethods, outline the distributed implementation of the requisite data processing, and show how a single compressed LP can be used to solve multiple variants of the original LP near-optimally in real-time (e.g., tosupport scenario analysis). We also show how the method can be leveraged in integer programming models. Experimental results on marketing contact optimization and political legislature problems validate the performance of our technique.
【Keywords】: optimization; linear programming; abstraction; marketing optimization; social choice; large-scale optimization; generalized assignment problem; generalized matching; parallel optimization; map reduce;
【Paper Link】 【Pages】:1191-1197
【Authors】: Jincheng Mei ; Kang Zhao ; Bao-Liang Lu
【Abstract】: With the extensive application of submodularity, its generalizations are constantly being proposed. However, most of them are tailored for special problems. In this paper, we focus on quasi-submodularity, a universal generalization, which satisfies weaker properties than submodularity but still enjoys favorable performance in optimization. Similar to the diminishing return property of submodularity, we first define a corresponding property called the single sub-crossing , then we propose two algorithms for unconstrained quasi-submodular function minimization and maximization, respectively. The proposed algorithms return the reduced lattices in O(n) iterations, and guarantee the objective function values are strictly monotonically increased or decreased after each iteration. Moreover, any local and global optima are definitely contained in the reduced lattices. Experimental results verify the effectiveness and efficiency of the proposed algorithms on lattice reduction.
【Keywords】:
【Paper Link】 【Pages】:1198-1204
【Authors】: Seyed Hamid Mirisaee ; Éric Gaussier ; Alexandre Termier
【Abstract】: Rank K Binary Matrix Factorization (BMF) approximates a binary matrix by the product of two binary matrices of lower rank, K, using either L1 or L2 norm. In this paper, we first show that the BMF with L2 norm can be reformulated as an Unconstrained Binary Quadratic Programming (UBQP) problem. We then review several local search strategies that can be used to improve the BMF solutions obtained by previously proposed methods, before introducing a new local search dedicated to the BMF problem. We show in particular that the proposed solution is in general faster than the previously proposed ones. We then assess its behavior on several collections and methods and show that it significantly improves methods targeting the L2 norms on all the datasets considered; for the L1 norm, the improvement is also significant for real, structured datasets and for the BMF problem without the binary reconstruction constraint.
【Keywords】: Matrix factorization; Local search; heuristics
【Paper Link】 【Pages】:1205-1211
【Authors】: Hossein Mobahi ; John W. Fisher III
【Abstract】: Optimization via continuation method is a widely used approach for solving nonconvex minimization problems. While this method generally does not provide a global minimum, empirically it often achieves a superior local minimum compared to alternative approaches such as gradient descent. However, theoretical analysis of this method is largely unavailable. Here, we provide a theoretical analysis that provides a bound on the endpoint solution of the continuation method. The derived bound depends on a problem specific characteristic that we refer to as optimization complexity. We show that this characteristic can be analytically computed when the objective function is expressed in some suitable basis functions. Our analysis combines elements of scale-space theory, regularization and differential equations.
【Keywords】: nonconvex optimization; continuation method
【Paper Link】 【Pages】:1212-1218
【Authors】: Danny Munera ; Daniel Diaz ; Salvador Abreu ; Francesca Rossi ; Vijay A. Saraswat ; Philippe Codognet
【Abstract】: Stable matching problems have several practical applications. If preference lists are truncated and contain ties, finding a stable matching with maximal size is computationally difficult. We address this problem using a local search technique, based on Adaptive Search and present experimental evidence that this approach is much more efficient than state-of-the-art exact and approximate methods. Moreover, parallel versions (particularly versions with communication) improve performance so much that very large and hard instances can be solved quickly.
【Keywords】: Stable Matching; Parallel Local Search; Cooperative Multi-Walk
【Paper Link】 【Pages】:1219-1225
【Authors】: Masaaki Nishino ; Norihito Yasuda ; Shin-ichi Minato ; Masaaki Nagata
【Abstract】: Dynamic programming (DP) is a fundamental tool used to obtain exact, optimal solutions for many combinatorial optimization problems. Among these problems, important ones including the knapsack problems and the computation of edit distances between string pairs can be solved with a kind of DP that corresponds to solving the shortest path problem on a directed acyclic graph (DAG). These problems can be solved efficiently with DP, however, in practical situations, we want to solve the customized problems made by adding logical constraints to the original problems. Developing an algorithm specifically for each combination of a problem and a constraint set is unrealistic. The proposed method, BDD-Constrained Search (BCS), exploits a Binary Decision Diagram (BDD) that represents the logical constraints in combination with the DAG that represents the problem. The BCS runs DP on the DAG while using the BDD to check the equivalence and the validity of intermediate solutions to efficiently solve the problem. The important feature of BCS is that it can be applied to problems with various types of logical constraints in a unified way once we represent the constraints as a BDD. We give a theoretical analysis on the time complexity of BCS and also conduct experiments to compare its performance to that of a state-of-the-art integer linear programming solver.
【Keywords】:
【Paper Link】 【Pages】:1226-1232
【Authors】: Shunji Umetani
【Abstract】: We present a data mining approach for reducing the search space of local search algorithms in large-scale set partitioning problems (SPPs). We construct a k-nearest neighbor graph by extracting variable associations from the instance to be solved, in order to identify promising pairs of flipping variables in the large neighborhood search. We incorporate the search space reduction technique into a 2-flip neighborhood local search algorithm with an efficient incremental evaluation of solutions and an adaptive control of penalty weights. We also develop a 4-flip neighborhood local search algorithm that flips four variables alternately along 4-paths or 4-cycles in the k-nearest neighbor graph. According to computational comparison with the latest solvers, our algorithm performs effectively for large-scale SPP instances with up to 2.57 million variables.
【Keywords】: local search; set partitioning problem
【Paper Link】 【Pages】:1233-1240
【Authors】: Emre Yamangil ; Russell Bent ; Scott Backhaus
【Abstract】: Modern society is critically dependent on the services provided by engineered infrastructure networks. When natural disasters (e.g. Hurricane Sandy) occur, the ability of these networks to provide service is often degraded because of physical damage to network components. One of the most critical of these networks is the electrical distribution grid, with medium voltage circuits often suffering the most severe damage. However, well-placed upgrades to these distribution grids can greatly improve post-event network performance. We formulate an optimal electrical distribution grid design problem as a two-stage, stochastic mixed-integer program with damage scenarios from natural disasters modeled as a set of stochastic events. We develop and investigate the tractability of an exact and several heuristic algorithms based on decompositions that are hybrids of techniques developed by the AI and operations research communities. We provide computational evidence that these algorithms have significant benefits when compared with commercial, mixed-integer programming software.
【Keywords】:
【Paper Link】 【Pages】:1241-1247
【Authors】: Yeqin Zhang ; Martin Müller
【Abstract】: Temperature Discovery Search (TDS) is a forward search method for computing or approximating the temperature of a combinatorial game. Temperature and mean are important concepts in combinatorial game theory, which can be used to develop efficient algorithms for playing well in a sum of subgames. A new algorithm TDS+ with five enhancements of TDS is developed, which greatly speeds up both exact and approximate versions of TDS. Means and temperatures can be computed faster, and fixed-time approximations which are important for practical play can be computed with higher accuracy than before.
【Keywords】: Temperature Discovery Search; TDS+; combinatorial game theory; sum games; game tree search; Amazons
【Paper Link】 【Pages】:1248-1255
【Authors】: Yichao Zhou ; Jianyang Zeng
【Abstract】: A search is a fundamental topic in artificial intelligence. Recently, the general purpose computation on graphics processing units (GPGPU) has been widely used to accelerate numerous computational tasks. In this paper, we propose the first parallel variant of the A search algorithm such that the search process of an agent can be accelerated by a single GPU processor in a massively parallel fashion. Our experiments have demonstrated that the GPU-accelerated A search is efficient in solving multiple real-world search tasks, including combinatorial optimization problems, pathfinding and game solving. Compared to the traditional sequential CPU-based A implementation, our GPU-based A* algorithm can achieve a significant speedup by up to 45x on large-scale search problems.
【Keywords】: A* Search; GPGPU; Massive Parallel
【Paper Link】 【Pages】:1256-1262
【Authors】: Praphul Chandra ; Yadati Narahari ; Debmalya Mandal ; Prasenjit Dey
【Abstract】: Motivated by current day crowdsourcing platforms and emergence of online labor markets, this work addresses the problem of task allocation and payment decisions when unreliable and strategic workers arrive over time to work on tasks which must be completed within a deadline. We consider the following scenario: a requester has a set of tasks that must be completed before a deadline; agents (aka crowd workers) arrive over time and it is required to make sequential decisions regarding task allocation and pricing. Agents may have different costs for providing service and these costs are private information of the agents. We assume that agents are not strategic about their arrival times but could be strategic about their costs of service. In addition, agents could be unreliable in the sense of not being able to complete the assigned tasks within the allocated time; these tasks must then be reallocated to other agents to ensure ontime completion of the set of tasks by the deadline. For this setting, we propose two mechanisms: a DPM (DynamicPrice Mechanism) and an ABM (Auction Based Mechanism). Both mechanisms are dominant strategy incentive compatible, budget feasible, and also satisfy ex-post individual rationality for agents who complete the allocated tasks. These mechanisms can be implemented in current day crowdsourcing platforms with minimal changes to the current interaction model.
【Keywords】: Crowdsourcing, Mechanism Design, Auctions, Reliability, Online
【Paper Link】 【Pages】:1263-1269
【Authors】: Preethi Jyothi ; Mark Hasegawa-Johnson
【Abstract】: Transcribed speech is a critical resource for building statistical speech recognition systems. Recent work has looked towards soliciting transcriptions for large speech corpora from native speakers of the language using crowdsourcing techniques. However, native speakers of the target language may not be readily available for crowdsourcing. We examine the following question: can humans unfamiliar with the target language help transcribe? We follow an information-theoretic approach to this problem: (1) We learn the characteristics of a noisy channel that models the transcribers' systematic perception biases. (2) We use an error-correcting code, specifically a repetition code, to encode the inputs to this channel, in conjunction with a maximum-likelihood decoding rule. To demonstrate the feasibility of this approach, we transcribe isolated Hindi words with the help of Mechanical Turk workers unfamiliar with Hindi. We successfully recover Hindi words with an accuracy of over 85% (and 94% in a 4-best list) using a 15-fold repetition code. We also estimate the conditional entropy of the input to this channel (Hindi words) given the channel output (transcripts from crowdsourced workers) to be less than 2 bits; this serves as a theoretical estimate of the average number of bits of auxiliary information required for errorless recovery.
【Keywords】: Crowdsourcing; Repeated labeling; Speech transcription; Noisy-channel models
【Paper Link】 【Pages】:1270-1276
【Authors】: Yuezhou Lv ; Thomas Moscibroda
【Abstract】: In a basic economic system, each participant receives a (financial) reward according to his own contribution to the system. In this work, we study an alternative approach — Incentive Networks — in which a participant's reward depends not only on his own contribution; but also in part on the contributions made by his social contacts or friends. We show that the key parameter effecting the efficiency of such an Incentive Network-based economic system depends on the participant's degree of directed altruism. Directed altruism is the extent to which someone is willing to work if his work results in a payment to his friend, rather than to himself. Specifically, we characterize the condition under which an Incentive Network-based economy is more efficient than the basic "pay-for-your-contribution" economy. We quantify by how much incentive networks can reduce the total reward that needs to be paid to the participants in order to achieve a certain overall contribution. Finally, we study the impact of the network topology and various exogenous parameters on the efficiency of incentive networks. Our results suggest that in many practical settings, Incentive Network-based reward systems or compensation structures could be more efficient than the ubiquitous 'pay-for-your-contribution' schemes.
【Keywords】: Reward; Network; Mechanism Design; Incentive; Crowdsourcing
【Paper Link】 【Pages】:1277-1283
【Authors】: Diego Noble ; Marcelo O. R. Prates ; Daniel Bossle ; Luís C. Lamb
【Abstract】: Recent studies have suggested that current agent-based models are not sufficiently sophisticated to reproduce results achieved by human collaborative learning and reasoning. Such studies suggest that humans are diverse and dynamic when solving problems socially. However, despite their relevance to problem-solving, these two behavioral features have not yet been fully investigated. In this paper we analyse a recent social problem-solving model and attempt to address its shortcomings. Specifically, we investigate the effects of separating exploitation from exploration in agent behaviors and explore the concept of diversity in such models. We found out that diverse populations outperform homogeneous ones in both efficient and inefficient networks. Finally, we show that agent diversity is more relevant than the strategic behavioral dynamics. This work contributes towards understanding the role of diverse and dynamic behaviors in social problem-solving as well as the advancement of state-of-art social problem-solving models.
【Keywords】:
【Paper Link】 【Pages】:1284-1290
【Authors】: David Sanchez-Charles ; Victor Muntés-Mulero ; Marc Solé ; Jordi Nin
【Abstract】: Although crowdsourcing has been proven efficient as a mechanism to solve independent tasks for on-line production, it is still unclear how to define and manage workflows in complex tasks that require the participation and coordination of different workers. Despite the existence of different frameworks to define workflows, we still lack a commonly accepted solution that is able to describe the most common workflows in current and future platforms. In this paper, we propose CrowdWON, a new graphical framework to describe and monitor crowd processes, the proposed language is able to represent the workflow of most well-known existing applications, extend previous modelling frameworks, and assist in the future generation of crowdsourcing platforms. Beyond previous proposals, CrowdWON allows for the formal definition of adaptative workflows, that depend on the skills of the crowd workers and/or process deadlines. CrowdWON also allows expressing constraints on workers based on previous individual contributions. Finally, we show how our proposal can be used to describe well known crowdsourcing workflows.
【Keywords】: crowdsourcing; modelling language; process
【Paper Link】 【Pages】:1291-1297
【Authors】: Nihar B. Shah ; Dengyong Zhou
【Abstract】: Human computation or crowdsourcing involves joint inference of the ground-truth-answers and the worker-abilities by optimizing an objective function, for instance, by maximizing the data likelihood based on an assumed underlying model. A variety of methods have been proposed in the literature to address this inference problem. As far as we know, none of the objective functions in existing methods is convex. In machine learning and applied statistics, a convex function such as the objective function of support vector machines (SVMs) is generally preferred, since it can leverage the high-performance algorithms and rigorous guarantees established in the extensive literature on convex optimization. One may thus wonder if there exists a meaningful convex objective function for the inference problem in human computation. In this paper, we investigate this convexity issue for human computation. We take an axiomatic approach by formulating a set of axioms that impose two mild and natural assumptions on the objective function for the inference. Under these axioms, we show that it is unfortunately impossible to ensure convexity of the inference problem. On the other hand, we show that interestingly, in the absence of a requirement to model "spammers", one can construct reasonable objective functions for crowdsourcing that guarantee convex inference.
【Keywords】: crowdsourcing;inference;convex;axiomatic
【Paper Link】 【Pages】:1298-1304
【Authors】: Long Tran-Thanh ; Trung Dong Huynh ; Avi Rosenfeld ; Sarvapali D. Ramchurn ; Nicholas R. Jennings
【Abstract】: We consider the problem of task allocation in crowdsourcing systems with multiple complex workflows, each of which consists of a set of inter-dependent micro-tasks.We propose Budgeteer, an algorithm to solve this problem under a budget constraint. In particular, our algorithm first calculates an efficient way to allocate budget to each workflow. It then determines the number of inter-dependent micro-tasks and the price to pay for each task within each workflow, given the corresponding budget constraints. We empirically evaluate it on a well-known crowdsourcing-based text correction workflow using Amazon Mechanical Turk, and show that Budgeteer can achieve similar levels of accuracy to current benchmarks, but is on average 45 % cheaper.
【Keywords】: crowdsourcing; budget constraints; complex workflow; task allocation
【Paper Link】 【Pages】:1305-1312
【Authors】: Han Yu ; Chunyan Miao ; Zhiqi Shen ; Cyril Leung ; Yiqiang Chen ; Qiang Yang
【Abstract】: Reputation-based approaches allow a crowdsourcing system to identify reliable workers to whom tasks can be delegated. In crowdsourcing systems that can be modeled as multi-agent trust networks consist of resource constrained trustee agents (i.e., workers), workers may need to further sub-delegate tasks to others if they determine that they cannot complete all pending tasks before the stipulated deadlines. Existing reputation-based decision-making models cannot help workers decide when and to whom to sub-delegate tasks. In this paper, we proposed a reputation aware task sub-delegation (RTS) approach to bridge this gap. By jointly considering a worker's reputation, workload, the price of its effort and its trust relationships with others, RTS can be implemented as an intelligent agent to help workers make sub-delegation decisions in a distributed manner. The resulting task allocation maximizes social welfare through efficient utilization of the collective capacity of a crowd, and provides provable performance guarantees. Experimental comparisons with state-of-the-art approaches based on the Epinions trust network demonstrate significant advantages of RTS under high workload conditions.
【Keywords】: Crowdsourcing; Human Computation; Reputation; Trust Networks; Task Sub-delegation
【Paper Link】 【Pages】:1313-1319
【Authors】: Avshalom Elmalech ; David Sarne ; Avi Rosenfeld ; Eden Shalom Erez
【Abstract】: This paper represents a paradigm shift in what advice agents should provide people. Contrary to what was previously thought, we empirically show that agents that dispense optimal advice will not necessary facilitate the best improvement in people's strategies. Instead, we claim that agents should at times suboptimally advise. We provide results demonstrating the effectiveness of a suboptimal advising approach in extensive experiments in two canonical mixed agent-human advice-giving domains. Our proposed guideline for suboptimal advising is to rely on the level of intuitiveness of the optimal advice as a measure for how much the suboptimal advice presented to the user should drift from the optimal value.
【Keywords】: advice provisioning; advice to people; suboptimal advicing
【Paper Link】 【Pages】:1320-1327
【Authors】: Ariel Rosenfeld ; Sarit Kraus
【Abstract】: Argumentative discussion is a highly demanding task. In order to help people in such situations, this paper provides an innovative methodology for developing an agent that can support people in argumentative discussions by proposing possible arguments to them. By analyzing more than 130 human discussions and 140 questionnaires, answered by people, we show that the well-established Argumentation Theory is not a good predictor of people's choice of arguments. Then, we present a model that has 76% accuracy when predicting people’s top three argument choices given a partial deliberation. We present the Predictive and Relevance based Heuristic agent (PRH), which uses this model with a heuristic that estimates the relevance of possible arguments to the last argument given in order to propose possible arguments. Through extensive human studies with over 200 human subjects, we show that people’s satisfaction from the PRH agent is significantly higher than from other agents that propose arguments based on Argumentation Theory, predict arguments without the heuristics or only the heuristics. People also use the PRH agent's proposed arguments significantly more often than those proposed by the other agents.
【Keywords】: Human-Computer Interaction;Argumentation Theory;Human Argumentation
【Paper Link】 【Pages】:1328-1335
【Authors】: Biqiao Zhang ; Emily Mower Provost ; Robert Swedberg ; Georg Essl
【Abstract】: Emotion affects our understanding of the opinions and sentiments of others. Research has demonstrated that humans are able to recognize emotions in various domains, including speech and music, and that there are potential shared features that shape the emotion in both domains. In this paper, we investigate acoustic and visual features that are relevant to emotion perception in the domains of singing and speaking. We train regression models using two paradigms: (1) within-domain, in which models are trained and tested on the same domain and (2) cross-domain, in which models are trained on one domain and tested on the other domain. This strategy allows us to analyze the similarities and differences underlying the relationship between audio-visual feature expression and emotion perception and how this relationship is affected by domain of expression. We use kernel density estimation to model emotion as a probability distribution over the perception associated with multiple evaluators on the valence-activation space. This allows us to model the variation inherent in the reported perception. Results suggest that activation can be modeled more accurately across domains, compared to valence. Furthermore, visual features capture cross-domain emotion more accurately than acoustic features. The results provide additional evidence for a shared mechanism underlying spoken and sung emotion perception.
【Keywords】: Emotion Perception; Emotion Modeling; Singing; Speaking
【Paper Link】 【Pages】:1336-1342
【Authors】: Abdeslam Boularias ; James Andrew Bagnell ; Anthony Stentz
【Abstract】: We present a fully autonomous robotic system for grasping objects in dense clutter. The objects are unknown and have arbitrary shapes. Therefore, we cannot rely on prior models. Instead, the robot learns online, from scratch, to manipulate the objects by trial and error. Grasping objects in clutter is significantly harder than grasping isolated objects, because the robot needs to push and move objects around in order to create sufficient space for the fingers. These pre-grasping actions do not have an immediate utility, and may result in unnecessary delays. The utility of a pre-grasping action can be measured only by looking at the complete chain of consecutive actions and effects. This is a sequential decision-making problem that can be cast in the reinforcement learning framework. We solve this problem by learning the stochastic transitions between the observed states, using nonparametric density estimation. The learned transition function is used only for re-calculating the values of the executed actions in the observed states, with different policies. Values of new state-actions are obtained by regressing the values of the executed actions. The state of the system at a given time is a depth (3D) image of the scene. We use spectral clustering for detecting the different objects in the image. The performance of our system is assessed on a robot with real-world objects.
【Keywords】: Robotics; Reinforcement Learning; Grasping; Manipulation
【Paper Link】 【Pages】:1343-1349
【Authors】: Goren Gordon ; Cynthia Breazeal
【Abstract】: Effective tutoring requires personalization of the interaction to each student.Continuous and efficient assessment of the student's skills are a prerequisite for such personalization.We developed a Bayesian active-learning algorithm that continuously and efficiently assesses a child's word-reading skills and implemented it in a social robot.We then developed an integrated experimental paradigm in which a child plays a novel story-creation tablet game with the robot.The robot is portrayed as a younger peer who wishes to learn to read, framing the assessment of the child's word-reading skills as well as empowering the child.We show that our algorithm results in an accurate representation of the child's word-reading skills for a large age range, 4-8 year old children, and large initial reading skill range.We also show that employing child-specific assessment-based tutoring results in an age- and initial reading skill-independent learning, compared to random tutoring.Finally, our integrated system enables us to show that implementing the same learning algorithm on the robot's reading skills results in knowledge that is comparable to what the child thinks the robot has learned.The child's perception of the robot's knowledge is age-dependent and may facilitate an indirect assessment of the development of theory-of-mind.
【Keywords】: assessment;expected information gain;personalization
【Paper Link】 【Pages】:1350-1356
【Authors】: GeunSik Jo ; Kee-Sung Lee ; Devy Chandra ; Chol-Hee Jang ; Myung-Hyun Ga
【Abstract】: A homography matrix is used in computer vision field to solve the correspondence problem between a pair of stereo images. RANSAC algorithm is often used to calculate the homography matrix by randomly selecting a set of features iteratively. CS-RANSAC algorithm in this paper converts RANSAC algorithm into two-layers. The first layer is addressing sampling problem which we can describe our knowledge about degenerate features by mean of Constraint Satisfaction Problems (CSP). By dividing the input image into a N X N grid and making feature points into discrete domains, we can model the image into the CSP model to efficiently filter out degenerate feature samples using CSP in the first layer, so that computer has knowledge about how to skip computing the homography matrix in the model estimation step for the second layer. The experimental results show that the proposed CS-RANSAC algorithm can outperform the most of variants of RANSAC without sacrificing its execution time.
【Keywords】: RANSAC; Constraint Satisfaction Problems, CS-RANSAC
【Paper Link】 【Pages】:1357-1363
【Authors】: Jaume Jordán ; Eva Onaindia
【Abstract】: When two or more self-interested agents put their plans to execution in the same environment, conflicts may arise as a consequence, for instance, of a common utilization of resources. In this case, an agent can postpone the execution of a particular action, if this punctually solves the conflict, or it can resort to execute a different plan if the agent's payoff significantly diminishes due to the action deferral. In this paper, we present a game-theoretic approach to non-cooperative planning that helps predict before execution what plan schedules agents will adopt so that the set of strategies of all agents constitute a Nash equilibrium. We perform some experiments and discuss the solutions obtained with our game-theoretical approach, analyzing how the conflicts between the plans determine the strategic behavior of the agents.
【Keywords】: game-theory; multi-agent systems; planning
【Paper Link】 【Pages】:1364-1370
【Authors】: KinMing Kam ; Shouyi Wang ; Stephen R. Bowen ; Wanpracha Art Chaovalitwongse
【Abstract】: Motion-adaptive radiotherapy techniques are promising to deliver truly ablative radiation doses to tumors with minimal normal tissue exposure by accounting for real-time tumor movement. However, a major challenge of successful applications of these techniques is the real-time prediction of breathing-induced tumor motion to accommodate system delivery latencies. Predicting respiratory motion in real-time is challenging. The current respiratory motion prediction approaches are still not satisfactory in terms of accuracy and interpretability due to the complexity of breathing patterns and the high inter-individual variability across patients. In this paper, we propose a novel respiratory motion prediction framework which integrates four key components: a personalized monitoring window generator, an orthogonal polynomial approximation-based pattern library builder, a variant best neighbor pattern searcher, and a statistical prediction decision maker. The four functional components work together into a real-time prediction system and is capable of performing personalized tumor position prediction during radiotherapy. Based on a study of respiratory motion of 27 patients with lung cancer, the proposed prediction approach generated consistently better prediction performances than the current respiratory motion prediction approaches, particularly for long prediction horizons.
【Keywords】:
【Paper Link】 【Pages】:1371-1379
【Authors】: Jean H. Oh ; Arne Suppé ; Felix Duvallet ; Abdeslam Boularias ; Luis E. Navarro-Serment ; Martial Hebert ; Anthony Stentz ; Jerry Vinokurov ; Oscar J. Romero ; Christian Lebiere ; Robert Dean
【Abstract】: Robots are increasingly becoming key players in human-robot teams. To become effective teammates, robots must possess profound understanding of an environment, be able to reason about the desired commands and goals within a specific context, and be able to communicate with human teammates in a clear and natural way. To address these challenges, we have developed an intelligence architecture that combines cognitive components to carry out high-level cognitive tasks, semantic perception to label regions in the world, and a natural language component to reason about the command and its relationship to the objects in the world. This paper describes recent developments using this architecture on a fielded mobile robot platform operating in unknown urban environments. We report a summary of extensive outdoor experiments; the results suggest that a multidisciplinary approach to robotics has the potential to create competent human-robot teams.
【Keywords】: semantic perception; symbolic navigation; prediction
【Paper Link】 【Pages】:1380-1386
【Authors】: Carlotta Schatten ; Ruth Janning ; Lars Schmidt-Thieme
【Abstract】: Correct evaluation of Machine Learning based sequencers require large data availability, large scale experiments and consideration of different evaluation measures. Such constraints make the construction of ad-hoc Intelligent Tutoring Systems (ITS) unfeasible and impose early integration in already existing ITS, which possesses a large amount of tasks to be sequenced. However, such systems were not designed to be combined with Machine Learning methods and require several adjustments. As a consequence more than a half of the components based on recommender technology are never evaluated with an online experiment. In this paper we show how we adapted a Matrix Factorization based performance predictor and a score based policy for task sequencing to be integrated in a commercial ITS with over 2000 tasks on 20 topics. We evaluated the experiment under different perspectives in comparison with the ITS sequencer designed by experts over the years. As a result we achieve same post-test results and outperform the current sequencer in the perceived experience questionnaire with almost no curriculum authoring effort. We also showed that the sequencer possess a better user modeling, better adapting to the knowledge acquisition rate of the students.
【Keywords】: Machine Learning;Sequencing; Matrix Factorization; Vygotsky;ITS;
【Paper Link】 【Pages】:1387-1393
【Authors】: Tom Williams ; Gordon Briggs ; Bradley Oosterveld ; Matthias Scheutz
【Abstract】: The ultimate goal of human natural language interaction is to communicate intentions. However, these intentions are often not directly derivable from the semantics of an utterance (e.g., when linguistic modulations are employed to convey polite-ness, respect, and social standing). Robotic architectures withsimple command-based natural language capabilities are thus not equipped to handle more liberal, yet natural uses of linguistic communicative exchanges. In this paper, we propose novel mechanisms for inferring in-tentions from utterances and generating clarification requests that will allow robots to cope with a much wider range of task-based natural language interactions. We demonstrate the potential of these inference algorithms for natural human-robot interactions by running them as part of an integrated cognitive robotic architecture on a mobile robot in a dialogue-based instruction task.
【Keywords】: speech act theory; intention understanding; human-robot interaction; integrated robot architectures; Dempster-Shafer theory
【Paper Link】 【Pages】:1394-1400
【Authors】: Shiqi Zhang ; Peter Stone
【Abstract】: In order to be fully robust and responsive to a dynamically changing real-world environment, intelligent robots will need to engage in a variety of simultaneous reasoning modalities. In particular, in this paper we consider their needs to i) reason with commonsense knowledge, ii) model their nondeterministic action outcomes and partial observability, and iii) plan toward maximizing long-term rewards. On one hand, Answer Set Programming (ASP) is good at representing and reasoning with commonsense and default knowledge, but is ill-equipped to plan under probabilistic uncertainty. On the other hand, Partially Observable Markov Decision Processes(POMDPs) are strong at planning under uncertainty toward maximizing long-term rewards, but are not designed to incorporate commonsense knowledge and inference. This paper introduces the CORPP algorithm which combines P-log,a probabilistic extension of ASP, with POMDPs to integrate commonsense reasoning with planning under uncertainty.Our approach is fully implemented and tested on a shopping request identification problem both in simulation and on a real robot. Compared with existing approaches using P-log or POMDPs individually, we observe significant improvements in both efficiency and accuracy.
【Keywords】:
【Paper Link】 【Pages】:1401-1409
【Authors】: Dawei Zhou ; Jiebo Luo ; Vincent M. B. Silenzio ; Yun Zhou ; Jile Hu ; Glenn Currier ; Henry A. Kautz
【Abstract】: Mental illness is becoming a major plague in modern societies and poses challenges to the capacity of current public health systems worldwide. With the widespread adoption of social media and mobile devices, and rapid advances in artificial intelligence, a unique opportunity arises for tackling mental health problems. In this study, we investigate how users’ online social activities and physiological signals detected through ubiquitous sensors can be utilized in realistic scenarios for monitoring their mental health states. First, we extract a suite of multimodal time-series signals using modern computer vision and signal processing techniques, from recruited participants while they are immersed in online social media that elicit emotions and emotion transitions. Next, we use machine learning techniques to build a model that establishes the connection between mental states and the extracted multimodal signals. Finally, we validate the effectiveness of our approach using two groups of recruited subjects.
【Keywords】: mental health, affect signals, multimodal analysis, social media, computer vision, machine learning
【Paper Link】 【Pages】:1410-1416
【Authors】: Ana Armas Romero ; Mark Kaminski ; Bernardo Cuenca Grau ; Ian Horrocks
【Abstract】: Module extraction — the task of computing a (preferably small) fragment M of an ontology T that preserves entailments over a signature S — has found many applications in recent years. Extracting modules of minimal size is, however, computationally hard, and often algorithmically infeasible. Thus, practical techniques are based on approximations, where M provably captures the relevant entailments, but is not guaranteed to be minimal. Existing approximations, however, ensure that M preserves all second-order entailments of T w.r.t. S, which is stronger than is required in many applications, and may lead to large modules in practice. In this paper we propose a novel approach in which module extraction is reduced to a reasoning problem in datalog. Our approach not only generalises existing approximations in an elegant way, but it can also be tailored to preserve only specific kinds of entailments, which allows us to extract significantly smaller modules. An evaluation on widely-used ontologies has shown very encouraging results.
【Keywords】: ontologies; module extraction; datalog reasoning
【Paper Link】 【Pages】:1417-1423
【Authors】: Alessandro Artale ; Roman Kontchakov ; Vladislav Ryzhikov ; Michael Zakharyaschev
【Abstract】: We design a tractable Horn fragment of the Halpern-Shoham temporal logic and extend it to interval-based temporal description logics, instance checking in which is P-complete for both combined and data complexity.
【Keywords】: interval temporal logic; temporal description logic
【Paper Link】 【Pages】:1424-1430
【Authors】: Joseph Babb ; Joohyung Lee
【Abstract】: Action languages are formal models of parts of natural language that are designed to describe effects of actions. Many of these languages can be viewed as high level notations of answer set programs structured to represent transition systems. However, the form of answer set programs considered in the earlier work is quite limited in comparison with the modern Answer Set Programming (ASP) language, which allows several useful constructs for knowledge representation, such as choice rules, aggregates, and abstract constraint atoms. We propose a new action language called BC+, which closes the gap between action languages and the modern ASP language. Language BC+ is defined as a high level notation of propositional formulas under the stable model semantics. Due to the generality of the underlying language, BC+ is expressive enough to encompass many modern ASP language constructs and the best features of several other action languages, such as B, C, C+ and BC. Computational methods available in ASP solvers are readily applicable to compute BC+, which led us to implement the language by extending system Cplus2ASP.
【Keywords】:
【Paper Link】 【Pages】:1431-1438
【Authors】: Harald Beck ; Minh Dao-Tran ; Thomas Eiter ; Michael Fink
【Abstract】: The recent rise of smart applications has drawn interest to logical reasoning over data streams. Different query languages and stream processing/reasoning engines were proposed. However, due to a lack of theoretical foundations, the expressivity and semantics of these diverse approaches were only informally discussed. Towards clear specifications and means for analytic study, a formal framework is needed to characterize their semantics in precise terms. We present LARS, a Logic-based framework for Analyzing Reasoning over Streams, i.e., a rule-based formalism with a novel window operator providing a flexible mechanism to represent views on streaming data. We establish complexity results for central reasoning tasks and show how the prominent Continuous Query Language (CQL) can be captured. Moreover, the relation between LARS and ETALIS, a system for complex event processing is discussed. We thus demonstrate the capability of LARS to serve as the desired formal foundation for expressing and analyzing different semantic approaches to stream processing/reasoning and engines.
【Keywords】: Knowledge Representation and Reasoning; Answer Set Programming; Stream Reasoning; Nonmonotonic Reasoning
【Paper Link】 【Pages】:1439-1445
【Authors】: Sebastian Binnewies ; Zhiqiang Zhuang ; Kewen Wang
【Abstract】: The recent years have seen several proposals aimed at placing the revision of logic programs within the belief change frameworks established for classical logic. A crucial challenge of this task lies in the nonmonotonicity of standard logic programming semantics. Existing approaches have thus used the monotonic characterisation via SE-models to develop semantic revision operators, which however neglect any syntactic information, or reverted to a syntax-oriented belief base approach altogether. In this paper, we bridge the gap between semantic and syntactic techniques by adapting the idea of a partial meet construction from classical belief change. This type of construction allows us to define new model-based operators for revising as well as contracting logic programs that preserve the syntactic structure of the programs involved. We demonstrate the rationality of our operators by testing them against the classic AGM or alternative belief change postulates adapted to the logic programming setting. We further present an algorithm that reduces the partial meet revision or contraction of a logic program to performing revision or contraction only on the relevant subsets of that program.
【Keywords】:
【Paper Link】 【Pages】:1446-1452
【Authors】: Alexander Bochman ; Vladimir Lifschitz
【Abstract】: We provide a logical representation of Pearl's structural causal models in the causal calculus of McCain and Turner (1997) and its first-order generalization by Lifschitz. It will be shown that, under this representation, the nonmonotonic semantics of the causal calculus describes precisely the solutions of the structural equations (the causal worlds of the causal model), while the causal logic from Bochman (2004) is adequate for describing the behavior of causal models under interventions (forming submodels).
【Keywords】: causation; theories of action and change
【Paper Link】 【Pages】:1453-1459
【Authors】: Bart Bogaerts ; Joost Vennekens ; Marc Denecker
【Abstract】: Algebraical fixpoint theory is an invaluable instrument for studying semantics of logics. For example, all major semantics of logic programming, autoepistemic logic, default logic and more recently, abstract argumentation have been shown to be induced by the different types of fixpoints defined in approximation fixpoint theory (AFT). In this paper, we add a new type of fixpoint to AFT: a grounded fixpoint of lattice operator O : L → L is defined as a lattice element x ∈ L such that O(x) = x and for all v ∈ L such that O(v ∧ x) ≤ v, it holds that x ≤ v. On the algebraical level, we show that all grounded fixpoints are minimal fixpoints approximated by the well-founded fixpoint and that all stable fixpoints are grounded. On the logical level, grounded fixpoints provide a new mathematically simple and compact type of semantics for any logic with a (possibly non-monotone) semantic operator. We explain the intuition underlying this semantics in the context of logic programming by pointing out that grounded fixpoints of the immediate consequence operator are interpretations that have no non-trivial unfounded sets. We also analyse the complexity of the induced semantics. Summarised, grounded fixpoint semantics is a new, probably the simplest and most compact, element in the family of semantics that capture basic intuitions and principles of various non-monotonic logics.
【Keywords】: Approximation Fixpoint Theory; Lattice operator; Stable semantics; Well-founded semantics; Groundedness
【Paper Link】 【Pages】:1460-1466
【Authors】: Adrian Boteanu ; Sonia Chernova
【Abstract】: Analogies are a fundamental human reasoning pattern that relies on relational similarity. Understanding how analogies are formed facilitates the transfer of knowledge between contexts. The approach presented in this work focuses on obtaining precise interpretations of analogies. We leverage noisy semantic networks to answer and explain a wide spectrum of analogy questions. The core of our contribution, the Semantic Similarity Engine, consists of methods for extracting and comparing graph-contexts that reveal the relational parallelism that analogies are based on, while mitigating uncertainty in the semantic network.We demonstrate these methods in two tasks: answering multiple choice analogy questions and generating human readable analogy explanations. We evaluate our approach on two datasets totaling 600 analogy questions. Our results show reliable performance and low false-positive rate in question answering; human evaluators agreed with 96% of our analogy explanations.
【Keywords】: semantic; semantic networks; analogy; interpretable; context
【Paper Link】 【Pages】:1467-1474
【Authors】: Gerhard Brewka ; James P. Delgrande ; Javier Romero ; Torsten Schaub
【Abstract】: In this paper we describe asprin, a general, flexible, and extensible framework for handling preferences among the stable models of a logic program. We show how complex preference relations can be specified through user-defined preference types and their arguments. We describe how preference specifications are handled internally by so-called preference programs, which are used for dominance testing. We also give algorithms for computing one, or all, optimal stable models of a logic program. Notably, our algorithms depend on the complexity of the dominance tests and make use of multi-shot answer set solving technology.
【Keywords】: preferences, logic programming, answer set programming
【Paper Link】 【Pages】:1475-1481
【Authors】: Federico Cerutti ; Ilias Tachmazidis ; Mauro Vallati ; Sotirios Batsakis ; Massimiliano Giacomin ; Grigoris Antoniou
【Abstract】: Abstract argumentation framework ( AF ) is a unifying framework able to encompass a variety of nonmonotonic reasoning approaches, logic programming and computational argumentation. Yet, efficient approaches for most of the decision and enumeration problems associated to AF s are missing, thus limiting the efficacy of argumentation-based approaches in real domains. In this paper, we present an algorithm for enumerating the preferred extensions of abstract argumentation frameworks which exploits parallel computation. To this purpose, the SCC-recursive semantics definition schema is adopted, where extensions are defined at the level of specific sub-frameworks. The algorithm shows significant performance improvements in large frameworks, in terms of number of solutions found and speedup.
【Keywords】:
【Paper Link】 【Pages】:1482-1488
【Authors】: James P. Delgrande ; Kewen Wang
【Abstract】: In this paper, we present an approach to forgetting in disjunctive logic programs, where forgetting an atom from a program amounts to a reduction in the signature of that program. Notably, the approach is syntax-independent, so that if two programs are strongly equivalent, then the result of forgetting a given atom in each program is also strongly equivalent. Our central definition of forgetting is abstract: forgetting an atom from program P is characterised by the set of those SE consequences of P that do not mention the atom to be forgotten. We provide an equivalent, syntactic, characterization in which forgetting an atom p is given by those rules in the program that do not mention p, together with rules obtained by a single inference step from those rules that do mention p. Forgetting is shown to have appropriate properties; in particular, answer sets are preserved in forgetting an atom. As well, forgetting an atom via the syntactic characterization results in a modest (at worst quadratic) blowup in the program size. Finally, we provide a prototype implementation of this approach to forgetting.
【Keywords】: logic programs; answer set programming; forgetting
【Paper Link】 【Pages】:1489-1495
【Authors】: Jianfeng Du ; Kewen Wang ; Yi-Dong Shen
【Abstract】: ABox abduction plays an important role in reasoning over description logic (DL) ontologies. However, it does not work with inconsistent DL ontologies. To tackle this problem while achieving tractability, we generalize ABox abduction from the classical semantics to an inconsistency-tolerant semantics, namely the Intersection ABox Repair (IAR) semantics, and propose the notion of IAR-explanations in inconsistent DL ontologies. We show that computing all minimal IAR-explanations is tractable in data complexity for first-order rewritable ontologies. However, the computational method may still not be practical due to a possibly large number of minimal IAR-explanations. Hence we propose to use preference information to reduce the number of explanations to be computed.
【Keywords】:
【Paper Link】 【Pages】:1496-1502
【Authors】: Xiuyi Fan ; Francesca Toni
【Abstract】: Argumentation can be viewed as a process of generating explanations. However, existing argumentation semantics are developed for identifying acceptable arguments within a set, rather than giving concrete justifications for them. In this work, we propose a new argumentation semantics, related admissibility, designed for giving explanations to arguments in both Abstract Argumentation and Assumption-based Argumentation. We identify different types of explanations defined in terms of the new semantics. We also give a correct computational counterpart for explanations using dispute forests.
【Keywords】:
【Paper Link】 【Pages】:1503-1510
【Authors】: Dietmar Jannach ; Thomas Schmitz ; Kostyantyn M. Shchekotykhin
【Abstract】: Model-Based Diagnosis techniques have been successfully applied to support a variety of fault-localization tasks both for hardware and software artifacts. In many applications, Reiter's hitting set algorithm has been used to determine the set of all diagnoses for a given problem. In order to construct the diagnoses with increasing cardinality, Reiter proposed a breadth-first search scheme in combination with different tree-pruning rules. Since many of today's computing devices have multi-core CPU architectures, we propose techniques to parallelize the construction of the tree to better utilize the computing resources without losing any diagnoses. Experimental evaluations using different benchmark problems show that parallelization can help to significantly reduce the required running times. Additional simulation experiments were performed to understand how the characteristics of the underlying problem structure impact the achieved performance gains.
【Keywords】: Model-Based Diagnosis; Parallelization
【Paper Link】 【Pages】:1511-1517
【Authors】: Jianmin Ji ; Hai Wan ; Ziwei Huo ; Zhenfeng Yuan
【Abstract】: Lifschitz and Turner introduced the notion of the splitting set and provided a method to divide a logic program into two parts. They showed that the task of computing the answer sets of the program can be converted into the tasks of computing the answer sets of these parts. However, the empty set and the set of all atoms are the only two splitting sets for many programs, then these programs cannot be divided by the splitting method. In this paper, we extend Lifschitz and Turner's splitting set theorem to allow the program to be split by an arbitrary set of atoms, while some new atoms may be introduced in the process. To illustrate the usefulness of the result, we show that for some typical programs the splitting process is efficient and the program simplification problem can be investigated using the concept of splitting.
【Keywords】: Splitting set; Splitting a Logic Program; Program Simplification
【Paper Link】 【Pages】:1518-1524
【Authors】: Jianmin Ji ; Hai Wan ; Peng Xiao
【Abstract】: This paper proposes an alternative definition of elementary loops and extends the notion of proper loops for disjunctive logic programs. Different from normal logic programs, the computational complexities of recognizing elementary loops and proper loops for disjunctive programs are coNP-complete. To address this problem, we introduce weaker versions of both elementary loops and proper loops and provide polynomial time algorithms for identifying them respectively. On the other hand, based on the notion of elementary loops, the class of Head-Elementary-loop-Free (HEF) programs was presented, which can be turned into equivalent normal logic programs by shifting head atoms into bodies. However, the problem of recognizing an HEF program is coNP-complete. Then we present a subclass of HEF programs which generalizes the class of Head-Cycle-Free programs and provide a polynomial time algorithm to identify them. At last, some experiments show that both elementary loops and proper loops could be replaced by their weak versions in practice.
【Keywords】: Elementary Loops; Proper Loops; Disjunctive Logic Programs
【Paper Link】 【Pages】:1525-1531
【Authors】: Egor V. Kostylev ; Juan L. Reutter ; Domagoj Vrgoc
【Abstract】: Applications of description logics (DLs) such as ontology-based data access (OBDA) require understanding of how to pose database queries over DL knowledge bases. While there have been many studies regarding traditional relational query formalisms such as conjunctive queries and their extensions, little attention has been paid to graph database queries, despite the fact that graph databases have essentially the same structure as knowledge bases. In particular, not much is known about the interplay between DLs and XPath. The latter is a powerful formalism for querying semistructured data: it is in the core of most practical query languages for XML trees, and it is also gaining popularity in theory and practice of graph databases. In this paper we make a step towards coupling knowledge bases and graph databases by studying how to answer powerful XPath-style queries over simple DLs like DL-Lite and EL. We start with adapting the definition of XPath to the DL context, and then proceed to study the complexity of evaluating XPath queries over knowledge bases. Results show that, while query answering is undecidable for the full XPath, by carefully tuning the shape of negation allowed in the queries we can arrive at XPath fragments that have a potential to be used in practice.
【Keywords】: Description Logics; EL; DL-Lite; Ontology-based query answering; OBDA; graph databases; XPath
【Paper Link】 【Pages】:1532-1538
【Authors】: Yuliya Lierler ; Miroslaw Truszczynski
【Abstract】: Modularity is an essential aspect of knowledge representation theory and practice. It has received substantial attention. We introduce model-based modular systems, an abstract framework for modular knowledge representation formalisms, similar in scope to multi-context systems but employing a simpler information-flow mechanism. We establish the precise relationship between the two frameworks, showing that they can simulate each other. We demonstrate that recently introduced modular knowledge representation formalisms integrating logic programming with satisfiability and, more generally, with constraint satisfaction can be cast as modular systems in our sense. These results show that our formalism offers a simple unifying framework for studies of modularity in knowledge representation.
【Keywords】: knowledge representation; modularity; multi-context systems
【Paper Link】 【Pages】:1539-1545
【Authors】: Xudong Liu ; Miroslaw Truszczynski
【Abstract】: We introduce partial lexicographic preference trees (PLPtrees) as a formalism for compact representations of preferences over combinatorial domains. Our main results concern the problem of passive learning of PLP-trees. Specifically, forseveral classes of PLP-trees, we study how to learn (i) a PLP-tree consistent with a dataset of examples, possibly subject to requirements on the size of the tree, and (ii) a PLP-tree correctly ordering as many of the examples as possible in case the dataset of examples is inconsistent. We establish complexity of these problems and, in all cases where the problem is in the class P, propose polynomial time algorithms.
【Keywords】: preference learning, knowledge representation
【Paper Link】 【Pages】:1546-1552
【Authors】: Thomas Lukasiewicz ; Maria Vanina Martinez ; Andreas Pieris ; Gerardo I. Simari
【Abstract】: Querying inconsistent ontologies is an intriguing new problem that gave rise to a flourishing research activity in the description logic (DL) community. The computational complexity of consistent query answering under the main DLs is rather well understood; however, little is known about existential rules. The goal of the current work is to perform an in-depth analysis of the complexity of consistent query answering under the main decidable classes of existential rules enriched with negative constraints. Our investigation focuses on one of the most prominent inconsistency-tolerant semantics, namely, the AR semantics. We establish a generic complexity result, which demonstrates the tight connection between classical and consistent query answering. This result allows us to obtain in a uniform way a relatively complete picture of the complexity of our problem.
【Keywords】: Inconsistency; Consistent query answering; Conjunctive queries; Existential rules; Tuple-generating dependencies; Computational complexity
【Paper Link】 【Pages】:1553-1559
【Authors】: Hua Meng ; Hui Kou ; Sanjiang Li
【Abstract】: In order to properly regulate iterated belief revision, Darwiche and Pearl (1997) model belief revision as revising epistemic states by propositions. An epistemic state in their sense consists of a belief set and a set of conditional beliefs. Although the denotation of an epistemic state can be indirectly captured by a total preorder on the set of worlds, it is unclear how to directly capture the structure in terms of the beliefs and conditional beliefs it contains. In this paper, we first provide an axiomatic characterisation for epistemic states by using nine rules about beliefs and conditional beliefs, and then argue that the last two rules are too strong and should be eliminated for characterising the belief state of an agent. We call a structure which satisfies the first seven rules a general epistemic state (GEP). To provide a semantical characterisation of GEPs, we introduce a mathematical structure called belief algebra, which is in essence a certain binary relation defined on the power set of worlds.We then establish a 1-1 correspondence between GEPs and belief algebras, and show that total preorders on worlds are special cases of belief algebras. Furthermore, using the notion of belief algebras, we extend the classical iterated belief revision rules of Darwiche and Pearl to our setting of general epistemic states.
【Keywords】: Belief Change; Belief Revision; Epistemic State; General Epistemic State; Belief Algebra
【Paper Link】 【Pages】:1560-1568
【Authors】: Boris Motik ; Yavor Nenov ; Robert Edgar Felix Piro ; Ian Horrocks
【Abstract】: Datalog-based systems often materialise all consequences of a datalog program and the data, allowing users' queries to be evaluated directly in the materialisation. This process, however, can be computationally intensive, so most systems update the materialisation incrementally when input data changes. We argue that existing solutions, such as the well-known Delete/Rederive (DRed) algorithm, can be inefficient in cases when facts have many alternate derivations. As a possible remedy, we propose a novel Backward/Forward (B/F) algorithm that tries to reduce the amount of work by a combination of backward and forward chaining. In our evaluation, the B/F algorithm was several orders of magnitude more efficient than the DRed algorithm on some inputs, and it was never significantly less efficient.
【Keywords】: Datalog; Materialisation; Incremental; Deletion; RDF;
【Paper Link】 【Pages】:1569-1575
【Authors】: Claudia Schulz ; Francesca Toni
【Abstract】: Logic Programming and Argumentation Theory have been existing side by side as two separate, yet related, techniques in the field of Knowledge Representation and Reasoningfor many years.When Assumption-Based Argumentation (ABA) was first introduced in the nineties,the authors showed how a logic program can be encoded in an ABA framework andproved that the stable semantics of a logic program corresponds to the stable extension semantics of the ABA framework encoding this logic program.We revisit this initial work by provingthat the 3-valued stable semantics of a logic program coincides with the complete semantics of the encoding ABA framework,and that the L-stable semantics of this logic program coincides with the semi-stable semantics of the encoding ABA framework.Furthermore, we show how to graphically represent the structure of a logic program encoded in an ABA frameworkand that not only logic programming and ABA semanticsbut also Abstract Argumentation semantics can be easily applied to a logic program using these graphical representations.
【Keywords】: Assumption-Based Argumentation; Logic Programming; Complete Semantics; Semi-Stable Semantics; 3-valued Stable Semantics; L-Stable Semantics; Labelling
【Paper Link】 【Pages】:1576-1582
【Authors】: Anika Schumann ; Freddy Lécué
【Abstract】: Many various types of sensors coming from different complex devices collect data from a city. Their underlying data representation follows specific manufacturer specifications that have possibly incomplete descriptions (in ontology) alignments. This paper addresses the problem of determining accurate and complete matching of ontologies given some common descriptions and their pre-determined high level alignments. In this context the problem of ontology matching consists of automatically determining all matching given the latter alignments, and manually verifying the matching results. Especially for applications where it is crucial that ontologies are matched correctly the latter can turn into a very time-consuming task for the user. This paper tackles this challenge and addresses the problem of computing the minimum number of user inputs needed to verify all matchings. We show how to represent this problem as a reasoning problem over a bipartite graph and how to encode it over pseudo Boolean constraints. Experiments show that our approach can be successfully applied to real-world data sets.
【Keywords】:
【Paper Link】 【Pages】:1583-1589
【Authors】: Christoph Schwering ; Gerhard Lakemeyer
【Abstract】: A fundamental task in reasoning about action and change is projection, which refers to determining what holds after a number of actions have occurred. A powerful method for solving the projection problem is regression, which reduces reasoning about the future to reasoning about the initial state. In particular, regression has played an important role in the situation calculus and its epistemic extensions. Recently, a modal variant of the situation calculus was proposed, which allows an agent to revise its beliefs based on so-called belief conditionals as part of its knowledge base. In this paper, we show how regression can be extended to reduce beliefs about the future to initial beliefs in the presence of belief conditionals. Moreover, we show how any remaining belief operators can be eliminated as well, thus reducing the belief projection problem to ordinary first-order entailments.
【Keywords】: Action, Change, and Causality; Reasoning with Beliefs
【Paper Link】 【Pages】:1590-1596
【Authors】: Nicolas Schwind ; Katsumi Inoue ; Gauvain Bourgne ; Sébastien Konieczny ; Pierre Marquis
【Abstract】: Belief revision games (BRGs) are concerned with the dynamics of the beliefs of a group of communicating agents. BRGs are "zero-player" games where at each step every agent revises her own beliefs by taking account for the beliefs of her acquaintances. Each agent is associated with a belief state defined on some finite propositional language. We provide a general definition for such games where each agent has her own revision policy, and show that the belief sequences of agents can always be finitely characterized. We then define a set of revision policies based on belief merging operators. We point out a set of appealing properties for BRGs and investigate the extent to which these properties are satisfied by the merging-based policies under consideration.
【Keywords】: Belief Revision Games; Belief Change; Social Networks
【Paper Link】 【Pages】:1597-1603
【Authors】: Kostyantyn M. Shchekotykhin
【Abstract】: Broad application of answer set programming (ASP) for declarative problem solving requires the development of tools supporting the coding process. Program debugging is one of the crucial activities within this process. Modern ASP debugging approaches allow efficient computation of possible explanations of a fault. However, even for a small program a debugger might return a large number of possible explanations and selection of the correct one must be done manually. In this paper we present an interactive query-based ASP debugging method which extends previous approaches and finds the preferred explanation by means of observations. The system automatically generates a sequence of queries to a programmer asking whether a set of ground atoms must be true in all (cautiously) or some (bravely) answer sets of the program. Since some queries can be more informative than the others, we discuss query selection strategies which - given user's preferences for an explanation - can find the most informative query reducing the overall number of queries required for the identification of a preferred explanation.
【Keywords】: Logic programming; Answer set programming; Debugging
【Paper Link】 【Pages】:1604-1610
【Authors】: Tran Cao Son ; Enrico Pontelli ; Chitta Baral ; Gregory Gelfond
【Abstract】: The paper proposes a condition for preserving the KD45 property of a Kripke model when a sequence of update models is applied to it. The paper defines the notions of a primitive update model and a semi-reflexive KD45 (or sr-KD45) Kripke model. It proves that updating a sr-KD45 Kripke model using a primitive update model results in a sr-KD45 Kripke model, i.e., a primitive update model preserves the properties of a sr-KD45 Kripke model. It shows that several update models for modeling well-known actions found in the literature are primitive. This result provides guarantees that can be useful in presence of multiple applications of actions in multi-agent system (e.g., multi-agent planning).
【Keywords】: Kripke model; update model; belief and knowledge
【Paper Link】 【Pages】:1611-1617
【Authors】: Giorgio Stefanoni ; Boris Motik
【Abstract】: Answering conjunctive queries (CQs) over EL knowledge bases (KBs) with complex role inclusions is PSPACE-hard and in PSPACE in certain cases; however, if complex role inclusions are restricted to role transitivity, a tight upper complexity bound has so far been unknown. Furthermore, the existing algorithms cannot handle reflexive roles, and they are not practicable. Finally, the problem is tractable for acyclic CQs and ELH, and NP-complete for unrestricted CQs and ELHO KBs. In this paper we complete the complexity landscape of CQ answering for several important cases. In particular, we present a practicable NP algorithm for answering CQs over ELHOs KBs—a logic containing all of OWL 2 EL, but with complex role inclusions restricted to role transitivity. Our preliminary evaluation suggests that the algorithm can be suitable for practical use. Moreover, we show that, even for a restricted class of so-called arborescent acyclic queries, CQ answering over EL KBs becomes NP-hard in the presence of either transitive or reflexive roles. Finally, we show that answering arborescent CQs over ELHO KBs is tractable, whereas answering acyclic CQs is NP-hard.
【Keywords】: OWL 2 EL; Conjunctive Queries; Acyclic Conjunctive Queries; Transitive Roles; Reflexive Roles
【Paper Link】 【Pages】:1618-1624
【Authors】: Roni Tzvi Stern ; Meir Kalech ; Shelly Rogov ; Alexander Feldman
【Abstract】: A known limitation of many diagnosis algorithms is that the number of diagnoses they return can be very large. This raises the question of how to use such a large set of diagnoses. For example, presenting hundreds of diagnoses to a human operator (charged with repairing the system) is meaningless. In various settings, including decision support for a human operator and automated troubleshooting processes, it is sufficient to be able to answer a basic diagnostic question: is a given component faulty? We propose a way to aggregate an arbitrarily large set of diagnoses to return an estimate of the likelihood of a given component to be faulty. The resulting mapping of components to their likelihood of being faulty is called the system's health state. We propose two metrics for evaluating the accuracy of a health state and show that an accurate health state can be found without finding all diagnoses. An empirical study explores the question of how many diagnoses are needed to obtain an accurate enough health state, and a simple online stopping criterion is proposed.
【Keywords】: Diagnosis; Reasoning
【Paper Link】 【Pages】:1625-1631
【Authors】: Hannes Strass
【Abstract】: We analyze the relative expressiveness of the two-valued semantics of abstract argumentation frameworks, normal logic programs and abstract dialectical frameworks. By expressiveness we mean the ability to encode a desired set of two-valued interpretations over a given propositional vocabulary A using only atoms from A. While the computational complexity of the two-valued model existence problem for all these languages is (almost) the same, we show that the languages form a neat hierarchy with respect to their expressiveness. We then demonstrate that this hierarchy collapses once we allow to introduce a linear number of new vocabulary elements.
【Keywords】: expressiveness; logic programming; abstract argumentation
【Paper Link】 【Pages】:1632-1640
【Authors】: Akshaya Thippur ; Chris Burbridge ; Lars Kunze ; Marina Alberti ; John Folkesson ; Patric Jensfelt ; Nick Hawes
【Abstract】: Object recognition systems can be unreliable when run in isolation depending on only image based features, but their performance can be improved when taking scene context into account. In this paper, we present techniques to model and infer object labels in real scenes based on a variety of spatial relations — geometric features which capture how objects co-occur — and compare their efficacy in the context of augmenting perception based object classification in real-world table-top scenes. We utilise a long-term dataset of office table-tops for qualitatively comparing the performances of these techniques. On this dataset, we show that more intricate techniques, have a superior performance but do not generalise well on small training data. We also show that techniques using coarser information perform crudely but sufficiently well in standalone scenarios and generalise well on small training data. We conclude the paper, expanding on the insights we have gained through these comparisons and comment on a few fundamental topics with respect to long-term autonomous robots.
【Keywords】: Spatial relations; Spatial contexts; Long-term Autonomy; Joint object classification
【Paper Link】 【Pages】:1641-1648
【Authors】: Guy Van den Broeck ; Adnan Darwiche
【Abstract】: Knowledge compilation is a powerful reasoning paradigm with many applications across AI and computer science more broadly. We consider the problem of bottom-up compilation of knowledge bases, which is usually predicated on the existence of a polytime function for combining compilations using Boolean operators (usually called an Apply function). While such a polytime Apply function is known to exist for certain languages (e.g., OBDDs) and not exist for others (e.g., DNNFs), its existence for certain languages remains unknown. Among the latter is the recently introduced language of Sentential Decision Diagrams (SDDs): while a polytime Apply function exists for SDDs, it was unknown whether such a function exists for the important subset of compressed SDDs which are canonical. We resolve this open question in this paper and consider some of its theoretical and practical implications. Some of the findings we report question the common wisdom on the relationship between bottom-up compilation, language canonicity and the complexity of the Apply function.
【Keywords】: knowledge compilation; sentential decision diagrams; decision diagrams
【Paper Link】 【Pages】:1649-1655
【Authors】: Yisong Wang ; Kewen Wang ; Zhe Wang ; Zhiqiang Zhuang
【Abstract】: The theory of (variable) forgetting has received significant attention in nonmonotonic reasoning, especially, in answer set programming. However, the problem of establishing a theory of forgetting for some expressive nonmonotonic logics such as McCarthy's circumscription is rarely explored.In this paper a theory of forgetting for propositional circumscription is proposed, which is not a straightforward adaption of existing approaches. In particular, some properties that are essential for existing proposals do not hold any longer or have to be reformulated. Several useful properties of the new forgetting are proved, which demonstrate suitability of the forgetting for circumscription. A sound and complete algorithm for the forgetting is developed and an analysis of computational complexity is given.
【Keywords】: Circumscription; Forgetting; Algorithms; Complexity
【Paper Link】 【Pages】:1656-1662
【Authors】: Zhe Wang ; Kewen Wang ; Zhiqiang Zhuang ; Guilin Qi
【Abstract】: The development and maintenance of large and complex ontologies are often time-consuming and error-prone. Thus, automated ontology learning and evolution have attracted intensive research interest. In data-centric applications where ontologies are designed from the data or automatically learnt from it, when new data instances are added that contradict the ontology, it is often desirable to incrementally revise the ontology according to the added data. In description logics, this problem can be intuitively formulated as the operation of TBox contraction, i.e., rational elimination of certain axioms from the logical consequences of a TBox, and it is w.r.t. an ABox. In this paper we introduce a model-theoretic approach to such a contraction problem by using an alternative semantic characterisation of DL-Lite TBoxes. We show that entailment checking (without necessarily first computing the contraction result) is in coNP, which does not shift the corresponding complexity in propositional logic, and the problem is tractable when the size of the new data is bounded.
【Keywords】: ontology change, belief revision, DL-Lite
【Paper Link】 【Pages】:1663-1670
【Authors】: Fei Wu ; Jun Song ; Yi Yang ; Xi Li ; Zhongfei (Mark) Zhang ; Yueting Zhuang
【Abstract】: We consider the problem of embedding entities and relations of knowledge bases into low-dimensional continuous vector spaces (distributed representations). Unlike most existing approaches, which are primarily efficient for modelling pairwise relations between entities, we attempt to explicitly model both pairwise relations and long-range interactions between entities, by interpreting them as linear operators on the low-dimensional embeddings of the entities. Therefore, in this paper we introduces Path-Ranking to capture the long-range interactions of knowledge graph and at the same time preserve the pairwise relations of knowledge graph; we call it 'structured embedding via pairwise relation and long-range interactions' (referred to as SePLi). Comparing with the-state-of-the-art models, SePLi achieves better performances of embeddings.
【Keywords】: structure embedding; pairwise relation; long-range interaction; knowledge graph; relational data
【Paper Link】 【Pages】:1671-1677
【Authors】: Dongmo Zhang ; Michael Thielscher
【Abstract】: This paper introduces a modal logic for reasoning about game strategies. The logic is based on a variant of the well-known game description language for describing game rules and further extends it with two modalities for reasoning about actions and strategies. We develop an axiomatic system and prove its soundness and completeness with respect to a specific semantics based on the state transition model of games. Interestingly, the completeness proof makes use of forgetting techniques that have been widely used in the KR&R literature. We demonstrate how general game-playing systems can apply the logic to develop game strategies.
【Keywords】: Strategic reasoning, general game playing, multi-agent systems
【Paper Link】 【Pages】:1678-1685
【Authors】: Heng Zhang ; Yan Zhang ; Jia-Huai You
【Abstract】: Finite chase, or alternatively chase termination, is an important condition to ensure the decidability of existential rule languages. In the past few years, a number of rule languages with finite chase have been studied. In this work, we propose a novel approach for classifying the rule languages with finite chase. Using this approach, a family of decidable rule languages, which extend the existing languages with the finite chase property, are naturally defined. We then study the complexity of these languages. Although all of them are tractable for data complexity, we show that their combined complexity can be arbitrarily high. Furthermore, we prove that all the rule languages with finite chase that extend the weakly acyclic language are of the same expressiveness as the weakly acyclic one, while rule languages with higher combined complexity are in general more succinct than those with lower combined complexity.
【Keywords】: existential rules; finite chase; complexity; expressiveness; succinctness
【Paper Link】 【Pages】:1686-1692
【Authors】: Sachinthaka Abeywardana ; Fabio Ramos
【Abstract】: Quantile regression deals with the problem of computing robust estimators when the conditional mean and standard deviation of the predicted function are inadequate to capture its variability. The technique has an extensive list of applications, including health sciences, ecology and finance. In this work we present a non-parametric method of inferring quantiles and derive a novel Variational Bayesian (VB) approximation to the marginal likelihood, leading to an elegant Expectation Maximisation algorithm for learning the model. Our method is nonparametric, has strong convergence guarantees, and can deal with nonsymmetric quantiles seamlessly. We compare the method to other parametric and non-parametric Bayesian techniques, and alternative approximations based on expectation propagation demonstrating the benefits of our framework in toy problems and real datasets.
【Keywords】: Quantile Regression; Variational Bayes; Gaussian Processes
【Paper Link】 【Pages】:1693-1699
【Authors】: Ognjen Arandjelovic
【Abstract】: Clinical trial adaptation refers to any adjustment of the trial protocol after the onset of the trial. The main goal is to make the process of introducing new medical interventions to patients more efficient by reducing the cost and the time associated with evaluating their safety and efficacy. The principal question is how should adaptation be performed so as to minimize the chance of distorting the outcome of the trial. We propose a novel method for achieving this. Unlike previous work our approach focuses on trial adaptation by sample size adjustment. We adopt a recently proposed stratification framework based on collected auxiliary data and show that this information together with the primary measured variables can be used to make a probabilistically informed choice of the particular sub-group a sample should be removed from. Experiments on simulated data are used to illustrate the effectiveness of our method and its application in practice.
【Keywords】: Medicine; clinical; RCT; CCT; Bayesian
【Paper Link】 【Pages】:1700-1706
【Authors】: Sarah Marie Brown ; Andrea Webb ; Rami Mangoubi ; Jennifer Dy
【Abstract】: Current diagnostic methods for mental pathologies, including Post-Traumatic Stress Disorder (PTSD), involve a clinician-coded interview, which can be subjective. Heart rate and skin conductance, as well as other peripheral physiology measures, have previously shown utility in predicting binary diagnostic decisions. The binary decision problem is easier, but misses important information on the severity of the patient’s condition. This work utilizes a novel experimental set-up that exploits virtual reality videos and peripheral physiology for PTSD diagnosis. In pursuit of an automated physiology-based objective diagnostic method, we propose a learning formulation that integrates the description of the experimental data and expert knowledge on desirable properties of a physiological diagnostic score. From a list of desired criteria, we derive a new cost function that combines regression and classification while learning the salient features for predicting physiological score. The physiological score produced by Sparse Combined Regression-Classification (SCRC) is assessed with respect to three sets of criteria chosen to reflect design goals for an objective, physiological PTSD score: parsimony and context of selected features, diagnostic score validity, and learning generalizability. For these criteria, we demonstrate that Sparse Combined Regression-Classification performs better than more generic learning approaches.
【Keywords】: PTSD; physiology based diagnosis; cost function; classification; regression
【Paper Link】 【Pages】:1707-1713
【Authors】: Zheng Chen ; Minmin Chen ; Kilian Q. Weinberger ; Weixiong Zhang
【Abstract】: Link prediction and multi-label learning on graphs are two important but challenging machine learning problems that have broad applications in diverse fields. Not only are the two problems inherently correlated and often appear concurrently, they are also exacerbated by incomplete data. We develop a novel algorithm to solve these two problems jointly under a unified framework, which helps reduce the impact of graph noise and benefits both tasks individually. We reduce multi-label learning problem into an additional link prediction task and solve both problems with marginalized denoising, which we co-regularize with Laplacian smoothing. This approach combines both learning tasks into a single convex objective function, which we optimize efficiently with iterative closed-form updates. The resulting approach performs significantly better than prior work on several important real-world applications with great consistency.
【Keywords】: link prediction; multi-label learning; marginalized denoising; protein-protein interaction; social networks
【Paper Link】 【Pages】:1714-1720
【Authors】: Xin-Yu Dai ; Jianbing Zhang ; Shujian Huang ; Jiajun Chen ; Zhi-Hua Zhou
【Abstract】: In many learning tasks with structural properties, structural sparsity methods help induce sparse models, usually leading to better interpretability and higher generalization performance. One popular approach is to use group sparsity regularization that enforces sparsity on the clustered groups of features, while another popular approach is to adopt graph sparsity regularization that considers sparsity on the link structure of graph embedded features. Both the group and graph structural properties co-exist in many applications. However, group sparsity and graph sparsity have not been considered simultaneously yet. In this paper, we propose a g 2 -regularization that takes group and graph sparsity into joint consideration, and present an effective approach for its optimization. Experiments on both synthetic and real data show that, enforcing group-graph sparsity lead to better performance than using group sparsity or graph sparsity only.
【Keywords】: Group Lasso, Graph Sparsity, Sparse Model; Feature Selection
【Paper Link】 【Pages】:1721-1727
【Authors】: Huiji Gao ; Jiliang Tang ; Xia Hu ; Huan Liu
【Abstract】: The rapid urban expansion has greatly extended the physical boundary of users' living area and developed a large number of POIs (points of interest). POI recommendation is a task that facilitates users' urban exploration and helps them filter uninteresting POIs for decision making. While existing work of POI recommendation on location-based social networks (LBSNs) discovers the spatial, temporal, and social patterns of user check-in behavior, the use of content information has not been systematically studied. The various types of content information available on LBSNs could be related to different aspects of a user's check-in action, providing a unique opportunity for POI recommendation. In this work, we study the content information on LBSNs w.r.t. POI properties, user interests, and sentiment indications. We model the three types of information under a unified POI recommendation framework with the consideration of their relationship to check-in actions. The experimental results exhibit the significance of content information in explaining user behavior, and demonstrate its power to improve POI recommendation performance on LBSNs.
【Keywords】: POI Recommendation; Location-based Social Networks; Content Aware
【Paper Link】 【Pages】:1728-1734
【Authors】: Matthew Gingerich ; Cristina Conati
【Abstract】: A user-adaptive information visualization system capable of learning models of users and the visualization tasks they perform could provide interventions optimized for helping specific users in specific task contexts. In this paper, we investigate the accuracy of predicting visualization tasks, user performance on tasks, and user traits from gaze data. We show that predictions made with a logistic regression model are significantly better than a baseline classifier, with particularly strong results for predicting task type and user performance. Furthermore, we compare classifiers built with interface-independent and interface-dependent features, and show that the interface-independent features are comparable or superior to interface-dependent ones. Finally, we discuss how the accuracy of predictive models is affected if they are trained with data from trials that had highlighting interventions added to the visualization.
【Keywords】: eye-tracking; user modelling; applied machine learning
【Paper Link】 【Pages】:1735-1741
【Authors】: Anshul Gupta ; Ricardo Gutierrez-Osuna ; Matthew Christy ; Boris Capitanu ; Loretta Auvil ; Liz Grumbach ; Richard Furuta ; Laura Mandell
【Abstract】: Mass digitization of historical documents is a challenging problem for optical character recognition (OCR) tools. Issues include noisy backgrounds and faded text due to aging, border/marginal noise, bleed-through, skewing, warping, as well as irregular fonts and page layouts. As a result, OCR tools often produce a large number of spurious bounding boxes (BBs) in addition to those that correspond to words in the document. This paper presents an iterative classification algorithm to automatically label BBs (i.e., as text or noise) based on their spatial distribution and geometry. The approach uses a rule-base classifier to generate initial text/noise labels for each BB, followed by an iterative classifier that refines the initial labels by incorporating local information to each BB, its spatial location, shape and size. When evaluated on a dataset containing over 72,000 manually-labeled BBs from 159 historical documents, the algorithm can classify BBs with 0.95 precision and 0.96 recall. Further evaluation on a collection of 6,775 documents with ground-truth transcriptions shows that the algorithm can also be used to predict document quality (0.7 correlation) and improve OCR transcriptions in 85% of the cases.
【Keywords】: optical character recognition;machine learning;digital humanities;document triage
【Paper Link】 【Pages】:1742-1748
【Authors】: Nils Yannick Hammerla ; James Fisher ; Peter Andras ; Lynn Rochester ; Richard Walker ; Thomas Ploetz
【Abstract】: Management of Parkinson's Disease (PD) could be improved significantly if reliable, objective information about fluctuations in disease severity can be obtained in ecologically valid surroundings such as the private home. Although automatic assessment in PD has been studied extensively, so far no approach has been devised that is useful for clinical practice. Analysis approaches common for the field lack the capability of exploiting data from realistic environments, which represents a major barrier towards practical assessment systems. The very unreliable and infrequent labelling of ambiguous, low resolution movement data collected in such environments represents a very challenging analysis setting, where advances would have significant societal impact in our ageing population. In this work we propose an assessment system that abides practical usability constraints and applies deep learning to differentiate disease state in data collected in naturalistic settings. Based on a large data-set collected from 34 people with PD we illustrate that deep learning outperforms other approaches in generalisation performance, despite the unreliable labelling characteristic for this problem setting, and how such systems could improve current clinical practice.
【Keywords】: Deep learning, Parkinson's Disease, Accelerometer, Naturalistic Environments
【Paper Link】 【Pages】:1749-1755
【Authors】: Jiazhen He ; James Bailey ; Benjamin I. P. Rubinstein ; Rui Zhang
【Abstract】: Massive Open Online Courses (MOOCs) have received widespread attention for their potential to scale higher education, with multiple platforms such as Coursera, edX and Udacity recently appearing. Despite their successes, a major problem faced by MOOCs is low completion rates. In this paper, we explore the accurate early identification of students who are at risk of not completing courses. We build predictive models weekly, over multiple offerings of a course. Furthermore, we envision student interventions that present meaningful probabilities of failure, enacted only for marginal students.To be effective, predicted probabilities must be both well-calibrated and smoothed across weeks.Based on logistic regression, we propose two transfer learning algorithms to trade-off smoothness and accuracy by adding a regularization term to minimize the difference of failure probabilities between consecutive weeks. Experimental results on two offerings of a Coursera MOOC establish the effectiveness of our algorithms.
【Keywords】: MOOCs; transfer learning; logistic regression
【Paper Link】 【Pages】:1756-1762
【Authors】: Mohamed Hamza Ibrahim ; Christopher J. Pal ; Gilles Pesant
【Abstract】: One key challenge in statistical relational learning (SRL) is scalable inference. Unfortunately, most real-world problems in SRL have expressive models that translate into large grounded networks, representing a bottleneck for any inference method and weakening its scalability. In this paper we introduce Preference Relaxation (PR), a two-stage strategy that uses the determinism present in the underlying model to improve the scalability of relational inference. The basic idea of PR is that if the underlying model involves mandatory (i.e. hard) constraints as well as preferences (i.e. soft constraints) then it is potentially wasteful to allocate memory for all constraints in advance when performing inference. To avoid this, PR starts by relaxing preferences and performing inference with hard constraints only. It then removes variables that violate hard constraints, thereby avoiding irrelevant computations involving preferences. In addition it uses the removed variables to enlarge the evidence database. This reduces the effective size of the grounded network. Our approach is general and can be applied to various inference methods in relational domains. Experiments on real-world applications show how PR substantially scales relational inference with a minor impact on accuracy.
【Keywords】: Statistical Relational Learning; Relational Inference; Markov Logic Networks
【Paper Link】 【Pages】:1763-1769
【Authors】: Been Kim ; Kayur Patel ; Afshin Rostamizadeh ; Julie A. Shah
【Abstract】: The majority of machine learning research has been focused on building models and inference techniques with sound mathematical properties and cutting edge performance. Little attention has been devoted to the development of data representation that can be used to improve a user's ability to interpret the data and machine learning models to solve real-world problems. In this paper, we quantitatively and qualitatively evaluate an efficient, accurate and scalable feature-compression method using latent Dirichlet allocation for discrete data. This representation can effectively communicate the characteristics of high-dimensional, complex data points. We show that the improvement of a user's interpretability through the use of a topic modeling-based compression technique is statistically significant, according to a number of metrics, when compared with other representations. Also, we find that this representation is scalable --- it maintains alignment with human classification accuracy as an increasing number of data points are shown. In addition, the learned topic layer can semantically deliver meaningful information to users that could potentially aid human reasoning about data characteristics in connection with compressed topic space.
【Keywords】:
【Paper Link】 【Pages】:1770-1776
【Authors】: Phillip B. Kirlin ; David Jensen
【Abstract】: The overarching goal of music theory is to explain the inner workings of a musical composition by examining the structure of the composition. Schenkerian music theory supposes that Western tonal compositions can be viewed as hierarchies of musical objects. The process of Schenkerian analysis reveals this hierarchy by identifying connections between notes or chords of a composition that illustrate both the small- and large-scale construction of the music. We present a new probabilistic model of this variety of music analysis, details of how the parameters of the model can be learned from a corpus, an algorithm for deriving the most probable analysis for a given piece of music, and both quantitative and human-based evaluations of the algorithm's performance. This represents the first large-scale data-driven computational approach to hierarchical music analysis.
【Keywords】: music informatics; Schenkerian analysis; supervised learning
【Paper Link】 【Pages】:1777-1783
【Authors】: Thomas A. Lasko
【Abstract】: Sampling repeated clinical laboratory tests with appropriate timing is challenging because the latent physiologic function being sampled is in general nonstationary. When ordering repeated tests, clinicians adopt various simple strategies that may or may not be well suited to the behavior of the function. Previous research on this topic has been primarily focused on cost-driven assessments of oversampling. But for monitoring physiologic state or for retrospective analysis, undersampling can be much more problematic than oversampling. In this paper we analyze hundreds of observation sequences of four different clinical laboratory tests to provide principled, data-driven estimates of undersampling and oversampling, and to assess whether the sampling adapts to changing volatility of the latent function. To do this, we developed a new method for fitting a Gaussian process to samples of a nonstationary latent function. Our method includes an explicit estimate of the latent function's volatility over time, which is deterministically related to its nonstationarity. We find on average that the degree of undersampling is up to an order of magnitude greater than oversampling, and that only a small minority are sampled with an adaptive strategy.
【Keywords】: Gaussian Processes; Nonstationarity; Approximate Inference; Medicine; Secondary Use; Clinical Laboratory Tests; Sampling Strategies; Utilization Management
【Paper Link】 【Pages】:1784-1790
【Authors】: Qing Li ; LiLing Jiang ; Ping Li ; Hsinchun Chen
【Abstract】: Stock movements are essentially driven by new information. Market data, financial news, and social sentiment are believed to have impacts on stock markets. To study the correlation between information and stock movements, previous works typically concatenate the features of different information sources into one super feature vector. However, such concatenated vector approaches treat each information source separately and ignore their interactions. In this article, we model the multi-faceted investors’ information and their intrinsic links with tensors. To identify the nonlinear patterns between stock movements and new information, we propose a supervised tensor regression learning approach to investigate the joint impact of different information sources on stock markets. Experiments on CSI 100 stocks in the year 2011 show that our approach outperforms the state-of-the-art trading strategies.
【Keywords】: Tensor;stock;news;social media;trading strategy
【Paper Link】 【Pages】:1791-1797
【Authors】: Zhe Lim ; Benjamin I. P. Rubinstein
【Abstract】: Matching and merging data from conflicting sources is the bread and butter of data integration, which drives search verticals, e-commerce comparison sites and cyber intelligence. Schema matching lifts data integration - traditionally focused on well-structured data - to highly heterogeneous sources. While schema matching has enjoyed significant success in matching data attributes, inconsistencies can exist at a deeper level, making full integration difficult or impossible. We propose a more fine-grained approach that focuses on correspondences between the values of attributes across data sources. Since the semantics of attribute values derive from their use and co-occurrence, we argue for the suitability of canonical correlation analysis (CCA) and its variants. We demonstrate the superior statistical and computational performance of multiple sparse CCA compared to a suite of baseline algorithms, on two datasets which we are releasing to stimulate further research. Our crowd-annotated data covers both cases that are relatively easy for humans to supply ground-truth, and that are inherently difficult for human computation.
【Keywords】: Canonical Correlation Analysis; CCA; Schema Matching; Entity Resolution; Merging
【Paper Link】 【Pages】:1798-1804
【Authors】: Zitao Liu ; Milos Hauskrecht
【Abstract】: Linear Dynamical System (LDS) is an elegant mathematical framework for modeling and learning Multivariate Time Series (MTS). However, in general, it is difficult to set the dimension of an LDS's hidden state space. A small number of hidden states may not be able to model the complexities of a MTS, while a large number of hidden states can lead to overfitting. In this paper, we study learning methods that impose various regularization penalties on the transition matrix of the LDS model and propose a regularized LDS learning framework (rLDS) which aims to (1) automatically shut down LDSs' spurious and unnecessary dimensions, and consequently, address the problem of choosing the optimal number of hidden states; (2) prevent the overfitting problem given a small amount of MTS data; and (3) support accurate MTS forecasting. To learn the regularized LDS from data we incorporate a second order cone program and a generalized gradient descent method into the Maximum a Posteriori framework and use Expectation Maximization to obtain a low-rank transition matrix of the LDS model. We propose two priors for modeling the matrix which lead to two instances of our rLDS. We show that our rLDS is able to recover well the intrinsic dimensionality of the time series dynamics and it improves the predictive performance when compared to baselines on both synthetic and real-world MTS datasets.
【Keywords】: Linear Dynamical System; Multivariate Time Series; Low Rank Approximation
【Paper Link】 【Pages】:1805-1811
【Authors】: Canyi Lu ; Changbo Zhu ; Chunyan Xu ; Shuicheng Yan ; Zhouchen Lin
【Abstract】: This work studies the Generalized Singular Value Thresholding (GSVT) operator associated with a nonconvex function g defined on the singular values of X. We prove that GSVT can be obtained by performing the proximal operator of g on the singular values since Proxg(.) is monotone when g is lower bounded. If the nonconvex g satisfies some conditions (many popular nonconvex surrogate functions, e.g., lp-norm, 0 < p < 1, of l0-norm are special cases), a general solver to find Proxg(b) is proposed for any b ≥ 0. GSVT greatly generalizes the known Singular Value Thresholding (SVT) which is a basic subroutine in many convex low rank minimization methods. We are able to solve the nonconvex low rank minimization problem by using GSVT in place of SVT.
【Keywords】: nonconvex optimization, low rank, singular value
【Paper Link】 【Pages】:1812-1818
【Authors】: Baharan Mirzasoleiman ; Ashwinkumar Badanidiyuru ; Amin Karbasi ; Jan Vondrák ; Andreas Krause
【Abstract】: Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized algorithm, STOCHASTIC-GREEDY, can achieve a (1 − 1/e − ε) approximation guarantee, in expectation, to the optimum solution in time linear in the size of the data and independent of the cardinality constraint. We empirically demonstrate the effectiveness of our algorithm on submodular functions arising in data summarization, including training large-scale kernel methods, exemplar-based clustering, and sensor placement. We observe that STOCHASTIC-GREEDY practically achieves the same utility value as lazy greedy but runs much faster. More surprisingly, we observe that in many practical scenarios STOCHASTIC-GREEDY does not evaluate the whole fraction of data points even once and still achieves indistinguishable results compared to lazy greedy.
【Keywords】: submodular functions, approximation algorithms, greedy algorithms
【Paper Link】 【Pages】:1819-1825
【Authors】: George D. Montanez ; Saeed Amizadeh ; Nikolay Laptev
【Abstract】: Faced with the problem of characterizing systematic changes in multivariate time series in an unsupervised manner, we derive and test two methods of regularizing hidden Markov models for this task. Regularization on state transitions provides smooth transitioning among states, such that the sequences are split into broad, contiguous segments. Our methods are compared with a recent hierarchical Dirichlet process hidden Markov model (HDP-HMM) and a baseline standard hidden Markov model, of which the former suffers from poor performance on moderate-dimensional data and sensitivity to parameter settings, while the latter suffers from rapid state transitioning, over-segmentation and poor performance on a segmentation task involving human activity accelerometer data from the UCI Repository. The regularized methods developed here are able to perfectly characterize change of behavior in the human activity data for roughly half of the real-data test cases, with accuracy of 94% and low variation of information. In contrast to the HDP-HMM, our methods provide simple, drop-in replacements for standard hidden Markov model update rules, allowing standard expectation maximization (EM) algorithms to be used for learning.
【Keywords】: state-persistent HMMs; hidden Markov models; segmenting time series; multivariate time series
【Paper Link】 【Pages】:1826-1832
【Authors】: Richard Jayadi Oentaryo ; Stephanus Daniel Handoko ; Hoong Chuin Lau
【Abstract】: The abundance of algorithms developed to solve different problems has given rise to an important research question: How do we choose the best algorithm for a given problem? Known as algorithm selection, this issue has been prevailing in many domains, as no single algorithm can perform best on all problem instances. Traditional algorithm selection and portfolio construction methods typically treat the problem as a classification or regression task. In this paper, we present a new approach that provides a more natural treatment of algorithm selection and portfolio construction as a ranking task. Accordingly, we develop a Ranking-Based Algorithm Selection (RAS) method, which employs a simple polynomial model to capture the ranking of different solvers for different problem instances. We devise an efficient iterative algorithm that can gracefully optimize the polynomial coefficients by minimizing a ranking loss function, which is derived from a sound probabilistic formulation of the ranking problem. Experiments on the SAT 2012 competition dataset show that our approach yields competitive performance to that of more sophisticated algorithm selection methods.
【Keywords】: algorithm selection; ranking; satisfiability problem
【Paper Link】 【Pages】:1833-1839
【Authors】: Buyue Qian ; Xiang Wang ; Ian Davidson
【Abstract】: Learning to rank is an emerging learning task that opens up a diverse set of applications. However, most existing work focuses on learning a single ranking function whilst in many real world applications, there can be many ranking functions to fulfill various retrieval tasks on the same data set. How to train many ranking functions is challenging due to the limited availability of training data which is further compounded when plentiful training data is available for a small subset of the ranking functions. This is particularly true in settings, such as personalized ranking/retrieval, where each person requires a unique ranking function according to their preference, but only the functions of the persons who provide sufficient ratings (of objects, such as movies and music) can be well trained. To address this, we propose to construct a graph where each node corresponds to a retrieval task, and then propagate ranking functions on the graph. We illustrate the usefulness of the idea of propagating ranking functions and our method by exploring two real world applications.
【Keywords】: Learning to Rank; Distribution Propagation; Linear Ranking function; Graph Diffusion; The Hellinger Distance
【Paper Link】 【Pages】:1840-1846
【Authors】: Jimmy S. J. Ren ; Li Xu
【Abstract】: We recently have witnessed many ground-breaking results in machine learning and computer vision, generated by using deep convolutional neural networks (CNN). While the success mainly stems from the large volume of training data and the deep network architectures, the vector processing hardware (e.g. GPU) undisputedly plays a vital role in modern CNN implementations to support massive computation. Though much attention was paid in the extent literature to understand the algorithmic side of deep CNN, little research was dedicated to the vectorization for scaling up CNNs. In this paper, we studied the vectorization process of key building blocks in deep CNNs, in order to better understand and facilitate parallel implementation. Key steps in training and testing deep CNNs are abstracted as matrix and vector operators, upon which parallelism can be easily achieved. We developed and compared six implementations with various degrees of vectorization with which we illustrated the impact of vectorization on the speed of model training and testing. Besides, a unified CNN framework for both high-level and low-level vision tasks is provided, along with a vectorized Matlab implementation with state-of-the-art speed performance.
【Keywords】: convolutional neural networks;deblur;denoise;detection
【Paper Link】 【Pages】:1847-1853
【Authors】: Pedro Henrique Santana ; Spencer Lane ; Eric Timmons ; Brian Charles Williams ; Carlos Forster
【Abstract】: Innovative methods have been developed for diagnosis, activity monitoring, and state estimation that achieve high accuracy through the use of stochastic models involving hybrid discrete and continuous behaviors. A key bottleneck is the automated acquisition of these hybrid models, and recent methods have focused predominantly on Jump Markov processes and piecewise autoregressive models. In this paper, we present a novel algorithm capable of performing unsupervised learning of guarded Probabilistic Hybrid Automata (PHA) models, which extends prior work by allowing stochastic discrete mode transitions in a hybrid system to have a functional dependence on its continuous state. Our experiments indicate that guarded PHA models can yield significant performance improvements when used by hybrid state estimators, particularly when diagnosing the true discrete mode of the system, without any noticeable impact on their real-time performance.
【Keywords】: Probabilistic Hybrid Automata; Guarded Transitions; Diagnosis; Expectation Maximization
【Paper Link】 【Pages】:1854-1860
【Authors】: Weiwei Shen ; Jun Wang
【Abstract】: Merton's portfolio optimization problem in the presence of transaction costs for multiple assets has been an important and challenging problem in both theory and practice. Most existing work suffers from curse of dimensionality and encounters with the difficulty of generalization. In this paper, we develop an approximate dynamic programing method of synergistically combining the Lowner-John ellipsoid approximation with conventional value function iteration to quantify the associated optimal trading policy. Through constructing Lowner-John ellipsoids to parameterize the optimal policy and taking Euclidean projections onto the constructed ellipsoids to implement the trading policy, the proposed algorithm has cut computational costs up to a factor of five hundred and meanwhile achieved near-optimal risk-adjusted returns across both synthetic and real-world market datasets.
【Keywords】: Portfolio optimization; Approximate dynamic programming
【Paper Link】 【Pages】:1861-1867
【Authors】: Can Wang ; Chi-Hung Chi ; Wei Zhou ; Raymond K. Wong
【Abstract】: In the real-world applications, heterogeneous interdependent attributes that consist of both discrete and numerical variables can be observed ubiquitously. The usual representation of these data sets is an information table, assuming the independence of attributes. However, very often, they are actually interdependent on one another, either explicitly or implicitly. Limited research has been conducted in analyzing such attribute interactions, which causes the analysis results to be more local than global. This paper proposes the coupled heterogeneous attribute analysis to capture the interdependence among mixed data by addressing coupling context and coupling weights in unsupervised learning. Such global couplings integrate the interactions within discrete attributes, within numerical attributes and across them to form the coupled representation for mixed type objects based on dimension conversion and feature selection. This work makes one step forward towards explicitly modeling the interdependence of heterogeneous attributes among mixed data, verified by the applications in data structure analysis, data clustering evaluation, and density comparison. Substantial experiments on 12 UCI data sets show that our approach can effectively capture the global couplings of heterogeneous attributes and outperforms the state-of-the-art methods, supported by statistical analysis.
【Keywords】: Coupling Relationship; Interdependence; Heterogenous Attributes; Mixed Data
【Paper Link】 【Pages】:1868-1874
【Authors】: Xin Wang ; Ying Wang ; Wanli Zuo ; Guoyong Cai
【Abstract】: With the pervasion of social media, topic identification in short texts attracts increasing attention in recent years. However, in nature the texts of social media are short and noisy, and the structures are sparse and dynamic, resulting in difficulty to identify topic categories exactly from online social media. Inspired by social science findings that preference consistency and social contagion are observed in social media, we investigate topic identification in short and noisy texts by exploring social context from the perspective of social sciences. In particular, we present a mathematical optimization formulation that incorporates the preference consistency and social contagion theories into a supervised learning method, and conduct feature selection to tackle short and noisy texts in social media, which result in a Sociological framework for Topic Identification (STI). Experimental results on real-world datasets from Twitter and Citation Network demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of social context in topic identification.
【Keywords】: Topic Identification; Short and Noisy Texts; Preference Consistency; Social Contagion; Lasso
【Paper Link】 【Pages】:1875-1881
【Authors】: Ying Wang ; Xin Wang ; Jiliang Tang ; Wanli Zuo ; Guoyong Cai
【Abstract】: With the pervasion of social media, trust has been playing more of an important role in helping online users collect reliable information. In reality, user-specified trust relations are often very sparse; hence, inferring unknown trust relations has attracted increasing attention in recent years. Social status is one of the most important concepts in trust, and status theory is developed to help us understand the important role of social status in the formation of trust relations. In this paper, we investigate how to exploit social status in trust prediction by modeling status theory. We first vertify status theory in trust relations, then provide a principled way to model it mathematically, and propose a novel framework sTrust which incorporates status theory for trust prediction. Experimental results on real-world datasets demonstrate the effectiveness of the proposed framework. Futher experiments are conducted to understand the importance of status theory in trust prediction.
【Keywords】: Trust Prediction, Status Theory, Matrix Factorization
【Paper Link】 【Pages】:1882-1888
【Authors】: Lan Wei ; YongHong Tian ; Yaowei Wang ; Tiejun Huang
【Abstract】: Human gait has been shown to be an efficient biometric measure for person identification at a distance. However, it often needs different gait features to handle various covariate conditions including viewing angles, walking speed, carrying an object and wearing different types of shoes. In order to improve the robustness of gait-based person re-identification on such multi-covariate conditions, a novel Swiss-system based cascade ranking model is proposed in this paper. Since the ranking model is able to learn a subspace where the potential true match is given the highest ranking, we formulate the gait-based person re-identification as a bipartite ranking problem and utilize it as an effective way for multi-feature ensemble learning. Then a Swiss multi-round competition system is developed for the cascade ranking model to optimize its effectiveness and efficiency. Extensive experiments on three indoor and outdoor public datasets demonstrate that our model outperforms several state-of-the-art methods remarkably.
【Keywords】: Person re-identification; Gait recognition; Cascade ranking; Swiss system
【Paper Link】 【Pages】:1889-1895
【Authors】: Eric Wong ; J. Zico Kolter
【Abstract】: Motivated by problems such as molecular energy prediction, we derive an (improper) kernel between geometric inputs, that is able to capture the relevant rotational and translation invariances in geometric data. Since many physical simulations based upon geometric data produce derivatives of the output quantity with respect to the input positions, we derive an approach that incorporates derivative information into our kernel learning. We further show how to exploit the low rank structure of the resulting kernel matrices to speed up learning. Finally, we evaluated the method in the context of molecular energy prediction, showing good performance for modeling previously unseen molecular configurations. Integrating the approach into a Bayesian optimization, we show substantial improvement over the state of the art in molecular energy optimization.
【Keywords】: kernel; Gaussian process; singular value decomposition; derivatives
【Paper Link】 【Pages】:1896-1902
【Authors】: Pengtao Xie ; Yulong Pei ; Yuan Xie ; Eric P. Xing
【Abstract】: Personal photos are enjoying explosive growth with the popularity of photo-taking devices and social media. The vast amount of online photos largely exhibit users' interests, emotion and opinions. Mining user interests from personal photos can boost a number of utilities, such as advertising, interest based community detection and photo recommendation. In this paper, we study the problem of user interests mining from personal photos. We propose a User Image Latent Space Model to jointly model user interests and image contents. User interests are modeled as latent factors and each user is assumed to have a distribution over them. By inferring the latent factors and users' distributions, we can discover what the users are interested in. We model image contents with a four-level hierarchical structure where the layers correspond to themes, semantic regions, visual words and pixels respectively. Users' latent interests are embedded in the theme layer. Given image contents, users' interests can be discovered by doing posterior inference. We use variational inference to approximate the posteriors of latent variables and learn model parameters. Experiments on 180K Flickr photos demonstrate the effectiveness of our model.
【Keywords】:
【Paper Link】 【Pages】:1903-1909
【Authors】: Pengtao Xie ; Eric P. Xing
【Abstract】: Image clustering and visual codebook learning are two fundamental problems in computer vision and they are tightly related. On one hand, a good codebook can generate effective feature representations which largely affect clustering performance. On the other hand, class labels obtained from image clustering can serve as supervised information to guide codebook learning. Traditionally, these two processes are conducted separately and their correlation is generally ignored.In this paper, we propose a Double Layer Gaussian Mixture Model (DLGMM) to simultaneously perform image clustering and codebook learning. In DLGMM, two tasks are seamlessly coupled and can mutually promote each other. Cluster labels and codebook are jointly estimated to achieve the overall best performance. To incorporate the spatial coherence between neighboring visual patches, we propose a Spatially Coherent DLGMM which uses a Markov Random Field to encourage neighboring patches to share the same visual word label.We use variational inference to approximate the posterior of latent variables and learn model parameters.Experiments on two datasets demonstrate the effectiveness of two models.
【Keywords】:
【Paper Link】 【Pages】:1910-1916
【Authors】: Bo Xin ; Lingjing Hu ; Yizhou Wang ; Wen Gao
【Abstract】: Neuroimage analysis usually involves learning thousands or even millions of variables using only a limited number of samples. In this regard, sparse models, e.g. the lasso, are applied to select the optimal features and achieve high diagnosis accuracy. The lasso, however, usually results in independent unstable features. Stability, a manifest of reproducibility of statistical results subject to reasonable perturbations to data and the model (Yu 2013), is an important focus in statistics, especially in the analysis of high dimensional data. In this paper, we explore a nonnegative generalized fused lasso model for stable feature selection in the diagnosis of Alzheimer's disease. In addition to sparsity, our model incorporates two important pathological priors: the spatial cohesion of lesion voxels and the positive correlation between the features and the disease labels. To optimize the model, we propose an efficient algorithm by proving a novel link between total variation and fast network flow algorithms via conic duality. Experiments show that the proposed nonnegative model performs much better in exploring the intrinsic structure of data via selecting stable features compared with other state-of-the-arts.
【Keywords】: stable feature selection; nonnegative generalized fused lasso; Alzheimer's disease
【Paper Link】 【Pages】:1917-1923
【Authors】: Xin Xin ; Chunwei Lu ; Yashen Wang ; Heyan Huang
【Abstract】: Accurate road speed predictions can help drivers in smart route planning. Although the issue has been studied previously, most existing work focus on arterial roads only, where sensors are configured closely for collecting complete real-time data. For collector roads where sensors sparsly cover, however, speed predictions are often ignored. With GPS-equipped floating car signals being available nowadays, we aim at forecasting collector road speeds by utilizing these signals. The main challenge compared with arterial roads comes from the missing data. In a time slot of the real case, over 90% of collector roads cannot be covered by enough floating cars. Thus most traditional approaches for arterial roads, relying on complete historical data, cannot be employed directly. Aiming at solving this problem, we propose a multi-view road speed prediction framework. In the first view, temporal patterns are modeled by a layered hidden Markov model; and in the second view, spatial patterns are modeled by a collective matrix factorization model. The two models are learned and inferred simultaneously in a co-regularized manner. Experiments conducted in the Beijing road network, based on 10K taxi signals in 2 years, have demonstrated that the approach outperforms traditional approaches by 10% in MAE and RMSE.
【Keywords】: Dealing with Missing Values;Matrix Factorization;Applications;Multi-view Learning;Hidden Markov Model
【Paper Link】 【Pages】:1924-1930
【Authors】: Chang Xu ; Dacheng Tao ; Chao Xu
【Abstract】: In multi-label learning, an example is represented by a descriptive feature associated with several labels. Simply considering labels as independent or correlated is crude; it would be beneficial to define and exploit the causality between multiple labels. For example, an image label 'lake' implies the label 'water', but not vice versa. Since the original features are a disorderly mixture of the properties originating from different labels, it is intuitive to factorize these raw features to clearly represent each individual label and its causality relationship.Following the large-margin principle, we propose an effective approach to discover the causal features of multiple labels, thus revealing the causality between labels from the perspective of feature. We show theoretically that the proposed approach is a tight approximation of the empirical multi-label classification error, and the causality revealed strengthens the consistency of the algorithm. Extensive experimentations using synthetic and real-world data demonstrate that the proposed algorithm effectively discovers label causality, generates causal features, and improves multi-label learning.
【Keywords】: Multi-label learning; label relationship
【Paper Link】 【Pages】:1931-1937
【Authors】: Linli Xu ; Aiqing Huang ; Jianhui Chen ; Enhong Chen
【Abstract】: In multi-task learning, multiple related tasks are considered simultaneously, with the goal to improve the generalization performance by utilizing the intrinsic sharing of information across tasks. This paper presents a multi-task learning approach by modeling the task-feature relationships. Specifically, instead of assuming that similar tasks have similar weights on all the features, we start with the motivation that the tasks should be related in terms of subsets of features, which implies a co-cluster structure. We design a novel regularization term to capture this task-feature co-cluster structure. A proximal algorithm is adopted to solve the optimization problem. Convincing experimental results demonstrate the effectiveness of the proposed algorithm and justify the idea of exploiting the task-feature relationships.
【Keywords】: Multi-Task Learning; Co-Cluster Structure; Task-Feature Relationships
【Paper Link】 【Pages】:1938-1944
【Authors】: Linli Xu ; Yitan Li ; Yubo Wang ; Enhong Chen
【Abstract】: We examine the fundamental problem of background modeling which is to model the background scenes in video sequences and segment the moving objects from the background. A novel approach is proposed based on the Restricted Boltzmann Machine (RBM) while exploiting the temporal nature of the problem. In particular, we augment the standard RBM to take a window of sequential video frames as input and generate the background model while enforcing the background smoothly adapting to the temporal changes. As a result, the augmented temporally adaptive model can generate stable background given noisy inputs and adapt quickly to the changes in background while keeping all the advantages of RBMs including exact inference and effective learning procedure. Experimental results demonstrate the effectiveness of the proposed method in modeling the temporal nature in background.
【Keywords】: background modeling; background subtraction; Restricted Boltzmann Machines; unsupervised learning; temporality; video sequence
【Paper Link】 【Pages】:1945-1951
【Authors】: Junchi Yan ; Chao Zhang ; Hongyuan Zha ; Min Gong ; Changhua Sun ; Jin Huang ; Stephen M. Chu ; Xiaokang Yang
【Abstract】: Sales pipeline win-propensity prediction is fundamental to effective sales management. In contrast to using subjective human rating, we propose a modern machine learning paradigm to estimate the win-propensity of sales leads over time. A profile-specific two-dimensional Hawkes processes model is developed to capture the influence from seller's activities on their leads to the win outcome, coupled with lead's personalized profiles. It is motivated by two observations: i) sellers tend to frequently focus their selling activities and efforts on a few leads during a relatively short time. This is evidenced and reflected by their concentrated interactions with the pipeline, including login, browsing and updating the sales leads which are logged by the system; ii) the pending opportunity is prone to reach its win outcome shortly after such temporally concentrated interactions. Our model is deployed and in continual use to a large, global, B2B multinational technology enterprize (Fortune 500) with a case study. Due to the generality and flexibility of the model, it also enjoys the potential applicability to other real-world problems.
【Keywords】:
【Paper Link】 【Pages】:1952-1958
【Authors】: Bo Yang ; Xuehua Zhao ; Xueyan Liu
【Abstract】: There has been an increasing interest in exploring signed networks with positive and negative links in that they contain more information than unsigned networks. As fundamental problems of signed network analysis, community detection and sign (or attitude) prediction are still primary challenges. To address them, we propose a generative Bayesian approach, in which 1) a signed stochastic blockmodel is proposed to characterize the community structure in context of signed networks, by means of explicitly formulating the distributions of both density and frustration of signed links from a stochastic perspective, and 2) a model learning algorithm is proposed by theoretically deriving a variational Bayes EM for parameter estimation and a variation based approximate evidence for model selection. Through the comparisons with state-of-the-art methods on synthetic and real-world networks, the proposed approach shows its superiority in both community detection and sign prediction for exploratory networks.
【Keywords】: signed network community detection; sign predicition; stochastic blockmodeling; variational Bayes approach
【Paper Link】 【Pages】:1959-1965
【Authors】: Quanming Yao ; James T. Kwok
【Abstract】: Colorization aims at recovering the original color of a monochrome image from only a few color pixels. A state-of-the-art approach is based on matrix completion, which assumes that the target color image is low-rank. However, this low-rank assumption is often invalid on natural images. In this paper, we propose a patch-based approach that divides the image into patches and then imposes a low-rank structure only on groups of similar patches. Each local matrix completion problem is solved by an accelerated version of alternating direction method of multipliers (ADMM), and each AD-MM subproblem is solved efficiently by divide-and-conquer. Experiments on a number of benchmark images demonstrate that the proposed method outperforms existing approaches.
【Keywords】:
【Paper Link】 【Pages】:1966-1972
【Authors】: Shandian Zhe ; Zenglin Xu ; Yuan Qi ; Peng Yu
【Abstract】: In the analysis and diagnosis of many diseases, such as the Alzheimer's disease (AD), two important and related tasks are usually required: i) selecting genetic and phenotypical markers for diagnosis, and ii) identifying associations between genetic and phenotypical features. While previous studies treat these two tasks separately, they are tightly coupled due to the same underlying biological basis. To harness their potential benefits for each other, we propose a new sparse Bayesian approach to jointly carry out the two important and related tasks. In our approach, we extract common latent features from different data sources by sparse projection matrices and then use the latent features to predict disease severity levels; in return, the disease status can guide the learning of sparse projection matrices, which not only reveal interactions between data sources but also select groups of related biomarkers. In order to boost the learning of sparse projection matrices, we further incorporate graph Laplacian priors encoding the valuable linkage disequilibrium (LD) information. To efficiently estimate the model, we develop a variational inference algorithm. Analysis on an imaging genetics dataset for AD study shows that our model discovers biologically meaningful associations between single nucleotide polymorphisms (SNPs) and magnetic resonance imaging (MRI) features, and achieves significantly higher accuracy for predicting ordinal AD stages than competitive methods.
【Keywords】: Alzheimer's Disease; Association Study; Bayesian Sparse Modeling; Multiview Learning; Spike and Slab priors
【Paper Link】 【Pages】:1973-1979
【Authors】: Shuai Zheng ; Xiao Cai ; Chris H. Q. Ding ; Feiping Nie ; Heng Huang
【Abstract】: Real life data often includes information from different channels. For example, in computer vision, we can describe an image using different image features, such as pixel intensity, color, HOG, GIST feature, SIFT features, etc.. These different aspects of the same objects are often called multi-view (or multi-modal) data. Low-rank regression model has been proved to be an effective learning mechanism by exploring the low-rank structure of real life data. But previous low-rank regression model only works on single view data. In this paper, we propose a multi-view low-rank regression model by imposing low-rank constraints on multi-view regression model. Most importantly, we provide a closed-form solution to the multi-view low-rank regression model. Extensive experiments on 4 multi-view datasets show that the multi-view low-rank regression model outperforms single-view regression model and reveals that multi-view low-rank structure is very helpful.
【Keywords】:
【Paper Link】 【Pages】:1980-1987
【Authors】: Xiaowei Zhong ; Linli Xu ; Yitan Li ; Zhiyuan Liu ; Enhong Chen
【Abstract】: Recently, solving rank minimization problems by leveraging nonconvex relaxations has received significant attention. Some theoretical analyses demonstrate that it can provide a better approximation of original problems than convex relaxations. However, designing an effective algorithm to solve nonconvex optimization problems remains a big challenge. In this paper, we propose an Iterative Shrinkage-Thresholding and Reweighted Algorithm (ISTRA) to solve rank minimization problems using the nonconvex weighted nuclear norm as a low rank regularizer. We prove theoretically that under certain assumptions our method achieves a high-quality local optimal solution efficiently. Experimental results on synthetic and real data show that the proposed ISTRA algorithm outperforms state-of-the-art methods in both accuracy and efficiency.
【Keywords】: low rank; optimization; nonconvex relaxation; matrix completion
【Paper Link】 【Pages】:1988-1994
【Authors】: Stefano Vittorino Albrecht ; Jacob William Crandall ; Subramanian Ramamoorthy
【Abstract】: Many multiagent applications require an agent to learn quickly how to interact with previously unknown other agents. To address this problem, researchers have studied learning algorithms which compute posterior beliefs over a hypothesised set of policies, based on the observed actions of the other agents. The posterior belief is complemented by the prior belief, which specifies the subjective likelihood of policies before any actions are observed. In this paper, we present the first comprehensive empirical study on the practical impact of prior beliefs over policies in repeated interactions. We show that prior beliefs can have a significant impact on the long-term performance of such methods, and that the magnitude of the impact depends on the depth of the planning horizon. Moreover, our results demonstrate that automatic methods can be used to compute prior beliefs with consistent performance effects. This indicates that prior beliefs could be eliminated as a manual parameter and instead be computed automatically.
【Keywords】: Prior Beliefs, Policy Types, Bayesian Games, Multiagent Systems
【Paper Link】 【Pages】:1995-2002
【Authors】: Christopher Amato ; Frans A. Oliehoek
【Abstract】: Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces. This is particularly problematic in multiagent POMDPs where the action and observation space grows exponentially with the number of agents. To combat this intractability, we propose a novel scalable approach based on sample-based planning and factored value functions that exploits structure present in many multiagent settings. This approach applies not only in the planning case, but also in the Bayesian reinforcement learning setting. Experimental results show that we are able to provide high quality solutions to large multiagent planning and learning problems.
【Keywords】: POMDPs; Multiagent POMDPs; Reinforcement Learning; Online Planning
【Paper Link】 【Pages】:2003-2009
【Authors】: Ofra Amir ; Guni Sharon ; Roni Stern
【Abstract】: This paper proposes a mapping between multi-agent pathfinding (MAPF) and combinatorial auctions (CAs). In MAPF, agents need to reach their goal destinations without colliding. Algorithms for solving MAPF aim at assigning agents non-conflicting paths that minimize agents' travel costs. In CA problems, agents bid over bundles of items they desire. Auction mechanisms aim at finding an allocation of bundles that maximizes social welfare. In the proposed mapping of MAPF to CAs, agents bid on paths to their goals and the auction allocates non-colliding paths to the agents. Using this formulation, auction mechanisms can be naturally used to solve a range of MAPF problem variants. In particular, auction mechanisms can be applied to non-cooperative settings with self-interested agents while providing optimality guarantees and robustness to manipulations by agents. The paper further shows how to efficiently implement an auction mechanism for MAPF, utilizing methods and representations from both the MAPF and CA literatures.
【Keywords】: Multi-agent pathfinding; Combinatorial auctions
【Paper Link】 【Pages】:2010-2016
【Authors】: Samuel Barrett ; Peter Stone
【Abstract】: Many scenarios require that robots work together as a team in order to effectively accomplish their tasks. However, pre-coordinating these teams may not always be possible given the growing number of companies and research labs creating these robots. Therefore, it is desirable for robots to be able to reason about ad hoc teamwork and adapt to new teammates on the fly. Past research on ad hoc teamwork has focused on relatively simple domains, but this paper demonstrates that agents can reason about ad hoc teamwork in complex scenarios. To handle these complex scenarios, we introduce a new algorithm, PLASTIC–Policy, that builds on an existing ad hoc teamwork approach. Specifically, PLASTIC– Policy learns policies to cooperate with past teammates and reuses these policies to quickly adapt to new teammates. This approach is tested in the 2D simulation soccer league of RoboCup using the half field offense task.
【Keywords】: Ad hoc teamwork; Multiagent systems; Robot soccer; Reinforcement learning
【Paper Link】 【Pages】:2017-2023
【Authors】: Rahmatollah Beheshti ; Awrad Mohammed Ali ; Gita Reese Sukthankar
【Abstract】: In many cases, creating long-term solutions to sustainability issues requires not only innovative technology, but also large-scale public adoption of the proposed solutions. Social simulations are a valuable but underutilized tool that can help public policy researchers understand when sustainable practices are likely to make the delicate transition from being an individual choice to becoming a social norm. In this paper, we introduce a new normative multi-agent architecture, Cognitive Social Learners (CSL), that models bottom-up norm emergence through a social learning mechanism, while using BDI (Belief/Desire/Intention) reasoning to handle adoption and compliance. CSL preserves a greater sense of cognitive realism than influence propagation or infectious transmission approaches, enabling the modeling of complex beliefs and contradictory objectives within an agent-based simulation. In this paper, we demonstrate the use of CSL for modeling norm emergence of recycling practices and public participation in a smoke-free campus initiative.
【Keywords】: normative multiagent systems; social simulation; BDI architectures; social learning
【Paper Link】 【Pages】:2024-2030
【Authors】: Adi Botea ; Pavel Surynek
【Abstract】: Much of the literature on multi-agent path finding focuses on undirected graphs, where motion is permitted in both directions along a graph edge. Despite this, travelling on directed graphs is relevant in navigation domains, such as pathfinding in games, and asymmetric communication networks. We consider multi-agent path finding on strongly biconnected directed graphs. We show that all instances with at least two unoccupied positions can be solved or proven unsolvable. We present a polynomial-time algorithm for this class of problems, and analyze its complexity. Our work may be the first formal study of multi-agent path finding on directed graphs.
【Keywords】:
【Paper Link】 【Pages】:2031-2037
【Authors】: Diego Calvanese ; Giorgio Delzanno ; Marco Montali
【Abstract】: We study the extension of relational multiagent systems (RMASs), where agents manipulate full-fledged relational databases, with data types and facets equipped with domain-specific, rigid relations (such as total orders). Specifically, we focus on design-time verification of RMASs against rich first-order temporal properties expressed in a variant of first-order mu-calculus with quantification across states. We build on previous decidability results under the state-bounded assumption, i.e., in each single state only a bounded number of data objects is stored in the agent databases, while unboundedly many can be encountered over time. We recast this condition, showing decidability in presence of dense, linear orders, and facets defined on top of them. Our approach is based on the construction of a finite-state, sound and complete abstraction of the original system, in which dense linear orders are reformulated as non-rigid relations working on the active domain of the system only. We also show undecidability when including a data type equipped with the successor relation.
【Keywords】: relational multiagent systems; data-aware dynamic systems; temporal logics; formal verification
【Paper Link】 【Pages】:2038-2044
【Authors】: Petr Cermák ; Alessio Lomuscio ; Aniello Murano
【Abstract】: Strategy Logic (SL) has recently come to the fore as a useful specification language to reason about multi-agent systems. Its one-goal fragment, or SL[1G], is of particular interest as it strictly subsumes widely used logics such as ATL*, while maintaining attractive complexity features. In this paper we put forward an automata-based methodology for verifying and synthesising multi-agent systems against specifications given in SL[1G]. We show that the algorithm is sound and optimal from a computational point of view. A key feature of the approach is that all data structures and operations on them can be performed on BDDs. We report on a BDD-based model checker implementing the algorithm and evaluate its performance on the fair process scheduler synthesis.
【Keywords】: Model Checking; Synthesis; Strategy Logic
【Paper Link】 【Pages】:2045-2051
【Authors】: Jiehua Chen ; Piotr Faliszewski ; Rolf Niedermeier ; Nimrod Talmon
【Abstract】: We study the computational complexity of candidate control in elections with few voters (that is, we take the number of voters as a parameter). We consider both the standard scenario of adding and deleting candidates, where one asks if a given candidate can become a winner (or, in the destructive case, can be precluded from winning) by adding/deleting some candidates, and a combinatorial scenario where adding/deleting a candidate automatically means adding/deleting a whole group of candidates. Our results show that the parameterized complexity of candidate control (with the number of voters as the parameter) is much more varied than in the setting with many voters.
【Keywords】: Voting; NP-hard election control problem; parameterized complexity; W-hardness, fixed-parameter tractability
【Paper Link】 【Pages】:2052-2059
【Authors】: Amit K. Chopra ; Munindar P. Singh
【Abstract】: We propose Cupid, a language for specifying commitments that supports their information-centric aspects, and offers crucial benefits. One, Cupid is first-order, enabling a systematic treatment of commitment instances. Two, Cupid supports features needed for real-world scenarios such as deadlines, nested commitments, and complex event expressions for capturing the lifecycle of commitment instances. Three, Cupid maps to relational database queries and thus provides a set-based semantics for retrieving commitment instances in states such as being violated, discharged, and so on. We prove that Cupid queries are safe. Four, to aid commitment modelers, we propose the notion of well-identified commitments, and finitely violable and finitely expirable commitments. We give syntactic restrictions for obtaining such commitments.
【Keywords】: commitments; commitment protocols; information; databases; relational algebra
【Paper Link】 【Pages】:2060-2066
【Authors】: Akin Günay ; Songzheng Song ; Yang Liu ; Jie Zhang
【Abstract】: Commitment protocols provide an effective formalism for the regulation of agent interaction. Although existing work mainly focus on the design-time development of static commitment protocols, recent studies propose methods to create them dynamically at run-time with respect to the goals of the agents. These methods require agents to verify new commitment protocols taking their goals, and beliefs about the other agents’ behavior into account. Accordingly, in this paper, we first propose a probabilistic model to formally capture commitment protocols according to agents’ beliefs. Secondly, we identify a set of important properties for the verification of a new commitment protocol from an agent’s perspective and formalize these properties in our model. Thirdly, we develop probabilistic model checking algorithms with advanced reduction for efficient verification of these properties. Finally, we implement these algorithms as a tool and evaluate the proposed properties over different commitment protocols.
【Keywords】: commitments; verification; probabilistic model checking
【Paper Link】 【Pages】:2067-2073
【Authors】: Mohammad Rashedul Hasan ; Anita Raja ; Ana L. C. Bazzan
【Abstract】: In this paper, we design a distributed mechanism that is able to create a social convention within a large convention space for multiagent systems (MAS) operating on various topologies. Specifically, we investigate a language coordination problem in which agents in a dynamic MAS construct a common lexicon in a decentralized fashion. Agent interactions are modeled using a language game where every agent repeatedly plays with its neighbors. Each agent stochastically updates its lexicons based on the utility values of the received lexicons from its immediate neighbors. We present a novel topology-aware utility computation mechanism and equip the agents with the ability to reorganize their neighborhood based on this utility estimate to expedite the convention formation process. Extensive simulation results indicate that our proposed mechanism is both effective (able to converge into a large majority convention state with more than 90\% agents sharing a high-quality lexicon) and efficient (faster) as compared to state-of-the-art approaches for social conventions in large convention spaces.
【Keywords】: Convention formation; Lexicon Convention; Dynamic Networks; Scale-free Network; Small-world network; Network thinking; Link diversity
【Paper Link】 【Pages】:2074-2080
【Authors】: Daisuke Hatano ; Yuichi Yoshida
【Abstract】: We introduce a new framework for solving distributed constraint optimization problems that extend the domain of each variable into a simplex.We propose two methods for searching the extended domain for good assignments.The first one relaxes the problem using linear programming, finds the optimum LP solution, and rounds it to an assignment.The second one plays a cost-minimization game, finds a certain kind of equilibrium, and rounds it to an assignment.Both methods are realized by performing the multiplicative weights method in a distributed manner.We experimentally demonstrate that our methods have good scalability,and in particular, the second method outperforms existing algorithms in terms of solution quality and efficiency.
【Keywords】:
【Paper Link】 【Pages】:2081-2088
【Authors】: Panagiotis Kouvaros ; Alessio Lomuscio
【Abstract】: We study parameterised verification of robot swarms against temporal-epistemic specifications. We relax some of the significant restrictions assumed in the literature and present a counter abstraction approach that enable us to verify a potentially much smaller abstract model when checking a formula on a swarm of any size. We present an implementation and discuss experimental results obtained for the alpha algorithm for robot swarms.
【Keywords】: Verification; Model Checking; Parameterised Verification; Robotic Swarms
【Paper Link】 【Pages】:2089-2095
【Authors】: Haifang Li ; Fei Tian ; Wei Chen ; Tao Qin ; Zhiming Ma ; Tie-Yan Liu
【Abstract】: For Internet applications like sponsored search, cautions need to be taken when using machine learning to optimize their mechanisms (e.g., auction) since self-interested agents in these applications may change their behaviors (and thus the data distribution) in response to the mechanisms. To tackle this problem, a framework called game-theoretic machine learning (GTML) was recently proposed, which first learns a Markov behavior model to characterize agents' behaviors, and then learns the optimal mechanism by simulating agents' behavior changes in response to the mechanism. While GTML has demonstrated practical success, its generalization analysis is challenging because the behavior data are non-i.i.d. and dependent on the mechanism. To address this challenge, first, we decompose the generalization error for GTML into the behavior learning error and the mechanism learning error; second, for the behavior learning error, we obtain novel non-asymptotic error bounds for both parametric and non-parametric behavior learning methods; third, for the mechanism learning error, we derive a uniform convergence bound based on a new concept called \emph{nested covering number} of the mechanism space and the generalization analysis techniques developed for mixing sequences.
【Keywords】: Markov behavior model; Empirical risk minimization; Game-theoretic machine learning; Generalization analysis
【Paper Link】 【Pages】:2096-2102
【Authors】: Patrick MacAlpine ; Eric Price ; Peter Stone
【Abstract】: Teams of mobile robots often need to divide up subtasks efficiently. In spatial domains, a key criterion for doing so may depend on distances between robots and the subtasks' locations. This paper considers a specific such criterion, namely how to assign interchangeable robots, represented as point masses, to a set of target goal locations within an open two dimensional space such that the makespan (time for all robots to reach their target locations) is minimized while also preventing collisions among robots. We present scaleable (computable in polynomial time) role assignment algorithms that we classify as being SCRAM (Scalable Collision-avoiding Role Assignment with Minimal-makespan). SCRAM role assignment algorithms use a graph theoretic approach to map agents to target goal locations such that our objectives for both minimizing the makespan and avoiding agent collisions are met. A system using SCRAM role assignment was originally designed to allow for decentralized coordination among physically realistic simulated humanoid soccer playing robots in the partially observable, non-deterministic, noisy, dynamic, and limited communication setting of the RoboCup 3D simulation league. In its current form, SCRAM role assignment generalizes well to many realistic and real-world multiagent systems, and scales to thousands of agents.
【Keywords】:
【Paper Link】 【Pages】:2103-2109
【Authors】: Reshef Meir
【Abstract】: Understanding the nature of strategic voting is the holy grail of social choice theory, where game-theory, social science and recently computational approaches are all applied in order to model the incentives and behavior of voters. In a recent paper, Meir et al.[EC'14] made another step in this direction, by suggesting a behavioral game-theoretic model for voters under uncertainty. For a specific variation of best-response heuristics, they proved initial existence and convergence results in the Plurality voting system. This paper extends the model in multiple directions, considering voters with different uncertainty levels, simultaneous strategic decisions, and a more permissive notion of best-response. It is proved that a voting equilibrium exists even in the most general case. Further, any society voting in an iterative setting is guaranteed to converge to an equilibrium. An alternative behavior is analyzed, where voters try to minimize their worst-case regret. As it turns out, the two behaviors coincide in the simple setting of Meir et al.[EC'14], but not in the general case.
【Keywords】: voting; convergence; equilibrium; regret minimization
【Paper Link】 【Pages】:2110-2116
【Authors】: Ernesto Nunes ; Maria L. Gini
【Abstract】: We propose an auction algorithm to allocate tasks that have temporal constraints to cooperative robots. Temporal constraints are expressed as time windows, within which a task must be executed. There are no restrictions on the time windows, which are allowed to overlap. Robots model their temporal constraints using a simple temporal network, enabling them to maintain consistent schedules. When bidding on a task, a robot takes into account its own current commitments and an optimization objective, which is to minimize the time of completion of the last task alone or in combination with minimizing the distance traveled. The algorithm works both when all the tasks are known upfront and when tasks arrive dynamically. We show the performance of the algorithm in simulation with different numbers of tasks and robots, and compare it with a baseline greedy algorithm and a state-of-the-art auction algorithm. Our algorithm is computationally frugal and consistently allocates more tasks than the competing algorithms.
【Keywords】: Multi-robot task allocation; Overlapping time windows; Auctions
【Paper Link】 【Pages】:2117-2123
【Authors】: Luke Riley ; Katie Atkinson ; Paul E. Dunne ; Terry R. Payne
【Abstract】: Within characteristic function games, agents have the option of joining one of many different coalitions, based on the utility value of each candidate coalition. However, determining this utility value can be computationally complex since the number of coalitions increases exponentially with the number of agents available. Various approaches have been proposed that mediate this problem by distributing the computational load so that each agent calculates only a subset of coalition values. However, current approaches are either highly inefficient due to redundant calculations, or make the benevolence assumption (i.e. are not suitable for adversarial environments). We introduce DCG, a novel algorithm that distributes the calculations of coalition utility values across a community of agents, such that: (i) no inter-agent communication is required; (ii) the coalition value calculations are (approximately) equally partitioned into shares, one for each agent; (iii) the utility value is calculated only once for each coalition, thus redundant calculations are eliminated; (iv) there is an equal number of operations for agents with equal sized shares; and (v) an agent is only allocated those coalitions in which it is a potential member. The DCG algorithm is presented and illustrated by means of an example. We formally prove that our approach allocates all of the coalitions to the agents, and that each coalition is assigned once and only once.
【Keywords】: Coalition formation; coalition value calculations; distributed problem solving; coordination without communication; power-set distribution;
【Paper Link】 【Pages】:2124-2130
【Authors】: Piotr Krzysztof Skowron ; Piotr Faliszewski
【Abstract】: We consider the problem of winner determination under Chamberlin--Courant's multiwinner voting rule with approval utilities. This problem is equivalent to the well-known NP-complete MaxCover problem (i.e., a version of the SetCover problem where we aim to cover as many elements as possible) and, so, the best polynomial-time approximation algorithm for it has approximation ratio 1 - 1/e. We show exponential-time/FPT approximation algorithms that, on one hand, achieve arbitrarily good approximation ratios and, on the other hand, have running times much better than known exact algorithms. We focus on the cases where the voters have to approve of at most/at least a given number of candidates.
【Keywords】:
【Paper Link】 【Pages】:2131-2137
【Authors】: Piotr Krzysztof Skowron ; Piotr Faliszewski ; Jérôme Lang
【Abstract】: We consider the following problem: There is a set of items (e.g., movies) and a group of agents (e.g., passengers on a plane); each agent has some intrinsic utility for each of the items. Our goal is to pick a set of K items that maximize the total derived utility of all the agents (i.e., in our example we are to pick K movies that we put on the plane's entertainment system). However, the actual utility that an agent derives from a given item is only a fraction of its intrinsic one, and this fraction depends on how the agent ranks the item among the chosen, available, ones. We provide a formal specification of the model and provide concrete examples and settings where it is applicable. We show that the problem is hard in general, but we show a number of tractability results for its natural special cases.
【Keywords】:
【Paper Link】 【Pages】:2138-2145
【Authors】: Kevin Waugh ; Dustin Morrill ; James Andrew Bagnell ; Michael H. Bowling
【Abstract】: We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work onabstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.
【Keywords】: Extensive-form Games; Abstraction; Nash Equilibrium; Regret; Counterfactual Regret
【Paper Link】 【Pages】:2146-2152
【Authors】: Danushka Bollegala ; Takanori Maehara ; Yuichi Yoshida ; Ken-ichi Kawarabayashi
【Abstract】: Attributes of words and relations between two words are central to numerous tasks in Artificial Intelligence such as knowledge representation, similarity measurement, and analogy detection. Often when two words share one or more attributes in common, they are con- nected by some semantic relations. On the other hand, if there are numerous semantic relations between two words, we can expect some of the attributes of one of the words to be inherited by the other. Motivated by this close connection between attributes and relations, given a relational graph in which words are inter-connected via numerous semantic relations, we propose a method to learn a latent representation for the individual words. The proposed method considers not only the co-occurrences of words as done by existing approaches for word representation learning, but also the semantic relations in which two words co-occur. To evaluate the accuracy of the word representations learnt using the proposed method, we use the learnt word representa- tions to solve semantic word analogy problems. Our experimental results show that it is possible to learn better word representations by using semantic semantics between words.
【Keywords】: Semantic Relations; Word Representations
【Paper Link】 【Pages】:2153-2159
【Authors】: Ziqiang Cao ; Furu Wei ; Li Dong ; Sujian Li ; Ming Zhou
【Abstract】: We develop a Ranking framework upon Recursive Neural Networks (R2N2) to rank sentences for multi-document summarization. It formulates the sentence ranking task as a hierarchical regression process, which simultaneously measures the salience of a sentence and its constituents (e.g., phrases) in the parsing tree. This enables us to draw on word-level to sentence-level supervisions derived from reference summaries.In addition, recursive neural networks are used to automatically learn ranking features over the tree, with hand-crafted feature vectors of words as inputs. Hierarchical regressions are then conducted with learned features concatenating raw features.Ranking scores of sentences and words are utilized to effectively select informative and non-redundant sentences to generate summaries.Experiments on the DUC 2001, 2002 and 2004 multi-document summarization datasets show that R2N2 outperforms state-of-the-art extractive summarization approaches.
【Keywords】:
【Paper Link】 【Pages】:2160-2166
【Authors】: Song Feng ; Sujith Ravi ; Ravi Kumar ; Polina Kuznetsova ; Wei Liu ; Alexander C. Berg ; Tamara L. Berg ; Yejin Choi
【Abstract】: We study Refer-to-as relations as a new type of semanticknowledge. Compared to the much studied Is-a relation,which concerns factual taxonomy knowledge, Refer-to-as relationsaim to address pragmatic semantic knowledge. Forexample, a “penguin” is a “bird” from a taxonomy point ofview, but people rarely refer to a “penguin” as a “bird” invernacular use. This observation closely relates to the entrylevelcategorization studied in Prototype Theory in Psychology.We posit that Refer-to-as relations can be learned fromdata, and that both textual and visual information would behelpful in inferring the relations. By integrating existing lexicalstructure knowledge with language statistics and visualsimilarities, we formulate a collective inference approach tomap all object names in an encyclopedia to commonly usednames for each object. Our contributions include a new labeleddata set, the inference and optimization approach, andthe computed mappings and similarities.
【Keywords】:
【Paper Link】 【Pages】:2167-2173
【Authors】: Rahul Jha ; Reed Coke ; Dragomir R. Radev
【Abstract】: We investigate the task of generating coherent survey articles for scientific topics. We introduce an extractive summarization algorithm that combines a content model with a discourse model to generate coherent and readable summaries of scientific topics using text from scientific articles relevant to the topic. Human evaluation on 15 topics in computational linguistics shows that our system produces significantly more coherent summaries than previous systems. Specifically, our system improves the ratings for coherence by 36% in human evaluation compared to C-Lexrank, a state of the art system for scientific article summarization.
【Keywords】: Summarization;Multi-Document Summarization;Scientific Summarization;Coherent Summarization;Discourse Based Summarization
【Paper Link】 【Pages】:2174-2180
【Authors】: Khang Nhut Lam ; Feras Al Tarouti ; Jugal K. Kalita
【Abstract】: This paper proposes approaches to automatically createa large number of new bilingual dictionaries for low resource languages, especially resource-poor and endangered languages, from a single input bilingual dictionary. Our algorithms produce translations of wordsin a source language to plentiful target languages using available Wordnets and a machine translator (MT). Since our approaches rely on just one input dictionary, available Wordnets and an MT, they are applicable toany bilingual dictionary as long as one of the two languagesis English or has a Wordnet linked to the Princeton Wordnet. Starting with 5 available bilingual dictionaries,we create 48 new bilingual dictionaries. Of these, 30 pairs of languages are not supported by the popular MTs: Google and Bing.
【Keywords】: bilingual dictionaries, Wordnets; multilingual
【Paper Link】 【Pages】:2181-2187
【Authors】: Yankai Lin ; Zhiyuan Liu ; Maosong Sun ; Yang Liu ; Xuan Zhu
【Abstract】: Knowledge graph completion aims to perform link prediction between entities. In this paper, we consider the approach of knowledge graph embeddings. Recently, models such as TransE and TransH build entity and relation embeddings by regarding a relation as translation from head entity to tail entity. We note that these models simply put both entities and relations within the same semantic space. In fact, an entity may have multiple aspects and various relations may focus on different aspects of entities, which makes a common space insufficient for modeling. In this paper, we propose TransR to build entity and relation embeddings in separate entity space and relation spaces. Afterwards, we learn embeddings by first projecting entities from entity space to corresponding relation space and then building translations between projected entities. In experiments, we evaluate our models on three tasks including link prediction, triple classification and relational fact extraction. Experimental results show significant and consistent improvements compared to state-of-the-art baselines including TransE and TransH.
【Keywords】: knowledge graph embedding; knowledge graph completion; relation extraction; knolwedge representation
【Paper Link】 【Pages】:2188-2194
【Authors】: Zhaohui Wu ; C. Lee Giles
【Abstract】: Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific'' prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluations on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.
【Keywords】: Sense-aware Semantic Analysis; Multi-prototype Word Representation; Wikipedia
【Paper Link】 【Pages】:2195-2202
【Authors】: Yu Zhao ; Zhiyuan Liu ; Maosong Sun
【Abstract】: Compositional semantic aims at constructing the meaning of phrases or sentences according to the compositionality of word meanings. In this paper, we propose to synchronously learn the representations of individual words and extracted high-frequency phrases. Representations of extracted phrases are considered as gold standard for constructing more general operations to compose the representation of unseen phrases. We propose a grammatical type specific model that improves the composition flexibility by adopting vector-tensor-vector operations. Our model embodies the compositional characteristics of traditional additive and multiplicative model. Empirical result shows that our model outperforms state-of-the-art composition methods in the task of computing phrase similarities.
【Keywords】: semantic composition;phrase representation;tensor indexing model
【Paper Link】 【Pages】:2203-2209
【Authors】: Zsolt Bitvai ; Trevor Cohn
【Abstract】: Peer-to-peer lending is a new highly liquid market for debt, which is rapidly growing in popularity. Here we consider modelling market rates, developing a non-linear Gaussian Process regression method which incorporates both structured data and unstructured text from the loan application. We show that the peer-to-peer market is predictable, and identify a small set of key factors with high predictive power. Our approach outperforms baseline methods for predicting market rates, and generates substantial profit in a trading simulation.
【Keywords】:
【Paper Link】 【Pages】:2210-2216
【Authors】: Ziqiang Cao ; Sujian Li ; Yang Liu ; Wenjie Li ; Heng Ji
【Abstract】: Topic modeling techniques have the benefits of modeling words and documents uniformly under a probabilistic framework. However, they also suffer from the limitations of sensitivity to initialization and unigram topic distribution, which can be remedied by deep learning techniques. To explore the combination of topic modeling and deep learning techniques, we first explain the standard topic modelfrom the perspective of a neural network. Based on this, we propose a novel neural topic model (NTM) where the representation of words and documents are efficiently and naturally combined into a uniform framework. Extending from NTM, we can easily add a label layer and propose the supervised neural topic model (sNTM) to tackle supervised tasks. Experiments show that our models are competitive in both topic discovery and classification/regression tasks.
【Keywords】:
【Paper Link】 【Pages】:2217-2223
【Authors】: Devendra Singh Chaplot ; Pushpak Bhattacharyya ; Ashwin Paranjape
【Abstract】: Word Sense Disambiguation is a difficult problem to solve in the unsupervised setting. This is because in this setting inference becomes more dependent on the interplay between different senses in the context due to unavailability of learning resources. Using two basic ideas, sense dependency and selective dependency, we model the WSD problem as a Maximum A Posteriori (MAP) Inference Query on a Markov Random Field (MRF) built using WordNet and Link Parser or Stanford Parser. To the best of our knowledge this combination of dependency and MRF is novel, and our graph-based unsupervised WSD system beats state-of-the-art system on SensEval-2, SensEval-3 and SemEval-2007 English all-words datasets while being over 35 times faster.
【Keywords】: Natural Language Processing; Unsupervised Word Sense Disambiguation; Machine Learning; Markov Random Field; Dependency Parser; WSD; MRF
【Paper Link】 【Pages】:2224-2231
【Authors】: Xingyuan Chen ; Yunqing Xia ; Peng Jin ; John A. Carroll
【Abstract】: Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover, a classifier trained on labeled documents may suffer from overfitting and adaptability problems. Dataless text classification (DLTC) has been proposed as a solution to these problems, since it does not require labeled documents. Previous research in DLTC has used explicit semantic analysis of Wikipedia content to measure semantic distance between documents, which is in turn used to classify test documents based on nearest neighbours. The semantic-based DLTC method has a major drawback in that it relies on a large-scale, finely-compiled semantic knowledge base, which is difficult to obtain in many scenarios. In this paper we propose a novel kind of model, descriptive LDA (DescLDA), which performs DLTC with only category description words and unlabeled documents. In DescLDA, the LDA model is assembled with a describing device to infer Dirichlet priors from prior descriptive documents created with category description words. The Dirichlet priors are then used by LDA to induce category-aware latent topics from unlabeled documents. Experimental results with the 20Newsgroups and RCV1 datasets show that: (1) our DLTC method is more effective than the semantic-based DLTC baseline method; and (2) the accuracy of our DLTC method is very close to state-of-the-art supervised text classification methods. As neither external knowledge resources nor labeled documents are required, our DLTC method is applicable to a wider range of scenarios.
【Keywords】: Dataless text classification; descriptive topic model; Dirichlet prior;
【Paper Link】 【Pages】:2232-2238
【Authors】: Lan Du ; John K. Pate ; Mark Johnson
【Abstract】: Documents from the same domain usually discuss similar topics in a similar order. However, the number of topics and the exact topics discussed in each individual document can vary. In this paper we present a simple topic model that uses generalised Mallows models and incomplete topic orderings to incorporate this ordering regularity into the probabilistic generative process of the new model. We show how to reparameterise the new model so that a point-wise sampling algorithm from the Bayesian word segmentation literature can be used for inference. This algorithm jointly samples not only the topic orders and the topic assignments but also topic segmentations of documents. Experimental results show that our model performs significantly better than the other ordering-based topic models on nearly all the corpora that we used, and competitively with other state-of-the-art topic segmentation models on corpora that have a strong ordering regularity.
【Keywords】: topic segmentation; topic model; GMM; ordering
【Paper Link】 【Pages】:2239-2245
【Authors】: Simone Filice ; Danilo Croce ; Roberto Basili
【Abstract】: In Kernel-based Learning the targeted phenomenon is summarized by a set of explanatory examples derived from the training set. When the model size grows with the complexity of the task, such approaches are so computationally demanding that the adoption of comprehensive models is not always viable.In this paper, a general framework aimed at minimizing this problem is proposed: multiple classifiers are stratified and dynamically invoked according to increasing levels of complexity corresponding to incrementally more expressive representation spaces.Computationally expensive inferences are thus adopted only when the classification at lower levels is too uncertain over an individual instance. The application of complex functions is thus avoided where possible, with a significant reduction of the overall costs. The proposed strategy has been integrated within two well-known algorithms: Support Vector Machines and Passive-Aggressive Online classifier.A significant cost reduction (up to 90%), with a negligible performance drop, is observed against two Natural Language Processing tasks, i.e. Question Classification and Sentiment Analysis in Twitter.
【Keywords】: Kernel-based Learning; Stratified Learning; Online Learning
【Paper Link】 【Pages】:2246-2252
【Authors】: Dan Garrette ; Chris Dyer ; Jason Baldridge ; Noah A. Smith
【Abstract】: Combinatory Categorial Grammar (CCG) is a lexicalized grammar formalism in which words are associated with categories that, in combination with a small universal set of rules, specify the syntactic configurations in which they may occur. Categories are selected from a large, recursively-defined set; this leads to high word-to-category ambiguity, which is one of the primary factors that make learning CCG parsers difficult, especially in the face of little data. Previous work has shown that learning sequence models for CCG tagging can be improved by using linguistically-motivated prior probability distributions over potential categories. We extend this approach to the task of learning a CCG parser from weak supervision. We present a Bayesian formulation for CCG parser induction that assumes only supervision in the form of an incomplete tag dictionary mapping some word types to sets of potential categories. Our approach outperforms a baseline model trained with uniform priors by exploiting universal, intrinsic properties of the CCG formalism to bias the model toward simpler, more cross-linguistically common categories.
【Keywords】: natural language processing; machine learning; computational linguistics; parsing; grammar; ccg; combinatory categorial grammar
【Paper Link】 【Pages】:2253-2259
【Authors】: Dishan Gupta ; Jaime G. Carbonell ; Anatole Gershman ; Steve Klein ; David Miller
【Abstract】: Unsupervised discovery of synonymous phrases is useful in a variety of tasks ranging from text mining and search engines to semantic analysis and machine translation. This paper presents an unsupervised corpus-based conditional model: Near-Synonym System (NeSS) for finding phrasal synonyms and near synonyms that requires only a large monolingual corpus. The method is based on maximizing information-theoretic combinations of shared contexts and is parallelizable for large-scale processing. An evaluation framework with crowd-sourced judgments is proposed and results are compared with alternate methods, demonstrating considerably superior results to the literature and to thesaurus look up for multi-word phrases. Moreover, the results show that the statistical scoring functions and overall scalability of the system are more important than language specific NLP tools. The method is language-independent and practically useable due to accuracy and real-time performance via parallel decomposition.
【Keywords】: Phrasal Synonyms; Paraphrase Acquisition; Monolingual Corpora; Distributional Similarity
【Paper Link】 【Pages】:2260-2266
【Authors】: Seungyeon Kim ; Joonseok Lee ; Guy Lebanon ; Haesun Park
【Abstract】: The n-gram model has been widely used to capture the local ordering of words, yet its exploding feature space often causes an estimation issue. This paper presents local context sparse coding (LCSC), a non-probabilistic topic model that effectively handles large feature spaces using sparse coding. In addition, it introduces a new concept of locality, local contexts, which provides a representation that can generate locally coherent topics and document representations. Our model efficiently finds topics and representations by applying greedy coordinate descent updates. The model is useful for discovering local topics and the semantic flow of a document, as well as constructing predictive models.
【Keywords】:
【Paper Link】 【Pages】:2267-2273
【Authors】: Siwei Lai ; Liheng Xu ; Kang Liu ; Jun Zhao
【Abstract】: Text classification is a foundational task in many NLP applications. Traditional text classifiers often rely on many human-designed features, such as dictionaries, knowledge bases and special tree kernels. In contrast to traditional methods, we introduce a recurrent convolutional neural network for text classification without human-designed features. In our model, we apply a recurrent structure to capture contextual information as far as possible when learning word representations, which may introduce considerably less noise compared to traditional window-based neural networks. We also employ a max-pooling layer that automatically judges which words play key roles in text classification to capture the key components in texts. We conduct experiments on four commonly used datasets. The experimental results show that the proposed method outperforms the state-of-the-art methods on several datasets, particularly on document-level datasets.
【Keywords】: neural network; text classification; word embedding
【Paper Link】 【Pages】:2274-2280
【Authors】: Emmanuel Lassalle ; Pascal Denis
【Abstract】: This paper introduces a new structured model for learning anaphoricity detection and coreference resolution in a joint fashion. Specifically,we use a latent tree to represent the full coreference and anaphoric structure of a document at a global level, and we jointly learn the parameters of the two models using a version of the structured perceptron algorithm. Our joint structured model is further refined by the use of pairwise constraints which help the model to capture accurately certain patterns of coreference. Our experiments on the CoNLL-2012 English datasets show large improvements in both coreference resolution and anaphoricity detection, compared to various competing architectures. Our best coreference system obtains a CoNLL score of 81.97 on gold mentions, which is to date the best score reported on this setting.
【Keywords】: Coreference resolution; Anaphoricity detection; Latent tree models
【Paper Link】 【Pages】:2281-2287
【Authors】: Junyi Jessy Li ; Ani Nenkova
【Abstract】: Recent studies have demonstrated that specificity is an important characterization of texts potentially beneficial for a range of applications such as multi-document news summarization and analysis of science journalism. The feasibility of automatically predicting sentence specificity from a rich set of features has also been confirmed in prior work. In this paper we present a practical system for predicting sentence specificity which exploits only features that require minimum processing and is trained in a semi-supervised manner. Our system outperforms the state-of-the-art method for predicting sentence specificity and does not require part of speech tagging or syntactic parsing as the prior methods did. With the tool that we developed --- Speciteller --- we study the role of specificity in sentence simplification. We show that specificity is a useful indicator for finding sentences that need to be simplified and a useful objective for simplification, descriptive of the differences between original and simplified sentences.
【Keywords】: specificity; general; specific; sentence property
【Paper Link】 【Pages】:2288-2294
【Authors】: Changsong Liu ; Joyce Yue Chai
【Paper Link】 【Pages】:2288-2294
【Authors】: Changsong Liu ; Joyce Yue Chai
【Abstract】: In human-robot dialogue, although a robot and its human partner are co-present in a shared environment, they have significantly mismatched perceptual capabilities (e.g., recognizing objects in the surroundings). When a shared perceptual basis is missing, it becomes difficult for the robot to identify referents in the physical world that are referred to by the human (i.e., a problem of referential grounding). To overcome this problem, we have developed an optimization based approach that allows the robot to detect and adapt to perceptual differences. Through online interaction with the human, the robot can learn a set of weights indicating how reliably/unreliably each dimension (e.g., object type, object color, etc.) of its perception of the environment maps to the human's linguistic descriptors and thus adjust its word models accordingly. Our empirical evaluation has shown that this weight-learning approach can successfully adjust the weights to reflect the robot's perceptual limitations. The learned weights, together with updated word models, can lead to a significant improvement for referential grounding in future dialogues.
【Keywords】: Referential Grounding; Weight Learning; Human-Robot Dialogue
【Paper Link】 【Pages】:2295-2301
【Authors】: Yang Liu ; Maosong Sun
【Abstract】: Word alignment is an important natural language processing task that indicates the correspondence between natural languages. Recently, unsupervised learning of log-linear models for word alignment has received considerable attention as it combines the merits of generative and discriminative approaches. However, a major challenge still remains: it is intractable to calculate the expectations of non-local features that are critical for capturing the divergence between natural languages. We propose a contrastive approach that aims to differentiate observed training examples from noises. It not only introduces prior knowledge to guide unsupervised learning but also cancels out partition functions. Based on the observation that the probability mass of log-linear models for word alignment is usually highly concentrated, we propose to use top-$n$ alignments to approximate the expectations with respect to posterior distributions. This allows for efficient and accurate calculation of expectations of non-local features. Experiments show that our approach achieves significant improvements over state-of-the-art unsupervised word alignment methods.
【Keywords】: contrastive learning; latent-variable log-linear models; sampling
【Paper Link】 【Pages】:2302-2310
【Authors】: Tom M. Mitchell ; William W. Cohen ; Estevam R. Hruschka Jr. ; Partha Pratim Talukdar ; Justin Betteridge ; Andrew Carlson ; Bhavana Dalvi Mishra ; Matthew Gardner ; Bryan Kisiel ; Jayant Krishnamurthy ; Ni Lao ; Kathryn Mazaitis ; Thahir Mohamed ; Ndapandula Nakashole ; Emmanouil Antonios Platanios ; Alan Ritter ; Mehdi Samadi ; Burr Settles ; Richard C. Wang ; Derry Tanti Wijaya ; Abhinav Gupta ; Xinlei Chen ; Abulhair Saparov ; Malcolm Greaves ; Joel Welling
【Abstract】: Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never-Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidence-weighted beliefs (e.g., servedWith(tea, biscuits) ). NELL has also learned millions of features and parameters that enable it to read these beliefs from the web. Additionally, it has learned to reason over these beliefs to infer new beliefs, and is able to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http://rtw.ml.cmu.edu, and followed on Twitter at @CMUNELL.
【Keywords】: never ending learning, machine learning, read the web
【Paper Link】 【Pages】:2311-2317
【Authors】: Yanchuan Sim ; Bryan R. Routledge ; Noah A. Smith
【Abstract】: We explore the idea that authoring a piece of text is an act of maximizing one's expected utility.To make this idea concrete, we consider the societally important decisions of the Supreme Court of the United States.Extensive past work in quantitative political science provides a framework for empirically modeling the decisions of justices and how they relate to text.We incorporate into such a model texts authored by amici curiae (``friends of the court'' separate from the litigants) who seek to weigh in on the decision, then explicitly model their goals in a random utility model.We demonstrate the benefits of this approach in improved vote prediction and the ability to perform counterfactual analysis.
【Keywords】: natural language processing; text mining; nlp; utility; supreme court; political science; scotus; votes
【Paper Link】 【Pages】:2318-2324
【Authors】: Andrei Arsene Simion ; Michael Collins ; Cliff Stein
【Abstract】: Recently, a new convex formulation of IBM Model 2 was introduced. In this paper we develop the theory further and introduce a class of convex relaxations for latent variable models which include IBM Model 2. When applied to IBM Model 2, our relaxation class subsumes the previous relaxation as a special case. As proof of concept, we study a new relaxation of IBM Model 2 which is simpler than the previous algorithm: the new relaxation relies on the use of nothing more than a multinomial EM algorithm, does not require the tuning of a learning rate, and has some favorable comparisons to IBM Model 2 in terms of F-Measure. The ideas presented could be applied to a wide range of NLP and machine learning problems.
【Keywords】: convex optimization; machine learning; NLP; word alignment
【Paper Link】 【Pages】:2325-2331
【Authors】: Svitlana Volkova ; Benjamin Van Durme
【Abstract】: Latent author attribute prediction in social media provides a novel set of conditions for the construction of supervised classification models. With individual authors as training and test instances, their associated content (“features”) are made available incrementally over time, as they converse over discussion forums. We propose various approaches to handling this dynamic data, from traditional batch training and testing, to incremental bootstrapping, and then active learning via crowdsourcing. Our underlying model relies on an intuitive application of Bayes rule, which should be easy to adopt by the community, thus allowing for a general shift towards online modeling for social media.
【Keywords】: NLPML;NLPTM;APP
【Paper Link】 【Pages】:2332-2338
【Authors】: Fangzhao Wu ; Yangqiu Song ; Yongfeng Huang
【Abstract】: Microblog sentiment classification is an important research topic which has wide applications in both academia and industry. Because microblog messages are short, noisy and contain masses of acronyms and informal words, microblog sentiment classification is a very challenging task. Fortunately, collectively the contextual information about these idiosyncratic words provide knowledge about their sentiment orientations. In this paper, we propose to use the microblogs' contextual knowledge mined from a large amount of unlabeled data to help improve microblog sentiment classification. We define two kinds of contextual knowledge: word-word association and word-sentiment association. The contextual knowledge is formulated as regularization terms in supervised learning algorithms. An efficient optimization procedure is proposed to learn the model. Experimental results on benchmark datasets show that our method can consistently and significantly outperform the state-of-the-art methods.
【Keywords】: Sentiment Classification; Microblog; Contextual Knowledge
【Paper Link】 【Pages】:2339-2345
【Authors】: Jun Xie ; Chao Ma ; Janardhan Rao Doppa ; Prashanth Mannem ; Xiaoli Z. Fern ; Thomas G. Dietterich ; Prasad Tadepalli
【Abstract】: Easy-first, a search-based structured prediction approach, has been applied to many NLP tasks including dependency parsing and coreference resolution. This approach employs a learned greedy policy (action scoring function) to make easy decisions first, which constrains the remaining decisions and makes them easier. We formulate greedy policy learning in the Easy-first approach as a novel non-convex optimization problem and solve it via an efficient Majorization Minimizatoin (MM) algorithm. Results on within-document coreference and cross-document joint entity and event coreference tasks demonstrate that the proposed approach achieves statistically significant performance improvement over existing training regimes for Easy-first and is less susceptible to overfitting.
【Keywords】: Structured Prediction; Learning for Search; Imitation Learning; Coreference Resolution
【Paper Link】 【Pages】:2346-2352
【Authors】: Ran Xu ; Caiming Xiong ; Wei Chen ; Jason J. Corso
【Abstract】: Recently, joint video-language modeling has been attracting more and more attention. However, most existing approaches focus on exploring the language model upon on a fixed visual model. In this paper, we propose a unified framework that jointly models video and the corresponding text sentences. The framework consists of three parts: a compositional semantics language model, a deep video model and a joint embedding model. In our language model, we propose a dependency-tree structure model that embeds sentence into a continuous vector space, which preserves visually grounded meanings and word order. In the visual model, we leverage deep neural networks to capture essential semantic information from videos. In the joint embedding model, we minimize the distance of the outputs of the deep video model and compositional language model in the joint space, and update these two models jointly. Based on these three parts, our system is able to accomplish three tasks: 1) natural language generation, and 2) video retrieval and 3) language retrieval. In the experiments, the results show our approach outperforms SVM, CRF and CCA baselines in predicting Subject-Verb-Object triplet and natural sentence generation, and is better than CCA in video retrieval and language retrieval tasks.
【Keywords】: Natural Language Generation; Deep Learning; Video Content Analysis; Video to Text; Multi-Modality
【Paper Link】 【Pages】:2353-2360
【Authors】: Min Yang ; Tianyi Cui ; Wenting Tu
【Abstract】: Topic modeling of textual corpora is an important and challenging problem. In most previous work, the “bag-of-words” assumption is usually made which ignores the ordering of words. This assumption simplifies the computation, but it unrealistically loses the ordering information and the semantic of words in the context. In this paper, we present a Gaussian Mixture Neural Topic Model (GMNTM) which incorporates both the ordering of words and the semantic meaning of sentences into topic modeling. Specifically, we represent each topic as a cluster of multi-dimensional vectors and embed the corpus into a collection of vectors generated by the Gaussian mixture model. Each word is affected not only by its topic, but also by the embedding vector of its surrounding words and the context. The Gaussian mixture components and the topic of documents, sentences and words can be learnt jointly. Extensive experiments show that our model can learn better topics and more accurate word distributions for each topic. Quantitatively, comparing to state-of-the-art topic modeling approaches, GMNTM obtains significantly better performance in terms of perplexity, retrieval accuracy and classification accuracy.
【Keywords】: Topic Modeling; Distributed Representations; Natural Language Processing
【Paper Link】 【Pages】:2361-2367
【Authors】: Hadi Amiri ; Hal Daumé III
【Abstract】: We consider the problem of classifying micro-posts as churny or non-churny with respect to a given brand. Using Twitter data about three brands, we find that standard machine learning techniques clearly outperform keyword based approaches. However, the three machine learning techniques we employed (linear classification, support vector machines, and logistic regression) do not perform as well on churn classification as on other text classification problems. We investigate demographic, content, and context churn indicators in microblogs and examine factors that make this problem more challenging. Experimental results show an average F1 performance of 75% for target-dependent churn classification in microblogs.
【Keywords】: Churn prediction; Churn classification
【Paper Link】 【Pages】:2368-2374
【Authors】: Wei-Te Chen ; Claire Bonial ; Martha Palmer
【Abstract】: This research describes the development of a supervised classifier of English light verb constructions, for example, "take a walk" and "make a speech." This classifier relies on features from dependency parses, OntoNotes sense tags, WordNet hypernyms and WordNet lexical file information. Evaluation shows that this system achieves an 89% F1 score (four points above the state of the art) on the BNC test set used by Tu & Roth (2011), and an F1 score of 80.68 on the OntoNotes test set, which is significantly more challenging. We attribute the superior F1 score to the use of our rich linguistic features, including the use of WordNet synset and hypernym relations for the detection of previously unattested light verb constructions. We describe the classifier and its features, as well as the characteristics of the OntoNotes light verb construction test set, which relies on linguistically motivated PropBank annotation.
【Keywords】: Light Verb Constructions; Lexical Semantics; Natural Language Processing
【Paper Link】 【Pages】:2375-2381
【Authors】: Chen Chen ; Vincent Ng
【Abstract】: Pronoun resolution and common noun phrase resolution are the two most challenging subtasks of coreference resolution. While a lot of work has focused on pronoun resolution, common noun phrase resolution has almost always been tackled in the context of the larger coreference resolution task. In fact, to our knowledge, there has been no attempt to address Chinese common noun phrase resolution as a standalone task. In this paper, we propose a generative model for unsupervised Chinese common noun phrase resolution that not only allows easy incorporation of linguistic constraints on coreference but also performs joint resolution and anaphoricity determination. When evaluated on the Chinese portion of the OntoNotes 5.0 corpus, our model rivals its supervised counterpart in performance.
【Keywords】:
【Paper Link】 【Pages】:2382-2388
【Authors】: Grant DeLozier ; Jason Baldridge ; Loretta London
【Abstract】: Toponym resolution, or grounding names of places to their actual locations, is an important problem in analysis of both historical corpora and present-day news and web content. Recent approaches have shifted from rule-based spatial minimization methods to machine learned classifiers that use features of the text surrounding a toponym. Such methods have been shown to be highly effective, but they crucially rely on gazetteers and are unable to handle unknown place names or locations. We address this limitation by modeling the geographic distributions of words over the earth's surface: we calculate the geographic profile of each word based on local spatial statistics over a set of geo-referenced language models. These geo-profiles can be further refined by combining in-domain data with background statistics from Wikipedia. Our resolver computes the overlap of all geo-profiles in a given text span; without using a gazetteer, it performs on par with existing classifiers. When combined with a gazetteer, it achieves state-of-the-art performance for two standard toponym resolution corpora (TR-CoNLL and Civil War). Furthermore, it dramatically improves recall when toponyms are identified by named entity recognizers, which often (correctly) find non-standard variants of toponyms.
【Keywords】: Toponym Resolution; Georeferencing; Named Entity Disambiguation
【Paper Link】 【Pages】:2389-2395
【Authors】: Xiao Ding ; Ting Liu ; Junwen Duan ; Jian-Yun Nie
【Abstract】: Social media platforms are often used by people to express their needs and desires. Such data offer great opportunities to identify users’ consumption intention from user-generated contents, so that better tailored products or services can be recommended. However, there have been few efforts on mining commercial intents from social media contents. In this paper, we investigate the use of social media data to identify consumption intentions for individuals. We develop a Consumption Intention Mining Model (CIMM) based on convolutional neural network (CNN), for identifying whether the user has a consumption intention. The task is domain-dependent, and learning CNN requires a large number of annotated instances, which can be available only in some domains. Hence, we investigate the possibility of transferring the CNN mid-level sentence representation learned from one domain to another by adding an adaptation layer. To demonstrate the effectiveness of CIMM, we conduct experiments on two domains. Our results show that CIMM offers a powerful paradigm for effectively identifying users’ consumption intention based on their social media data. Moreover, our results also confirm that the CNN learned in one domain can be effectively transferred to another domain. This suggests that a great potential for our model to significantly increase effectiveness of product recommendations and targeted advertising.
【Keywords】: consumption intention; convolutional neural network; domain adaptive
【Paper Link】 【Pages】:2396-2403
【Authors】: Chikara Hashimoto ; Kentaro Torisawa ; Julien Kloetzer ; Jong-Hoon Oh
【Abstract】: Event causality knowledge is indispensable for intelligent natural language understanding. The problem is that any method for extracting event causalities from text is insufficient; it is likely that some event causalities that we can recognize in this world are not written in a corpus, no matter its size. We propose a method of hypothesizing unseen event causalities from known event causalities extracted from the web by the semantic relations between nouns. For example, our method can hypothesize "deploy a security camera" -> "avoid crimes" from "deploy a mosquito net" -> "avoid malaria" through semantic relation . Our experiments show that, from 2.4 million event causalities extracted from the web, our method generated more than 300,000 hypotheses, which were not in the input, with 70% precision. We also show that our method outperforms a state-of-the-art hypothesis generation method.
【Keywords】:
【Paper Link】 【Pages】:2404-2410
【Authors】: Wenyi Huang ; Zhaohui Wu ; Liang Chen ; Prasenjit Mitra ; C. Lee Giles
【Abstract】: Automatic citation recommendation can be very useful for authoring a paper and is an AI-complete problem due to the challenge of bridging the semantic gap between citation context and the cited paper. It is not always easy for knowledgeable researchers to give an accurate citation context for a cited paper or to find the right paper to cite given context. To help with this problem, we propose a novel neural probabilistic model that jointly learns the semantic representations of citation contexts and cited papers. The probability of citing a paper given a citation context is estimated by training a multi-layer neural network. We implement and evaluate our model on the entire CiteSeer dataset, which at the time of this work consists of 10,760,318 citation contexts from 1,017,457 papers. We show that the proposed model significantly outperforms other state-of-the-art models in recall, MAP, MRR, and nDCG.
【Keywords】: Citation recommendation; Neural Probabilistic Model; Distributed Representations
【Paper Link】 【Pages】:2411-2417
【Authors】: Huayi Li ; Arjun Mukherjee ; Jianfeng Si ; Bing Liu
【Abstract】: Identifying aspect-based opinions has been studied extensively in recent years. However, existing work primarily focused on adjective, adverb, and noun expressions. Clearly, verb expressions can imply opinions too. We found that in many domains verb expressions can be even more important to applications because they often describe major issues of products or services. These issues enable brands and businesses to directly improve their products or services. To the best of our knowledge, this problem has not received much attention in the literature. In this paper, we make an attempt to solve this problem. Our proposed method first extracts verb expressions from reviews and then employs Markov Networks to model rich linguistic features and long distance relationships to identify negative issue expressions. Since our training data is obtained from titles of reviews whose labels are automatically inferred from review ratings, our approach is applicable to any domain without manual involvement. Experimental results using real-life review datasets show that our approach outperforms strong baselines.
【Keywords】: Verb Expression;Opinion Mining
【Paper Link】 【Pages】:2418-2424
【Authors】: Yang Liu ; Zhiyuan Liu ; Tat-Seng Chua ; Maosong Sun
【Abstract】: Most word embedding models typically represent each word using a single vector, which makes these models indiscriminative for ubiquitous homonymy and polysemy. In order to enhance discriminativeness, we employ latent topic models to assign topics for each word in the text corpus, and learn topical word embeddings (TWE) based on both words and their topics. In this way, contextual word embeddings can be flexibly obtained to measure contextual word similarity. We can also build document representations, which are more expressive than some widely-used document models such as latent topic models. In the experiments, we evaluate the TWE models on two tasks, contextual word similarity and text classification. The experimental results show that our models outperform typical word embedding models including the multi-prototype version on contextual word similarity, and also exceed latent topic models and other representative document models on text classification.
【Keywords】: word embeddings; topic models; document representation
【Paper Link】 【Pages】:2425-2431
【Authors】: Yassine Mrabet ; Claire Gardent ; Muriel Foulonneau ; Elena Simperl ; Eric Ras
【Abstract】: While the Web of data is attracting increasing interest and rapidly growing in size, the major support of information on the surface Web are still multimedia documents. Semantic annotation of texts is one of the main processes that are intended to facilitate meaning-based information exchange between computational agents. However, such annotation faces several challenges such as the heterogeneity of natural language expressions, the heterogeneity of documents structure and context dependencies. While a broad range of annotation approaches rely mainly or partly on the target textual context to disambiguate the extracted entities, in this paper we present an approach that relies mainly on formalized-knowledge expressed in RDF datasets to categorize and disambiguate noun phrases. In the proposed method, we represent the reference knowledge bases as co-occurrence matrices and the disambiguation problem as a 0-1 Integer Linear Programming (ILP) problem. The proposed approach is unsupervised and can be ported to any RDF knowledge base. The system implementing this approach, called KODA, shows very promising results w.r.t. state-of-the-art annotation tools in cross-domain experimentations.
【Keywords】: Semantic Annotation; Entity Linking; RDF; Open Data; Text Mining
【Paper Link】 【Pages】:2432-2439
【Authors】: Ashequl Qadir ; Pablo N. Mendes ; Daniel Gruhl ; Neal Lewis
【Abstract】: With the rise of social media, learning from informal text has become increasingly important. We present a novel semantic lexicon induction approach that is able to learn new vocabulary from social media. Our method is robust to the idiosyncrasies of informal and open-domain text corpora. Unlike previous work, it does not impose restrictions on the lexical features of candidate terms — e.g. by restricting entries to nouns or noun phrases —while still being able to accurately learn multiword phrases of variable length. Starting with a few seed terms for a semantic category, our method first explores the context around seed terms in a corpus, and identifies context patterns that are relevant to the category. These patterns are used to extract candidate terms — i.e. multiword segments that are further analyzed to ensure meaningful term boundary segmentation. We show that our approach is able to learn high quality semantic lexicons from informally written social media text of Twitter, and can achieve accuracy as high as 92% in the top 100 learned category members.
【Keywords】: lexicon induction; social media
【Paper Link】 【Pages】:2440-2446
【Authors】: Likun Qiu ; Yue Zhang
【Abstract】: Word segmentation is a necessary first step for automaticsyntactic analysis of Chinese text. Chinese segmentationis highly accurate on news data, but the accuraciesdrop significantly on other domains, such as science andliterature. For scientific domains, a significant portionof out-of-vocabulary words are domain-specific terms, and therefore lexicons can be used to improve segmentationsignificantly. For the literature domain, however,there is not a fixed set of domain terms. For example,each novel can contain a specifiac set of person, organizationand location names. We investigate a method forautomatically mining common noun entities for eachnovel using information extraction techniques, and usethe resulting entities to improve a state-of-the-art segmentationmodel for the novel. In particular, we designa novel double-propagation algorithm that mines nounentities together with common contextual patterns, anduse them as plug-in features to a model trained on thesource domain. An advantage of our method is that noretraining for the segmentation model is needed for eachnovel, and hence it can be applied efficiently given thehuge number of novels on the web. Results on five differentnovels show significantly improved accuracies,in particular for OOV words.
【Keywords】: Natural Language Processing; Evaluation and Analysis; Information Extraction
【Paper Link】 【Pages】:2447-2452
【Authors】: Anders Søgaard ; Barbara Plank ; Héctor Martínez Alonso
【Abstract】: Knowledge bases have the potential to advance artificial intelligence, but often suffer from recall problems, i.e., lack of knowledge of new entities and relations. On the contrary, social media such as Twitter provide abundance of data, in a timely manner: information spreads at an incredible pace and is posted long before it makes it into more commonly used resources for knowledge extraction. In this paper we address the question whether we can exploit social media to extract new facts, which may at first seem like finding needles in haystacks. We collect tweets about 60 entities in Freebase and compare four methods to extract binary relation candidates, based on syntactic and semantic parsing and simple mechanism for factuality scoring. The extracted facts are manually evaluated in terms of their correctness and relevance for search. We show that moving from bottom-up syntactic or semantic dependency parsing formalisms to top-down frame-semantic processing improves the robustness of knowledge extraction, producing more intelligible fact candidates of better quality. In order to evaluate the quality of frame semantic parsing on Twitter intrinsically, we make a multiply frame-annotated dataset of tweets publicly available.
【Keywords】: frame semantics; knowledge bases; twitter
【Paper Link】 【Pages】:2453-2459
【Authors】: Jiwei Tan ; Xiaojun Wan ; Jianguo Xiao
【Abstract】: In this paper, we propose and address a novel task of recommending quotes for writing. Quote is short for quotation, which is the repetition of someone else’s statement or thoughts. It is a common case in our writing when we would like to cite someone’s statement, like a proverb or a statement by some famous people, to make our composition more elegant or convincing. However, sometimes we are so eager to make a citation of quote somewhere, but have no idea about the relevant quote to express our idea. Because knowing or remembering so many quotes is not easy, it is exciting to have a system to recommend relevant quotes for us while writing. In this paper we tackle this appealing AI task, and build up a learning framework for quote recommendation. We collect abundant quotes from the Internet, and mine real contexts containing these quotes from large amount of electronic books, to build up a dataset for experiments. We explore the particular features of this task, and propose a few useful features to model the characteristics of quotes and the relevance of quotes to contexts. We apply a supervised learning to rank model to integrate multiple features. Experiment results show that, our proposed approach is appropriate for this task and it outperforms other recommendation methods.
【Keywords】: quote recommendation; learning to rank
【Paper Link】 【Pages】:2460-2467
【Authors】: Andrew Yates ; Nazli Goharian ; Ophir Frieder
【Abstract】: The potential benefits of mining social media to learn about adverse drug reactions (ADRs) are rapidly increasing with the increasing popularity of social media. Unknown ADRs have traditionally been discovered by expensive post-marketing trials, but recent work has suggested that some unknown ADRs may be discovered by analyzing social media. We propose three methods for extracting ADRs from forum posts and tweets, and compare our methods with several existing methods. Our methods outperform the existing methods in several scenarios; our filtering method achieves the highest F1 and precision on forum posts, and our CRF method achieves the highest precision on tweets. Furthermore, we address the difficulty of annotating social media on a large scale with an alternate evaluation scheme that takes advantage of the ADRs listed on drug labels. We investigate how well this alternate evaluation approximates a traditional evaluation using human annotations.
【Keywords】:
【Paper Link】 【Pages】:2468-2475
【Authors】: Deyu Zhou ; Liangyu Chen ; Yulan He
【Abstract】: Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the mostup-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.
【Keywords】:
【Paper Link】 【Pages】:2476-2482
【Authors】: Tameem Adel ; Alexander Wong
【Abstract】: The aim of domain adaptation algorithms is to establish a learner, trained on labeled data from a source domain, that can classify samples from a target domain, in which few or no labeled data are available for training. Covariate shift, a primary assumption in several works on domain adaptation, assumes that the labeling functions of source and target domains are identical. We present a domain adaptation algorithm that assumes a relaxed version of covariate shift where the assumption that the labeling functions of the source and target domains are identical holds with a certain probability. Assuming a source deterministic large margin binary classifier, the farther a target instance is from the source decision boundary, the higher the probability that covariate shift holds. In this context, given a target unlabeled sample and no target labeled data, we develop a domain adaptation algorithm that bases its labeling decisions both on the source learner and on the similarities between the target unlabeled instances. The source labeling function decisions associated with probabilistic covariate shift, along with the target similarities are concurrently expressed on a similarity graph. We evaluate our proposed algorithm on a benchmark sentiment analysis (and domain adaptation) dataset, where state-of-the-art adaptation results are achieved. We also derive a lower bound on the performance of the algorithm.
【Keywords】: Machine Learning; Domain Adaptation; Spectral Grouping
【Paper Link】 【Pages】:2483-2489
【Authors】: Ibrahim M. Alabdulmohsin ; Xin Gao ; Xiangliang Zhang
【Abstract】: Active learning is a subfield of machine learning that has been successfully used in many applications including text classification and bioinformatics. One of the fundamental branches of active learning is query synthesis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the true decision boundary. Nevertheless, the existing literature on membership query synthesis has focused on finite concept classes with a limited extension to real-world applications. In this paper, we present an efficient spectral algorithm for membership query synthesis for halfspaces, whose sample complexity is experimentally shown to be near-optimal. At each iteration, the algorithm consists of two steps. First, a convex optimization problem is solved that provides an approximate characterization of the version space. Second, a principal component is extracted, which yields a synthetic query that shrinks the version space exponentially fast. Unlike traditional methods in active learning, the proposed method can be readily extended into the batch setting by solving for the top k eigenvectors in the second step. Experimentally, it exhibits a significant improvement over traditional approaches such as uncertainty sampling and representative sampling. For example, to learn a halfspace in the Euclidean plane with 25 dimensions and an estimation error of 1E-4, the proposed algorithm uses less than 3% of the number of queries required by uncertainty sampling.
【Keywords】: Active Learning; Query Synthesis; Halfspace; Linear Classifiers
【Paper Link】 【Pages】:2490-2496
【Authors】: Kareem Amin ; Satyen Kale ; Gerald Tesauro ; Deepak S. Turaga
【Abstract】: We consider a budgeted variant of the problem of learning from expert advice with N experts. Each queried expert incurs a cost and there is a given budget B on the total cost of experts that can be queried in any prediction round. We provide an online learning algorithm for this setting with regret after T prediction rounds bounded by O(sqrt(C log(N)T/B)), where C is the total cost of all experts. We complement this upper bound with a nearly matching lower bound Omega(sqrt(CT/B)) on the regret of any algorithm for this problem. We also provide experimental validation of our algorithm.
【Keywords】:
【Paper Link】 【Pages】:2497-2503
【Authors】: Olov Andersson ; Fredrik Heintz ; Patrick Doherty
【Abstract】: Reinforcement learning for robot control tasks in continuous environments is a challenging problem due to the dimensionality of the state and action spaces, time and resource costs for learning with a real robot as well as constraints imposed for its safe operation. In this paper we propose a model-based reinforcement learning approach for continuous environments with constraints. The approach combines model-based reinforcement learning with recent advances in approximate optimal control. This results in a bounded-rationality agent that makes decisions in real-time by efficiently solving a sequence of constrained optimization problems on learned sparse Gaussian process models. Such a combination has several advantages. No high-dimensional policy needs to be computed or stored while the learning problem often reduces to a set of lower-dimensional models of the dynamics. In addition, hard constraints can easily be included and objectives can also be changed in real-time to allow for multiple or dynamic tasks. The efficacy of the approach is demonstrated on both an extended cart pole domain and a challenging quadcopter navigation task using real data.
【Keywords】: Reinforcement Learning,Gaussian Processes,Optimization,Robotics
【Paper Link】 【Pages】:2504-2510
【Authors】: Haitham Bou-Ammar ; Eric Eaton ; Paul Ruvolo ; Matthew E. Taylor
【Abstract】: The success of applying policy gradient reinforcement learning (RL) to difficult control tasks hinges crucially on the ability to determine a sensible initialization for the policy. Transfer learning methods tackle this problem by reusing knowledge gleaned from solving other related tasks. In the case of multiple task domains, these algorithms require an inter-task mapping to facilitate knowledge transfer across domains. However, there are currently no general methods to learn an inter-task mapping without requiring either background knowledge that is not typically present in RL settings, or an expensive analysis of an exponential number of inter-task mappings in the size of the state and action spaces. This paper introduces an autonomous framework that uses unsupervised manifold alignment to learn inter-task mappings and effectively transfer samples between different task domains. Empirical results on diverse dynamical systems, including an application to quadrotor control, demonstrate its effectiveness for cross-domain transfer in the context of policy gradient RL.
【Keywords】: transfer learning; reinforcement learning; policy gradients; manifold alignment; cross-domain transfer
【Paper Link】 【Pages】:2511-2517
【Authors】: Thomas Boucher ; C. J. Carey ; Sridhar Mahadevan ; Melinda Darby Dyar
【Abstract】: Current manifold alignment methods can effectively align data sets that are drawn from a non-intersecting set of manifolds. However, as data sets become increasingly high-dimensional and complex, this assumption may not hold. This paper proposes a novel manifold alignment algorithm, low rank alignment (LRA), that uses a low rank representation (instead of a nearest neighbor graph construction) to embed and align data sets drawn from mixtures of manifolds. LRA does not require the tuning of a sensitive nearest neighbor hyperparameter or prior knowledge of the number of manifolds, both of which are common drawbacks with existing techniques. We demonstrate the effectiveness of our algorithm in two real-world applications: a transfer learning task in spectroscopy and a canonical information retrieval task.
【Keywords】: manifold alignment; transfer learning; low rank representation
【Paper Link】 【Pages】:2518-2524
【Authors】: Wei Cao ; Liang Hu ; Longbing Cao
【Abstract】: The global financial crisis occurred in 2008 and its contagion to other regions, as well as the long-lasting impact on different markets, show that it is increasingly important to understand the complicated coupling relationships across financial markets. This is indeed very difficult as complex hidden coupling relationships exist between different financial markets in various countries, which are very hard to model. The couplings involve interactions between homogeneous markets from various countries (we call intra-market coupling), interactions between heterogeneous markets (inter-market coupling) and interactions between current and past market behaviors (temporal coupling). Very limited work has been done towards modeling such complex couplings, whereas some existing methods predict market movement by simply aggregating indicators from various markets but ignoring the inbuilt couplings. As a result, these methods are highly sensitive to observations, and may often fail when financial indicators change slightly. In this paper, a coupled deep belief network is designed to accommodate the above three types of couplings across financial markets. With a deep-architecture model to capture the high-level coupled features, the proposed approach can infer market trends. Experimental results on data of stock and currency markets from three countries show that our approach outperforms other baselines, from both technical and business perspectives.
【Keywords】: Complex Couplings; Deep Belief Network; Time Series Model
【Paper Link】 【Pages】:2525-2531
【Authors】: Kai-Wei Chang ; Shyam Upadhyay ; Gourab Kundu ; Dan Roth
【Abstract】: Training a structured prediction model involves performing several loss-augmented inference steps. Over the lifetime of the training, many of these inference problems, although different, share the same solution. We propose AI-DCD, an Amortized Inference framework for Dual Coordinate Descent method, an approximate learning algorithm, that accelerates the training process by exploiting this redundancy of solutions, without compromising the performance of the model. We show the efficacy of our method by training a structured SVM using dual coordinate descent for an entityrelation extraction task. Our method learns the same model as an exact training algorithm would, but call the inference engine only in 10% – 24% of the inference problems encountered during training. We observe similar gains on a multi-label classification task and with a Structured Perceptron model for the entity-relation task.
【Keywords】: Structured Learning; Amortized Inference
【Paper Link】 【Pages】:2532-2538
【Authors】: Xiaojun Chang ; Feiping Nie ; Zhigang Ma ; Yi Yang ; Xiaofang Zhou
【Abstract】: Spectral clustering is a fundamental technique in the field of data mining and information processing. Most existing spectral clustering algorithms integrate dimensionality reduction into the clustering process assisted by manifold learning in the original space. However, the manifold in reduced-dimensional subspace is likely to exhibit altered properties in contrast with the original space. Thus, applying manifold information obtained from the original space to the clustering process in a low-dimensional subspace is prone to inferior performance. Aiming to address this issue, we propose a novel convex algorithm that mines the manifold structure in the low-dimensional subspace. In addition, our unified learning process makes the manifold learning particularly tailored for the clustering. Compared with other related methods, the proposed algorithm results in more structured clustering result. To validate the efficacy of the proposed algorithm, we perform extensive experiments on several benchmark datasets in comparison with some state-of-the-art clustering approaches. The experimental results demonstrate that the proposed algorithm has quite promising clustering performance.
【Keywords】: Spectral Clustering;Manifold Learning
【Paper Link】 【Pages】:2539-2546
【Authors】: Jaesik Choi ; Eyal Amir ; Tianfang Xu ; Albert J. Valocchi
【Abstract】: The Kalman Filter (KF) is pervasively used to control a vast array of consumer, health and defense products. By grouping sets of symmetric state variables, the Relational Kalman Filter (RKF) enables us to scale the exact KF for large-scale dynamic systems. In this paper, we provide a parameter learning algorithm for RKF, and a regrouping algorithm that prevents the degeneration of the relational structure for efficient filtering. The proposed algorithms significantly expand the applicability of the RKFs by solving the following questions: (1) how to learn parameters for RKF from partial observations; and (2) how to regroup the degenerated state variables by noisy real-world observations. To our knowledge, this is the first paper on learning parameters in relational continuous probabilistic models. We show that our new algorithms significantly improve the accuracy and the efficiency of filtering large-scale dynamic systems.
【Keywords】: Kalman filtering, Statistical Relational Learning, Probabilistic Relational Models, Linear Gaussian Models, Lifted Inference
【Paper Link】 【Pages】:2547-2553
【Authors】: Ujjwal Das Gupta ; Erik Talvitie ; Michael Bowling
【Abstract】: Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Policy gradient algorithms, which directly represent the policy, often need fewer parameters to learn good policies. However, they typically employ a fixed parametric representation that may not be sufficient for complex domains. This paper introduces the Policy Tree algorithm, which can learn an adaptive representation of policy in the form of a decision tree over different instantiations of a base policy. Policy gradient is used both to optimize the parameters and to grow the tree by choosing splits that enable the maximum local increase in the expected return of the policy. Experiments show that this algorithm can choose genuinely helpful splits and significantly improve upon the commonly used linear Gibbs softmax policy, which we choose as our base policy.
【Keywords】: Reinforcement Learning; Policy Gradient; Decision Trees; Representation Learning
【Paper Link】 【Pages】:2554-2560
【Authors】: Charanpal Dhanjal ; Romaric Gaudel ; Stéphan Clémençon
【Abstract】: In recommendation systems, one is interested in the ranking of the predicted items as opposed to other losses such as the mean squared error. Although a variety of ways to evaluate rankings exist in the literature, here we focus on the Area Under the ROC Curve (AUC) as it widely used and has a strong theoretical underpinning. In practical recommendation, only items at the top of the ranked list are presented to the users. With this in mind we propose a class of objective functions which primarily represent a smooth surrogate for the real AUC, and in a special case we show how to prioritise the top of the list. This loss is differentiable and is optimised through a carefully designed stochastic gradient-descent-based algorithm which scales linearly with the size of the data. We mitigate sample bias present in the data by sampling observations according to a certain power-law based distribution. In addition, we provide computation results as to the efficacy of the proposed method using synthetic and real data.
【Keywords】: Recommender Sytems; Matrix Factorization; AUC; Local AUC
【Paper Link】 【Pages】:2561-2567
【Authors】: Hu Ding ; Jinhui Xu
【Abstract】: Support Vector Machine (SVM) is a fundamental technique in machine learning. A long time challenge facing SVM is how to deal with outliers (caused by mislabeling), as they could make the classes in SVM nonseparable. Existing techniques, such as soft margin SVM, ν-SVM, and Core-SVM, can alleviate the problem to certain extent, but cannot completely resolve the issue. Recently, there are also techniques available for explicit outlier removal. But they suffer from high time complexity and cannot guarantee quality of solution. In this paper, we present a new combinatorial approach, called Random Gradient Descent Tree (or RGD-tree), to explicitly deal with outliers; this results in a new algorithm called RGD-SVM. Our technique yields provably good solution and can be efficiently implemented for practical purpose. The time and space complexities of our approach only linearly depend on the input size and the dimensionality of the space, which are significantly better than existing ones. Experiments on benchmark datasets suggest that our technique considerably outperforms several popular techniques in most of the cases.
【Keywords】: SVM; Outliers; Robust algorithms; Random sampling; Gradient Descent; Boosting
【Paper Link】 【Pages】:2568-2574
【Authors】: Yi Ding ; Peilin Zhao ; Steven C. H. Hoi ; Yew-Soon Ong
【Abstract】: Learning for maximizing AUC performance is an important research problem in machine learning. Unlike traditional batch learning methods for maximizing AUC which often suffer from poor scalability, recent years have witnessed some emerging studies that attempt to maximize AUC by single-pass online learning approaches. Despite their encouraging results reported, the existing online AUC maximization algorithms often adopt simple stochastic gradient descent approaches, which fail to exploit the geometry knowledge of the data observed in the online learning process, and thus could suffer from relatively slow convergence. To overcome the limitation of the existing studies, in this paper, we propose a novel algorithm of Adaptive Online AUC Maximization (AdaOAM), by applying an adaptive gradient method for exploiting the knowledge of historical gradients to perform more informative online learning. The new adaptive updating strategy by AdaOAM is less sensitive to parameter settings due to its natural effect of tuning the learning rate. In addition, the time complexity of the new algorithm remains the same as the previous non-adaptive algorithms. To demonstrate the effectiveness of the proposed algorithm, we analyze its theoretical bound, and further evaluate its empirical performance on both public benchmark datasets and anomaly detection datasets. The encouraging empirical results clearly show the effectiveness and efficiency of the proposed algorithm.
【Keywords】:
【Paper Link】 【Pages】:2575-2581
【Authors】: Finale Doshi-Velez ; Byron C. Wallace ; Ryan Adams
【Abstract】: Topic modeling is a powerful tool for uncovering latent structure in many domains, including medicine, finance, and vision. The goals for the model vary depending on the application: sometimes the discovered topics are used for prediction or another downstream task. In other cases, the content of the topic may be of intrinsic scientific interest. Unfortunately, even when one uses modern sparse techniques, discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that uses knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance.
【Keywords】:
【Paper Link】 【Pages】:2582-2588
【Authors】: Changying Du ; Shandian Zhe ; Fuzhen Zhuang ; Yuan Qi ; Qing He ; Zhongzhi Shi
【Abstract】: Supervised dimensionality reduction has shown great advantages in finding predictive subspaces. Previous methods rarely consider the popular maximum margin principle and are prone to overfitting to usually small training data, especially for those under the maximum likelihood framework. In this paper, we present a posterior-regularized Bayesian approach to combine Principal Component Analysis (PCA) with the max-margin learning. Based on the data augmentation idea for max-margin learning and the probabilistic interpretation of PCA, our method can automatically infer the weight and penalty parameter of max-margin learning machine, while finding the most appropriate PCA subspace simultaneously under the Bayesian framework. We develop a fast mean-field variational inference algorithm to approximate the posterior. Experimental results on various classification tasks show that our method outperforms a number of competitors.
【Keywords】: Supervised dimensionality reduction; Principal Component Analysis (PCA); Maximum margin principle; Variational Bayesian
【Paper Link】 【Pages】:2589-2595
【Authors】: Jun Du ; Zhihua Cai
【Abstract】: In classification problem, we assume that the samples around the class boundary are more likely to be incorrectly annotated than others, and propose boundary-conditional class noise (BCN). Based on the BCN assumption, we use unnormalized Gaussian and Laplace distributions to directly model how class noise is generated, in symmetric and asymmetric cases. In addition, we demonstrate that Logistic regression and Probit regression can also be reinterpreted from this class noise perspective, and compare them with the proposed models. The empirical study shows that, the proposed asymmetric models overall outperform the benchmark linear models, and the asymmetric Laplace-noise model achieves the best performance among all.
【Keywords】:
【Paper Link】 【Pages】:2596-2602
【Authors】: Zhouyu Fu ; Feifei Pan ; Cheng Deng ; Wei Liu
【Abstract】: Multiple-Instance (MI) learning is an important supervised learning technique which deals with collections of instances called bags. While existing research in MI learning mainly focused on classification, in this paper we propose a new approach for MI retrieval to enable effective similarity retrieval of bags of instances, where training data is presented in the form of similar and dissimilar bag pairs. An embedded scheme is devised as encoding each bag into a single bag feature vector by exploiting a similarity-based transformation. In this way, the original MI problem is converted into a single-instance version. Furthermore, we develop a principled approach for optimizing bag features specific to similarity retrieval through leveraging pairwise label information at the bag level. The experimental results demonstrate the effectiveness of the proposed approach in comparison with the alternatives for MI retrieval.
【Keywords】:
【Paper Link】 【Pages】:2603-2609
【Authors】: Longwen Gao ; Shuigeng Zhou
【Abstract】: Group sparsity has drawn much attention in machine learning. However, existing work can handle only datasets with certain group structures, where each sample has a certain membership with one or more groups. This paper investigates the learning of sparse representations from datasets with uncertain group structures, where each sample has an uncertain member-ship with all groups in terms of a probability distribution. We call this problem uncertain group sparse representation (UGSR in short), which is a generalization of the standard group sparse representation (GSR). We formulate the UGSR model and propose an efficient algorithm to solve this problem. We apply UGSR to text emotion classification and aging face recognition. Experiments show that UGSR outperforms standard sparse representation (SR) and standard GSR as well as fuzzy kNN classification.
【Keywords】: Sparse representation; Group sparsity; Uncertainty
【Paper Link】 【Pages】:2610-2616
【Authors】: Debarghya Ghoshdastidar ; Ambedkar Dukkipati
【Abstract】: Spectral clustering, a graph partitioning technique, has gained immense popularity in machine learning in the context of unsupervised learning. This is due to convincing empirical studies, elegant approaches involved and the theoretical guarantees provided in the literature. To tackle some challenging problems that arose in computer vision etc., recently, a need to develop spectral methods that incorporate multi-way similarity measures surfaced. This, in turn, leads to a hypergraph partitioning problem. In this paper, we formulate a criterion for partitioning uniform hypergraphs, and show that a relaxation of this problem is related to the multilinear singular value decomposition (SVD) of symmetric tensors. Using this, we provide a spectral technique for clustering based on higher order affinities, and derive a theoretical bound on the error incurred by this method. We also study the complexity of the algorithm and use Nystr ̈om’s method and column sampling techniques to develop approximate methods with significantly reduced complexity. Experiments on geometric grouping and motion segmentation demonstrate the practical significance of the proposed methods.
【Keywords】: Clustering, Hypergraph partitioning, Multilinear SVD
【Paper Link】 【Pages】:2617-2623
【Authors】: Maxim Grechkin ; Maryam Fazel ; Daniela M. Witten ; Su-In Lee
【Abstract】: Graphical models provide a rich framework for summarizing the dependencies among variables. The graphical lasso approach attempts to learn the structure of a Gaussian graphical model (GGM) by maximizing the log likelihood of the data, subject to an l1 penalty on the elements of the inverse covariance matrix. Most algorithms for solving the graphical lasso problem do not scale to a very large number of variables. Furthermore, the learned network structure is hard to interpret. To overcome these challenges, we propose a novel GGM structure learning method that exploits the fact that for many real-world problems we have prior knowledge that certain edges are unlikely to be present. For example, in gene regulatory networks, a pair of genes that does not participate together in any of the cellular processes, typically referred to as pathways, is less likely to be connected. In computer vision applications in which each variable corresponds to a pixel, each variable is likely to be connected to the nearby variables. In this paper, we propose the pathway graphical lasso, which learns the structure of a GGM subject to pathway-based constraints. In order to solve this problem, we decompose the network into smaller parts, and use a message-passing algorithm in order to communicate among the subnetworks. Our algorithm has orders of magnitude improvement in run time compared to the state-of-the-art optimization methods for the graphical lasso problem that were modified to handle pathway-based constraints.
【Keywords】: structure learning, graphical models, computational biology, model decomposition
【Paper Link】 【Pages】:2624-2630
【Authors】: Zhaohan Guo ; Emma Brunskill
【Abstract】: In many real-world situations a decision maker may make decisions across many separate reinforcement learning tasks in parallel, yet there has been very little work on concurrent RL. Building on the efficient exploration RL literature, we introduce two new concurrent RL algorithms and bound their sample complexity. We show that under some mild conditions, both when the agent is known to be acting in many copies of the same MDP, and when they are not the same but are taken from a finite set, we can gain linear improvements in the sample complexity over not sharing information. This is quite exciting as a linear speedup is the most one might hope to gain. Our preliminary experiments confirm this result and show empirical benefits.
【Keywords】: Reinforcement Learning
【Paper Link】 【Pages】:2631-2637
【Abstract】: Feature grouping has been demonstrated to be promising in learning with high-dimensional data. It helps reduce the variances in the estimation and improves the stability of feature selection. One major limitation of existing feature grouping approaches is that some similar but different feature groups are often mis-fused, leading to impaired performance. In this paper, we propose a Discriminative Feature Grouping (DFG) method to discover the feature groups with enhanced discrimination. Different from existing methods, DFG adopts a novel regularizer for the feature coefficients to trade-off between fusing and discriminating feature groups. The proposed regularizer consists of a ell_1 norm to enforce feature sparsity and a pairwise ell_infty norm to encourage the absolute differences among any three feature coefficients to be similar. To achieve better asymptotic property, we generalize the proposed regularizer to an adaptive one where the feature coefficients are weighted based on the solution of some estimator with root-n consistency. For optimization, we employ the alternating direction method of multipliers to solve the proposed methods efficiently. Experimental results on synthetic and real-world datasets demonstrate that the proposed methods have good performance compared with the state-of-the-art feature grouping methods.
【Keywords】: Feature Selection; Feature Grouping
【Paper Link】 【Pages】:2638-2644
【Abstract】: In multi-task learning (MTL), multiple related tasks are learned jointly by sharing information across them. Many MTL algorithms have been proposed to learn the underlying task groups. However, those methods are limited to learn the task groups at only a single level, which may be not sufficient to model the complex structure among tasks in many real-world applications. In this paper, we propose a Multi-Level Task Grouping (MeTaG) method to learn the multi-level grouping structure instead of only one level among tasks. Specifically, by assuming the number of levels to be H, we decompose the parameter matrix into a sum of H component matrices, each of which is regularized with a l2 norm on the pairwise difference among parameters of all the tasks to construct level-specific task groups. For optimization, we employ the smoothing proximal gradient method to efficiently solve the objective function of the MeTaG model. Moreover, we provide theoretical analysis to show that under certain conditions the MeTaG model can recover the true parameter matrix and the true task groups in each level with high probability. We experiment our approach on both synthetic and real-world datasets, showing competitive performance over state-of-the-art MTL methods.
【Keywords】: Multi-Task Learning; Task Grouping; Multi-Level
【Paper Link】 【Pages】:2645-2651
【Authors】: Kazuo Hara ; Ikumi Suzuki ; Masashi Shimbo ; Kei Kobayashi ; Kenji Fukumizu ; Milos Radovanovic
【Abstract】: Hubness has been recently identified as a problematic phenomenon occurring in high-dimensional space. In this paper, we address a different type of hubness that occurs when the number of samples is large. We investigate the difference between the hubness in high-dimensional data and the one in large-sample data. One finding is that centering, which is known to reduce the former, does not work for the latter. We then propose a new hub-reduction method, called localized centering. It is an extension of centering, yet works effectively for both types of hubness. Using real-world datasets consisting of a large number of documents, we demonstrate that the proposed method improves the accuracy of k-nearest neighbor classification.
【Keywords】: Hubness; Centering; k nearest neighbor method
【Paper Link】 【Pages】:2652-2658
【Authors】: Anna Harutyunyan ; Sam Devlin ; Peter Vrancx ; Ann Nowé
【Abstract】: Effectively incorporating external advice is an important problem in reinforcement learning, especially as it moves into the real world. Potential-based reward shaping is a way to provide the agent with a specific form of additional reward, with the guarantee of policy invariance. In this work we give a novel way to incorporate an arbitrary reward function with the same guarantee, by implicitly translating it into the specific form of dynamic advice potentials, which are maintained as an auxiliary value function learnt at the same time. We show that advice provided in this way captures the input reward function in expectation, and demonstrate its efficacy empirically.
【Keywords】:
【Paper Link】 【Pages】:2659-2665
【Authors】: Wei-Ning Hsu ; Hsuan-Tien Lin
【Abstract】: Pool-based active learning is an important technique that helps reduce labeling efforts within a pool of unlabeled instances. Currently, most pool-based active learning strategies are constructed based on some human-designed philosophy; that is, they reflect what human beings assume to be “good labeling questions.” However, while such human-designed philosophies can be useful on specific data sets, it is often difficult to establish the theoretical connection of those philosophies to the true learning performance of interest. In addition, given that a single human-designed philosophy is unlikely to work on all scenarios, choosing and blending those strategies under different scenarios is an important but challenging practical task. This paper tackles this task by letting the machines adaptively “learn” from the performance of a set of given strategies on a particular data set. More specifically, we design a learning algorithm that connects active learning with the well-known multi-armed bandit problem. Further, we postulate that, given an appropriate choice for the multi-armed bandit learner, it is possible to estimate the performance of different strategies on the fly. Extensive empirical studies of the resulting ALBL algorithm confirm that it performs better than state-of-the-art strategies and a leading blending algorithm for active learning, all of which are based on human-designed philosophy.
【Keywords】: active learning; machine learning
【Paper Link】 【Pages】:2666-2672
【Authors】: Junjie Hu ; Haiqin Yang ; Irwin King ; Michael R. Lyu ; Anthony Man-Cho So
【Abstract】: Online learning from imbalanced streaming data to capture the nonlinearity and heterogeneity of the data is significant in machine learning and data mining. To tackle this problem, we propose a kernelized online imbalanced learning (KOIL) algorithm to directly maximize the area under the ROC curve (AUC). We address two more challenges: 1) How to control the number of support vectors without sacrificing model performance; and 2) how to restrict the fluctuation of the learned decision function to attain smooth updating. To this end, we introduce two buffers with fixed budgets (buffer sizes) for positive class and negative class, respectively, to store the learned support vectors, which can allow us to capture the global information of the decision boundary. When determining the weight of a new support vector, we confine its influence only to its $k$-nearest opposite support vectors. This can restrict the effect of new instances and prevent the harm of outliers. More importantly, we design a sophisticated scheme to compensate the model after replacement is conducted when either buffer is full. With this compensation, the learned model approaches the one learned with infinite budgets. We present both theoretical analysis and extensive experimental comparison to demonstrate the effectiveness of our proposed KOIL.
【Keywords】:
【Paper Link】 【Pages】:2673-2679
【Authors】: De-An Huang ; Amir Massoud Farahmand ; Kris M. Kitani ; James Andrew Bagnell
【Abstract】: Maximum entropy inverse optimal control (MaxEnt IOC) is an effective means of discovering the underlying cost function of demonstrated human activity and can be used to predict human behavior over low-dimensional state spaces (i.e., forecasting of 2D trajectories). To enable inference in very large state spaces, we introduce an approximate MaxEnt IOC procedure to address the fundamental computational bottleneck stemming from calculating the partition function via dynamic programming. Approximate MaxEnt IOC is based on two components: approximate dynamic programming and Monte Carlo sampling. We analyze this approximation approach and provide a finite-sample error upper bound on its excess loss. We validate the proposed method in the context of analyzing dual-agent interactions from video, where we use approximate MaxEnt IOC to simulate mental images of a single agents body pose sequence (a high-dimensional image space). We experiment with sequences image data taken from RGB and RGBD data and show that it is possible to learn cost functions that lead to accurate predictions in high-dimensional problems that were previously intractable.
【Keywords】:
【Paper Link】 【Pages】:2680-2686
【Authors】: Gao Huang ; Jianwen Zhang ; Shiji Song ; Zheng Chen
【Abstract】: This paper proposes a new approach for discriminative clustering. The intuition is, for a good clustering, one should be able to learn a classifier from the clustering labels with high generalization accuracy. Thus we define a novel metric to evaluate the quality of a clustering labeling, named Minimum Separation Probability (MSP), which is a lower bound of the generalization accuracy of a classifier learnt from the clustering labeling. We take MSP as the objective to maximize and propose our approach Maximin Separation Probability Clustering (MSPC), which has several attractive properties, such as invariance to anisotropic feature scaling and intuitive probabilistic explanation for clustering quality. We present three efficient optimization strategies for MSPC, and analyze their interesting connections to existing clustering approaches, such as maximum margin clustering (MMC) and discriminative k-means. Empirical results on real world data sets verify that MSP is a robust and effective clustering quality measure. It is also shown that the proposed algorithms compare favorably to state-of-the-art clustering algorithms in both accuracy and efficiency.
【Keywords】: discriminative clustering, minimax probability machine
【Paper Link】 【Pages】:2687-2693
【Authors】: Rui Huang ; Fengyuan Zhu ; Pheng-Ann Heng
【Abstract】: We develop the Dynamic Chinese Restaurant Process (DCRP) which incorporates time-evolutionary feature in dependent Dirichlet Process mixture models. This model can capture the dynamic change of mixture components, allowing clusters to emerge, vanish and vary over time. All these macroscopic changes are controlled by tracing the birth and death of every single element. We investigate the properties of dependent Dirichlet Process mixture model based on DCRP and develop corresponding Gibbs Sampler for posterior inference. We also conduct simulation and empirical studies to compare this model with traditional CRP and related models. The results show that this model can provide better results for sequential data, especially for data with heterogeneous lifetime distribution.
【Keywords】: Dynamic Chinese Restaurant Process;Birth-and-Death Process; Nonparametric Bayesian; Evolutionary Clustering
【Paper Link】 【Pages】:2694-2700
【Authors】: Lu Jiang ; Deyu Meng ; Qian Zhao ; Shiguang Shan ; Alexander G. Hauptmann
【Abstract】: Curriculum learning (CL) or self-paced learning (SPL) represents a recently proposed learning regime inspired by the learning process of humans and animals that gradually proceeds from easy to more complex samples in training. The two methods share a similar conceptual learning paradigm, but differ in specific learning schemes. In CL, the curriculum is predetermined by prior knowledge, and remain fixed thereafter. Therefore, this type of method heavily relies on the quality of prior knowledge while ignoring feedback about the learner. In SPL, the curriculum is dynamically determined to adjust to the learning pace of the leaner. However, SPL is unable to deal with prior knowledge, rendering it prone to overfitting. In this paper, we discover the missing link between CL and SPL, and propose a unified framework named self-paced curriculum leaning (SPCL). SPCL is formulated as a concise optimization problem that takes into account both prior knowledge known before training and the learning progress during training. In comparison to human education, SPCL is analogous to "instructor-student-collaborative" learning mode, as opposed to "instructor-driven" in CL or "student-driven" in SPL. Empirically, we show that the advantage of SPCL on two tasks.
【Keywords】: Self-paced Learning; Curriculum Learning; Prior Knowledge
【Paper Link】 【Pages】:2701-2707
【Authors】: Itamar Katz ; Koby Crammer
【Abstract】: We derive a convex optimization problem for the task of segmenting sequential data, which explicitly treats presence of outliers. We describe two algorithms for solving this problem, one exact and one a top-down novel approach, and we derive a consistency results for the case of two segments and no outliers. Robustness to outliers is evaluated on two real-world tasks related to speech segmentation. Our algorithms outperform baseline segmentation algorithms.
【Keywords】: Segmentation, Unsupervised learning, Outlier robustness; Time series
【Paper Link】 【Pages】:2708-2714
【Authors】: Nathaniel Korda ; Prashanth L. A. ; Rémi Munos
【Abstract】: Online learning algorithms require to often recompute least squares regression estimates of parameters. We study improving the computational complexity of such algorithms by using stochastic gradient descent (SGD) type schemes in place of classic regression solvers. We show that SGD schemes efficiently track the true solutions of the regression problems, even in the presence of a drift. This finding coupled with an $O(d)$ improvement in complexity, where $d$ is the dimension of the data, make them attractive for implementation in the \textit{big data} settings. In the case when strong convexity in the regression problem is guaranteed, we provide bounds on the error both in expectation and high probability (the latter is often needed to provide theoretical guarantees for higher level algorithms), despite the drifting least squares solution. As an example of this case we prove that the regret performance of an SGD version of the PEGE linear bandit algorithm is worse than that of PEGE itself only by a factor of $O(\log^4 n)$. When strong convexity of the regression problem cannot be guaranteed, we investigate using an adaptive regularisation. We make an empirical study of an adaptively regularised, SGD version of LinUCB in a news article recommendation application, which uses the large scale news recommendation dataset from Yahoo! front page. These experiments show a large gain in computational complexity and a consistently low tracking error.
【Keywords】: Stochastic Gradient Descent; Drifting; Linear Bandits; Least Squares Regression
【Paper Link】 【Pages】:2715-2721
【Authors】: Alex Kulesza ; Nan Jiang ; Satinder P. Singh
【Abstract】: Predictive state representations (PSRs) are models of dynamical systems that represent state as a vector of predictions about future observable events (tests) conditioned on past observed events (histories). If a practitioner selects finite sets of tests and histories that are known to be sufficient to completely capture the system, an exact PSR can be learned in polynomial time using spectral methods. However, most real-world systems are complex, and in practice computational constraints limit us to small sets of tests and histories which are therefore never truly sufficient. How, then, should we choose these sets? Existing theory offers little guidance here, and yet we show that the choice is highly consequential -- tests and histories selected at random or by a naive rule significantly underperform the best sets. In this paper we approach the problem both theoretically and empirically. While any fixed system can be represented by an infinite number of equivalent but distinct PSRs, we show that in the computationally unconstrained setting, where existing theory guarantees accurate predictions, the PSRs learned by spectral methods always satisfy a particular spectral bound. Adapting this idea, we propose a simple algorithmic technique to search for sets of tests and histories that approximately satisfy the bound while respecting computational limits. Empirically, our method significantly reduces prediction errors compared to standard spectral learning approaches.
【Keywords】:
【Paper Link】 【Pages】:2722-2728
【Authors】: Chandrashekar Lakshminarayanan ; Shalabh Bhatnagar
【Abstract】: Markov decision processes (MDPs) with large number of states are of high practical interest. However, conventional algorithms to solve MDP are computationally infeasible in this scenario. Approximate dynamic programming (ADP) methods tackle this issue by computing approximate solutions. A widely applied ADP method is approximate linear program (ALP) which makes use of linear function approximation and offers theoretical performance guarantees. Nevertheless, the ALP is difficult to solve due to the presence of a large number of constraints and in practice, a reduced linear program (RLP) is solved instead. The RLP has a tractable number of constraints sampled from the original constraints of the ALP. Though the RLP is known to perform well in experiments, theoretical guarantees are available only for a specific RLP obtained under idealized assumptions. In this paper, we generalize the RLP to define a generalized reduced linear program (GRLP) which has a tractable number of constraints that are obtained as positive linear combinations of the original constraints of the ALP. The main contribution of this paper is the novel theoretical framework developed to obtain error bounds for any given GRLP. Central to our framework are two max-norm contraction operators. Our result theoretically justifies linear approximation of constraints. We discuss the implication of our results in the contexts of ADP and reinforcement learning. We also demonstrate via an example in the domain of controlled queues that the experiments conform to the theory.
【Keywords】:
【Paper Link】 【Pages】:2729-2735
【Authors】: Johannes Lederer ; Christian Müller
【Abstract】: Lasso is a popular method for high-dimensional variable selection, but it hinges on a tuning parameter that is difficult to calibrate in practice. In this study, we introduce TREX, an alternative to Lasso with an inherent calibration to all aspects of the model. This adaptation to the entire model renders TREX an estimator that does not require any calibration of tuning parameters. We show that TREX can outperform cross-validated Lasso in terms of variable selection and computational efficiency. We also introduce a bootstrapped version of TREX that can further improve variable selection. We illustrate the promising performance of TREX both on synthetic data and on two biological data sets from the fields of genomics and proteomics.
【Keywords】: tuning parameter; variable selection; Lasso; high-dimensional regression
【Paper Link】 【Pages】:2736-2742
【Authors】: Kibok Lee ; Junmo Kim
【Abstract】: Linear discriminant analysis (LDA) is a popular dimensionality reduction and classification method that simultaneously maximizes between-class scatter and minimizes within-class scatter. In this paper, we verify the equivalence of LDA and least squares (LS) with a set of dependent variable matrices. The equivalence is in the sense that the LDA solution matrix and the LS solution matrix have the same range. The resulting LS provides an intuitive interpretation in which its solution performs data clustering according to class labels. Further, the fact that LDA and LS have the same range allows us to design a two-stage algorithm that computes the LDA solution given by generalized eigenvalue decomposition (GEVD), much faster than computing the original GEVD. Experimental results demonstrate the equivalence of the LDA solution and the proposed LS solution.
【Keywords】: linear discriminant analysis; least squares; dimensionality reduction
【Paper Link】 【Pages】:2743-2749
【Authors】: Chao Li ; Qibin Zhao ; Junhua Li ; Andrzej Cichocki ; Lili Guo
【Abstract】: In multi-data learning, it is usually assumed that common latent factors exist among multi-datasets, but it may lead to deteriorated performance when datasets are heterogeneous and unbalanced. In this paper, we propose a novel common structure for multi-data learning. Instead of common latent factors, we assume that datasets share Common Adjacency Graph (CAG) structure, which is more robust to heterogeneity and unbalance of datasets. Furthermore, we utilize CAG structure to develop a new method for multi-tensor completion, which exploits the common structure in datasets to improve the completion performance. Numerical results demostrate that the proposed method not only outperforms state-of-the-art methods for video in-painting, but also can recover missing data well even in cases that conventional methods are not applicable.
【Keywords】: Tensor Completion; Multi-task Learning; Common Structure
【Paper Link】 【Pages】:2750-2756
【Authors】: Yeqing Li ; Feiping Nie ; Heng Huang ; Junzhou Huang
【Abstract】: In this paper, we address the problem of large-scale multi-view spectral clustering. In many real-world applications, data can be represented in various heterogeneous features or views. Different views often provide different aspects of information that are complementary to each other. Several previous methods of clustering have demonstrated that better accuracy can be achieved using integrated information of all the views than just using each view individually. One important class of such methods is multi-view spectral clustering, which is based on graph Laplacian. However, existing methods are not applicable to large-scale problem for their high computational complexity. To this end, we propose a novel large-scale multi-view spectral clustering approach based on the bipartite graph. Our method uses local manifold fusion to integrate heterogeneous features. To improve efficiency, we approximate the similarity graphs using bipartite graphs. Furthermore, we show that our method can be easily extended to handle the out-of-sample problem. Extensive experimental results on five benchmark datasets demonstrate the effectiveness and efficiency of the proposed method, where our method runs up to nearly 3000 times faster than the state-of-the-art methods.
【Keywords】: Large-Scale; Multi-view; Spectral Clustering; Bipartite Graph
【Paper Link】 【Pages】:2757-2763
【Authors】: Wenzhao Lian ; Piyush Rai ; Esther Salazar ; Lawrence Carin
【Abstract】: We present a probabilistic framework for learning with heterogeneous multiview data where some views are given as ordinal, binary, or real-valued feature matrices, and some views as similarity matrices. Our framework has the following distinguishing aspects: (i) a unified latent factor model for integrating information from diverse feature (ordinal, binary, real) and similarity based views, and predicting the missing data in each view, leveraging view correlations; (ii) seamless adaptation to binary/multiclass classification where data consists of multiple feature and/or similarity-based views; and (iii) an efficient, variational inference algorithm which is especially flexible in modeling the views with ordinal-valued data (by learning the cutpoints for the ordinal data), and extends naturally to streaming data settings. Our framework subsumes methods such as multiview learning and multiple kernel learning as special cases. We demonstrate the effectiveness of our framework on several real-world and benchmarks datasets.
【Keywords】:
【Paper Link】 【Pages】:2764-2770
【Authors】: Anqi Liu ; Lev Reyzin ; Brian D. Ziebart
【Abstract】: Existing approaches to active learning are generally optimistic about their certainty with respect to data shift between labeled and unlabeled data. They assume that unknown datapoint labels follow the inductive biases of the active learner. As a result, the most useful datapoint labels—ones that refute current inductive biases—are rarely solicited. We propose a shift-pessimistic approach to active learning that assumes the worst-case about the unknown conditional label distribution. This closely aligns model uncertainty with generalization error, enabling more useful label solicitation. We investigate the theoretical benefits of this approach and demonstrate its empirical advantages on probabilistic binary classification tasks.
【Keywords】: Active Learning; Covariate Shift; Robust Classification
【Paper Link】 【Pages】:2771-2777
【Authors】: April H. Liu ; Leonard K. M. Poon ; Nevin Lianwen Zhang
【Abstract】: This paper is concerned with model-based clustering of discrete data. Latent class models (LCMs) are usually used for the task. An LCM consists of a latent variable and a number of attributes. It makes the overly restrictive assumption that the attributes are mutually independent given the latent variable. We propose a novel method to relax the assumption. The key idea is to partition the attributes into groups such that correlations among the attributes in each group can be properly modeled by using one single latent variable. The latent variables for the attribute groups are then used to build a number of models and one of them is chosen to produce the clustering results. Extensive empirical studies have been conducted to compare the new method with LCM and several other methods (K-means, kernel K-means and spectral clustering) that are not model-based. The new method outperforms the alternative methods in most cases and the differences are often large.
【Keywords】: Clustering; Latent Tree Models; Local Independence
【Paper Link】 【Pages】:2778-2784
【Authors】: Meng Liu ; Yong Luo ; Dacheng Tao ; Chao Xu ; Yonggang Wen
【Abstract】: Multi-label image classification is of significant interest due to its major role in real-world web image analysis applications such as large-scale image retrieval and browsing. Recently, matrix completion (MC) has been developed to deal with multi-label classification tasks. MC has distinct advantages, such as robustness to missing entries in the feature and label spaces and a natural ability to handle multi-label problems. However, current MC-based multi-label image classification methods only consider data represented by a single-view feature, therefore, do not precisely characterize images that contain several semantic concepts. An intuitive way to utilize multiple features taken from different views is to concatenate the different features into a long vector; however, this concatenation is prone to over-fitting and leads to high time complexity in MC-based image classification. Therefore, we present a novel multi-view learning model for MC-based image classification, called low-rank multi-view matrix completion (lrMMC), which first seeks a low-dimensional common representation of all views by utilizing the proposed low-rank multi-view learning (lrMVL) algorithm. In lrMVL, the common subspace is constrained to be low rank so that it is suitable for MC. In addition, combination weights are learned to explore complementarity between different views. An efficient solver based on fixed-point continuation (FPC) is developed for optimization, and the learned low-rank representation is then incorporated into MC-based image classification. Extensive experimentation on the challenging PASCAL VOC' 07 dataset demonstrates the superiority of lrMMC compared to other multi-label image classification approaches.
【Keywords】: multi-view; low-rank; matrix completion; image classification; multi-label
【Paper Link】 【Pages】:2785-2791
【Authors】: Song Liu ; Taiji Suzuki ; Masashi Sugiyama
【Abstract】: We study the problem of learning sparse structure changes between two Markov networks P and Q. Rather than fitting two Markov networks separately to two sets of data and figuring out their differences, a recent work proposed to learn changes directly via estimating the ratio between two Markov network models. Such a direct approach was demonstrated to perform excellently in experiments, although its theoretical properties remained unexplored. In this paper, we give sufficient conditions for successful change detection with respect to the sample size np, nq, the dimension of data m, and the number of changed edges d.
【Keywords】: Markov Network; Change Detection; Density Ratio Estimation
【Paper Link】 【Pages】:2792-2799
【Authors】: Wei Liu ; Cun Mu ; Rongrong Ji ; Shiqian Ma ; John R. Smith ; Shih-Fu Chang
【Abstract】: Metric learning has become a widespreadly used tool in machine learning. To reduce expensive costs brought in by increasing dimensionality, low-rank metric learning arises as it can be more economical in storage and computation. However, existing low-rank metric learning algorithms usually adopt nonconvex objectives, and are hence sensitive to the choice of a heuristic low-rank basis. In this paper, we propose a novel low-rank metric learning algorithm to yield bilinear similarity functions. This algorithm scales linearly with input dimensionality in both space and time, therefore applicable to high-dimensional data domains. A convex objective free of heuristics is formulated by leveraging trace norm regularization to promote low-rankness. Crucially, we prove that all globally optimal metric solutions must retain a certain low-rank structure, which enables our algorithm to decompose the high-dimensional learning task into two steps: an SVD-based projection and a metric learning problem with reduced dimensionality. The latter step can be tackled efficiently through employing a linearized Alternating Direction Method of Multipliers. The efficacy of the proposed algorithm is demonstrated through experiments performed on four benchmark datasets with tens of thousands of dimensions.
【Keywords】: metric learning; similarity; high dimensions
【Paper Link】 【Pages】:2800-2806
【Authors】: Weiwei Liu ; Ivor W. Tsang
【Abstract】: Canonical correlation analysis (CCA) and maximum margin output coding (MMOC) methods have shown promising results for multi-label prediction, where each instance is associated with multiple labels. However, these methods require an expensive decoding procedure to recover the multiple labels of each testing instance. The testing complexity becomes unacceptable when there are many labels. To avoid decoding completely, we present a novel large margin metric learning paradigm for multi-label prediction. In particular, the proposed method learns a distance metric to discover label dependency such that instances with very different multiple labels will be moved far away. To handle many labels, we present an accelerated proximal gradient procedure to speed up the learning process. Comprehensive experiments demonstrate that our proposed method is significantly faster than CCA and MMOC in terms of both training and testing complexities. Moreover, our method achieves superior prediction performance compared with state-of-the-art methods.
【Keywords】: Multi-label prediction,Metric Learning,$k$ Nearest Neighbors
【Paper Link】 【Pages】:2807-2813
【Authors】: Xinwang Liu ; Lei Wang ; Jianping Yin ; Yong Dou ; Jian Zhang
【Abstract】: Multiple kernel learning (MKL) optimally combines the multiple channels of each sample to improve classification performance. However, existing MKL algorithms cannot effectively handle the situation where some channels are missing, which is common in practical applications. This paper proposes an absent MKL (AMKL) algorithm to address this issue. Different from existing approaches where missing channels are firstly imputed and then a standard MKL algorithm is deployed on the imputed data, our algorithm directly classifies each sample with its observed channels. In specific, we define a margin for each sample in its own relevant space, which corresponds to the observed channels of that sample. The proposed AMKL algorithm then maximizes the minimum of all sample-based margins, and this leads to a difficult optimization problem. We show that this problem can be reformulated as a convex one by applying the representer theorem. This makes it readily be solved via existing convex optimization packages. Extensive experiments are conducted on five MKL benchmark data sets to compare the proposed algorithm with existing imputation-based methods. As observed, our algorithm achieves superior performance and the improvement is more significant with the increasing missing ratio.
【Keywords】: Multiple Kernel Learning, Absent Feature Learning, Max Margin
【Paper Link】 【Pages】:2814-2820
【Authors】: Yong Liu ; Shizhong Liao
【Abstract】: The selection of kernel function which determines the mapping between the input space and the feature space is of crucial importance to kernel methods. Existing kernel selection approaches commonly use some measures of generalization error, which are usually difficult to estimate and have slow convergence rates. In this paper, we propose a novel measure, called eigenvalues ratio (ER), of the tight bound of generalization error for kernel selection. ER is the ration between the sum of the main eigenvalues and that of the tail eigenvalues of the kernel matrix. Defferent from most of existing measures, ER is defined on the kernel matrxi, so it can be estimated easily from the available training data, which makes it usable for kernel selection. We establish tight ER-based generalization error bounds of order $O(\frac{1}{n})$ for several kernel-based methods under certain general conditions, while for most of existing measures, the convergence rate is at most $O(\frac{1}{\sqrt{n}})$. Finally, to guarantee good generalization performance, we propose a novel kernel selection criterion by minimizing the derived tight generalization error bounds. Theoretical analysis and experimental results demonstrate that our kernel selection criterion is a good choice for kernel seletion.
【Keywords】: kernel selection; matrix spectra; error bound
【Paper Link】 【Pages】:2821-2827
【Authors】: Kian Hsiang Low ; Jiangbo Yu ; Jie Chen ; Patrick Jaillet
【Abstract】: The expressive power of a Gaussian process (GP) model comes at a cost of poor scalability in the data size. To improve its scalability, this paper presents a low-rank-cum-Markov approximation (LMA) of the GP model that is novel in leveraging the dual computational advantages stemming from complementing a low-rank approximate representation of the full-rank GP based on a support set of inputs with a Markov approximation of the resulting residual process; the latter approximation is guaranteed to be closest in the Kullback-Leibler distance criterion subject to some constraint and is considerably more refined than that of existing sparse GP models utilizing low-rank representations due to its more relaxed conditional independence assumption (especially with larger data). As a result, our LMA method can trade off between the size of the support set and the order of the Markov property to (a) incur lower computational cost than such sparse GP models while achieving predictive performance comparable to them and (b) accurately represent features/patterns of any scale. Interestingly, varying the Markov order produces a spectrum of LMAs with PIC approximation and full-rank GP at the two extremes. An advantage of our LMA method is that it is amenable to parallelization on multiple machines/cores, thereby gaining greater scalability. Empirical evaluation on three real-world datasets in clusters of up to 32 computing nodes shows that our centralized and parallel LMA methods are significantly more time-efficient and scalable than state-of-the-art sparse and full-rank GP regression methods while achieving comparable predictive performances.
【Keywords】: Gaussian process; parallel machine learning; scalability; big data
【Paper Link】 【Pages】:2828-2834
【Authors】: Zhiwu Lu ; Xin Gao ; Liwei Wang ; Ji-Rong Wen ; Songfang Huang
【Abstract】: This paper presents a large-scale sparse coding algorithm to deal with the challenging problem of noise-robust semi-supervised learning over very large data with only few noisy initial labels. By giving an L1-norm formulation of Laplacian regularization directly based upon the manifold structure of the data, we transform noise-robust semi-supervised learning into a generalized sparse coding problem so that noise reduction can be imposed upon the noisy initial labels. Furthermore, to keep the scalability of noise-robust semi-supervised learning over very large data, we make use of both nonlinear approximation and dimension reduction techniques to solve this generalized sparse coding problem in linear time and space complexity. Finally, we evaluate the proposed algorithm in the challenging task of large-scale semi-supervised image classification with only few noisy initial labels. The experimental results on several benchmark image datasets show the promising performance of the proposed algorithm.
【Keywords】:
【Paper Link】 【Pages】:2835-2841
【Authors】: Tengfei Ma ; Issei Sato ; Hiroshi Nakagawa
【Abstract】: The hierarchical Dirichlet process (HDP) is a powerful nonparametric Bayesian approach to modeling groups of data which allows the mixture components in each group to be shared. However, in many cases the groups themselves are also in latent groups (categories) which may impact the modeling a lot. In order to utilize the unknown category information of grouped data, we present the hybrid nested/ hierarchical Dirichlet process (hNHDP), a prior that blends the desirable aspects of both the HDP and the nested Dirichlet Process (NDP). Specifically, we introduce a clustering structure for the groups. The prior distribution for each cluster is a realization of a Dirichlet process. Moreover, the set of cluster-specific distributions can share part of atoms between groups, and the shared atoms and specific atoms are generated separately. We apply the hNHDP to document modeling and bring in a mechanism to identify discriminative words and topics. We derive an efficient Markov chain Monte Carlo scheme for posterior inference and present experiments on document modeling.
【Keywords】: nested; hierarchical; Dirichlet process
【Paper Link】 【Pages】:2842-2848
【Authors】: Patrick MacAlpine ; Mike Depinet ; Peter Stone
【Abstract】: Layered learning is a hierarchical machine learning paradigm that enables learning of complex behaviors by incrementally learning a series of sub-behaviors. A key feature of layered learning is that higher layers directly depend on the learned lower layers. In its original formulation, lower layers were frozen prior to learning higher layers. This paper considers an extension to the paradigm that allows learning certain behaviors independently, and then later stitching them together by learning at the "seams" where their influences overlap. The UT Austin Villa 2014 RoboCup 3D simulation team, using such overlapping layered learning, learned a total of 19 layered behaviors for a simulated soccer-playing robot, organized both in series and in parallel. To the best of our knowledge this is more than three times the number of layered behaviors in any prior layered learning system. Furthermore, the complete learning process is repeated on four different robot body types, showcasing its generality as a paradigm for efficient behavior learning. The resulting team won the RoboCup 2014 championship with an undefeated record, scoring 52 goals and conceding none. This paper includes a detailed experimental analysis of the team's performance and the overlapping layered learning approach that led to its success.
【Keywords】:
【Paper Link】 【Pages】:2849-2856
【Authors】: Travis Mandel ; Yun-En Liu ; Emma Brunskill ; Zoran Popovic
【Abstract】: Current algorithms for the standard multi-armed bandit problem have good empirical performance and optimal regret bounds. However, real-world problems often differ from the standard formulation in several ways. First, feedback may be delayed instead of arriving immediately. Second, the real world often contains structure which suggests heuristics, which we wish to incorporate while retaining the best-known theoretical guarantees. Third, we may wish to make use of an arbitrary prior dataset without negatively impacting performance. Fourth, we may wish to efficiently evaluate algorithms using a previously collected dataset. Surprisingly, these seemingly-disparate problems can be addressed using algorithms inspired by a recently-developed queueing technique. We present the Stochastic Delayed Bandits (SDB) algorithm as a solution to these four problems, which takes black-box bandit algorithms (including heuristic approaches) as input while achieving good theoretical guarantees. We present empirical results from both synthetic simulations and real-world data drawn from an educational game. Our results show that SDB outperforms state-of-the-art approaches to handling delay, heuristics, prior data, and evaluation.
【Keywords】:
【Paper Link】 【Pages】:2857-2863
【Authors】: David Martínez Martínez ; Guillem Alenyà ; Carme Torras
【Abstract】: Reinforcement learning (RL) is a common paradigm for learning tasks in robotics. However, a lot of exploration is usually required, making RL too slow for high-level tasks. We present V-MIN, an algorithm that integrates teacher demonstrations with RL to learn complex tasks faster. The algorithm combines active demonstration requests and autonomous exploration to find policies yielding rewards higher than a given threshold Vmin. This threshold sets the degree of quality with which the robot is expected to complete the task, thus allowing the user to either opt for very good policies that require many learning experiences, or to be more permissive with sub-optimal policies that are easier to learn. The threshold can also be increased online to force the system to improve its policies until the desired behavior is obtained. Furthermore, the algorithm generalizes previously learned knowledge, adapting well to changes. The performance of V-MIN has been validated through experimentation, including domains from the international planning competition. Our approach achieves the desired behavior where previous algorithms failed.
【Keywords】: Model-based Reinforcement Learning; Teacher Demonstrations; Relational Learning; Active Learning
【Paper Link】 【Pages】:2864-2870
【Authors】: Charles Mathy ; Nate Derbinsky ; José Bento ; Jonathan Rosenthal ; Jonathan S. Yedidia
【Abstract】: We describe a new instance-based learning algorithm called the Boundary Forest (BF) algorithm, that can be used for supervised and unsupervised learning. The al- gorithm builds a forest of trees whose nodes store previ- ously seen examples. It can be shown data points one at a time and updates itself incrementally, hence it is nat- urally online. Few instance-based algorithms have this property while being simultaneously fast, which the BF is. This is crucial for applications where one needs to respond to input data in real time. The number of chil- dren of each node is not set beforehand but obtained from the training procedure, which makes the algorithm very flexible with regards to what data manifolds it can learn. We test its generalization performance and speed on a range of benchmark datasets and detail in which settings it outperforms the state of the art. Empirically we find that training time scales as O(DN log(N )) and testing as O(Dlog(N)), where D is the dimensionality and N the amount of data.
【Keywords】: Machine learning; Real time; tree-based search; Classification; Regression; Retrieval
【Paper Link】 【Pages】:2871-2877
【Authors】: Shike Mei ; Xiaojin Zhu
【Abstract】: We investigate a problem at the intersection of machine learning and security: training-set attacks on machine learners. In such attacks an attacker contaminates the training data so that a specific learning algorithm would produce a model profitable to the attacker. Understanding training-set attacks is important as more intelligent agents (e.g. spam filters and robots) are equipped with learning capability and can potentially be hacked via data they receive from the environment. This paper identifies the optimal training-set attack on a broad family of machine learners. First we show that optimal training-set attack can be formulated as a bilevel optimization problem. Then we show that for machine learners with certain Karush-Kuhn-Tucker conditions we can solve the bilevel problem efficiently using gradient methods on an implicit function. As examples, we demonstrate optimal training-set attacks on Support VectorMachines, logistic regression, and linear regression with extensive experiments. Finally, we discuss potential defenses against such attacks.
【Keywords】: machine teaching; security; optimal attack; implicit function
【Paper Link】 【Pages】:2878-2886
【Authors】: Aniruddh Nath ; Pedro M. Domingos
【Abstract】: Sum-product networks (SPNs) are a recently-proposed deep architecture that guarantees tractable inference, even on certain high-treewidth models. SPNs are a propositional architecture, treating the instances as independent and identically distributed. In this paper, we introduce Relational Sum-Product Networks (RSPNs), a new tractable first-order probabilistic architecture. RSPNs generalize SPNs by modeling a set of instances jointly, allowing them to influence each other's probability distributions, as well as modeling probabilities of relations between objects. We also present LearnRSPN, the first algorithm for learning high-treewidth tractable statistical relational models. LearnRSPN is a recursive top-down structure learning algorithm for RSPNs, based on Gens and Domingos' LearnSPN algorithm for propositional SPN learning. We evaluate the algorithm on three datasets; the RSPN learning algorithm outperforms Markov Logic Networks in both running time and predictive accuracy.
【Keywords】: Statistical Relational Learning; Tractable Models; Sum-Product Networks
【Paper Link】 【Pages】:2887-2893
【Authors】: Tu Dinh Nguyen ; Truyen Tran ; Dinh Q. Phung ; Svetha Venkatesh
【Abstract】: Restricted Boltzmann Machines (RBMs) are an important class of latent variable models for representing vector data. An under-explored area is multimode data, where each data point is a matrix or a tensor. Standard RBMs applying to such data would require vectorizing matrices and tensors, thus resulting in unnecessarily high dimensionality and at the same time, destroying the inherent higher-order interaction structures. This paper introduces Tensor-variate Restricted Boltzmann Machines (TvRBMs) which generalize RBMs to capture the multiplicative interaction between data modes and the latent variables. TvRBMs are highly compact in that the number of free parameters grows only linear with the number of modes. We demonstrate the capacity of TvRBMs on three real-world applications: handwritten digit classification, face recognition and EEG-based alcoholic diagnosis. The learnt features of the model are more discriminative than the rivals, resulting in better classification performance.
【Keywords】: tensor; rbm; restricted boltzmann machine; tvrbm; multiplicative interaction; eeg;
【Paper Link】 【Pages】:2894-2900
【Authors】: Mingdong Ou ; Peng Cui ; Jun Wang ; Fei Wang ; Wenwu Zhu
【Abstract】: Due to the simplicity and efficiency, many hashing methods have recently been developed for large-scale similarity search. Most of the existing hashing methods focus on mapping low-level features to binary codes, but neglect attributes that are commonly associated with data samples. Attribute data, such as image tag, product brand, and user profile, can represent human recognition better than low-level features. However, attributes have specific characteristics, including high-dimensional, sparse and categorical properties, which is hardly leveraged into the existing hashing learning frameworks. In this paper, we propose a hashing learning framework, Probabilistic Attributed Hashing (PAH), to integrate attributes with low-level features. The connections between attributes and low-level features are built through sharing a common set of latent binary variables, i.e. hash codes, through which attributes and features can complement each other. Finally, we develop an efficient iterative learning algorithm, which is generally feasible for large-scale applications. Extensive experiments and comparison study are conducted on two public datasets, i.e., DBLP and NUS-WIDE. The results clearly demonstrate that the proposed PAH method substantially outperforms the peer methods.
【Keywords】: Hashing; Attributes
【Paper Link】 【Pages】:2901-2907
【Authors】: Mahdi Pakdaman Naeini ; Gregory F. Cooper ; Milos Hauskrecht
【Abstract】: Learning probabilistic predictive models that are well calibrated is critical for many prediction and decision-making tasks in artificial intelligence. In this paper we present a new non-parametric calibration method called Bayesian Binning into Quantiles (BBQ) which addresses key limitations of existing calibration methods. The method post processes the output of a binary classification algorithm; thus, it can be readily combined with many existing classification algorithms. The method is computationally tractable, and empirically accurate, as evidenced by the set of experiments reported here on both real and simulated datasets.
【Keywords】: Bayesian binning; classifier calibration; accurate probability; calibrated probability; Bayesian Scoring
【Paper Link】 【Pages】:2908-2913
【Authors】: Brandon Shane Parker ; Latifur Khan
【Abstract】: As the proliferation of constant data feeds increases from social media, embedded sensors, and other sources, the capability to provide predictive concept labels to these data streams will become ever more important and lucrative. However, the dynamic, non-stationary nature, and effectively infinite length of data streams pose additional challenges for stream data mining algorithms. The sparse quantity of training data also limits the use of algorithms that are heavily dependent on supervised training. To address all these issues, we propose an incremental semi-supervised method that provides general concept class label predictions, but it also tracks concept clusters within the feature space using an innovative new online clustering algorithm. Each concept cluster contains an embedded stream classifier, creating a diverse ensemble for data instance classification within the generative model used for detecting emerging concepts in the stream. Unlike other recent novel class detection methods, our method goes beyond detecting, and continues to differentiate and track the emerging concepts. We show the effectiveness of our method on several synthetic and real world data sets, and we compare the results against other leading baseline methods.
【Keywords】: Fast Data; Novel class detection; Non-Stationary stream classification; semi-supervised learning; stream clustering
【Paper Link】 【Pages】:2914-2920
【Authors】: Leto Peel ; Aaron Clauset
【Abstract】: Interactions among people or objects are often dynamic in nature and can be represented as a sequence of networks, each providing a snapshot of the interactions over a brief period of time. An important task in analyzing such evolving networks is change-point detection, in which we both identify the times at which the large-scale pattern of interactions changes fundamentally and quantify how large and what kind of change occurred. Here, we formalize for the first time the network change-point detection problem within an online probabilistic learning framework and introduce a method that can reliably solve it. This method combines a generalized hierarchical random graph model with a Bayesian hypothesis test to quantitatively determine if, when, and precisely how a change point has occurred. We analyze the detectability of our method using synthetic data with known change points of different types and magnitudes, and show that this method is more accurate than several previously used alternatives. Applied to two high-resolution evolving social networks, this method identifies a sequence of change points that align with known external ``shocks'' to these networks.
【Keywords】: dynamic networks; change-point detection; generative models; model comparison
【Paper Link】 【Pages】:2921-2927
【Authors】: Yuxin Peng
【Abstract】: Learning from imbalanced data sets is one of the challenging problems in machine learning, which means the number of negative examples is far more than that of positive examples. The main problems of existing methods are: (1) The degree of re-sampling, a key factor greatly affecting performance, needs to be pre-fixed, which is difficult to make the optimal choice; (2) Many useful negative samples are discarded in under-sampling; (3) The effectiveness of algorithm-level methods are limited because they just use the original training data for single classifier. To address the above issues, a novel approach of adaptive sampling with optimal cost is proposed for class-imbalance learning in this paper. The novelty of the proposed approach mainly lies in: adaptively over-sampling the minority positive examples and under-sampling the majority negative examples, forming different sub-classifiers by different subsets of training data with the best cost ratio adaptively chosen, and combining these sub-classifiers according to their accuracy to create a strong classifier. It aims to make full use of the whole training data and improve the performance of class-imbalance learning classifier. The solid experiments are conducted to compare the performance between the proposed approach and 12 state-of-the-art methods on challenging 16 UCI data sets on 3 evaluation metrics, and the results show the proposed approach can achieve superior performance in class-imbalance learning.
【Keywords】: classification; class-imbalance learning; cost-sensitive learning; sampling
【Paper Link】 【Pages】:2928-2934
【Authors】: Matteo Pirotta ; Simone Parisi ; Marcello Restelli
【Abstract】: This paper is about learning a continuous approximation of the Pareto frontier in Multi-Objective Markov Decision Problems (MOMDPs).We propose a policy-based approach that exploits gradient information to generate solutions close to the Pareto ones.Differently from previous policy-gradient multi-objective algorithms, where n optimization routines are used to have n solutions, our approach performs a single gradient-ascent run that at each step generates an improved continuous approximation of the Pareto frontier.The idea is to exploit a gradient-based approach to optimize the parameters of a function that defines a manifold in the policy parameter space so that the corresponding image in the objective space gets as close as possible to the Pareto frontier.Besides deriving how to compute and estimate such gradient, we will also discuss the non-trivial issue of defining a metric to assess the quality of the candidate Pareto frontiers.Finally, the properties of the proposed approach are empirically evaluated on two interesting MOMDPs.
【Keywords】: Reinforcement Learning; MDP;
【Paper Link】 【Pages】:2935-2941
【Authors】: Chao Qian ; Yang Yu ; Zhi-Hua Zhou
【Abstract】: Ensemble learning is among the state-of-the-art learning techniques, which trains and combines many base learners. Ensemble pruning removes some of the base learners of an ensemble, and has been shown to be able to further improve the generalization performance. However, the two goals of ensemble pruning, i.e., maximizing the generalization performance and minimizing the number of base learners, can conflict when being pushed to the limit. Most previous ensemble pruning approaches solve objectives that mix the two goals. In this paper, motivated by the recent theoretical advance of evolutionary optimization, we investigate solving the two goals explicitly in a bi-objective formulation and propose the PEP (Pareto Ensemble Pruning) approach. We disclose that PEP does not only achieve significantly better performance than the state-of-the-art approaches, and also gains theoretical support.
【Keywords】:
【Paper Link】 【Pages】:2942-2948
【Authors】: Piyush Rai ; Yingjian Wang ; Lawrence Carin
【Abstract】: We present a probabilistic model for tensor decomposition where one or more tensor modes may have side-information about the mode entities in form of their features and/or their adjacency network. We consider a Bayesian approach based on the Canonical PARAFAC (CP) decomposition and enrich this single-layer decomposition approach with a two-layer decomposition. The second layer fits a factor model for each layer-one factor matrix and models the factor matrix via the mode entities' features and/or the network between the mode entities. The second-layer decomposition of each factor matrix also learns a binary latent representation for the entities of that mode, which can be useful in its own right. Our model can handle both continuous as well as binary tensor observations. Another appealing aspect of our model is the simplicity of the model inference, with easy-to-sample Gibbs updates. We demonstrate the results of our model on several benchmarks datasets, consisting of both real and binary tensors.
【Keywords】: bayesian methods; tensor decomposition
【Paper Link】 【Pages】:2949-2955
【Authors】: Sashank Jakkam Reddi ; Barnabás Póczos ; Alexander J. Smola
【Abstract】: Covariate shift correction allows one to perform supervised learning even when the distribution of the covariates on the training set does not match that on the test set. This is achieved by re-weighting observations. Such a strategy removes bias, potentially at the expense of greatly increased variance. We propose a simple strategy for removing bias while retaining small variance. It uses a biased, low variance estimate as a prior and corrects the final estimate relative to the prior. We prove that this yields an efficient estimator and demonstrate good experimental performance.
【Keywords】:
【Paper Link】 【Pages】:2956-2964
【Authors】: Peter Schulam ; Fredrick Wigley ; Suchi Saria
【Abstract】: Diseases such as autism, cardiovascular disease, and the autoimmune disorders are difficult to treat because of the remarkable degree of variation among affected individuals. Subtyping research seeks to refine the definition of such complex, multi-organ diseases by identifying homogeneous patient subgroups. In this paper, we propose the Probabilistic Subtyping Model (PSM) to identify subgroups based on clustering individual clinical severity markers. This task is challenging due to the presence of nuisance variability — variations in measurements that are not due to disease subtype — which, if not accounted for, generate biased estimates for the group-level trajectories. Measurement sparsity and irregular sampling patterns pose additional challenges in clustering such data. PSM uses a hierarchical model to account for these different sources of variability. Our experiments demonstrate that by accounting for nuisance variability, PSM is able to more accurately model the marker data. We also discuss novel subtypes discovered using PSM and the resulting clinical hypotheses that are now the subject of follow up clinical experiments.
【Keywords】: Machine Learning; Computational Medicine; Computational Phenotyping; Computational Endotyping; Time Series; Latent Variable Models; Disease Subtyping; Patient Similarity
【Paper Link】 【Pages】:2965-2971
【Authors】: Bin Shen ; Bao-Di Liu ; Qifan Wang ; Yi Fang ; Jan P. Allebach
【Abstract】: As one of the most important state-of-the-art classification techniques, Support Vector Machine (SVM) has been widely adopted in many real-world applications, such as object detection, face recognition, text categorization, etc., due to its competitive practical performance and elegant theoretical interpretation. However, it treats all samples independently, and ignores the fact that, in many real situations especially when data are in high dimensional space, samples typically lie on low dimensional manifolds of the feature space and thus a sample can be related to its neighbors by being represented as a linear combination of other samples on the same manifold. This linear representation, which is usually sparse, reflects the structure of underlying manifolds. It has been extensively explored in the recent literature and proven to be critical for the performance of classification. To benefit from both the underlying low dimensional manifold structure and the large margin classifier, this paper proposes a novel method called Sparsity Preserving Support Vector Machine(SP-SVM), which explicitly considers the sparse representation of samples while maximizing the margin between different classes. Consequently, SP-SVM inherits both the discriminative power of support vector machine and the merits of sparsity. A set of experiments on real-world benchmark data sets show that SP-SVM achieves significantly higher precision on recognition task than various competitive baselines including the traditional SVM, the sparse representation based method and the classical nearest neighbor classifier.
【Keywords】: SVM; sparse representation; manifold learning; face recognition; image classification
【Paper Link】 【Pages】:2972-2978
【Authors】: Yangqiu Song ; Chenguang Wang ; Ming Zhang ; Hailong Sun ; Qiang Yang
【Abstract】: With the recent growth of online content on the Web, there have been more user generated data with noisy and missing labels, e.g., social tags and voted labels from Amazon's Mechanical Turks. Most of machine learning methods, which require accurate label sets, could not be trusted when the label sets were yet unreliable. In this paper, we provide a text label refinement algorithm to adjust the labels for such noisy and missing labeled datasets. We assume that the labeled sets can be refined based on the labels with certain confidence, and the similarity between data being consistent with the labels. We propose a label smoothness ratio criterion to measure the smoothness of the labels and the consistency between labels and data. We demonstrate the effectiveness of the label refining algorithm on eight labeled document datasets, and validate that the results are useful for generating better labels.
【Keywords】:
【Paper Link】 【Pages】:2979-2985
【Authors】: Paul A. Szerlip ; Gregory Morse ; Justin K. Pugh ; Kenneth O. Stanley
【Abstract】: Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start even though they are trained without knowledge of the ultimate classification problem. Interestingly, DDFA also continues to add new features indefinitely (so it does not depend on a hidden layer size), is not based on minimizing error, and is inherently divergent instead of convergent, thereby providing a unique direction of research for unsupervised feature learning. In this paper the quality of its learned features is demonstrated on the MNIST dataset, where its performance confirms that indeed DDFA is a viable technique for learning useful features.
【Keywords】: Evolutionary Computation; Neural Networks; Unsupervised Feature Learning; Neuroevolution; Novelty Search; Deep Learning
【Paper Link】 【Pages】:2986-2992
【Authors】: Erik Talvitie
【Abstract】: While model-based reinforcement learning is often studied under the assumption that a fully accurate model is contained within the model class, this is rarely true in practice. When the model class may be fundamentally limited, it can be difficult to obtain theoretical guarantees. Under some conditions the DAgger algorithm promises a policy nearly as good as the plan obtained from the most accurate model in the class, but only if the planning algorithm is near-optimal, which is also rarely the case in complex problems. This paper explores the interaction between DAgger and Monte Carlo planning, specifically showing that DAgger may perform poorly when coupled with a sub-optimal planner. A novel variation of DAgger specifically for use with Monte Carlo planning is derived and is shown to behave far better in some cases where DAgger fails.
【Keywords】: reinforcement learning; system identification; model-based reinforcement learning
【Paper Link】 【Pages】:2993-2999
【Authors】: Aviv Tamar ; Yonatan Glassner ; Shie Mannor
【Abstract】: Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.
【Keywords】: CVaR; Likelihood Ratio Method; Reinforcement Learning; MDP
【Paper Link】 【Pages】:3000-3006
【Authors】: Philip S. Thomas ; Georgios Theocharous ; Mohammad Ghavamzadeh
【Abstract】: Many reinforcement learning algorithms use trajectories collected from the execution of one or more policies to propose a new policy. Because execution of a bad policy can be costly or dangerous, techniques for evaluating the performance of the new policy without requiring its execution have been of recent interest in industry. Such off-policy evaluation methods, which estimate the performance of a policy using trajectories collected from the execution of other policies, heretofore have not provided confidences regarding the accuracy of their estimates. In this paper we propose an off-policy method for computing a lower confidence bound on the expected return of a policy.
【Keywords】: policy evaluation; high-confidence; concentration inequality
【Paper Link】 【Pages】:3007-3015
【Authors】: Jan Van Haaren ; Andrey Kolobov ; Jesse Davis
【Abstract】: The traditional way of obtaining models from data, inductive learning, has proved itself both in theory and in many practical applications. However, in domains where data is difficult or expensive to obtain, e.g., medicine, deep transfer learning is a more promising technique. It circumvents the model acquisition difficulties caused by scarce data in a target domain by carrying over structural properties of a model learned in a source domain where training data is ample. Nonetheless, the lack of a principled view of transfer learning so far has limited its adoption. In this paper, we address this issue by regarding transfer learning as a process that biases learning in a target domain in favor of patterns useful in a source domain. Specifically, we consider a first-order logic model of the data as an instantiation of a set of second-order templates. Hence, the usefulness of a model is partly determined by the learner's prior distribution over these template sets. The main insight of our work is that transferring knowledge amounts to acquiring a posterior over the second-order template sets by learning in the source domain and using this posterior when learning in the target setting. Our experimental evaluation demonstrates our approach to outperform the existing transfer learning techniques in terms of accuracy and runtime.
【Keywords】: Transfer learning
【Paper Link】 【Pages】:3016-3023
【Authors】: Joel Veness ; Marc G. Bellemare ; Marcus Hutter ; Alvin Chua ; Guillaume Desjardins
【Abstract】: This paper describes a new information-theoretic policy evaluation technique for reinforcement learning. This technique converts any compression or density model into a corresponding estimate of value. Under appropriate stationarity and ergodicity conditions, we show that the use of a sufficiently powerful model gives rise to a consistent value function estimator. We also study the behavior of this technique when applied to various Atari 2600 video games, where the use of suboptimal modeling techniques is unavoidable. We consider three fundamentally different models, all too limited to perfectly model the dynamics of the system. Remarkably, we find that our technique provides sufficiently accurate value estimates for effective on-policy control. We conclude with a suggestive study highlighting the potential of our technique to scale to large problems.
【Keywords】: Reinforcement Learning; Compression; Policy Evaluation; Density Estimation; On-policy Control
【Paper Link】 【Pages】:3024-3030
【Authors】: Arun Venkatraman ; Martial Hebert ; J. Andrew Bagnell
【Abstract】: Most typical statistical and machine learning approaches to time series modeling optimize a single-step prediction error. In multiple-step simulation, the learned model is iteratively applied, feeding through the previous output as its new input. Any such predictor however, inevitably introduces errors, and these compounding errors change the input distribution for future prediction steps, breaking the train-test i.i.d assumption common in supervised learning. We present an approach that reuses training data to make a no-regret learner robust to errors made during multi-step prediction. Our insight is to formulate the problem as imitation learning; the training data serves as a "demonstrator" by providing corrections for the errors made during multi-step prediction. By this reduction of multi-step time series prediction to imitation learning, we establish theoretically a strong performance guarantee on the relation between training error and the multi-step prediction error. We present experimental results of our method, DaD, and show significant improvement over the traditional approach in two notably different domains, dynamic system modeling and video texture prediction.
【Keywords】: Time series modeling; Learning dynamics models; Imitation learning; Cascading predictions; Meta-Algorithms
【Paper Link】 【Pages】:3031-3037
【Authors】: Cheng Wan ; Xiaoming Jin ; Guiguang Ding ; Dou Shen
【Abstract】: Restricted Boltzmann Machine (RBM) has been applied to a wide variety of tasks due to its advantage in feature extraction. Implementing sparsity constraint in the activated hidden units of RBM is an important improvement on RBM. The sparsity constraints in the existing methods are usually specified by users and are independent of the input data. However, the input data could be heterogeneous in content and thus naturally demand elastic and adaptive settings of the sparsity constraints. To solve this problem, we proposed a generalized model with adaptive sparsity constraint, named Gaussian Cardinality Restricted Boltzmann Machines (GC-RBM). In this model, the thresholds of hidden unit activations are decided by the input data and a given Gaussian distribution on the pre-training phase. We provide a principled method to train the GC-RBM with Gaussian prior. Experimental results on two real world data sets justify the effectiveness of the proposed method and its superiority over CaRBM in terms of classification accuracy.
【Keywords】: RBM; sparsity; Gaussian
【Paper Link】 【Pages】:3038-3044
【Authors】: Boyu Wang ; Joelle Pineau
【Abstract】: The related problems of transfer learning and multitask learning have attracted significant attention, generating a rich literature of models and algorithms. Yet most existing approaches are studied in an offline fashion, implicitly assuming that data from different domains are given as a batch. Such an assumption is not valid in many real-world applications where data samples arrive sequentially, and one wants a good learner even from few examples. The goal of our work is to provide sound extensions to existing transfer and multitask learning algorithms such that they can be used in an anytime setting. More specifically, we propose two novel online boosting algorithms, one for transfer learning and one for multitask learning, both designed to leverage the knowledge of instances in other domains. The experimental results show state-of-the-art empirical performance on standard benchmarks, and we present results of using our methods for effectively detecting new seizures in patients with epilepsy from very few previous samples.
【Keywords】:
【Paper Link】 【Pages】:3045-3051
【Authors】: Hanmo Wang ; Liang Du ; Peng Zhou ; Lei Shi ; Yi-Dong Shen
【Abstract】: Active learning is a machine learning technique that trains a classifier after selecting a subset from an unlabeled dataset for labeling and using the selected data for training. Recently, batch mode active learning, which selects a batch of samples to label in parallel, has attracted a lot of attention. Its challenge lies in the choice of criteria used for guiding the search of the optimal batch. In this paper, we propose a novel approach to selecting the optimal batch of queries by minimizing the α-relative Pearson divergence (RPE) between the labeled and the original datasets. This particular divergence is chosen since it can distinguish the optimal batch more easily than other measures especially when available candidates are similar. The proposed objective is a min-max optimization problem, and it is difficult to solve due to the involvement of both minimization and maximization. We find that the objective has an equivalent convex form, and thus a global optimal solution can be obtained. Then the subgradient method can be applied to solve the simplified convex problem. Our empirical studies on UCI datasets demonstrate the effectiveness of the proposed approach compared with the state-of-the-art batch mode active learning methods.
【Keywords】: Active Learning; Batch Mode Active Learning; alpha-relative Pearson divergence; RPE; distribution comparison; maximum mean discrepancy; MMD; convex; minimax; min-max
【Paper Link】 【Pages】:3052-3058
【Authors】: Hao Wang ; Xingjian Shi ; Dit-Yan Yeung
【Abstract】: Tag recommendation has become one of the most important ways of organizing and indexing online resources like articles, movies, and music. Since tagging information is usually very sparse, effective learning of the content representation for these resources is crucial to accurate tag recommendation. Recently, models proposed for tag recommendation, such as collaborative topic regression and its variants, have demonstrated promising accuracy. However, a limitation of these models is that, by using topic models like latent Dirichlet allocation as the key component, the learned representation may not be compact and effective enough. Moreover, since relational data exist as an auxiliary data source in many applications, it is desirable to incorporate such data into tag recommendation models. In this paper, we start with a deep learning model called stacked denoising autoencoder (SDAE) in an attempt to learn more effective content representation. We propose a probabilistic formulation for SDAE and then extend it to a relational SDAE (RSDAE) model. RSDAE jointly performs deep representation learning and relational learning in a principled way under a probabilistic framework. Experiments conducted on three real datasets show that both learning more effective representation and learning from relational data are beneficial steps to take to advance the state of the art.
【Keywords】: Deep Learning; Recommender System; Collaborative Filtering; Social Network
【Paper Link】 【Pages】:3059-3065
【Authors】: Hua Wang ; Feiping Nie ; Heng Huang
【Abstract】: Locality preserving projection (LPP) is an effective dimensionality reduction method based on manifold learning, which is defined over the graph weighted squared L2-norm distances in the projected subspace. Since squared L2-norm distance is prone to outliers, it is desirable to develop a robust LPP method. In this paper, motivated by existing studies that improve the robustness of statistical learning models via L1-norm or not-squared L2-norm formulations, we propose a robust LPP (rLPP) formulation to minimize the p-th order of the L2-norm distances, which can better tolerate large outlying data samples because it suppress the introduced biased more than the L1-norm or not squared L2-norm minimizations. However, solving the formulated objective is very challenging because it not only non-smooth but also non-convex. As an important theoretical contribution of this work, we systematically derive an efficient iterative algorithm to solve the general p-th order L2-norm minimization problem, which, to the best of our knowledge, is solved for the first time in literature. Extensive empirical evaluations on the proposed rLPP method have been performed, in which our new method outperforms the related state-of-the-art methods in a variety of experimental settings and demonstrate its effectiveness in seeking better subspaces on both noiseless and noisy data.
【Keywords】: Unsupervised learning; Robust Locality Preserving Projection; p-Order Minimization
【Paper Link】 【Pages】:3066-3072
【Authors】: Qifan Wang ; Luo Si ; Bin Shen
【Abstract】: Hashing techniques have been widely applied for large scale similarity search problems due to the computational and memory efficiency.However, most existing hashing methods assume data examples are independently and identically distributed.But there often exists various additional dependency/structure information between data examplesin many real world applications. Ignoring this structure information may limit theperformance of existing hashing algorithms.This paper explores the research problemof learning to Hash on Structured Data (HSD) and formulates anovel framework that considers additional structure information.In particular, the hashing function is learned in a unified learning framework by simultaneously ensuring the structural consistency and preserving the similarities between data examples.An iterative gradient descent algorithm is designed as the optimization procedure. Furthermore, we improve the effectiveness of hashing function through orthogonal transformation by minimizing the quantization error.Experimentalresults on two datasets clearly demonstrate the advantages ofthe proposed method over several state-of-the-art hashing methods.
【Keywords】: Hashing; Structure Data; Similarity Search
【Paper Link】 【Pages】:3073-3079
【Authors】: Wei Wang ; Hao Wang ; Chen Zhang ; Fanjiang Xu
【Abstract】: Learning an appropriate feature representation across source and target domains is one of the most effective solutions to domain adaptation problems. Conventional cross-domain feature learning methods rely on the Reproducing Kernel Hilbert Space (RKHS) induced by a single kernel. Recently, Multiple Kernel Learning (MKL), which bases classifiers on combinations of kernels, has shown improved performance in the tasks without distribution difference between domains. In this paper, we generalize the framework of MKL for cross-domain feature learning and propose a novel Transfer Feature Representation (TFR) algorithm. TFR learns a convex combination of multiple kernels and a linear transformation in a single optimization which integrates the minimization of distribution difference with the preservation of discriminating power across domains. As a result, standard machine learning models trained in the source domain can be reused for the target domain data. After rewritten into a differentiable formulation, TFR can be optimized by a reduced gradient method and reaches the convergence. Experiments in two real-world applications verify the effectiveness of our proposed method.
【Keywords】: domain adaptation; feature representation; multiple kernel learning;
【Paper Link】 【Pages】:3080-3086
【Authors】: Martha White ; Junfeng Wen ; Michael Bowling ; Dale Schuurmans
【Abstract】: Autoregressive moving average (ARMA) models are a fundamental tool in timeseries analysis that offer intuitive modeling capability and efficient predictors. Unfortunately, the lack of globally optimal parameter estimation strategies for these models remains a problem:application studies often adopt the simpler autoregressive model that can be easily estimated by maximizing (a posteriori) likelihood. We develop a (regularized, imputed) maximum likelihood criterion that admits efficient global estimation via structured matrix norm optimization methods. An empirical evaluation demonstrates the benefits of globally optimal parameter estimation over local and moment matching approaches.
【Keywords】: Time-Series
【Paper Link】 【Pages】:3087-3093
【Authors】: Robert William Wright ; Xingye Qiao ; Steven Loscalzo ; Lei Yu
【Abstract】: Approximate value iteration (AVI) is a widely used technique in reinforcement learning. Most AVI methods do not take full advantage of the sequential relationship between samples within a trajectory in deriving value estimates, due to the challenges in dealing with the inherent bias and variance in the $n$-step returns. We propose a bounding method which uses a negatively biased but relatively low variance estimator generated from a complex return to provide a lower bound on the observed value of a traditional one-step return estimator. In addition, we develop a new Bounded FQI algorithm, which efficiently incorporates the bounding method into an AVI framework. Experiments show that our method produces more accurate value estimates than existing approaches, resulting in improved policies.
【Keywords】: Reinforcement Learning; Approximate Value Iteration; Complex Returns; Off-Policy
【Paper Link】 【Pages】:3094-3100
【Authors】: Ga Wu ; Scott Sanner ; Rodrigo F. S. C. Oliveira
【Abstract】: Naive Bayes (NB) is well-known to be a simple but effective classifier, especially when combined with feature selection. Unfortunately, feature selection methods are often greedy and thus cannot guarantee an optimal feature set is selected. An alternative to feature selection is to use Bayesian model averaging (BMA), which computes a weighted average over multiple predictors; when the different predictor models correspond to different feature sets, BMA has the advantage over feature selection that its predictions tend to have lower variance on average in comparison to any single model. In this paper, we show for the first time that it is possible to exactly evaluate BMA over the exponentially-sized powerset of NB feature models in linear-time in the number of features; this yields an algorithm about as expensive to train as a single NB model with all features, but yet provably converges to the globally optimal feature subset in the asymptotic limit of data. We evaluate this novel BMA-NB classifier on a range of datasets showing that it never underperforms NB (as expected) and sometimes offers performance competitive (or superior) to classifiers such as SVMs and logistic regression while taking a fraction of the time to train.
【Keywords】: Naive Bayes; Bayesian Model Averaging; Feature Selection
【Paper Link】 【Pages】:3101-3107
【Authors】: Hongteng Xu ; Licheng Yu ; Dixin Luo ; Hongyuan Zha ; Yi Xu
【Abstract】: In this paper, we propose a novel dictionary learning method in the semi-supervised setting by dynamically coupling graph and group structures. To this end, samples are represented by sparse codes inheriting their graph structure while the labeled samples within the same class are represented with group sparsity, sharing the same atoms of the dictionary. Instead of statically combining graph and group structures, we take advantage of them in a mutually reinforcing way — in the dictionary learning phase, we introduce the unlabeled samples into groups by an entropy-based method and then update the corresponding local graph, resulting in a more structured and discriminative dictionary. We analyze the relationship between the two structures and prove the convergence of our proposed method. Focusing on image classification task, we evaluate our approach on several datasets and obtain superior performance compared with the state-of-the-art methods, especially in the case of only a few labeled samples and limited dictionary size.
【Keywords】: Dictionary learning; Group-graph structures; Mutually reinforcing; Sparse Representation; Classification
【Paper Link】 【Pages】:3108-3114
【Authors】: Hongteng Xu ; Hongyuan Zha ; Ren-Cang Li ; Mark A. Davenport
【Abstract】: In this paper, we propose an interpretation of active learning from a pure algebraic view and combine it with semi-supervised manifold learning. The proposed active manifold learning algorithm aims to learn the low-dimensional parameter space of the manifold with high accuracy from smartly labeled samples. We demonstrate that this problem is equivalent to a condition number minimization problem of the alignment matrix. Focusing on this problem, we first give a theoretical upper bound for the solution. Then we develop a heuristic but effective sample selection algorithm with the help of the Gershgorin circle theorem. We investigate the rationality, the feasibility, the universality and the complexity of the proposed method and demonstrate that our method yields encouraging active learning results.
【Keywords】: Active Learning; Semi-supervised Manifold Learning; Gershgorin Circle; Heuristic Algorithm
【Paper Link】 【Pages】:3115-3121
【Authors】: Zenglin Xu ; Rong Jin ; Bin Shen ; Shenghuo Zhu
【Abstract】: Nystrom approximation is an effective approach to accelerate the computation of kernel matrices in many kernel methods. In this paper, we consider the Nystrom approximation for sparse kernel methods. Instead of relying on the low-rank assumption of the original kernels, which sometimes does not hold in some applications, we take advantage of the restricted eigenvalue condition, which has been proved to be robust for sparse kernel methods. Based on the restricted eigenvalue condition, we have provided not only the approximation bound for the original kernel matrix but also the recovery bound for the sparse solutions of sparse kernel regression. In addition to the theoretical analysis, we also demonstrate the good performance of the Nystrom approximation for sparse kernel regression on real world data sets.
【Keywords】: nystrom approximation, kernel methods, regression
【Paper Link】 【Pages】:3122-3128
【Authors】: Yuto Yamaguchi ; Christos Faloutsos ; Hiroyuki Kitagawa
【Abstract】: If we know most of Smith’s friends are from Boston, what can we say about the rest of Smith’s friends? In this paper, we focus on the node classification problem on networks, which is one of the most important topics in AI and Web communities. Our proposed algorithm which is referred to as OMNIProp has the following properties: (a) seamless and accurate; it works well on any label correlations (i.e., homophily, heterophily, and mixture of them) (b) fast; it is efficient and guaranteed to converge on arbitrary graphs (c) quasi-parameter free; it has just one well-interpretable parameter with heuristic default value of 1. We also prove the theoretical connections of our algorithm to the semi-supervised learning (SSL) algorithms and to random-walks. Experiments on four real, different network datasets demonstrate the benefits of the proposed algorithm, where OMNI-Prop outperforms the top competitors.
【Keywords】:
【Paper Link】 【Pages】:3129-3135
【Authors】: Yuya Yoshikawa ; Tomoharu Iwata ; Hiroshi Sawada
【Abstract】: Gaussian process (GP) regression is a widely used method for non-linear prediction.The performance of the GP regression depends on whether it can properly capture the covariance structure of target variables, which is represented by kernels between input data.However, when the input is represented as a set of features, e.g. bag-of-words, it is difficult to calculate desirable kernel values because the co-occurrence of different but relevant words cannot be reflected in the kernel calculation.To overcome this problem, we propose a Gaussian process latent variable set model (GP-LVSM), which is a non-linear regression model effective for bag-of-words data.With the GP-LVSM, a latent vector is associated with each word, and each document is represented as a distribution of the latent vectors for words appearing in the document. We efficiently represent the distributions by using the framework of kernel embeddings of distributions that can hold high-order moment information of distributions without need for explicit density estimation.By learning latent vectors so as to maximize the posterior probability, kernels that reflect relations between words are obtained, and also words are visualized in a low-dimensional space.In experiments using 25 item review datasets, we demonstrate the effectiveness of the GP-LVSM in prediction and visualization.
【Keywords】:
【Paper Link】 【Pages】:3136-3142
【Authors】: Takayuki Yoshizumi
【Abstract】: The solutions or states of optimization problems or simulations are evaluated by using objective functions. The weights for these objective functions usually have to be estimated from experts' evaluations, which are likely to be qualitative and somewhat subjective. Although such estimation tasks are normally regarded as quite suitable for machine learning, we propose a mathematical programming-based method for better estimation. The key idea of our method is to use an ordinal scale for measuring paired differences of the objective values as well as the paired objective values. By using an ordinal scale, experts' qualitative and subjective evaluations can be appropriately expressed with simultaneous linear inequalities, and which can be handled by a mathematical programming solver. This allows us to extract more information from experts' evaluations compared to machine-learning-based algorithms, which increases the accuracy of our estimation. We show that our method outperforms machine-learning-based algorithms in a test of finding appropriate weights for an objective function.
【Keywords】: machine learning; mathematical programming
【Paper Link】 【Pages】:3143-3149
【Authors】: Hongyang Zhang ; Zhouchen Lin ; Chao Zhang ; Edward Y. Chang
【Abstract】: Subspace recovery from noisy or even corrupted data is critical for various applications in machine learning and data analysis. To detect outliers, Robust PCA (R PCA) via Outlier Pursuit was proposed and had found many successful applications. However, the current theoretical analysis on Outlier Pursuit only shows that it succeeds when the sparsity of the corruption matrix is of O(n/r), where n is the number of the samples and r is the rank of the intrinsic matrix which may be comparable to n. Moreover, the regularization parameter is suggested as 3/(7 squareroot gamma n}, where gamma is a parameter that is not known a priori. In this paper, with incoherence condition and proposed ambiguity condition we prove that Outlier Pursuit succeeds when the rank of the intrinsic matrix is of O(n log n) and the sparsity of the corruption matrix is of O(n). We further show that the orders of both bounds are tight. Thus R-PCA via Outlier Pursuit is able to recover intrinsic matrix of higher rank and identify much denser corruptions than what the existing results could predict. Moreover, we suggest that the regularization parameter be chosen as 1 squareroot{log n}, which is definite. Our analysis waives the necessity of tuning the regularization parameter and also significantly extends the working range of the Outlier Pursuit. Experiments on synthetic and real data verify our theories.
【Keywords】: Robust PCA, Low Rank, Sparsity, Subspace Recovery, Outlier Detection
【Paper Link】 【Pages】:3150-3157
【Authors】: Kun Zhang ; Mingming Gong ; Bernhard Schölkopf
【Abstract】: This paper is concerned with the problem of domain adaptation with multiple sources from a causal point of view. In particular, we use causal models to represent the relationship between the features X and class label Y , and consider possible situations where different modules of the causal model change with the domain. In each situation, we investigate what knowledge is appropriate to transfer and find the optimal target-domain hypothesis. This gives an intuitive interpretation of the assumptions underlying certain previous methods and motivates new ones. We finally focus on the case where Y is the cause for X with changing PY and PX|Y , that is, PY and PX|Y change independently across domains. Under appropriate assumptions, the availability of multiple source domains allows a natural way to reconstruct the conditional distribution on the target domain; we propose to model PX|Y (the process to generate effect X from cause Y ) on the target domain as a linear mixture of those on source domains, and estimate all involved parameters by matching the target-domain feature distribution. Experimental results on both synthetic and real-world data verify our theoretical results.
【Keywords】: domain adaptation; causal knowledge; target shift; conditional shift
【Paper Link】 【Pages】:3158-3164
【Authors】: Lijun Zhang ; Tianbao Yang ; Rong Jin ; Zhi-Hua Zhou
【Abstract】: In online bandit learning, the learner aims to minimize a sequence of losses, while only observing the value of each loss at a single point. Although various algorithms and theories have been developed for online bandit learning, most of them are limited to convex losses. In this paper, we investigate the problem of online bandit learning with non-convex losses, and develop an efficient algorithm with formal theoretical guarantees. To be specific, we consider a class of losses which is a composition of a non-increasing scalar function and a linear function. This setting models a wide range of supervised learning applications such as online classification with a non-convex loss. Theoretical analysis shows that our algorithm achieves an O(poly(d)T2/3) regret bound when the variation of the loss function is small. To the best of our knowledge, this is the first work in online bandit learning that does not rely on convexity.
【Keywords】: Online Bandit Learning; Non-convex Losses; Exploration-exploitation
【Paper Link】 【Pages】:3165-3173
【Authors】: Shengping Zhang ; Shiva Kasiviswanathan ; Pong C. Yuen ; Mehrtash Harandi
【Abstract】: Symmetric Positive Definite (SPD) matrices in the form of region covariances are considered rich descriptors for images and videos. Recent studies suggest that exploiting the Riemannian geometry of the SPD manifolds could lead to improved performances for vision applications. For tasks involving processing large-scale and dynamic data in computer vision, the underlying model is required to progressively and efficiently adapt itself to the new and unseen observations. Motivated by these requirements, this paper studies the problem of online dictionary learning on the SPD manifolds. We make use of the Stein divergence to recast the problem of online dictionary learning on the manifolds to a problem in Reproducing Kernel Hilbert Spaces, for which, we develop efficient algorithms by taking into account the geometric structure of the SPD manifolds. To our best knowledge, our work is the first study that provides a solution for online dictionary learning on the SPD manifolds. Empirical results on both large-scale image classification task and dynamic video processing tasks validate the superior performance of our approach as compared to several state-of-the-art algorithms.
【Keywords】: sparse coding; online dictionary learning; Symmetric Positive Definite Manifolds
【Paper Link】 【Pages】:3174-3180
【Authors】: Xianchao Zhang ; Linlin Zong ; Xinyue Liu ; Hong Yu
【Abstract】: Existing multi-view clustering algorithms require thatthe data is completely or partially mapped betweeneach pair of views. However, this requirement couldnot be satisfied in most practical settings. In this paper,we tackle the problem of multi-view clustering for unmappeddata in the framework of NMF based clustering.With the help of inter-view constraints, we definethe disagreement between each pair of views by the factthat the indicator vectors of two instances from two differentviews should be similar if they belong to the samecluster and dissimilar otherwise. The overall objectiveof our algorithm is to minimize the loss function of NMFin each view as well as the disagreement betweeneach pair of views. Experimental results show that, witha small number of constraints, the proposed algorithmgets good performance on unmapped data, and outperformsexisting algorithms on partially mapped data andcompletely mapped data.
【Keywords】: multi-view clusterling; unmapped data; NMF
【Paper Link】 【Pages】:3181-3187
【Authors】: Yu Zhang
【Abstract】: In this paper, we study multi-task algorithms from the perspective of the algorithmic stability. We give a definition of the multi-task uniform stability, a generalization of the conventional uniform stability, which measures the maximum difference between the loss of a multi-task algorithm trained on a data set and that of the multi-task algorithm trained on the same data set but with a data point removed in each task. In order to analyze multi-task algorithms based on multi-task uniform stability, we prove a generalized McDiarmid's inequality which assumes the difference bound condition holds by changing multiple input arguments instead of only one in the conventional McDiarmid's inequality. By using the generalized McDiarmid's inequality as a tool, we can analyze the generalization performance of general multi-task algorithms in terms of the multi-task uniform stability. Moreover, as applications, we prove generalization bounds of several representative regularized multi-task algorithms.
【Keywords】: Multi-Task Learning; Stability
【Paper Link】 【Pages】:3188-3195
【Authors】: Han Zhao ; Pascal Poupart ; Yongfeng Zhang ; Martin Lysy
【Abstract】: We propose SoF (Soft-cluster matrix Factorization), a probabilistic clustering algorithm which softly assigns each data point into clusters. Unlike model-based clustering algorithms, SoF does not make assumptions about the data density distribution. Instead, we take an axiomatic approach to define 4 properties that the probability of co-clustered pairs of points should satisfy. Based on the properties, SoF utilizes a distance measure between pairs of points to induce the conditional co-cluster probabilities. The objective function in our framework establishes an important connection between probabilistic clustering and constrained symmetric Nonnegative Matrix Factorization (NMF), hence providing a theoretical interpretation for NMF-based clustering algorithms. To optimize the objective, we derive a sequential minimization algorithm using a penalty method. Experimental results on both synthetic and real-world datasets show that SoF significantly outperforms previous NMF-based algorithms and that it is able to detect non-convex patterns as well as cluster boundaries.
【Keywords】: Nonnegative matrix factorization; Probabilistic clustering; Optimization
【Paper Link】 【Pages】:3196-3202
【Authors】: Qian Zhao ; Deyu Meng ; Lu Jiang ; Qi Xie ; Zongben Xu ; Alexander G. Hauptmann
【Abstract】: Matrix factorization (MF) has been attracting much attention due to its wide applications. However, since MF models are generally non-convex, most of the existing methods are easily stuck into bad local minima, especially in the presence of outliers and missing data. To alleviate this deficiency, in this study we present a new MF learning methodology by gradually including matrix elements into MF training from easy to complex. This corresponds to a recently proposed learning fashion called self-paced learning (SPL), which has been demonstrated to be beneficial in avoiding bad local minima. We also generalize the conventional binary (hard) weighting scheme for SPL to a more effective real-valued (soft) weighting manner. The effectiveness of the proposed self-paced MF method is substantiated by a series of experiments on synthetic, structure from motion and background subtraction data.
【Keywords】:
【Paper Link】 【Pages】:3203-3209
【Authors】: Yi Zhen ; Piyush Rai ; Hongyuan Zha ; Lawrence Carin
【Abstract】: We present a probabilistic framework for learning pairwise similarities between objects belonging to different modalities, such as drugs and proteins, or text and images. Our framework is based on learning a binary code based representation for objects in each modality, and has the following key properties: (i) it can leverage both pairwise as well as easy-to-obtain relative preference based cross-modal constraints, (ii) the probabilistic framework naturally allows querying for the most useful/informative constraints, facilitating an active learning setting (existing methods for cross-modal similarity learning do not have such a mechanism), and (iii) the binary code length is learned from the data. We demonstrate the effectiveness of the proposed approach on two problems that require computing pairwise similarities between cross-modal object pairs: cross-modal link prediction in bipartite graphs, and hashing based cross-modal similarity search.
【Keywords】: similarity learning
【Paper Link】 【Pages】:3210-3216
【Authors】: Quan Zhou ; Wenlin Chen ; Shiji Song ; Jacob R. Gardner ; Kilian Q. Weinberger ; Yixin Chen
【Abstract】: Algorithmic reductions are one of the corner stones of theoretical computer science. Surprisingly, to-date, they have only played a limited role in machine learning. In this paper we introduce a formal and practical reduction between two of the most widely used machine learning algorithms: from the Elastic Net (and the Lasso as a special case) to the Support Vector Machine. First, we derive the reduction and summarize it in only 11 lines of MATLAB. Then, we demonstrate its high impact potential by translating recent advances in parallelizing SVM solvers directly to the Elastic Net. The resulting algorithm is a parallel solver for the Elastic Net (and Lasso) that naturally utilizes GPU and multi-core CPUs. We evaluate it on twelve real world data sets, and show that it yields identical results as the popular (and highly optimized) glmnet implementation but is up-to two orders of magnitude faster.
【Keywords】: Elastic Net; SVM; Reduction; Sparsity; Parallel computation
【Paper Link】 【Pages】:3217-3224
【Authors】: Feiyun Zhu ; Bin Fan ; Xinliang Zhu ; Ying Wang ; Shiming Xiang ; Chunhong Pan
【Abstract】: Subset selection from massive data with noised information is increasingly popular for various applications. This problem is still highly challenging as current methods are generally slow in speed and sensitive to outliers. To address the above two issues, we propose an accelerated robust subset selection (ARSS) method. Extensive experiments on ten benchmark datasets verify that our method not only outperforms state of the art methods, but also runs 10,000+ times faster than the most related method.
【Keywords】:
【Paper Link】 【Pages】:3225-3231
【Authors】: Meysam Aghighi ; Peter Jonsson ; Simon Ståhlberg
【Abstract】: Causal graphs are widely used to analyze the complexity of planning problems. Many tractable classes have been identified with their aid and state-of-the-art heuristics have been derived by exploiting such classes. In particular, Katz and Keyder have studied causal graphs that are hourglasses (which is a generalization of forks and inverted-forks) and shown that the corresponding cost-optimal planning problem is tractable under certain restrictions. We continue this work by studying polytrees (which is a generalization of hourglasses) under similar restrictions. We prove tractability of cost-optimal planning by providing an algorithm based on a novel notion of variable isomorphism. Our algorithm also sheds light on the k-consistency procedure for identifying unsolvable planning instances. We speculate that this may, at least partially, explain why merge-and-shrink heuristics have been successful for recognizing unsolvable instances.
【Keywords】: automated planning; causal graph; polynomial-time algorithm; cost-optimal planning; polytree
【Paper Link】 【Pages】:3232-3238
【Authors】: Christer Bäckström
【Abstract】: Bäckström studied the parameterised complexity of planning when the domain-transition graphs (DTGs) are acyclic. He used the parameters d (domain size), k (number of paths in the DTGs) and w (treewidth of the causal graph), and showed that planning is fixed-parameter tractable (fpt) in these parameters, and fpt in only parameter k if the causal graph is a polytree. We continue this work by considering some additional cases of non-acyclic DTGs. In particular, we consider the case where each strongly connected component (SCC) in a DTG must be a simple cycle, and we show that planning is fpt for this case if the causal graph is a polytree. This is done by first preprocessing the instance to construct an equivalent abstraction and then apply Bäckströms technique to this abstraction. We use the parameters d and k , reinterpreting this as the number of paths in the condensation of a DTG, and the two new parameters c (the number of contracted cycles along a path) and p max (an upper bound for walking around cycles, when not unbounded).
【Keywords】: Planning; Parameterised Complexity; Fixed Parameter Tractability
【Paper Link】 【Pages】:3239-3246
【Authors】: Jeb Brooks ; Emilia Reed ; Alexander Gruver ; James C. Boerkoel
【Abstract】: Flexibility in agent scheduling increases the resilience of temporal plans in the face of new constraints. However,current metrics of flexibility ignore domain knowledge about how such constraints might arise in practice, e.g., due to the uncertain duration of a robot’s transitiontime from one location to another. Probabilistic temporalplanning accounts for actions whose uncertain durations can be modeled with probability density functions. We introduce a new metric called robustness that measures the likelihood of success for probabilistic temporalplans. We show empirically that in multi-robot planning,robustness may be a better metric for assessing the quality of temporal plans than flexibility, thus reframing many popular scheduling optimization problems.
【Keywords】: Simple Temporal Problem; Multiagent Scheduling; Scheduling under Uncertainty
【Paper Link】 【Pages】:3247-3253
【Authors】: Daniel Bryce ; Sicun Gao ; David J. Musliner ; Robert P. Goldman
【Abstract】: PDDL+ planning involves reasoning about mixed discrete-continuous change over time. Nearly all PDDL+ planners assume that continuous change is linear. We present a new technique that accommodates nonlinear change by encoding problems as nonlinear hybrid systems. Using this encoding, we apply a Satisfiability Modulo Theories (SMT) solver to find plans. We show that it is important to use novel planning- specific heuristics for variable and value selection for SMT solving, which is inspired by recent advances in planning as SAT. We show the promising performance of the resulting solver on challenging nonlinear problems.
【Keywords】: Planning; Satisfiability Modulo Theories; Heuristics
【Paper Link】 【Pages】:3254-3260
【Authors】: Alessandro Cimatti ; Andrea Micheli ; Marco Roveri
【Abstract】: In many practical domains, planning systems are required to reason about durative actions. A common assumption in the literature is that the executor is allowed to decide the duration of each action. However, this assumption may be too restrictive for applications. In this paper, we tackle the problem of temporal planning with uncontrollable action durations. We show how to generate robust plans,that guarantee goal achievement despite the uncontrollability of the actual duration of the actions. We extend the state-space temporalplanning framework, integrating recent techniques for solving temporalproblems under uncertainty. We discuss different ways of lifting the total order plans generated by the heuristic search to partial orderplans, showing (in)completeness results for each of them. We implemented our approach on top of COLIN, a state-of-the-art planner. An experimental evaluation over several benchmark problems shows the practical feasibility of the proposed approach.
【Keywords】: Temporal Planning with Uncontrollable Durations; Temporal uncertainty; Temporal Planning; Temporal Problems
【Paper Link】 【Pages】:3261-3267
【Authors】: Hao Cui ; Roni Khardon ; Alan Fern ; Prasad Tadepalli
【Abstract】: This paper investigates stochastic planning problemswith large factored state and action spaces. We show that even with moderate increase in the size of existing challenge problems, the performance of state of the art algorithms deteriorates rapidly, making them ineffective.To address this problem we propose a family of simple but scalable online planning algorithms that combine sampling, as in Monte Carlo tree search, with “aggregation,” where the aggregation approximates a distribution over random variables by the product of their marginals. The algorithms are correct under some rather strong technical conditions and can serve as an unsound but effective heuristic when the conditions do not hold. An extensive experimental evaluation demonstrates that the new algorithms provide significant improvement over the state of the art when solving largeproblems in a number of challenge benchmark domains.
【Keywords】: Markov decision processes; Monte-Carlo tree search
【Paper Link】 【Pages】:3268-3274
【Authors】: Nina Ghanbari Ghooshchi ; Majid Namazi ; M. A. Hakim Newton ; Abdul Sattar
【Abstract】: We present a planner named Transition Constraints for Parallel Planning (TCPP). TCPP constructs a new constraint model from domain transition graphs (DTG) of a given planning problem. TCPP encodes the constraint model by using table constraints that allow don't cares or wild cards as cell values. TCPP uses Minion the constraint solver to solve the constraint model and returns the parallel plan. Empirical results exhibit the efficiency of our planning system over state-of-the-art constraint-based planners.
【Keywords】: Constraint-Based Planning; Planning; Constraints
【Paper Link】 【Pages】:3275-3282
【Authors】: Robert P. Goldman ; Ugur Kuter
【Abstract】: In this paper we present a plan-plan distance metric based on Kolmogorov(Algorithmic) complexity. Generating diverse sets of plans is useful for task ssuch as probing user preferences and reasoning about vulnerability to cyberattacks. Generating diverse plans, and comparing different diverse planning approaches requires a domain-independent, theoretically motivated definition of the diversity distance between plans. Previously proposed diversity measures are not theoretically motivated, and can provide inconsistent results on the sameplans. We define the diversity of plans in terms of how surprising one plan is givenanother or, its inverse, the conditional information in one plan givenanother. Kolmogorov complexity provides a domain independent theory of conditional information. While Kolmogorov complexity is not computable, a related metric, Normalized Compression Distance (NCD), provides a well-behaved approximation. In this paper we introduce NCD as an alternative diversity metric, and analyze its performance empirically, in comparison with previous diversity measures, showing strengths and weaknesses of each.We also examine the use of different compressor sin NCD. We show how NCD can be used to select a training set for HTN learning,giving an example of the utility of diversity metrics. We conclude withsuggestions for future work on improving, extending, and applying it to serve new applications.
【Keywords】: Planning; Plan diversity; Diversity metric; Algorithmic complexity; Kolmogorov complexity; Compression distance
【Paper Link】 【Pages】:3283-3290
【Authors】: Eric A. Hansen ; Ibrahim Abdoulahi
【Abstract】: Fully observable decision-theoretic planning problems are commonly modeled as stochastic shortest path (SSP) problems. For this class of planning problems, heuristic search algorithms (including LAO*, RTDP, and related algorithms), as well as the value iteration algorithm on which they are based, lack an efficient test for convergence to an ε-optimal policy (except in the special case of discounting). We introduce a simple and efficient test for convergence that applies to SSP problems with positive action costs. The test can detect whether a policy is proper, that is, whether it achieves the goal state with probability 1. If proper, it gives error bounds that can be used to detect convergence to an ε-optimal solution. The convergence test incurs no extra overhead besides computing the Bellman residual, and the performance guarantee it provides substantially improves the utility of this class of planning algorithms.
【Keywords】:
【Paper Link】 【Pages】:3291-3297
【Authors】: Robert C. Holte ; Yusra Alkhazraji ; Martin Wehrle
【Abstract】: Pruning techniques have recently been shown to speed up search algorithms by reducing the branching factor of large search spaces. One such technique is sleep sets, which were originally introduced as a pruning technique for model checking, and which have recently been investigated on a theoretical level for planning. In this paper, we propose a generalization of sleep sets and prove its correctness. While the original sleep sets were based on the commutativity of operators, generalized sleep sets are based on a more general notion of operator sequence redundancy. As a result, our approach dominates the original sleep sets variant in terms of pruning power. On a practical level, our experimental evaluation shows the potential of sleep sets and their generalizations on a large and common set of planning benchmarks.
【Keywords】: pruning techniques, sleep sets, operator redundancy
【Paper Link】 【Pages】:3298-3304
【Authors】: Sarah Keren ; Avigdor Gal ; Erez Karpas
【Abstract】: Goal recognition design involves the offline analysis of goal recognition models by formulating measures that assess the ability to perform goal recognition within a model and finding efficient ways to compute and optimize them. In this work we present goal recognition design for non-optimal agents, which extends previous work by accounting for agents that behave non-optimally either intentionally or naıvely. The analysis we present includes a new generalized model for goal recognition design and the worst case distinctiveness (wcd) measure. For two special cases of sub-optimal agents we present methods for calculating the wcd, part of which are based on novel compilations to classical planning problems. Our empirical evaluation shows the proposed solutions to be effective in computing and optimizing the wcd.
【Keywords】: Goal Recognition;Classical Planning Compilation;Intention Detection
【Paper Link】 【Pages】:3305-3312
【Authors】: Martin Kronegger ; Sebastian Ordyniak ; Andreas Pfandler
【Abstract】: Backdoors are a powerful tool to obtain efficient algorithms for hard problems. Recently, two new notions of backdoors to planning were introduced. However, for one of the new notions (i.e., variable-deletion) only hardness results are known so far. In this work we improve the situation by defining a new type of variable-deletion backdoors based on the extended causal graph of a planning instance. For this notion of backdoors several fixed-parameter tractable algorithms are identified. Furthermore, we explore the capabilities of polynomial time preprocessing, i.e., we check whether there exists a polynomial kernel. Our results also show the close connection between planning and verification problems such as Vector Addition System with States (VASS).
【Keywords】: Planning; Fixed-parameter tractable algorithms; (Parameterized) complexity theory; Backdoors
【Paper Link】 【Pages】:3313-3319
【Authors】: Meilun Li ; Zhikun She ; Andrea Turrini ; Lijun Zhang
【Abstract】: The classical planning problem can be enriched with quantitative and qualitative user-defined preferences on how the system behaves on achieving the goal. In this paper, we propose the probabilistic preference planning problem for Markov decision processes, where the preferences are based on an enriched probabilistic LTL-style logic. We develop P4Solver, an SMT-based planner computing the preferred plan by reducing the problem to quadratic programming problem, which can be solved using SMT solvers such as Z3. We illustrate the framework by applying our approach on two selected case studies.
【Keywords】: Markov Decision Process; Planning Problem; Probabilistic Planning; User Preference; Probabilistic LTL
【Paper Link】 【Pages】:3320-3326
【Authors】: Hang Ma ; Joelle Pineau
【Abstract】: Planning in large partially observable Markov decision processes (POMDPs) is challenging especially when a long planning horizon is required. A few recent algorithms successfully tackle this case but at the expense of a weaker information-gathering capacity. In this paper, we propose Information Gathering and Reward Exploitation of Subgoals (IGRES), a randomized POMDP planning algorithm that leverages information in the state space to automatically generate "macro-actions" to tackle tasks with long planning horizons, while locally exploring the belief space to allow effective information gathering. Experimental results show that IGRES is an effective multi-purpose POMDP solver, providing state-of-the-art performance for both long horizon planning tasks and information-gathering tasks on benchmark domains. Additional experiments with an ecological adaptive management problem indicate that IGRES is a promising tool for POMDP planning in real-world settings.
【Keywords】: POMDPs; planning under uncertainty; robot navigation
【Paper Link】 【Pages】:3327-3334
【Authors】: Christian J. Muise ; Vaishak Belle ; Paolo Felli ; Sheila A. McIlraith ; Tim Miller ; Adrian R. Pearce ; Liz Sonenberg
【Abstract】: Many AI applications involve the interaction of multiple autonomous agents, requiring those agents to reason about their own beliefs, as well as those of other agents. However, planning involving nested beliefs is known to be computationally challenging. In this work, we address the task of synthesizing plans that necessitate reasoning about the beliefs of other agents. We plan from the perspective of a single agent with the potential for goals and actions that involve nested beliefs, non-homogeneous agents, co-present observations, and the ability for one agent to reason as if it were another. We formally characterize our notion of planning with nested belief, and subsequently demonstrate how to automatically convert such problems into problems that appeal to classical planning technology. Our approach represents an important first step towards applying the well-established field of automated planning to the challenging task of planning involving nested beliefs of multiple agents.
【Keywords】: planning; epistemic reasoning; multi-agent; nested belief
【Paper Link】 【Pages】:3335-3341
【Authors】: Florian Pommerening ; Malte Helmert ; Gabriele Röger ; Jendrik Seipp
【Abstract】: Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usually defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuristic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.
【Keywords】: classical planning;cost-optimal planning;cost partitioning;abstraction heuristics;operator-counting constraints;potential heuristics
【Paper Link】 【Pages】:3342-3348
【Authors】: Pascal Poupart ; Aarti Malhotra ; Pei Pei ; Kee-Eung Kim ; Bongseok Goh ; Michael Bowling
【Abstract】: In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate linear programming to optimize policies in CPOMDPs. The optimization is performed offline and produces a finite state controller with desirable performance guarantees. The approach outperforms a constrained version of point-based value iteration on a suite of benchmark problems.
【Keywords】: Constrained POMDPs; Approximate Linear Programming; Finite State Controller
【Paper Link】 【Pages】:3349-3355
【Authors】: Jussi Rintanen
【Abstract】: The problem of planning or discrete control for timed system has earlier been solved with various constraint-based solution methods, including Constraint Programming, SAT solvers, SAT modulo Theories solvers, and Mixed Integer-Linear Programming. In this work we investigate the encoding of time in such constraint-based representations. A main issue with existing encodings is the necessity to allow arbitrary interleavings of concurrent actions' starting and ending times. The complex combinatorics of this can lead to poor scalability of leading search methods. We show how real or rational time in temporal models can in many practically important cases be replaced by integer time, and how this leads to far simpler encodings of planning as constraints. We demonstrate that the simplified encodings substantially improve the scalability of constraint-based planning.
【Keywords】: SAT; SMT; planning; temporal planning
【Paper Link】 【Pages】:3356-3363
【Authors】: Yash Satsangi ; Shimon Whiteson ; Frans A. Oliehoek
【Abstract】: A key challenge in the design of multi-sensor systems is the efficient allocation of scarce resources such as bandwidth, CPU cycles, and energy, leading to the dynamic sensor selection problem in which a subset of the available sensors must be selected at each timestep. While partially observable Markov decision processes (POMDPs) provide a natural decision-theoretic model for this problem, the computational cost of POMDP planning grows exponentially in the number of sensors, making it feasible only for small problems. We propose a new POMDP planning method that uses greedy maximization to greatly improve scalability in the number of sensors. We show that, under certain conditions, the value function of a dynamic sensor selection POMDP is submodular and use this result to bound the error introduced by performing greedy maximization. Experimental results on a real-world dataset from a multi-camera tracking system in a shopping mall show it achieves similar performance to existing methods but incurs only a fraction of the computational cost, leading to much better scalability in the number of cameras.
【Keywords】: Planning and scheduling; Sensor selection; POMDPs
【Paper Link】 【Pages】:3364-3370
【Authors】: Jendrik Seipp ; Silvan Sievers ; Malte Helmert ; Frank Hutter
【Abstract】: Sequential planning portfolios exploit the complementary strengths of different planners. Similarly, automated algorithm configuration tools can customize parameterized planning algorithms for a given type of tasks. Although some work has been done towards combining portfolios and algorithm configuration, the problem of automatically generating a sequential planning portfolio from a parameterized planner for a given type of tasks is still largely unsolved. Here, we present Cedalion, a conceptually simple approach for this problem that greedily searches for the pair of parameter configuration and runtime which, when appended to the current portfolio, maximizes portfolio improvement per additional runtime spent. We show theoretically that Cedalion yields portfolios provably within a constant factor of optimal for the training set distribution. We evaluate Cedalion empirically by applying it to construct sequential planning portfolios based on component planners from the highly parameterized Fast Downward (FD) framework. Results for a broad range of planning settings demonstrate that -- without any knowledge of planning or FD -- Cedalion constructs sequential FD portfolios that rival, and in some cases substantially outperform, manually-built FD portfolios.
【Keywords】: classical planning; sequential portfolios
【Paper Link】 【Pages】:3371-3377
【Authors】: Alexander Shleyfman ; Michael Katz ; Malte Helmert ; Silvan Sievers ; Martin Wehrle
【Abstract】: Heuristic search is a state-of-the-art approach to classical planning. Several heuristic families were developed over the years to automatically estimate goal distance information from problem descriptions. Orthogonally to the development of better heuristics, recent years have seen an increasing interest in symmetry-based state space pruning techniques that aim at reducing the search effort. However, little work has dealt with how the heuristics behave under symmetries. We investigate the symmetry properties of existing heuristics and reveal that many of them are invariant under symmetries.
【Keywords】: Classical Planning; Symmetries; Delete-relaxation Heuristics; Critical Path Heuristics; Landmark Heuristics
【Paper Link】 【Pages】:3378-3385
【Authors】: Silvan Sievers ; Martin Wehrle ; Malte Helmert ; Alexander Shleyfman ; Michael Katz
【Abstract】: Merge-and-shrink heuristics crucially rely on effective reduction techniques, such as bisimulation-based shrinking, to avoid the combinatorial explosion of abstractions. We propose the concept of factored symmetries for merge-and-shrink abstractions based on the established concept of symmetry reduction for state-space search. We investigate under which conditions factored symmetry reduction yields perfect heuristics and discuss the relationship to bisimulation. We also devise practical merging strategies based on this concept and experimentally validate their utility.
【Keywords】: merge-and-shrink heuristics; symmetries; heuristic search; abstraction heuristics
【Paper Link】 【Pages】:3386-3392
【Authors】: Sriram Srinivasan ; Erik Talvitie ; Michael H. Bowling
【Abstract】: Monte-Carlo planning has been proven successful in manysequential decision-making settings, but it suffers from poorexploration when the rewards are sparse. In this paper, weimprove exploration in UCT by generalizing across similarstates using a given distance metric. We show that this algorithm,like UCT, converges asymptotically to the optimalaction. When the state space does not have a natural distancemetric, we show how we can learn a local manifold from thetransition graph of states in the near future. to obtain a distancemetric. On domains inspired by video games, empiricalevidence shows that our algorithm is more sample efficientthan UCT, particularly when rewards are sparse.
【Keywords】: UCT;Exploration;Manifolds
【Paper Link】 【Pages】:3393-3401
【Authors】: Siddharth Srivastava ; Shlomo Zilberstein ; Abhishek Gupta ; Pieter Abbeel ; Stuart J. Russell
【Abstract】: We create a unified framework for analyzing and synthesizing plans with loops for solving problems with non-deterministic numeric effects and a limited form of partial observability. Three different action models---with deterministic, qualitative non-deterministic and Boolean non-deterministic semantics---are handled using a single abstract representation. We establish the conditions under which the correctness and termination of solutions, represented as abstract policies, can be verified. We also examine the feasibility of learning abstract policies from examples. We demonstrate our techniques on several planning problems and show that they apply to challenging real-world tasks such as doing the laundry with a PR2 robot. These results resolve a number of open questions about planning with loops and facilitate the development of new algorithms and applications.
【Keywords】: sequential decision making; planning; robotics, safety and termination guarantees; household robots
【Paper Link】 【Pages】:3402-3408
【Authors】: Luis Gustavo Rocha Vianna ; Leliane N. de Barros ; Scott Sanner
【Abstract】: Recent advances in Symbolic Dynamic Programming (SDP) combined withthe extended algebraic decision diagram (XADD) have provided exactsolutions for expressive subclasses of finite-horizon Hybrid MarkovDecision Processes (HMDPs) with mixed continuous and discrete stateand action parameters. Unfortunately, SDP suffers from two majordrawbacks: (1) it solves for all states and can be intractable formany problems that inherently have large optimal XADD value functionrepresentations; and (2) it cannot maintain compact (pruned) XADDrepresentations for domains with nonlinear dynamics and reward due tothe need for nonlinear constraint checking. In this work, wesimultaneously address both of these problems by introducing real-timeSDP (RTSDP). RTSDP addresses (1) by focusing the solution and valuerepresentation only on regions reachable from a set of initial statesand RTSDP addresses (2) by using visited states as witnesses ofreachable regions to assist in pruning irrelevant or unreachable(nonlinear) regions of the value function. To this end, RTSDP enjoysprovable convergence over the set of initial states and substantialspace and time savings over SDP as we demonstrate in a variety of hybrid domains ranging from inventory to reservoir to traffic control.
【Keywords】: Hybrid MDPs, Continuous Planning, Symbolic Dynamic Programming
【Paper Link】 【Pages】:3409-3417
【Authors】: David Wang ; Brian C. Williams
【Abstract】: Planning for and controlling a network of interacting devices requires a planner that accounts for the automatic timed transitions of devices, while meeting deadlines and achieving durative goals. Consider a planner for an imaging satellite with a camera that cannot tolerate exhaust. The planner would need to determine that opening a valve causes a chain reaction that ignites the engine, and thus needs to shield the camera. While planners exist that support deadlines and durative goals, currently, no planners can handle automatic timed transitions. We present tBurton, a temporal planner that supports these features, while additionally producing a temporally least-commitment plan. tBurton uses a divide and conquer approach: dividing the problem using causal-graph decomposition and conquering each factor with heuristic forward search. The `sub-plans' from each factor are then unified in a conflict directed search, guided by the causal graph structure. We describe why this approach is fast and efficient, and demonstrate its ability to improve the performance of existing planners on factorable problems through benchmarks from the International Planning Competition.
【Keywords】: planning. temporal planner, automata, timeline, regression, history, episode, temporal network
【Paper Link】 【Pages】:3418-3424
【Authors】: Kyle Hollins Wray ; Shlomo Zilberstein ; Abdel-Illah Mouaddib
【Abstract】: Sequential decision problems that involve multiple objectives are prevalent. Consider for example a driver of a semi-autonomous car who may want to optimize competing objectives such as travel time and the effort associated with manual driving. We introduce a rich model called Lexicographic MDP (LMDP) and a corresponding planning algorithm called LVI that generalize previous work by allowing for conditional lexicographic preferences with slack. We analyze the convergence characteristics of LVI and establish its game theoretic properties. The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Finally, we demonstrate how GPU-based optimization can improve the scalability of LVI and other value iteration algorithms for MDPs.
【Keywords】: multi-objective; momdp; lmdp; mdp; lexicographic preferences
【Paper Link】 【Pages】:3425-3431
【Authors】: Peng Yu ; Cheng Fang ; Brian Williams
【Abstract】: When scheduling tasks for field-deployable systems, our solutions must be robust to the uncertainty inherent in the real world. Although human intuition is trusted to balance reward and risk, humans perform poorly in risk assessment at the scale and complexity of real world problems. In this paper, we present a decision aid system that helps human operators diagnose the source of risk and manage uncertainty in temporal problems. The core of the system is a conflict-directed relaxation algorithm, called Conflict-Directed Chance-constraint Relaxation (CDCR), which specializes in resolving over-constrained temporal problems with probabilistic durations and a chance constraint bounding the risk of failure. Given a temporal problem with uncertain duration, CDCR proposes execution strategies that operate at acceptable risk levels and pinpoints the source of risk. If no such strategy can be found that meets the chance constraint, it can help humans to repair the over-constrained problem by trading off between desirability of solution and acceptable risk levels. The decision aid has been incorporated in a mission advisory system for assisting oceanographers to schedule activities in deep-sea expeditions, and demonstrated its effectiveness in scenarios with realistic uncertainty.
【Keywords】: temporal problems; probabilistic temporal constraints; chance-constrained scheduling; over-constrained problems; temporal relaxations
【Paper Link】 【Pages】:3432-3438
【Authors】: Zizhen Zhang ; Huang He ; Zhixing Luo ; Hu Qin ; Songshan Guo
【Abstract】: The split-delivery vehicle routing problem (SDVRP) is a natural extension of the classical vehicle routing problem (VRP) that allows the same customer to be served by more than one vehicle. This problem is a very challenging combinatorial optimization problem and has attracted much academic attention. To solve it, most of the literature articles adopted heuristic approaches in which the solution is represented by a set of delivery patterns, and the search operators were derived from the traditional VRP operators. Differently, our approach employs the combination of a set of routes and a forest to represent the solution. Several forest-based operators are accordingly introduced. We integrate the new operators into a simple tabu search framework and then demonstrate the efficiency of our approach by conducting experiments on existing benchmark instances.
【Keywords】: split-delivery; vehicle routing; forest-based heuristic
【Paper Link】 【Pages】:3439-3446
【Authors】: Hankz Hankui Zhuo
【Abstract】: AI planning techniques often require a given set of action models provided as input. Creating action models is, however, a difficult task that costs much manual effort. The problem of action-model acquisition has drawn a lot of interest from researchers in the past. Despite the success of the previous systems, they are all based on the assumption that there are enough training examples for learning high-quality action models. In many real-world applications, e.g., military operation, collecting a large amount of training examples is often both difficult and costly. Instead of collecting training examples, we assume there are abundant annotators, i.e., the crowd, available to provide information learning action models. Specifically, we first build a set of soft constraints based on the labels (true or false) given by the crowd or annotators. We then builds a set of soft constraints based on the input plan traces. After that we put all the constraints together and solve them using a weighted MAX-SAT solver, and convert the solution of the solver to action models. We finally exhibit that our approach is effective in the experiment.
【Keywords】: Planning
【Paper Link】 【Pages】:3447-3453
【Authors】: Ehsan Abbasnejad ; Justin Domke ; Scott Sanner
【Abstract】: Bayesian decision-theory underpins robust decision-making in applications ranging from plant control to robotics where hedging action selection against state uncertainty is critical for minimizing low probability but potentially catastrophic outcomes (e.g, uncontrollable plant conditions or robots falling into stairwells). Unfortunately, belief state distributions in such settings are often complex and/or high dimensional, thus prohibiting the efficient application of analytical techniques for expected utility computation when real-time control is required. This leaves Monte Carlo evaluation as one of the few viable (and hence frequently used) techniques for online action selection. However, loss-insensitive Monte Carlo methods may require large numbers of samples to identify optimal actions with high certainty since they may sample from highprobability regions that do not disambiguate action utilities. In this paper we remedy this problem by deriving an optimal proposal distribution for a loss-calibrated Monte Carlo importance sampler that bounds the regret of using an estimated optimal action. Empirically, we show that using our loss-calibrated Monte Carlo method yields high-accuracy optimal action selections in a fraction of the number of samples required by conventional loss-insensitive samplers.
【Keywords】:
【Paper Link】 【Pages】:3454-3460
【Authors】: Yossiri Adulyasak ; Pradeep Varakantham ; Asrar Ahmed ; Patrick Jaillet
【Abstract】: Markov Decision Problems, MDPs offer an effective mechanism for planning under uncertainty. However, due to unavoidable uncertainty over models, it is difficult to obtain an exact specification of an MDP. We are interested in solving MDPs, where transition and reward functions are not exactly specified. Existing research has primarily focussed on computing infinite horizon stationary policies when optimizing robustness, regret and percentile based objectives. We focus specifically on finite horizon problems with a special emphasis on objectives that are separable over individual instantiations of model uncertainty (i.e., objectives that can be expressed as a sum over instantiations of model uncertainty): (a) First, we identify two separable objectives for uncertain MDPs: Average Value Maximization (AVM) and Confidence Probability Maximisation (CPM). (b) Second, we provide optimization based solutions to compute policies for uncertain MDPs with such objectives. In particular, we exploit the separability of AVM and CPM objectives by employing Lagrangian dual decomposition(LDD). (c) Finally, we demonstrate the utility of the LDD approach on a benchmark problem from the literature.
【Keywords】: Markov Decision Problems (MDPs), Lagrangian Dual Decomposition; Bayesian Reinforcement Learning; Robust MDPs
【Paper Link】 【Pages】:3461-3467
【Authors】: Hadi Mohasel Afshar ; Scott Sanner ; Ehsan Abbasnejad
【Abstract】: Many real-world Bayesian inference problems such as preference learning or trader valuation modeling in financial markets naturally use piecewise likelihoods. Unfortunately, exact closed-form inference in the underlying Bayesian graphical models is intractable in the general case and existing approximation techniques provide few guarantees on both approximation quality and efficiency. While (Markov Chain) Monte Carlo methods provide an attractive asymptotically unbiased approximation approach, rejection sampling and Metropolis-Hastings both prove inefficient in practice, and analytical derivation of Gibbs samplers require exponential space and time in the amount of data. In this work, we show how to transform problematic piecewise likelihoods into equivalent mixture models and then provide a blocked Gibbs sampling approach for this transformed model that achieves an exponential-to-linear reduction in space and time compared to a conventional Gibbs sampler. This enables fast, asymptotically unbiased Bayesian inference in a new expressive class of piecewise graphical models and empirically requires orders of magnitude less time than rejection, Metropolis-Hastings, and conventional Gibbs sampling methods to achieve the same level of accuracy.
【Keywords】: Gibbs sampling, piecewise graphical models
【Paper Link】 【Pages】:3468-3474
【Authors】: Rehan Abdul Aziz ; Geoffrey Chu ; Christian J. Muise ; Peter James Stuckey
【Abstract】: Model counting is the problem of computing the number of models that satisfy a given propositional theory. It has recently been applied to solving inference tasks in probabilistic logic programming, where the goal is to compute the probability of given queries being true provided a set of mutually independent random variables, a model (a logic program) and some evidence. The core of solving this inference task involves translating the logic program to a propositional theory and using a model counter. In this paper, we show that for some problems that involve inductive definitions like reachability in a graph, the translation of logic programs to SAT can be expensive for the purpose of solving inference tasks. For such problems, direct implementation of stable model semantics allows for more efficient solving. We present two implementation techniques, based on unfounded set detection, that extend a propositional model counter to a stable model counter. Our experiments show that for particular problems, our approach can outperform a state-of-the-art probabilistic logic programming solver by several orders of magnitude in terms of running time and space requirements, and can solve instances of significantly larger sizes on which the current solver runs out of time or memory.
【Keywords】: Model Counting; Stable Model Semantics; Probabilistic Logic Programming
【Paper Link】 【Pages】:3475-3481
【Authors】: Elias Bareinboim ; Jin Tian
【Abstract】: Controlling for selection and confounding biases are two of the most challenging problems that appear in data analysis in the empirical sciences as well as in artificial intelligence tasks. The combination of previously studied methods for each of these biases in isolation is not directly applicable to certain non-trivial cases in which selection and confounding biases are simultaneously present. In this paper, we tackle these instances non-parametrically and in full generality. We provide graphical and algorithmic conditions for recoverability of interventional distributions for when selection and confounding biases are both present. Our treatment completely characterizes the class of causal effects that are recoverable in Markovian models, and is suffi- cient for Semi-Markovian models.
【Keywords】: sampling bias; experimental design; causal effects;
【Paper Link】 【Pages】:3482-3488
【Authors】: Nahla Ben Amor ; Fatma Essghaier ; Hélène Fargier
【Abstract】: This paper raises the question of collective decisionmaking under possibilistic uncertainty; We study fouregalitarian decision rules and show that in the contextof a possibilistic representation of uncertainty, the useof an egalitarian collective utility function allows toget rid of the Timing Effect. Making a step further,we prove that if both the agents’ preferences and thecollective ranking of the decisions satisfy Dubois andPrade’s axioms (1995), and particularly risk aversion,and Pareto Unanimity, then the egalitarian collectiveaggregation is compulsory. This result can be seen asan ordinal counterpart of Harsanyi’s theorem (1955).
【Keywords】: Decision under Uncertainty, Possibility Theory, Collective Choice, Egalitarianism, Timing Effect.
【Paper Link】 【Pages】:3489-3495
【Authors】: David Buchman ; David Poole
【Abstract】: We consider the problem of, given a probabilistic model on a set of random variables, how to add a new variable that depends on the other variables, without changing the original distribution. In particular, we consider relational models (such as Markov logic networks (MLNs)), where we cannot directly define conditional probabilities. In relational models, there may be an unbounded number of parents in the grounding, and conditional distributions need to be defined in terms of aggregators. The question we ask is whether and when it is possible to represent conditional probabilities at all in various relational models. Some aggregators have been shown to be representable by MLNs, by adding auxiliary variables; however it was unknown whether they could be defined without auxiliary variables. For other aggregators, it was not known whether they can be represented by MLNs at all. We obtained surprisingly strong negative results on the capability of flexible undirected relational models such as MLNs to represent aggregators without affecting the original model's distribution. We provide a map of what aspects of the models, including the use of auxiliary variables and quantifiers, result in the ability to represent various aggregators. In addition, we provide proof techniques which can be used to facilitate future theoretic results on relational models, and demonstrate them on relational logistic regression (RLR).
【Keywords】:
【Paper Link】 【Pages】:3496-3502
【Authors】: Krishnendu Chatterjee ; Martin Chmelik ; Raghav Gupta ; Ayush Kanodia
【Abstract】: We consider partially observable Markov decision processes (POMDPs) with a set of target states and every transition is associated with an integer cost. The optimization objective we study asks to minimize the expected total cost till the target set is reached, while ensuring that the target set is reached almost-surely (with probability 1). We show that for integer costs approximating the optimal cost is undecidable. For positive costs, our results are as follows: (i) we establish matching lower and upper bounds for the optimal cost and the bound is double exponential; (ii) we show that the problem of approximating the optimal cost is decidable and present approximation algorithms developing on the existing algorithms for POMDPs with finite-horizon objectives. While the worst-case running time of our algorithm is double exponential, we present efficient stopping criteria for the algorithm and show experimentally that it performs well in many examples of interest.
【Keywords】: POMDPs; Reachability objectives; Total cost; Approximation algorithms
【Paper Link】 【Pages】:3503-3510
【Authors】: Suming Jeremiah Chen ; Arthur Choi ; Adnan Darwiche
【Abstract】: There are many criteria for measuring the value of information (VOI), each based on a different principle that is usually suitable for specific applications. We propose a new criterion for measuring the value of information, which values information that leads to robust decisions (i.e., ones that are unlikely to change due to new information). We also introduce an algorithm for Naive Bayes networks that selects features with maximal VOI under the new criteria. We discuss the application of the new criteria to classification tasks, showing how it can be used to tradeoff the budget, allotted for acquiring information, with the classification accuracy. In particular, we show empirically that the new criteria can reduce the expended budget significantly while reducing the classification accuracy only slightly. We also show empirically that the new criterion leads to decisions that are much more robust than those based on traditional VOI criteria, such as information gain and classification loss. This make the new criteria particularly suitable for certain decision making applications.
【Keywords】: value of information; same-decision probability; decision-making; classification; feature selection; exact probabilistic inference; branch and bound; knapsack problem
【Paper Link】 【Pages】:3511-3518
【Authors】: Yuxin Chen ; Shervin Javdani ; Amin Karbasi ; J. Andrew Bagnell ; Siddhartha S. Srinivasa ; Andreas Krause
【Abstract】: How should we gather information to make effective decisions? A classical answer to this fundamental problem is given by the decision-theoretic value of information. Unfortunately, optimizing this objective is intractable, and myopic (greedy) approximations are known to perform poorly. In this paper, we introduce DiRECt, an efficient yet near-optimal algorithm for nonmyopically optimizing value of information. Crucially, DiRECt uses a novel surrogate objective that is: (1) aligned with the value of information problem (2) efficient to evaluate and (3) adaptive submodular. This latter property enables us to utilize an efficient greedy optimization while providing strong approximation guarantees. We demonstrate the utility of our approach on four diverse case-studies: touch-based robotic localization, comparison-based preference learning, wild-life conservation management, and preference elicitation in behavioral economics. In the first application, we demonstrate DiRECt in closed-loop on an actual robotic platform.
【Keywords】: Sequential Decision Making; Value of Information; Adaptive Submodularity; Decision Region Determination; Touch-based Localizatoin
【Paper Link】 【Pages】:3519-3525
【Authors】: Fábio Gagliardi Cozman ; Denis Deratani Mauá
【Abstract】: We examine the inferential complexity of Bayesian networks specified through logical constructs. We first consider simple propositional languages, and then move to relational languages. We examine both the combined complexity of inference (as network size and evidence size are not bounded) and the data complexity of inference (where network size is bounded); we also examine the connection to liftability through domain complexity. Combined and data complexity of several inference problems are presented, ranging from polynomial to exponential classes.
【Keywords】: Bayesian networks; relational Bayesian networks; complexity theory
【Paper Link】 【Pages】:3526-3532
【Authors】: Xiannian Fan ; Changhe Yuan
【Abstract】: Several heuristic search algorithms such as A* and breadth-first branch and bound have been developed for learning Bayesian network structures that optimize a scoring function. These algorithms rely on a lower bound function called k-cycle conflict heuristic in guiding the search to explore the most promising search spaces. The heuristic takes as input a partition of the random variables of a data set; the importance of the partition opens up opportunities for further research. This work introduces a new partition method based on information extracted from the potential optimal parent sets (POPS) of the variables. Empirical results show that the new partition can significantly improve the efficiency and scalability of heuristic search-based structure learning algorithms.
【Keywords】: Structure Learning; Bayesian Network; heuristic search
【Paper Link】 【Pages】:3533-3539
【Authors】: Darrell Hoy ; Evdokia Nikolova
【Abstract】: Mitigating risk in decision-making has been a long-standing problem. Due to the mathematical challenge of its nonlinear nature, especially in adaptive decision-making problems, finding optimal policies is typically intractable. With a focus on efficient algorithms, we ask how well we can approximate the optimal policies for the difficult case of general utility models of risk. Little is known about efficient algorithms beyond the very special cases of linear (risk-neutral) and exponential utilities since general utilities are not separable and preclude the use of traditional dynamic programming techniques. In this paper, we consider general utility functions and investigate efficient computation of approximately optimal routing policies, where the goal is to maximize the expected utility of arriving at a destination around a given deadline. We present an adaptive discretization variant of successive approximation which gives an $\error$-optimal policy in polynomial time. The main insight is to perform discretization at the utility level space, which results in a nonuniform discretization of the domain, and applies for any monotone utility function.
【Keywords】: risk-aversion; planning under uncertainty; routing; markov decision process; approximation
【Paper Link】 【Pages】:3540-3547
【Authors】: Thomas Keller ; Florian Geißer
【Abstract】: We introduce the MDP-Evaluation Stopping Problem, the optimization problem faced by participants of the International Probabilistic Planning Competition 2014 that focus on their own performance. It can be constructed as a meta-MDP where actions correspond to the application of a policy on a base-MDP, which is intractable in practice. Our theoretical analysis reveals that there are tractable special cases where the problem can be reduced to an optimal stopping problem. We derive approximate strategies of high quality by relaxing the general problem to an optimal stopping problem, and show both theoretically and experimentally that it not only pays off to pursue luck in the execution of the optimal policy, but that there are even cases where it is better to be lucky than good as the execution of a suboptimal base policy is part of an optimal strategy in the meta-MDP.
【Keywords】: Optimal Stopping Problem; Secretary Problem; MDP; Planning under Uncertainty; IPPC; UCT
【Paper Link】 【Pages】:3548-3555
【Authors】: Hyeoneun Kim ; Woosang Lim ; Kanghoon Lee ; Yung-Kyun Noh ; Kee-Eung Kim
【Abstract】: Bayesian reinforcement learning (BRL) provides a formal framework for optimal exploration-exploitation tradeoff in reinforcement learning. Unfortunately, it is generally intractable to find the Bayes-optimal behavior except for restricted cases. As a consequence, many BRL algorithms, model-based approaches in particular, rely on approximated models or real-time search methods. In this paper, we present potential-based shaping for improving the learning performance in model-based BRL. We propose a number of potential functions that are particularly well suited for BRL, and are domain-independent in the sense that they do not require any prior knowledge about the actual environment. By incorporating the potential function into real-time heuristic search, we show that we can significantly improve the learning performance in standard benchmark domains.
【Keywords】:
【Paper Link】 【Pages】:3556-3563
【Authors】: Kanghoon Lee ; Kee-Eung Kim
【Abstract】: Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on model based BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to compute tighter value function bounds of the Bayes-optimal value function, which is crucial for improving the performance of many model-based BRL algorithms. We then present how our bounds can be integrated into real-time AO* heuristic search, and provide a theoretical analysis on the impact of improved bounds on the search efficiency. We also provide empirical results on standard BRL domains that demonstrate the effectiveness of our approach.
【Keywords】:
【Paper Link】 【Pages】:3564-3570
【Authors】: Phillip Odom ; Tushar Khot ; Reid Porter ; Sriraam Natarajan
【Abstract】: Advice giving has been long explored in artificial intelligence to build robust learning algorithms. We consider advice giving in relational domains where the noise is systematic. The advice is provided as logical statements that are then explicitly considered by the learning algorithm at every update. Our empirical evidence proves that human advice can effectively accelerate learning in noisy structured domains where so far humans have been merely used as labelers or as designers of initial structure of the model.
【Keywords】: Relational Probabilistic Models; Uncertainty in AI; Knowledge-Intensive Learning
【Paper Link】 【Pages】:3571-3577
【Authors】: Aaditya Ramdas ; Sashank Jakkam Reddi ; Barnabás Póczos ; Aarti Singh ; Larry A. Wasserman
【Abstract】: This paper is about two related decision theoretic problems, nonparametric two-sample testing and independence testing. There is a belief that two recently proposed solutions, based on kernels and distances between pairs of points, behave well in high-dimensional settings. We identify different sources of misconception that give rise to the above belief. Specifically, we differentiate the hardness of estimation of test statistics from the hardness of testing whether these statistics are zero or not, and explicitly discuss a notion of "fair" alternative hypotheses for these problems as dimension increases. We then demonstrate that the power of these tests actually drops polynomially with increasing dimension against fair alternatives. We end with some theoretical insights and shed light on the median heuristic for kernel bandwidth selection. Our work advances the current understanding of the power of modern nonparametric hypothesis tests in high dimensions.
【Keywords】:
【Paper Link】 【Pages】:3578-3584
【Authors】: Sherry Shanshan Ruan ; Gheorghe Comanici ; Prakash Panangaden ; Doina Precup
【Abstract】: We provide a novel, flexible, iterative refinement algorithm to automatically construct an approximate statespace representation for Markov Decision Processes (MDPs). Our approach leverages bisimulation metrics, which have been used in prior work to generate features to represent the state space of MDPs. We address a drawback of this approach, which is the expensive computation of the bisimulation metrics. We propose an algorithm to generate an iteratively improving sequence of state space partitions. Partial metric computations guide the representation search and provide much lower space and computational complexity, while maintaining strong convergence properties. We provide theoretical results guaranteeing convergence as well as experimental illustrations of the accuracy and savings (in time and memory usage) of the new algorithm, compared to traditional bisimulation metric computation.
【Keywords】:
【Paper Link】 【Pages】:3585-3591
【Authors】: Michael John Schofield ; Michael Thielscher
【Abstract】: General Game Playing is the design of AI systems able to understand the rules of new games and to use such descriptions to play those games effectively. Games with incomplete information have recently been added as anew challenge for general game-playing systems. The only published solutions to this challenge are based on sampling complete information models. In doing so they ground all of the unknown information, thereby making information gathering moves of no value; a well-known criticism of such sampling based systems. We present and analyse a method for escalating reasoning from complete information models to incomplete information models and show how this enables a general game player to correctly value information in incomplete information games. Experimental results demonstrate the success of this technique over standard model sampling.
【Keywords】: Imperfect Information Games; Reasoning with Incomplete Information; Hyperplay
【Paper Link】 【Pages】:3592-3598
【Authors】: Alexander Shleyfman ; Antonín Komenda ; Carmel Domshlak
【Abstract】: Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.
【Keywords】: Multi-Armed Bandit; Monte-Carlo; Search
【Paper Link】 【Pages】:3599-3605
【Authors】: Guy Van den Broeck ; Mathias Niepert
【Abstract】: Lifted probabilistic inference algorithms have been successfully applied to a large number of symmetric graphical models. Unfortunately, the majority of real-world graphical models is asymmetric. This is even the case for relational representations when evidence is given. Therefore, more recent work in the community moved to making the models symmetric and then applying existing lifted inference algorithms. However, this approach has two shortcomings. First, all existing over-symmetric approximations require a relational representation such as Markov logic networks. Second, the induced symmetries often change the distribution significantly, making the computed probabilities highly biased. We present a framework for probabilistic sampling-based inference that only uses the induced approximate symmetries to propose steps in a Metropolis-Hastings style Markov chain. The framework, therefore, leads to improved probability estimates while remaining unbiased. Experiments demonstrate that the approach outperforms existing MCMC algorithms.
【Keywords】: lifted inference; probabilistic inference; symmetry-aware inference
【Paper Link】 【Pages】:3606-3612
【Authors】: Deepak Venugopal ; Somdeb Sarkhel ; Vibhav Gogate
【Abstract】: The main computational bottleneck in various sampling based and local-search based inference algorithms for Markov logic networks (e.g., Gibbs sampling, MC-SAT, MaxWalksat, etc.) is computing the number of groundings of a first-order formula that are true given a truth assignment to all of its ground atoms. We reduce this problem to the problem of counting the number of solutions of a constraint satisfaction problem (CSP) and show that during their execution, both sampling based and local-search based algorithms repeatedly solve dynamic versions of this counting problem. Deriving from the vast amount of literature on CSPs and graphical models, we propose an exact junction-tree based algorithm for computing the number of solutions of the dynamic CSP, analyze its properties, and show how it can be used to improve the computational complexity of Gibbs sampling and MaxWalksat. Empirical tests on a variety of benchmarks clearly show that our new approach is several orders of magnitude more scalable than existing approaches.
【Keywords】:
【Paper Link】 【Pages】:3613-3619
【Authors】: Ngo Anh Vien ; Marc Toussaint
【Abstract】: Monte-Carlo Tree Search, especially UCT and its POMDP version POMCP, have demonstrated excellent performanceon many problems. However, to efficiently scale to large domains one should also exploit hierarchical structure if present. In such hierarchical domains, finding rewarded states typically requires to search deeply; covering enough such informative states very far from the root becomes computationally expensive in flat non-hierarchical search approaches. We propose novel, scalable MCTS methods which integrate atask hierarchy into the MCTS framework, specifically lead-ing to hierarchical versions of both, UCT and POMCP. The new method does not need to estimate probabilistic models of each subtask, it instead computes subtask policies purely sample-based. We evaluate the hierarchical MCTS methods on various settings such as a hierarchical MDP, a Bayesian model-based hierarchical RL problem, and a large hierarchical POMDP.
【Keywords】: Monte-Carlo Tree Search, Hierarchical Monte-Carlo Planning, Bayesian hierarchical RL, POSMDP
【Paper Link】 【Pages】:3620-3627
【Authors】: Andrew J. Wang ; Brian C. Williams
【Abstract】: Temporal uncertainty in large-scale logistics forces one to trade off between lost efficiency through built-in slack and costly replanning when deadlines are missed. Due to the difficulty of reasoning about such likelihoods and consequences, a computational framework is needed to quantify and bound the risk of violating scheduling requirements. This work addresses the chance-constrained scheduling problem, where actions' durations are modeled probabilistically. Our solution method uses conflict-directed risk allocation to efficiently compute a scheduling policy. The key insight, compared to previous work in probabilistic scheduling, is to decouple the reasoning about temporal and risk constraints. This decomposes the problem into a separate master and subproblem, which can be iteratively solved much quicker. Through a set of simulated car-sharing scenarios, it is empirically shown that conflict-directed risk allocation computes solutions nearly an order of magnitude faster than prior art, which considers all constraints in a single lump-sum optimization.
【Keywords】: pstn; scheduling; chance constraint; risk allocation
【Paper Link】 【Pages】:3628-3634
【Authors】: Jeremy C. Weiss ; Sriraam Natarajan ; C. David Page Jr.
【Abstract】: Applications of graphical models often require the use of approximate inference, such as sequential importance sampling (SIS), for estimation of the model distribution given partial evidence, i.e., the target distribution. However, when SIS proposal and target distributions are dissimilar, such procedures lead to biased estimates or require a prohibitive number of samples. We introduce ReBaSIS, a method that better approximates the target distribution by sampling variable by variable from existing importance samplers and accepting or rejecting each proposed assignment in the sequence: a choice made based on anticipating upcoming evidence. We relate the per-variable proposal and model distributions by expected weight ratios of sequence completions and show that we can learn accurate models of optimal acceptance probabilities from local samples. In a continuous-time domain, our method improves upon previous importance samplers by transforming an SIS problem into a machine learning one.
【Keywords】: rejection-based importance sampling
【Paper Link】 【Pages】:3635-3641
【Authors】: Erik Peter Zawadzki ; Sébastien Lahaie
【Abstract】: A scoring rule is a device for eliciting and assessing probabilistic forecasts from an agent. When dealing with continuous outcome spaces, and absent any prior insights into the structure of the agent's beliefs, the rule should allow for a flexible reporting interface that can accurately represent complicated, multi-modal distributions. In this paper, we provide such a scoring rule based on a nonparametric approach of eliciting a set of samples from the agent and efficiently evaluating the score using kernel methods. We prove that sampled reports of increasing size converge rapidly to the true score, and that sampled reports are approximately optimal. We also demonstrate a connection between the scoring rule and the maximum mean discrepancy divergence. Experimental results are provided that confirm rapid convergence and that the expected score correlates well with standard notions of divergence, both important considerations for ensuring that agents are incentivized to report accurate information.
【Keywords】: Scoring rules, Nonparametric score, Kernel density estimation
【Paper Link】 【Pages】:3642-3648
【Authors】: Chongjie Zhang ; Julie A. Shah
【Abstract】: The utilitarian solution criterion, which has been extensively studied in multi-agent decision making under uncertainty, aims to maximize the sum of individual utilities. However, as the utilitarian solution often discriminates against some agents, it is not desirable for many practical applications where agents have their own interests and fairness is expected. To address this issue, this paper introduces egalitarian solution criteria for sequential decision-making under uncertainty, which are based on the maximin principle. Motivated by different application domains, we propose four maximin fairness criteria and develop corresponding algorithms for computing their optimal policies. Furthermore, we analyze the connections between these criteria and discuss and compare their characteristics.
【Keywords】: Fairness; decision-making under uncertainty; solution criterion
【Paper Link】 【Pages】:3649-3656
【Authors】: Xiaoyuan Zhu ; Changhe Yuan
【Abstract】: Most Relevant Explanation (MRE) is a new inference task in Bayesian networks that finds the most relevant partial instantiation of target variables as an explanation for given evidence by maximizing the Generalized Bayes Factor (GBF). No exact algorithm has been developed for solving MRE previously. This paper fills the void and introduces a breadth-first branch-and-bound MRE algorithm based on a novel upper bound on GBF. The bound is calculated by decomposing the computation of the score to a set of Markov blankets of subsets of evidence variables. Our empirical evaluations show that the proposed algorithm scales up exact MRE inference significantly.
【Keywords】: Bayeisan Networks; Most Relevant Explanation; Generalized Bayes Factor; Exact inference
【Paper Link】 【Pages】:3657-3663
【Authors】: José Bento ; Nate Derbinsky ; Charles Mathy ; Jonathan S. Yedidia
【Abstract】: We address the problem of planning collision-free paths for multiple agents using optimization methods known as proximal algorithms. Recently this approach was explored in Bento et al. (2013), which demonstrated its ease of parallelization and decentralization, the speed with which the algorithms generate good quality solutions, and its ability to incorporate different proximal operators, each ensuring that paths satisfy a desired property. Unfortunately, the operators derived only apply to paths in 2D and require that any intermediate waypoints we might want agents to follow be preassigned to specific agents, limiting their range of applicability. In this paper we resolve these limitations. We introduce new operators to deal with agents moving in arbitrary dimensions that are faster to compute than their 2D predecessors and we introduce landmarks, space-time positions that are automatically assigned to the set of agents under different optimality criteria. Finally, we report the performance of the new operators in several numerical experiments.
【Keywords】: Alternating direction method of multipliers; Proximal operators; Non-convex optimization; Distributed optimization; Message-passing algorithm; Semi-infinite programming; Multi-agent trajectory planning; Collision avoidance; Waypoint assignment
【Paper Link】 【Pages】:3664-3671
【Authors】: Morteza Lahijanian ; Shaull Almagor ; Dror Fried ; Lydia E. Kavraki ; Moshe Y. Vardi
【Abstract】: The specification of complex motion goals through temporal logics is increasingly favored in robotics to narrow the gap between task and motion planning. A major limiting factor of such logics, however, is their Boolean satisfaction condition. To relax this limitation, we introduce a method for quantifying the satisfaction of co-safe linear temporal logic specifications, and propose a planner that uses this method to synthesize robot trajectories with the optimal satisfaction value. The method assigns costs to violations of specifications from user-defined proposition costs. These violation costs define a distance to satisfaction and can be computed algorithmically using a weighted automaton. The planner utilizes this automaton and an abstraction of the robotic system to construct a product graph that captures all possible robot trajectories and their distances to satisfaction. Then, a plan with the minimum distance to satisfaction is generated by employing this graph as the high-level planner in a synergistic planning framework. The efficacy of the method is illustrated on a robot with unsatisfiable specifications in an office environment.
【Keywords】: LTL; planning; partial satisfaction; preference; robot; uncertainty; temporal logics
【Paper Link】 【Pages】:3672-3678
【Authors】: Mathew Monfort ; Anqi Liu ; Brian D. Ziebart
【Abstract】: To facilitate interaction with people, robots must not only recognize current actions, but also infer a person's intentions and future behavior. Recent advances in depth camera technology have significantly improved human motion tracking. However, the inherent high dimensionality of interacting with the physical world makes efficiently forecasting human intention and future behavior a challenging task. Predictive methods that estimate uncertainty are therefore critical for supporting appropriate robotic responses to the many ambiguities posed within the human-robot interaction setting. We address these two challenges, high dimensionality and uncertainty, by employing predictive inverse optimal control methods to estimate a probabilistic model of human motion trajectories. Our inverse optimal control formulation estimates quadratic cost functions that best rationalize observed trajectories framed as solutions to linear-quadratic regularization problems. The formulation calibrates its uncertainty from observed motion trajectories, and is efficient in high-dimensional state spaces with linear dynamics. We demonstrate its effectiveness on a task of anticipating the future trajectories, target locations and activity intentions of hand motions.
【Keywords】: Activity prediction; Target Prediction; Inverse Reinforcement Learning; Linear Quadratic Regulation; Human-Robot Interaction; Depth Camera Reasoning
【Paper Link】 【Pages】:3679-3685
【Authors】: David Ray Thompson ; David Wettergreen ; Greydon T. Foil ; P. Michael Furlong ; Anatha Ravi Kiran
【Abstract】: Adaptive exploration uses active learning principles to improve the efficiency of autonomous robotic surveys. This work considers an important and understudied aspect of autonomous exploration: in situ validation of remote sensing measurements. We focus on high- dimensional sensor data with a specific case study of spectroscopic mapping. A field robot refines an orbital image by measuring the surface at many wavelengths. We introduce a new objective function based on spectral unmixing that seeks pure spectral signatures to accurately model diluted remote signals. This objective reflects physical properties of the multi-wavelength data. The rover visits locations that jointly improve its model of the environment while satisfying time and energy constraints. We simulate exploration using alternative planning approaches, and show proof of concept results with the canonical spectroscopic map of a mining district in Cuprite, Nevada.
【Keywords】: Robotics; Adaptive Exploration; Autonomous Science; Remote Sensing; Planetary Exploration; Infrared Reflectance Spectroscopy
【Paper Link】 【Pages】:3686-3693
【Authors】: Yezhou Yang ; Yi Li ; Cornelia Fermüller ; Yiannis Aloimonos
【Abstract】: In order to advance action generation and creation in robots beyond simple learned schemas we need computational tools that allow us to automatically interpret and represent human actions. This paper presents a system that learns manipulation action plans by processing unconstrained videos from the World Wide Web. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots. The lower level of the system consists of two convolutional neural network (CNN) based recognition modules, one for classifying the hand grasp type and the other for object recognition. The higher level is a probabilistic manipulation action grammar based parsing module that aims at generating visual sentences for robot manipulation. Experiments conducted on a publicly available unconstrained video dataset show that the system is able to learn manipulation actions by ``watching'' unconstrained videos with high accuracy.
【Keywords】: Manipulation Actions; Action Grammar; Grasp; Convolutional Neural Networks
【Paper Link】 【Pages】:3694-3701
【Authors】: Valeriy Balabanov ; Jie-Hong Roland Jiang ; Mikolas Janota ; Magdalena Widl
【Abstract】: Many computer science problems can be naturally and compactly expressed using quantified Boolean formulas (QBFs). Evaluating thetruth or falsity of a QBF is an important task, and constructing the corresponding model or countermodel can be as important and sometimes even more useful in practice. Modern search and learning based QBF solvers rely fundamentally on resolution and can be instrumented to produce resolution proofs, from which in turn Skolem-function models and Herbrand-function countermodels can be extracted. These (counter)models are the key enabler of various applications. Not until recently the superiority of long-distanceresolution (LQ-resolution) to short-distance resolution(Q-resolution) was demonstrated. While a polynomial algorithm exists for (counter)model extraction from Q-resolution proofs, it remains open whether it exists forLQ-resolution proofs. This paper settles this open problem affirmatively by constructing a linear-time extraction procedure. Experimental results show the distinct benefits of the proposed method in extracting high quality certificates from some LQ-resolution proofs that are not obtainable from Q-resolution proofs.
【Keywords】: Quantified Boolean Formula; Long-Distance Resolution; Model; Countermodel; Skolem Function; Herbrand Function
【Paper Link】 【Pages】:3702-3709
【Authors】: Sam Bayless ; Noah Bayless ; Holger H. Hoos ; Alan J. Hu
【Abstract】: Boolean satisfiability (SAT) solvers have been successfully applied to a wide variety of difficult combinatorial problems. Many further problems can be solved by SAT Modulo Theory (SMT) solvers, which extend SAT solvers to handle additional types of constraints. However, building efficient SMT solvers is often very difficult. In this paper, we define the concept of a Boolean monotonic theory and show how to easily build efficient SMT solvers, including effective theory propagation and clause learning, for such theories. We present examples showing useful constraints that are monotonic, including many graph properties (e.g., shortest paths), and geometric properties (e.g., convex hulls). These constraints arise in problems that are otherwise difficult for SAT solvers to handle, such as procedural content generation. We have implemented several monotonic theory solvers using the techniques we present in this paper and applied these to content generation problems, demonstrating major speed-ups over SAT, SMT, and Answer Set Programming solvers, easily solving instances that were previously out of reach.
【Keywords】: SAT; SAT Modulo Theories; Answer Set Programming; Procedural Content Generation
【Paper Link】 【Pages】:3710-3716
【Authors】: Philippe Besnard ; Éric Grégoire ; Jean-Marie Lagniez
【Abstract】: An original method for the extraction of one maximal subset of a set of Boolean clauses that must be satisfiable with possibly mutually contradictory assumptive contexts is motivated and experimented. Noticeably, it performs a direct computation and avoids the enumeration of all subsets that are satisfiable with at least one of the contexts. The method applies for subsets that are maximal with respect to inclusion or cardinality.
【Keywords】: Partial-max-SAT; SAT; MSS; Maximal Satisfiable Subset
【Paper Link】 【Pages】:3717-3723
【Authors】: Christian Bessiere ; Anastasia Paparrizou ; Kostas Stergiou
【Abstract】: We propose two local consistencies that extend bounds consistency (BC) by simultaneously considering combinations of constraints as opposed to single constraints. We prove that these two local consistencies are both stronger than BC, but are NP-hard to enforce even when constraints are linear. Hence, we propose two polynomial-time techniques to enforce approximations of these two consistencies on linear constraints. One is a reformulation of the constraints on which we enforce BC whereas the other is a polynomial time algorithm. Both achieve stronger pruning than BC. Our experiments show large differences in favor of our approaches.
【Keywords】:
【Paper Link】 【Pages】:3724-3730
【Authors】: Marco Bozzano ; Alessandro Cimatti ; Marco Gario ; Andrea Micheli
【Abstract】: Timed Failure Propagation Graphs (TFPGs) are a formalism used in industry to describe failure propagation in a dynamic partially observable system. TFPGs are commonly used to perform model-based diagnosis. As in any model-based diagnosis approach, however, the quality of the diagnosis strongly depends on the quality of the model. Approaches to certify the quality of the TFPG are limited and mainly rely on testing. In this work we address this problem by leveraging efficient Satisfiability Modulo Theories (SMT) engines to perform exhaustive reasoning on TFPGs. We apply model-checking techniques to certify that a given TFPG satisfies (or not) a property of interest. Moreover, we discuss the problem of refinement and diagnosability testing and empirically show that our technique can be used to efficiently solve them.
【Keywords】: Satisfiability Modulo Theories; Timed Failure Propagation Graphs; Model Validation; Diagnosis
【Paper Link】 【Pages】:3731-3737
【Authors】: David A. Cohen ; Martin C. Cooper ; Peter G. Jeavons ; Stanislav Zivny
【Abstract】: Constraint programming is a natural paradigm for many combinatorial optimisation problems. The complexity of constraint satisfaction for various forms of constraints has been widely-studied, both to inform the choice of appropriate algorithms, and to understand better the boundary between polynomial-time complexity and NP-hardness. In constraint programming it is well-known that any constraint satisfaction problem can be converted to an equivalent binary problem using the so-called dual encoding. Using this standard approach any fixed collection of constraints, of arbitrary arity, can be converted to an equivalent set of constraints of arity at most two. Here we show that this transformation, although it changes the domain of the constraints, preserves all the relevant algebraic properties that determine the complexity. Moreover, we show that the dual encoding preserves many of the key algorithmic properties of the original instance. We also show that this remains true for more general valued constraint languages, where constraints may assign different cost values to different assignments. Hence, we obtain a simple proof of the fact that to classify the computational complexity of all valued constraint languages it suffices to classify only binary valued constraint languages.
【Keywords】:
【Paper Link】 【Pages】:3738-3745
【Authors】: Niklas Eén ; Alexander Legg ; Nina Narodytska ; Leonid Ryzhyk
【Abstract】: Reachability games are a useful formalism for the synthesis of reactive systems. Solving a reachability game involves (1) determining the winning player and (2) computing a winning strategy that determines the winning player's action in each state of the game. Recently, a new family of game solvers has been proposed, which rely on counterexample-guided search to compute winning sequences of actions represented as an abstract game tree. While these solvers have demonstrated promising performance in solving the winning determination problem, they currently do not support strategy extraction. We present the first strategy extraction algorithm for abstract game tree-based game solvers. Our algorithm performs SAT encoding of the game abstraction produced by the winner determination algorithm and uses interpolation to compute the strategy. Our experimental results show that our approach performs well on a number of software synthesis benchmarks.
【Keywords】: SAT; interpolants; reachability games
【Paper Link】 【Pages】:3746-3754
【Authors】: Philippe Jégou ; Cyril Terrioux
【Abstract】: Tractable classes constitute an important issue in Artificial Intelligence to define new islands of tractability for reasoning or problem solving. In the area of constraint networks, numerous tractable classes have been defined, and recently, the Broken Triangle Property (BTP) has been shown as one of the most important of them, this class including several classes previously defined. In this paper, we propose a new class called ETP for Extendable-Triple Property, which generalizes BTP, by including it. Combined with the verification of the Strong-Path-Consistency, ETP is shown to be a new tractable class. Moreover, this class inherits some desirable properties of BTP including the fact that the instances of this class can be solved thanks to usual algorithms (such as MAC or RFL) used in most solvers. We give the theoretical material about this new class and we present an experimental study which shows that from a practical viewpoint, it seems more usable in practice than BTP.
【Keywords】: Constraint Satisfaction; Tractable Classes; Complexity
【Paper Link】 【Pages】:3755-3761
【Authors】: Jayanta Kumar Dutta ; Bonny Banerjee
【Abstract】: We present an unsupervised approach for abnormal event detection in videos. We propose, given a dictionary of features learned from local spatiotemporal cuboids using the sparse coding objective, the abnormality of an event depends jointly on two factors: the frequency of each feature in reconstructing all events (or, rarity of a feature) and the strength by which it is used in reconstructing the current event (or, the absolute coefficient). The Incremental Coding Length (ICL) of a feature is a measure of its entropy gain. Given a dictionary, the ICL computation does not involve any parameter, is computationally efficient and has been used for saliency detection in images with impressive results. In this paper, the rarity of a dictionary feature is learned online as its average energy, a function of its ICL. The proposed approach is applicable to real world streaming videos. Experiments on three benchmark datasets and evaluations in comparison with a number of mainstream algorithms show that the approach is comparable to the state-of-the-art.
【Keywords】: abnormal event detection; online sparse coding; incremental coding length; rarity
【Paper Link】 【Pages】:3762-3768
【Authors】: Tarek El-Gaaly ; Vicky Froyen ; Ahmed M. Elgammal ; Jacob Feldman ; Manish Singh
【Abstract】: We present a probabilistic approach to shape decomposition that creates a skeleton-based shape representation of a 3D object while simultaneously decomposing it into constituent parts. Our approach probabilistically combines two prominent threads from the shape literature: skeleton-based (medial axis) representations of shape, and part-based representations of shape, in which shapes are combinations of primitive parts. Our approach recasts skeleton-based shape representation as a mixture estimation problem, allowing us to apply probabilistic estimation techniques to the problem of 3D shape decomposition, extending earlier work on the 2D case. The estimated 3D shape decompositions approximate human shape decomposition judgments. We present a tractable implementation of the framework, which begins by over-segmenting objects at concavities, and then probabilistically merges them to create a distribution over possible decompositions. This results in a hierarchy of decompositions at different structural scales, again closely matching known properties of human shape representation. The probabilistic estimation procedures that arise naturally in the model allow effective prediction of missing parts. We present results on shapes from a standard database illustrating the effectiveness of the approach.
【Keywords】: part decomposition; 3D; recognition; parts; hierarchical; shape representation; probabilistic; Bayesian; medial axis; skeletons; segmentation; perceptual grouping; human visual perception; mixtures
【Paper Link】 【Pages】:3769-3775
【Authors】: Chuang Gan ; Ming Lin ; Yi Yang ; Yueting Zhuang ; Alexander G. Hauptmann
【Abstract】: Automatically recognizing a large number of action categories from videos is of significant importance for video understanding. Most existing works focused on the design of more discriminative feature representation, and have achieved promising results when the positive samples are enough. However, very limited efforts were spent on recognizing a novel action without any positive exemplars, which is often the case in the real settings due to the large amount of action classes and the users' queries dramatic variations. To address this issue, we propose to perform action recognition when no positive exemplars of that class are provided, which is often known as the zero-shot learning. Different from other zero-shot learning approaches, which exploit attributes as the intermediate layer for the knowledge transfer, our main contribution is SIR, which directly leverages the semantic inter-class relationships between the known and unknown actions followed by label transfer learning. The inter-class semantic relationships are automatically measured by continuous word vectors, which learned by the skip-gram model using the large-scale text corpus. Extensive experiments on the UCF101 dataset validate the superiority of our method over fully-supervised approaches using few positive exemplars.
【Keywords】:
【Paper Link】 【Pages】:3776-3782
【Authors】: Jun Guo ; Changhu Wang ; Hongyang Chao
【Abstract】: As the popularity of touch-screen devices, understanding a user's hand-drawn sketch has become an increasingly important research topic in artificial intelligence and computer vision. However, different from natural images, the hand-drawn sketches are often highly abstract, with sparse visual information and large intra-class variance, making the problem more challenging. In this work, we study how to build effective representations for sketch recognition. First, to capture saliency patterns of different scales and spatial arrangements, a Gabor-based low-level representation is proposed. Then, based on this representation, to discovery more complex patterns in a sketch, a Hybrid Multilayer Sparse Coding (HMSC) model is proposed to learn mid-level representations. An improved dictionary learning algorithm is also leveraged in HMSC to reduce overfitting to common but trivial patterns. Extensive experiments show that the proposed representations are highly discriminative and lead to large improvements over the state of the arts.
【Keywords】: sketch recognition; representation; hybrid multilayer sparse coding
【Paper Link】 【Pages】:3783-3789
【Authors】: Yuchen Guo ; Guiguang Ding ; Xiaoming Jin ; Jianmin Wang
【Abstract】: Utilizing attributes for visual recognition has attracted increasingly interest because attributes can effectively bridge the semantic gap between low-level visual features and high-level semantic labels. In this paper, we propose a novel method for learning predictable and discriminative attributes. Specifically, we require the learned attributes can be reliably predicted from visual features, and discover the inherent discriminative structure of data. In addition, we propose to exploit the intra-category locality of data to overcome the intra-category variance in visual data. We conduct extensive experiments on Animals with Attributes (AwA) and Caltech256 datasets, and the results demonstrate that the proposed method achieves state-of-the-art performance.
【Keywords】:
【Paper Link】 【Pages】:3790-3796
【Authors】: Bo Jiang ; Jin Tang ; Chris H. Q. Ding ; Bin Luo
【Abstract】: Feature matching problem that incorporates pairwise constraints is usually formulated as a quadratic assignment problem (QAP). Since it is NP-hard, relaxation models are required. In this paper, we first formulate the QAP from the match selection point of view; and then propose a local sparse model for matching problem. Our local sparse matching (LSM) method has the following advantages: (1) It is parameter-free; (2) It generates a local sparse solution which is closer to a discrete matrix than most other continuous relaxation methods for the matching problem. (3) The one-to-one matching constraints are better maintained in LSM solution. Promising experimental results show the effectiveness of the Proposed LSM method.
【Keywords】: feature matching; sparse model; match selection
【Paper Link】 【Pages】:3797-3803
【Authors】: Johannes Lederer ; Sergio Guadarrama
【Abstract】: Sparse Filtering is a popular feature learning algorithm for image classification pipelines. In this paper, we connect the performance of Sparse Filtering with spectral properties of the corresponding feature matrices. This connection provides new insights into Sparse Filtering; in particular, it suggests early stopping of Sparse Filtering. We therefore introduce the Optimal Roundness Criterion (ORC), a novel stopping criterion for Sparse Filtering. We show that this stopping criterion is related with pre-processing procedures such as Statistical Whitening and demonstrate that it can make image classification with Sparse Filtering considerably faster and more accurate.
【Keywords】: sparse filtering; unsupervised feature learning; image classification
【Paper Link】 【Pages】:3804-3810
【Authors】: Jun Li ; Heyou Chang ; Jian Yang
【Abstract】: Sparse coding can learn good robust representation to noise and model more higher-order representation for image classification. However, the inference algorithm is computationally expensive even though the supervised signals are used to learn compact and discriminative dictionaries in sparse coding techniques. Luckily, a simplified neural network module (SNNM) has been proposed to directly learn the discriminative dictionaries for avoiding the expensive inference. But the SNNM module ignores the sparse representations. Therefore, we propose a sparse SNNM module by adding the mixed-norm regularization (l1/l2 norm). The sparse SNNM modules are further stacked to build a sparse deep stacking network (S-DSN). In the experiments, we evaluate S-DSN with four databases, including Extended YaleB, AR, 15 scene and Caltech101. Experimental results show that our model outperforms related classification methods with only a linear classifier. It is worth noting that we reach 98.8% recognition accuracy on 15 scene.
【Keywords】: Deep learning; stacking networks; sparse representations; image classification
【Paper Link】 【Pages】:3811-3819
【Authors】: Chaochao Lu ; Xiaoou Tang
【Abstract】: Face verification remains a challenging problem in very complex conditions with large variations such as pose, illumination, expression, and occlusions. This problemis exacerbated when we rely unrealistically on a singletraining data source, which is often insufficient to coverthe intrinsically complex face variations. This paperproposes a principled multi-task learning approachbased on Discriminative Gaussian Process Latent VariableModel (DGPLVM), named GaussianFace, for faceverification. In contrast to relying unrealistically on asingle training data source, our model exploits additional data from multiple source-domains to improve the generalization performance of face verification inan unknown target-domain. Importantly, our model can adapt automatically to complex data distributions, and therefore can well capture complex face variations inherent in multiple sources. To enhance discriminative power, we introduced a more efficient equivalent form of Kernel Fisher Discriminant Analysis to DGPLVM.To speed up the process of inference and prediction, we exploited the low rank approximation method. Extensive experiments demonstrated the effectiveness of the proposed model in learning from diverse data sources and generalizing to unseen domains. Specifically, the accuracy of our algorithm achieved an impressive accuracyrate of 98.52% on the well-known and challenging Labeled Faces in the Wild (LFW) benchmark. For the first time, the human-level performance in face verification (97.53%) on LFW is surpassed.
【Keywords】:
【Paper Link】 【Pages】:3820-3826
【Authors】: Wenhan Luo ; Björn Stenger ; Xiaowei Zhao ; Tae-Kyun Kim
【Abstract】: This paper proposes a new approach to multi-object tracking by semantic topic discovery. We dynamically cluster frame-by-frame detections and treat objects as topics, allowing the application of the Dirichlet Process Mixture Model (DPMM). The tracking problem is cast as a topic-discovery task where the video sequence is treated analogously to a document. This formulation addresses tracking issues such as object exclusivity constraints as well as cannot-link constraints which are integrated without the need for heuristic thresholds. The video is temporally segmented into epochs to model the dynamics of word (superpixel) co-occurrences and to model the temporal damping effect. In experiments on public data sets we demonstrate the effectiveness of the proposed algorithm.
【Keywords】: topic model; multi-object tracking
【Paper Link】 【Pages】:3827-3833
【Authors】: Xi Peng ; Zhang Yi ; Huajin Tang
【Abstract】: Given a data set from a union of multiple linear subspaces, a robust subspace clustering algorithm fits each group of data points with a low-dimensional subspace and then clusters these data even though they are grossly corrupted or sampled from the union of dependent subspaces. Under the framework of spectral clustering, recent works using sparse representation, low rank representation and their extensions achieve robust clustering results by formulating the errors (e.g., corruptions) into their objective functions so that the errors can be removed from the inputs. However, these approaches have suffered from the limitation that the structure of the errors should be known as the prior knowledge. In this paper, we present a new method of robust subspace clustering by eliminating the effect of the errors from the projection space (representation) rather than from the input space. We firstly prove that ell_1-, ell_2-, and ell_infty-norm-based linear projection spaces share the property of intra-subspace projection dominance, i.e., the coefficients over intra-subspace data points are larger than those over inter-subspace data points. Based on this property, we propose a robust and efficient subspace clustering algorithm, called Thresholding Ridge Regression (TRR). TRR calculates the ell2-norm-based coefficients of a given data set and performs a hard thresholding operator; and then the coefficients are used to build a similarity graph for clustering. Experimental studies show that TRR outperforms the state-of-the-art methods with respect to clustering quality, robustness, and time-saving.
【Keywords】: subspace clustering;sparse coding;low rank representation
【Paper Link】 【Pages】:3834-3840
【Authors】: Junchi Yan ; Jun Wang ; Hongyuan Zha ; Xiaokang Yang ; Stephen M. Chu
【Abstract】: Multi-view point registration is a relatively less studied problem compared with two-view point registration. Directly applying pairwise registration often leads to matching discrepancy as the mapping between two point sets can be determined either by direct correspondences or by any intermediate point set. Also, the local two-view registration tends to be sensitive to noises. We propose a novel multi-view registration method, where the optimal registration is achieved via an efficient and effective alternating concave minimization process. We further extend our solution to a general case in practice of registration among point sets with different cardinalities. Extensive empirical evaluations of peer methods on both synthetic data and real images suggest our method is robust to large disturbance. In particular, it is shown that our method outperforms peer point matching methods and performs competitively against graph matching approaches. The latter approaches utilize the additional second-order information at the cost of exponentially increased run-time, thus usually being less efficient.
【Keywords】:
【Paper Link】 【Pages】:3841-3847
【Authors】: Yan Yan ; Yi Yang ; Haoquan Shen ; Deyu Meng ; Gaowen Liu ; Alexander G. Hauptmann ; Nicu Sebe
【Abstract】: Complex event detection is a retrieval task with the goal of finding videos of a particular event in a large-scale unconstrained internet video archive, given example videos and text descriptions. Nowadays, different multimodal fusion schemes of low-level and high-level features are extensively investigated and evaluated for the complex event detection task. However, how to effectively select the high-level semantic meaningful concepts from a large pool to assist complex event detection is rarely studied in the literature. In this paper, we propose two novel strategies to automatically select semantic meaningful concepts for the event detection task based on both the events-kit text descriptions and the concepts high-level feature descriptions. Moreover, we introduce a novel event oriented dictionary representation based on the selected semantic concepts. Towards this goal, we leverage training samples of selected concepts from the Semantic Indexing (SIN) dataset with a pool of 346 concepts, into a novel supervised multi-task dictionary learning framework. Extensive experimental results on TRECVID Multimedia Event Detection (MED) dataset demonstrate the efficacy of our proposed method.
【Keywords】:
【Paper Link】 【Pages】:3848-3854
【Authors】: Shuo Yang ; Ping Luo ; Chen Change Loy ; Kenneth W. Shum ; Xiaoou Tang
【Abstract】: We consider the problem of learning deep representation when target labels are available. In this paper, we show that there exists intrinsic relationship between target coding and feature representation learning in deep networks. Specifically, we found that distributed binary acode with error correcting capability is more capable of encouraging discriminative features, in comparison tothe 1-of-K coding that is typically used in supervised deep learning. This new finding reveals additional benefit of using error-correcting code for deep model learning,apart from its well-known error correcting property. Extensive experiments are conducted on popular visual benchmark datasets.
【Keywords】: Representation Learning; Visual Recognition; Image Classification
【Paper Link】 【Pages】:3855-3863
【Authors】: Haonan Yu ; Jeffrey Mark Siskind
【Abstract】: Most previous work on video description trains individualparts of speech independently. It is more appealing from a linguistic point of view, for word models for all parts of speech to be learned simultaneously from whole sentences, a hypothesis suggested by some linguists for child language acquisition. In this paper, we learn to describe video by discriminatively training positive sentential labels against negative ones in a weakly supervised fashion: the meaning representations (i.e., HMMs) of individual words in these labels are learned from whole sentences without any correspondence annotation of what those words denote in the video. Textual descriptions are then generated for new video using trained word models.
【Keywords】: language acquisition; Hidden Markov Model; video description
【Paper Link】 【Pages】:3864-3870
【Authors】: Liming Zhao ; Xi Li ; Jun Xiao ; Fei Wu ; Yueting Zhuang
【Abstract】: As an important and challenging problem in computer vision and graphics, keypoint-based object tracking is typically formulated in a spatio-temporal statistical learning framework. However, most existing keypoint trackers are incapable of effectively modeling and balancing the following three aspects in a simultaneous manner: temporal model coherence across frames, spatial model consistency within frames, and discriminative feature construction. To address this issue, we propose a robust keypoint tracker based on spatio-temporal multi-task structured output optimization driven by discriminative metric learning. Consequently, temporal model coherence is characterized by multi-task structured keypoint model learning over several adjacent frames, while spatial model consistency is modeled by solving a geometric verification based structured learning problem. Discriminative feature construction is enabled by metric learning to ensure the intra-class compactness and inter-class separability. Finally, the above three modules are simultaneously optimized in a joint learning scheme. Experimental results have demonstrated the effectiveness of our tracker.
【Keywords】: computer vision; object tracking; metric learning; multi-task learning; structured output learning
【Paper Link】 【Pages】:3871-3877
【Authors】: Erjin Zhou ; Haoqiang Fan ; Zhimin Cao ; Yuning Jiang ; Qi Yin
【Abstract】: Face hallucination method is proposed to generate high-resolution images from low-resolution ones for better visualization. However, conventional hallucination methods are often designed for controlled settings and cannot handle varying conditions of pose, resolution degree, and blur. In this paper, we present a new method of face hallucination, which can consistently improve the resolution of face images even with large appearance variations. Our method is based on a novel network architecture called Bi-channel Convolutional Neural Network (Bi-channel CNN). It extracts robust face representations from raw input by using deep convolutional network, then adaptively integrates two channels of information (the raw input image and face representations) to predict the high-resolution image. Experimental results show our system outperforms the prior state-of-the-art methods.
【Keywords】: Face Fallucination; Convolutional Neural Network
【Paper Link】 【Pages】:3878-3886
【Authors】: Chao Zhu ; Yuxin Peng
【Abstract】: Pedestrian detection is a challenging problem in computer vision. Especially, a major bottleneck for current state-of-the-art methods is the significant performance decline with increasing occlusion. A common technique for occlusion handling is to train a set of occlusion-specific detectors and merge their results directly. These detectors are trained independently and the relationship among them is ignored. In this paper, we consider pedestrian detection in different occlusion levels as different but related problems, and propose a multi-task model to jointly consider their relatedness and differences. The proposed model adopts multi-task learning algorithm to map pedestrians in different occlusion levels to a common space, where all models corresponding to different occlusion levels are constrained to share a common set of features, and a boosted detector is then constructed to distinguish pedestrians from background. The proposed approach is evaluated on the challenging Caltech pedestrian detection benchmark, and achieves state-of-the-art results on different occlusion-specific test sets.
【Keywords】: Pedestrian detection; Occlusion handling; Multi-task learning
【Paper Link】 【Pages】:3887-3895
【Authors】: John L. Bresina
【Abstract】: This paper describes a challenging, real-world planning problem within the context of a NASA mission called LADEE (Lunar Atmospheric Dust Environment Explorer). We present the approach taken to reduce the complexity of the activity planning task in order to effectively perform it within the time pressures imposed by the mission requirements. One key aspect of this approach is the design of the activity planning process based on principles of problem decomposition and planning abstraction levels. The second key aspect is the mixed-initiative system developed for this task, called LASS (LADEE Activity Scheduling System). The primary challenge for LASS was representing and managing the science constraints that were tied to key points in the spacecraft’s orbit, given their dynamic nature due to the continually updated orbit determination solution.
【Keywords】: planning; NASA mission
【Paper Link】 【Pages】:3896-3903
【Authors】: Amit Dhurandhar ; Rajesh Kumar Ravi ; Bruce Graves ; Gopikrishnan Maniachari ; Markus Ettl
【Abstract】: An accredited biennial 2012 study by the Association of Certified Fraud Examiners claims that on average 5% of a company's revenue is lost because of unchecked fraud every year. The reason for such heavy losses are that it takes around 18 months for a fraud to be caught and audits catch only 3% of the actual fraud. This begs the need for better tools and processes to be able to quickly and cheaply identify potential malefactors. In this paper, we describe a robust tool to identify procurement related fraud/risk, though the general design and the analytical components could be adapted to detecting fraud in other domains. Besides analyzing standard transactional data, our solution analyzes multiple public and private data sources leading to wider coverage of fraud types than what generally exists in the marketplace. Moreover, our approach is more principled in the sense that the learning component, which is based on investigation feedback has formal guarantees. Though such a tool is ever evolving, an initial deployment of this tool over the past 6 months has found many interesting cases from compliance risk and fraud point of view, increasing the number of true positives found by over 80% compared with other state-of-the-art tools that the domain experts were previously using.
【Keywords】: collusion; risk; Big Data
【Paper Link】 【Pages】:3904-3911
【Authors】: Leonard Kinnaird-Heether ; Chris Dorman
【Abstract】: We developed a tool to solve a problem of position assignment within the IT Ford College Graduate program. This position assignment tool was first developed in 2012 and has been used successfully since then. The tool has since evolved for use with several other position assignment and related tasks with other similar programs in Ford Motor Company. This paper will describe the creation of this tool and how we have applied it, focusing on the need for developing such a tool, and how the continued development of this tool will benefit its users and the company.
【Keywords】: Artificial Intelligence; Combinatorial Optimization; Position Assignment; Enterprise Level
【Paper Link】 【Pages】:3912-3919
【Authors】: Juan Liu ; Eric Bier ; Aaron Wilson ; Tomo Honda ; Kumar Sricharan ; Leilani Gilpin ; John Alexis Guerra Gómez ; Daniel Davies
【Abstract】: Detection of fraud, waste, and abuse (FWA) is an important yet difficult problem. In this paper, we describe a system to detect suspicious activities in large healthcare claims datasets. Each healthcare dataset is viewed as a heterogeneous network of patients, doctors, pharmacies, and other entities. These networks can be large, with millions of patients, hundreds of thousands of doctors, and tens of thousands of pharmacies, for example. Graph analysis techniques are developed to find suspicious individuals, suspicious relationships between individuals, unusual changes over time, unusual geospatial dispersion, and anomalous networks within the overall graph structure. The system has been deployed on multiple sites and data sets, both government and commercial, to facilitate the work of FWA investigation analysts.
【Keywords】: Machine Learning; Healthcare
【Paper Link】 【Pages】:3920-3927
【Authors】: Sathappan Muthiah ; Bert Huang ; Jaime Arredondo ; David Mares ; Lise Getoor ; Graham Katz ; Naren Ramakrishnan
【Abstract】: Civil unrest (protests, strikes, and occupy events) is a common occurrence in both democracies and authoritarian regimes. The study of civil unrest is a key topic for political scientists as it helps capture an important mechanism by which citizenry express themselves. In countries where civil unrest is lawful, qualitative analysis has revealed that more than 75% of the protests are planned, organized, and/or announced in advance; therefore detecting future time mentions in relevant news and social media is a direct way to develop a protest forecasting system. We develop such a system in this paper, using a combination of key phrase learning to identify what to look for, probabilistic soft logic to reason about location occurrences in extracted results, and time normalization to resolve future tense mentions. We illustrate the application of our system to 10 countries in Latin America, viz. Argentina, Brazil, Chile, Colombia, Ecuador, El Salvador, Mexico, Paraguay, Uruguay, and Venezuela. Results demonstrate our successes in capturing significant societal unrest in these countries with an average lead time of 4.08 days. We also study the selective superiorities of news media versus social media (Twitter, Facebook) to identify relevant tradeoffs.
【Keywords】: Logic;Temporal Reasoning;Geo/Spatial Reasoning;
【Paper Link】 【Pages】:3928-3934
【Authors】: Edward D. Thompson ; Ethan Frolich ; James C. Bellows ; Benjamin E. Bassford ; Edward J. Skiko ; Mark S. Fox
【Abstract】: PDS (Process Diagnosis System) is an expert system shell developed in the early 1980's. It could handle thousands of sensor inputs and produce thousands of diagnostic messages with confidence factors based on complex logic designed to mimic the thinking of human experts. PDS went into commercial operation in 1985 to monitor seven power plant generators from a centralized diagnostic center at Westinghouse Power Generation headquarters. In the 1990’s the popularity of advanced technology gas turbines provided a renaissance in PDS utilization. The software has undergone rewrites and improvements since its inception, and the current PCPDS now supports the Siemens Power Diagnostics® Center with centralized rule based monitoring of over 1200 gas turbines, steam turbines, and generators.
【Keywords】: Power diagnosis expert rule shell
【Paper Link】 【Pages】:3935-3941
【Authors】: Brooke Cowan ; Sven Zethelius ; Brittany Luk ; Teodora Baras ; Prachi Ukarde ; Daodao Zhang
【Abstract】: This paper addresses the problem of named entity recognition (NER) in travel-related search queries. NER is an important step toward a richer understanding of user-generated inputs in information retrieval systems. NER in queries is challenging due to minimal context and few structural clues. NER in restricted-domain queries is useful in vertical search applications, for example following query classification in general search. This paper describes an efficient machine learning-based solution for the high-quality extraction of semantic entities from query inputs in a restricted-domain information retrieval setting. We apply a conditional random field (CRF) sequence model to travel-domain search queries and achieve high-accuracy results. Our approach yields an overall F1 score of 86.4% on a held-out test set, outperforming a baseline score of 82.0% on a CRF with standard features. The resulting NER classifier is currently in use in a real-life travel search engine.
【Keywords】: Named Entity Recognition; Search Query Semantics; Travel Domain
【Paper Link】 【Pages】:3942-3947
【Authors】: Murthy Devarakonda ; Ching-Huei Tsou
【Abstract】: Identifying a patient’s important medical problems requires broad and deep medical expertise, as well as significant time to gather all the relevant facts from the patient’s medical record and assess the clinical importance of the facts in reaching the final conclusion. A patient’s medical problem list is by far the most critical information that a physician uses in treatment and care of a patient. In spite of its critical role, its curation, manual or automated, has been an unmet need in clinical practice. We developed a machine learning technique in IBM Watson to automatically generate a patient’s medical problem list. The machine learning model uses lexical and medical features extracted from a patient’s record using NLP techniques. We show that the automated method achieves 70% recall and 67% precision based on the gold standard that medical experts created on a set of de-identified patient records from a major hospital system in the US. To the best of our knowledge this is the first successful machine learning/NLP method of extracting an open-ended patient’s medical problems from an Electronic Medical Record (EMR). This paper also contributes a methodology for assessing accuracy of a medical problem list generation technique.
【Keywords】: Medical Problem List; Machine Learning; Natural Language Processing; Electronic Medical Records; IBM Watson
【Paper Link】 【Pages】:3948-3953
【Authors】: Heshan Du ; Hai H. Nguyen ; Natasha Alechina ; Brian Logan ; Michael Jackson ; John Goodwin
【Abstract】: We describe a tool, MatchMaps, that generates sameAs and partOf matches between spatial objects (such as shops, shopping centres, etc.) in crowd-sourced and authoritative geospatial datasets. MatchMaps uses reasoning in qualitative spatial logic, description logic and truth maintenance techniques, to produce a consistent set of matches. We report the results of an initial evaluation of MatchMaps by experts from Ordnance Survey (Great Britain's National Mapping Authority). In both the case studies considered, MatchMaps was able to correctly match spatial objects (high precision and recall) with minimal human intervention.
【Keywords】: Spatial Logic; Geospatial Data; Ontology Matching
【Paper Link】 【Pages】:3954-3960
【Authors】: David John Gagne II ; Amy McGovern ; Jerald Brotzge ; Michael Coniglio ; James Correia Jr. ; Ming Xue
【Abstract】: Hail causes billions of dollars in losses by damaging buildings, vehicles, and crops. Improving the spatial and temporal accuracy of hail forecasts would allow people to mitigate hail damage. We have developed an approach to forecasting hail that identifies potential hail storms in storm-scale numerical weather prediction models and matches them with observed hailstorms. Machine learning models, including random forests, gradient boosting trees, and linear regression, are used to predict the expected hail size from each forecast storm. The individual hail size forecasts are merged with a spatial neighborhood ensemble probability technique to produce a consensus probability of hail at least 25.4 mm in diameter. The system was evaluated during the 2014 National Oceanic and Atmospheric Administration Hazardous Weather Testbed Experimental Forecast Program and compared with a physics-based hail size model. The machine-learning-based technique shows advantages in producing smaller size errors and more reliable probability forecasts. The machine learning approaches correctly predicted the location and extent of a significant hail event in eastern Nebraska and a marginal severe hail event in Colorado.
【Keywords】: meteorology; machine learning; hail; ensemble; storm-scale; gradient boosting; postprocessing;
【Paper Link】 【Pages】:3961-3966
【Authors】: Johnathan Gohde ; Mark S. Boddy ; Hazel Shackleton ; Steve Johnston
【Abstract】: In previous work, we described G2I2, a system that adjusts the cost function used by an off-road route planning system in order to more closely mimic the route choices made by humans. In this paper, we report on an extension to G2I2, called GUIDE, which adds significant new capabilities. GUIDE has the ability to induce a cost function starting with a set of historical tracks used as training input, with no requirement that these tracks be even close to cost-optimal. Given a cost function, either induced as above or provided from elsewhere, GUIDE can then compare planned routes with the actual tracks executed to adjust that cost function as either the environment or human preferences change over time. The features used by GUIDE in both the initial induction of the cost function and subsequent tuning include time-varying meta-data such as the temperature and precipitation at the time a given track was executed. We present results showing that, even when presented with tracks that are very far from cost-optimal, GUIDE can learn a set of preferences that closely mimics terrain choices made by humans.
【Keywords】: Route Planning; Data Mining; Human Computer Interaction; Machine Learning; Planning
【Paper Link】 【Pages】:3967-3974
【Authors】: Christophe Guettier ; Willy Lamal ; Israel Mayk ; Jacques Yelloz
【Abstract】: Complex operational environments require improved tactical mission command capabilities with a high level of interoperability among coalition control and command (C2) systems. This paper focuses on two areas of interest: decision support based on automated planning and Service Oriented Architecture (SOA) for rapid service development. Previous experiments were performed bilaterally by US, France and Germany to focus on collaborative mission planning using Web Services (WSs). The results reported herein were obtained from a unified experiment performed by US, France and Germany involving a common scenario. The operational benefit from the experimentation has been to improve mutual understanding among allied forces, to dynamically plan for assistance among ground support troops (logistics, MEDE- VAC, and other areas) as well as to improve their coordination. The effort addressed system design, and integration within an experimental framework. It enabled the evolution of the CERDEC Mission Command Gateway (MCG) architecture as well as a constraint based planner ”ORTAC”, developed by French DGA and Sagem. It takes into account near real-time multimodal Situation Awareness and readiness status from tactical edge units. The trilateral experiment, entitled From Data to Decision included Net-Centric manned and unmanned assets from all three nations (France - Germany - US) operating as a cohesive coalition force while preserving command and support relationships as required through their respective chains of command.
【Keywords】: Mission Planning; Experimentations; Control / Command Applications; Service Oriented Architectures; Network Centric Warfare
【Paper Link】 【Pages】:3975-3980
【Authors】: Greg Hines ; Alexandra Swanson ; Margaret Kosmala ; Chris J. Lintott
【Abstract】: Camera traps (remote, automatic cameras) are revolutionizing large-scale studies in ecology. The Serengeti Lion Project has used camera traps to produce over 1.5 million pictures of animals in the Serengeti. To analyze these pictures, the Project created Snapshot Serengeti, a citizen science website where volunteers can help classify animals. To increase accuracy, each photo is shown to multiple users and a critical step is aggregating individual classifications. In this paper, we present a new aggregation algorithm which achieves an accuracy of 98.6\%, better than many human experts. Our algorithm also requires fewer users per photo than existing methods. The algorithm is intuitive and designed so that non-experts can understand the end results.
【Keywords】:
【Paper Link】 【Pages】:3981-3986
【Authors】: Sasin Janpuangtong ; Dylan A. Shell
【Abstract】: This paper describes an end-to-end learning framework that allows a novice to create a model from data easily by helping structure the model building process and capturing extended aspects of domain knowledge. By treating the whole modeling process interactively and exploiting high-level knowledge in the form of an ontology, the framework is able to aid the user in a number of ways, including in helping to avoid pitfalls such as data dredging. Prudence must be exercised to avoid these hazards: certain conclusions may be supported by extra knowledge if, for example, there are reasons to trust a particular narrower set of hypotheses. This paper adopts the solution of using higher-level knowledge in order to allow this sort of domain knowledge to be inferred automatically, thereby selecting only relevant input attributes and thence constraining the hypothesis space. We describe how the framework automatically exploits structured knowledge in an ontology to identify relevant concepts, and how a data extraction component can make use of online data sources to find measurements of those concepts so that their relevance can be evaluated. To validate our approach, models of four different problem domains were built using our implementation of the framework. Prediction error on unseen examples of these models show that our framework, making use of the ontology, helps to improve model generalization.
【Keywords】: Ontology; Model Building Process; Supervised Learning; Information Extraction; HITS
【Paper Link】 【Pages】:3987-3992
【Authors】: Ugur Kuter ; Mark H. Burstein ; J. Benton ; Daniel Bryce ; Jordan Tyler Thayer ; Steve McCoy
【Abstract】: This paper describes a novel combination of Java program analysis and automated learning and planning architecture to the domain of Java vulnerability analysis. The key feature of our "HACKAR:Helpful Advice for Code Knowledge and Attack Resilience'' system is its ability to analyze Java programs at development-time, identifying vulnerabilities and ways to avoid them. HACKAR uses an improved version of NASA's Java PathFinder (JPF) to execute Java programs and identify vulnerabilities. The system features new Hierarchical Task Network (HTN) learning algorithms that (1) advance state-of-the-art HTN learners with reasoning about numeric constraints, failures, and more general cases of recursion, and (2) contribute to problem-solving by learning a hierarchical dataflow representation of the program from the inputs of the program. Empirical evaluation demonstrates that HACKAR was able to suggest fixes for all of our test program suites. It also shows that HACKAR can analyze programs with string inputs that original JPF implementation cannot.
【Keywords】:
【Paper Link】 【Pages】:3993-3998
【Authors】: Andres Quiroz ; Eric Huang ; Luca Ceriani
【Abstract】: Integrating heterogeneous data sets has been a significant barrier to many analytics tasks, due to the variety in structure and level of cleanliness of raw data sets requiring one-off ETL code. We propose HiperFuse, which significantly automates the data integration process by providing a declarative interface, robust type inference, extensible domain-specific data models, and a data integration planner which optimizes for plan completion time. The proposed tool is designed for schema-less data querying, code reuse within specific domains, and robustness in the face of messy unstructured data. To demonstrate the tool and its reference implementation, we show the requirements and execution steps for a use case in which IP addresses from a web clickstream log are joined with census data to obtain average income for particular site visitors (IPs), and offer preliminary performance results and qualitative comparisons to existing data integration and ETL tools.
【Keywords】: data integration; data fusion; declarative; ETL; DSL; dataflow optimization; planning; scheduling; automation
【Paper Link】 【Pages】:3999-4005
【Authors】: Paul Taele ; Laura Barreto ; Tracy Anne Hammond
【Abstract】: Learning music theory not only has practical benefits for musicians to write, perform, understand, and express music better, but also for both non-musicians to improve critical thinking, math analytical skills, and music appreciation. However, current external tools applicable for learning music theory through writing when human instruction is unavailable are either limited in feedback, lacking a written modality, or assuming already strong familiarity of music theory concepts. In this paper, we describe Maestoso, an educational tool for novice learners to learn music theory through sketching practice of quizzed music structures. Maestoso first automatically recognizes students' sketched input of quizzed concepts, then relies on existing sketch and gesture recognition techniques to automatically recognize the input, and finally generates instructor-emulated feedback. From our evaluations, we demonstrate that Maestoso performs reasonably well on recognizing music structure elements and that novice students can comfortably grasp introductory music theory in a single session.
【Keywords】: sketch recognition; intelligent user interfaces; music theory education
【Paper Link】 【Pages】:4006-4011
【Authors】: Amulya Yadav ; Leandro Soriano Marcolino ; Eric Rice ; Robin Petering ; Hailey Winetrobe ; Harmony Rhoades ; Milind Tambe ; Heather Carmichael
【Abstract】: Homeless youth are prone to HIV due to their engagement in high risk behavior. Many agencies conduct interventions to educate/train a select group of homeless youth about HIV prevention practices and rely on word-of-mouth spread of information through their social network. Previous work in strategic selection of intervention participants does not handle uncertainties in the social network's structure and in the evolving network state, potentially causing significant shortcomings in spread of information. Thus, we developed PSINET, a decision support system to aid the agencies in this task. PSINET includes the following key novelties: (i) it handles uncertainties in network structure and evolving network state; (ii) it addresses these uncertainties by using POMDPs in influence maximization; (iii) it provides algorithmic advances to allow high quality approximate solutions for such POMDPs. Simulations show that PSINET achieves around 60% more information spread over the current state-of-the-art. PSINET was developed in collaboration with My Friend's Place (a drop-in agency serving homeless youth in Los Angeles) and is currently being reviewed by their officials.
【Keywords】: Influence Maximization; POMDPs; Social Networks; HIV Prevention
【Paper Link】 【Pages】:4012-4018
【Authors】: Meng Zhao ; Faizan Javed ; Ferosh Jacob ; Matt McNair
【Abstract】: Named Entity Recognition (NER) and Named Entity Normalization (NEN) refer to the recognition and normalization of raw texts to known entities. From the perspective of recruitment innovation, professional skill characterization and normalization render human capital data more meaningful both commercially and socially. Accurate and detailed normalization of skills is the key for the predictive analysis of labor market dynamics. Such analytics help bridge the skills gap between employers and candidate workers by matching the right talent for the right job and identifying in-demand skills for workforce training programs. This can also work towards the social goal of providing more job opportunities to the community. In this paper we propose an automated approach for skill entity recognition and optimal normalization. The proposed system has two components: 1) Skills taxonomy generation, which employs vocational skill related sections of resumes and Wikipedia categories to define and develop a taxonomy of professional skills; 2) Skills tagging, which leverages properties of semantic word vectors to recognize and normalize relevant skills in input text. By sampling based end-user evaluation, the current system attains 91% accuracy on the taxonomy generation and 82% accuracy on the skills tagging tasks. The beta version of the system is currently applied in various big data and business intelligence applications for workforce analytics and career track projections at CareerBuilder.
【Keywords】: Skill Normalization, Named Entity Recognition, Wikipedia, Word2vec
【Paper Link】 【Pages】:4019-4021
【Authors】: Peter Clark
【Abstract】: While there has been an explosion of impressive, data-driven AI applications in recent years, machines still largely lack a deeper understanding of the world to answer questions that go beyond information explicitly stated in text, and to explain and discuss those answers. To reach this next generation of AI applications, it is imperative to make faster progress in areas of knowledge, modeling, reasoning, and language. Standardized tests have often been proposed as a driver for such progress, with good reason: Many of the questions require sophisticated understanding of both language and the world, pushing the boundaries of AI, while other questions are easier, supporting incremental progress. In Project Aristo at the Allen Institute for AI, we are working on a specific version of this challenge, namely having the computer pass Elementary School Science and Math exams. Even at this level there is a rich variety of problems and question types, the most difficult requiring significant progress in AI. Here we propose this task as a challenge problem for the community, and are providing supporting datasets. Solutions to many of these problems would have a major impact on the field so we encourage you: Take the Aristo Challenge!
【Keywords】: problem solving; knowledge representation; reasoning
【Paper Link】 【Pages】:4022-4023
【Authors】: Jeremy Hyrkas ; Daniel Halperin ; Bill Howe
【Abstract】: Flow cytometers measure the optical properties of particles to classify microbes. Recent innovations have allowed oceanographers to collect flow cytometry data continuously during research cruises, leading to an explosion of data and new challenges for the classification task.The massive scale, time-varying underlying populations, and noisy measurements motivate the development of new classification methods. We describe the problem, the data, and some preliminary results demonstratingthe difficulty with conventional methods.
【Keywords】: Cluster analysis; classification; bioinformatics
【Paper Link】 【Pages】:4024-4026
【Authors】: Leora Morgenstern ; Charles L. Ortiz Jr.
【Abstract】: This paper describes the Winograd Schema Challenge (WSC), which has been suggested as an alternative to the Turing Test and as a means of measuring progress in commonsense reasoning. A competition based on the WSC has been organized and announced to the AI research community. The WSC is of special interest to the AI applications community and we encourage its members to participate.
【Keywords】: Turing Test; Commonsense Reasoning;
【Paper Link】 【Pages】:4027-4031
【Authors】: Bonnie J. Dorr ; Lucian Galescu ; Ian E. Perera ; Kristy Hollingshead-Seitz ; David Atkinson ; Micah Clark ; William Clancey ; Yorick Wilks ; Eric Fosler-Lussier
【Abstract】: This Blue Sky presentation focuses on a major shift toward a notion of “ambient intelligence” that transcends general applications targeted at the general population. The focus is on highly personalized agents that accommodate individual differences and changes over time. This notion of Extended Ambient Intelligence (EAI) concerns adaptation to a person’s preferences and experiences, as well as changing capabilities, most notably in an environment where conversational engagement is central. An important step in moving this research forward is the accommodation of different degrees of cognitive capability (including speech processing) that may vary over time for a given user—whether through improvement or through deterioration. We suggest that the application of divergence detection to speech patterns may enable adaptation to a speaker’s increasing or decreasing level of speech impairment over time. Taking an adaptive approach toward technology development in this arena may be a first step toward empowering those with special needs so that they may live with a high quality of life. It also represents an important step toward a notion of ambient intelligence that is personalized beyond what can be achieved by mass-produced, one-size-fits-all software currently in use on mobile devices.
【Keywords】: amyotrophic lateral sclerosis; automatic speech recognition; speech recognition; physiological degeneration; ambient intelligence; speech adaptation
【Paper Link】 【Pages】:4032-4036
【Authors】: Sarit Kraus
【Abstract】: The number of people with disabilities is continuously increasing. Providing patients who have disabilities with the rehabilitation and care necessary to allow them good quality of life creates overwhelming demands for health and rehabilitation services. We suggest that advancements in intelligent agent technology provide new opportunities for improving the provided services. We will discuss the challenges of building an agent for the health care domain and present four capabilities that are required for an agent in the health care domain: planning, monitoring, intervention and encouragement. We will discuss the importance of personalizing all of them and the needto facilitate cooperation between the automated agent and the human care givers. We will review recent technology that can be used toward the development of agents that can have these capabilities and their promise in automating services such as physiotherapy, speech therapy and cognitive training.
【Keywords】: Automated Agents;Health Applications
【Paper Link】 【Pages】:4037-4041
【Authors】: Tie-Yan Liu ; Wei Chen ; Tao Qin
【Abstract】: Machine learning and game theory are two important directions of AI. The former usually assumes data is independent of the models to be learned; the latter usually assumes agents are fully rational. In many modern Internet applications, like sponsored search and crowdsourcing, the two basic assumptions are violated and new challenges are posed to both machine learning and game theory. To better model and study such applications, we need to go beyond conventional machine learning and game theory (mechanism design), and adopt a new approach called mechanism learning with mechanism induced data. Specifically, we propose to learn a behavior model from data to describe how real agents play the complicated game, instead of making the full-rationality assumption. Then we propose to optimize the mechanism by using the learned behavior models to predict the future behaviors of agents in response to the new mechanism. Because the above process couples mechanism learning and behavior learning in a loop, new algorithms and theories are needed to perform the task and guarantee the asymptotical performance. As shown in this paper, there are many interesting research topics along this direction, many of which are still open problems, waiting for researchers in our community to deeply investigate.
【Keywords】:
【Paper Link】 【Pages】:4042-4046
【Authors】: Michela Milano ; Pascal Van Hentenryck
【Abstract】: Our society is organized around a number of (interdependent) global systems. Logistic and supply chains, health services, energy networks, financial markets, computer networks, and cities are just a few examples of such global, complex systems. These global systems are socio-technical and involve interactions between complex infrastructures, man-made processes, natural phenomena, multiple stakeholders, and human behavior. For the first time in the history of manking, we have access to data sets of unprecedented scale and accuracy about these infrastructures, processes, natural phenomena, and human behaviors. In addition, progress in high-performancing computing, data mining, machine learning, and decision support opens the possibility of looking at these problems more holistically, capturing many of these aspects simultaneously. This paper addresses emergent architectures enabling controlling, predicting and reaoning on these systems.
【Keywords】: Global Systems, Decision support systems, Simulators, data analytics
【Paper Link】 【Pages】:4047-4051
【Authors】: Dana S. Nau ; Malik Ghallab ; Paolo Traverso
【Abstract】: In a recent position paper in Artificial Intelligence, we argued that the automated planning research literature has underestimated the importance and difficulty of deliberative acting, which is more than just interleaving planning and execution. We called for more research on the AI problems that emerge when attempting to integrate acting with planning. To provide a basis for such research, it will be important to have a formalization of acting that can be useful in practice. This is needed in the same way that a formal account of planning was necessary for research on planning. We describe some first steps toward developing such a formalization, and invite readers to carry out research along this line.
【Keywords】: Deliberative Acting; Automated Planning; Integrated Planning and Acting
【Paper Link】 【Pages】:4052-4056
【Authors】: Jussi Rintanen
【Abstract】: We propose revisions to the research agenda in Automated Planning. The proposal is based on a review of the role of the Planning Domain Definition Language (PDDL) in the activities of the AI planning community and the impact of PDDL on parts of its research agenda. We specifically show how specific properties of PDDL have impacted research on planning, by putting emphasis on certain research topics and complicating others. We argue that the development of more advanced modeling languages would be — analogously to the impact PDDL has had — a low overhead and smooth route for the ICAPS community shift its research focus to increasingly promising and relevant research topics.
【Keywords】:
【Paper Link】 【Pages】:4057-4061
【Authors】: Tuomas Sandholm
【Abstract】: Living organisms adapt to challenges through evolution. This has proven to be a key difficulty in developing therapies, since the organisms evolve resistance.I propose the wild idea of steering evolution strategically — using computational game theory for (typically incomplete-information) multistage games and opponent exploitation techniques. A sequential contingency plan for steering evolution is constructed computationally for the setting at hand. In the biological context, the opponent (e.g., a disease) has a systematic handicap because it evolves myopically. This can be exploited by computing trapping strategies that cause the opponent to evolve into states where it can be handled effectively. Potential application classes include therapeutics at the population, individual, and molecular levels (drug design), as well as cell repurposing and synthetic biology.
【Keywords】: game solving; opponent exploitation; treating disease; multistage plan; evolution; adaptation; imperfect-information game; incomplete-information game
【Paper Link】 【Pages】:4062-4066
【Authors】: Howard Elliot Shrobe ; Boris Katz ; Randall Davis
【Abstract】: Programmers are loathe to interrupt their workflow to document their design rationale, leading to frequent errors when software is modified — often much later and by different programmers. A Programmer’s Assistant could interact with the programmer to capture and preserve design rationale, in a natural way that would make rationale capture "cost less than it's worth", and could also detect common flaws in program design. Such a programmer’s assistant was not practical when it was first proposed decades ago, but advances over the years make now the time to revisit the concept, as our prototype shows.
【Keywords】: programmer's assistant, programmer's apprentice, design rationale, design capture, rationale capture
【Paper Link】 【Pages】:4067-4072
【Authors】: Jeffrey Mark Siskind
【Abstract】: Study of the human brain through fMRI can potentially benefit the pursuit of artificial intelligence. Four examples are presented. First, fMRI decoding of the brain activity of subjects watching video clips yields higher accuracy than state-of-the-art computer-vision approaches to activity recognition. Second, novel methods are presented that decode aggregate representations of complex visual stimuli by decoding their independent constituents. Third, cross-modal studies demonstrate the ability to decode the brain activity induced in subjects watching video stimuli when trained on the brain activity induced in subjects seeing text or hearing speech stimuli and vice versa. Fourth, the time course of brain processing while watching video stimuli is probed with scanning that trades off the amount of the brain scanned for the frequency at which it is scanned. Techniques like these can be used to study how the human brain grounds language in visual perception and may motivate development of novel approaches in AI.
【Keywords】:
【Paper Link】 【Pages】:4073-4077
【Authors】: Toby Walsh
【Abstract】: Many models and mechanisms in resource and cost allocation have been developed that are simple and abstract. By means of two case studies, I argue that it is now timely to consider richer models for the fair division of resources and for the allocation of costs. Such models should have features like asynchronicity which reflect more of the true complexity of many fair division and cost allocation problems met in the real world. I suggest that computation can be used in such models to increase both efficiency and fairness of the allocations. As a result, we may be able to do more with fewer resources and greater fairness.
【Keywords】: Shapley value, indivisible goods, fair division, cost sharing
【Paper Link】 【Pages】:4078-4082
【Authors】: Wlodek Zadrozny ; Valeria de Paiva ; Lawrence S. Moss
【Abstract】: Our paper is actually two contributions in one. First, we argue that IBM's Jeopardy! playing machine needs a formal semantics. We present several arguments as we discuss the system. We also situate the work in the broader context of contemporary AI. Our second point is that the work in this area might well be done as a broad collaborative project. Hence our "Blue Sky'' contribution is a proposal to organize a polymath-style effort aimed at developing formal tools for the study of state of the art question-answer systems, and other large scale NLP efforts whose architectures and algorithms lack a theoretical foundation.
【Keywords】: IBM Watson; formal semantics; NLP; polymath; question answering
【Paper Link】 【Pages】:4083-4087
【Authors】: Xiaojin Zhu
【Abstract】: I draw the reader's attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training. The Socratic dialogue style aims to stimulate critical thinking.
【Keywords】:
【Paper Link】 【Pages】:4088-4092
【Authors】: Shlomo Zilberstein
【Abstract】: The vision of populating the world with autonomous systems that reduce human labor and improve safety is gradually becoming a reality. Autonomous systems have changed the way space exploration is conducted and are beginning to transform everyday life with a range of household products. In many areas, however, there are considerable barriers to the deployment of fully autonomous systems. We refer to systems that require some degree of human intervention in order to complete a task as semi-autonomous systems. We examine the broad rationale for semi-autonomy and define basic properties of such systems. Accounting for the human in the loop presents a considerable challenge for current planning techniques. We examine various design choices in the development of semi-autonomous systems and their implications on planning and execution. Finally, we discuss fruitful research directions for advancing the science of semi-autonomy.
【Keywords】: autonomy; planning; human-in-the-loop
【Paper Link】 【Pages】:4093-4099
【Authors】: Vinay K. Chaudhri
【Abstract】: In this paper, I summarize the results of a decade-plus of research and development driven by the vision that human knowledge can be grounded in a small number of prototypical components that can be extended through composition and analogy. These ideas have been embodied in a system called AURA, which has been used to engineer an expressive knowledge base for an intelligent biology textbook. The focus of the current paper is to abstract away from the specifics and, to instead describe the core ideas in such a manner that they can be transferred and applied in different contexts, and to relate those ideas to the ongoing research by others.
【Keywords】: knowledge based systems, knowledge representation and reasoning, ontologies, analogical reasoning, prototypes, composition
【Paper Link】 【Pages】:4100-4106
【Authors】: Cristina Conati ; Giuseppe Carenini ; Dereck Toker ; Sébastien Lallé
【Abstract】: This paper summarizes an ongoing multi-year project aiming to uncover knowledge and techniques for devising intelligent environments for user-adaptive visualizations. We ran three studies designed to investigate the impact of user and task characteristics on user performance and satisfaction in different visualization contexts. Eye-tracking data collected in each study was analyzed to uncover possible interactions between user/task characteristics and gaze behavior during visualization processing. Finally, we investigated user models that can assess user characteristics relevant for adaptation from eye tracking data.
【Keywords】: Information Visualization; User modeling; Adaptive interface; Eye tracking
【Paper Link】 【Pages】:4107-4111
【Authors】: Luc De Raedt
【Abstract】: Applying machine learning and data mining to novel applications is cumbersome. This observation is the prime motivation for the interest in languages for learning and mining. This note provides a gentle introduction to three types of languages that support machine learning and data mining: inductive query languages, which extend database query languages with primitives for mining and learning, modelling languages, which allow to declaratively specify and solve mining and learning problems, and programming languages, that support the learning of functions and subroutines. It uses an example of each type of language to introduce the underlying ideas and puts them into a common perspective. This then forms the basis for a short analysis of the state-of-the-art.
【Keywords】: artificial intelligence, machine learning, data mining, languages
【Paper Link】 【Pages】:4112-4118
【Authors】: Pierre Marquis
【Abstract】: This paper is concerned with knowledge compilation (KC), a family of approaches developed in AI for more than twenty years. Knowledge compilation consists in pre-processing some pieces of the available information in order to improve the computational efficiency (especially, the time complexity) of some tasks. In this paper, the focus is laid on three KC topics which gave rise to many works: the development of knowledge compilation techniques for the clausal entailment problem in propositional logic, the concept of compilability and the notion of knowledge compilation map. The three topics, as well as an overview of the main results from the literature, are presented. Some recent research lines are also discussed.
【Keywords】: Knowledge compilation
【Paper Link】 【Pages】:4119-4126
【Authors】: Oliver Niggemann ; Volker Lohweg
【Abstract】: Cyber-Physical Production Systems (CPPSs) are in the focus of research, industry and politics: By applying new IT and new computer science solutions, production systems will become more adaptable, more resource ef- ficient and more user friendly. The analysis and diagnosis of such systems is a major part of this trend: Plants should detect automatically wear, faults and suboptimal configurations. This paper reflects the current state-of- the-art in diagnosis against the requirements of CPPSs, identifies three main gaps and gives application scenarios to outline first ideas for potential solutions to close these gaps.
【Keywords】: Cyber-Physical Systems; Machine Learning; Diagnosis; Anomaly Detection
【Paper Link】 【Pages】:4127-4131
【Authors】: Tuomas Sandholm
【Abstract】: Most real-world games and many recreational games are games of incomplete information. Over the last dozen years, abstraction has emerged as a key enabler for solving large incomplete-information games. First, the game is abstracted to generate a smaller, abstract game that is strategically similar to the original game. Second, an approximate equilibrium is computed in the abstract game. Third, the strategy from the abstract game is mapped back to the original game. In this paper, I will review key developments in the field. I present reasons for abstracting games, and point out the issue of abstraction pathology. I then review the practical algorithms for information abstraction and action abstraction. I then cover recent theoretical breakthroughs that beget bounds on the quality of the strategy from the abstract game, when measured in the original game. I then discuss how to reverse map the opponent's action into the abstraction if the opponent makes a move that is not in the abstraction. Finally, I discuss other topics of current and future research.
【Keywords】: abstraction; game solving; game theory; equilibrium finding; imperfect-information game; incomplete-information game
【Paper Link】 【Pages】:4132-4139
【Authors】: Lenhart K. Schubert
【Abstract】: In recent years, there has been renewed interest in the NLP community in genuine language understanding and dialogue. Thus the long-standing issue of how the semantic content of language should be represented is reentering the communal discussion. This paper provides a brief "opinionated survey" of broad-coverage semantic representation (SR). It suggests multiple desiderata for such representations, and then outlines more than a dozen approaches to SR — some long-standing, and some more recent, providing quick characterizations, pros, cons, and some comments on implementations.
【Keywords】: semantic representation, NLP, NLU
【Paper Link】 【Pages】:4140-4141
【Authors】: Saad Alqithami
【Abstract】: The paper concisely proposes a distinguishing paradigm to study a very large, collective group of agents that is called Network Organization. We will formally define and substantially evaluate this paradigm for self-governing agents, in which the state value function changes dynamically, and describe its salient properties.
【Keywords】: Multi-Agent Systems; Organizational Theory; Formal Modelling; Network Organization
【Paper Link】 【Pages】:4142-4143
【Authors】: Julio César Bahamón ; Camille Barot ; R. Michael Young
【Abstract】: We present an approach to incorporate interesting and compelling characters in planning-based narrative generation. The approach is based on a computational model that utilizes character actions to portray these as having distinct and well-defined personalities.
【Keywords】: Humans and AI; Narrative Generation; Planning; Character Personality; Applications of AI; Automated Reasoning
【Paper Link】 【Pages】:4144-4145
【Authors】: Matt Barnes ; Nick Gisolfi ; Madalina Fiterau ; Artur Dubrawski
【Abstract】: In many applications, training data is provided in the form of related datasets obtained from several sources, which typically affects the sample distribution. The learned classification models, which are expected to perform well on similar data coming from new sources, often suffer due to bias introduced by what we call `spurious' samples -- those due to source characteristics and not representative of any other part of the data. As standard outlier detection and robust classification usually fall short of determining groups of spurious samples, we propose a procedure which identifies the common structure across datasets by minimizing a multi-dataset divergence metric, increasing accuracy for new datasets.
【Keywords】: Outlier detection; density estimation
【Paper Link】 【Pages】:4146-4147
【Authors】: Li Chen ; Matthew Patton
【Abstract】: Online advertising is an important and huge industry. Having knowledge of the website attributes can contribute greatly to business strategies for ad-targeting, content display, inventory purchase or revenue prediction. In this paper, we introduce a stochastic blockmodeling for the website relations induced by the event of online user visitation. We propose two clustering algorithms to discover the intrinsic structures of websites, and compare the performance with a goodness-of-fit method and a deterministic graph partitioning method. We demonstrate the effectiveness of our algorithms on both simulation and AOL website dataset.
【Keywords】: Online Advertising, Graph Inference, Clustering
【Paper Link】 【Pages】:4148-4149
【Authors】: Mahsa Chitsaz ; Zhe Wang ; Kewen Wang
【Abstract】: With the current upward trend in semantically annotated data, ontology-based data access (OBDA) was formulated to tackle the problem of data integration and query answering, where an ontology is formalized as a description logic TBox. In order to meet usability requirements set by users, efforts have been made to equip OBDA system with explanation facilities. One important explanation tool for DL ontologies, referred to as query abduction, can be formalised as abductive reasoning. In particular, given an ontology and an observation (i.e., a query with an answer), an explanation to the observation is a set of facts that together with the ontology can entail the observation. In this paper, we develop a sound and complete algorithm of query abduction for general conjunctive queries in ELH ontologies. This is achieved through ontology approximation and query rewriting. We implemented a prototypical system using the highly optimized Prolog engine XSB. We evaluated our algorithm over university benchmark ontology and our experimental results show that the system is capable of handling query abduction problems for ontology that has approximately 10 millions ABox assertions.
【Keywords】: Query abduction, description logic ontology, ontology approximation, query rewriting
【Paper Link】 【Pages】:4150-4151
【Authors】: Jennifer D'Souza
【Abstract】: We propose a simple multi-pass sieve framework that applies tiers of deterministic normalization modules one at a time from highest to lowest precision for the task of normalizing names. While a sieve based architecture has been shown effective in coreference resolution, it has not yet been applied to the normalization task. We find that even in this task, the approach retains its characteristic features of being simple, and highly modular. In addition, it also proves robust when evaluated on two different kinds of data: clinical notes and biomedical text, by demonstrating high accuracy in normalizing disorder names found in both datasets.
【Keywords】: normalization; data mining; information retrieval
【Paper Link】 【Pages】:4152-4153
【Authors】: Jennifer D'Souza
【Abstract】: This paper describes a learning-based approach for automatic derivation of word variant forms bythe suffixation process. We employ the sequence labeling technique, which entails learning when to preserve, delete, substitute, or add a letter to form a new word from a given word. The features used by the learner are based on characters, phonetics, and hyphenation positions of the given word. To ensure that our system is robust to word variants that can arise from different forms of a root word, we generate multiple variant hypothesis for each word based on the sequence labeler's prediction. We then filter out ill-formed predictions, and create clusters of word variants by merging together a word and its predicted variants with other words and their predicted variants provided the groups share a word in common. Our results show that this learning-based approach is feasible for the task and warrants further exploration.
【Keywords】: derivational morphology; suffixation; sequence labeling
【Paper Link】 【Pages】:4154-4155
【Authors】: Priya Lekha Donti ; Jacob Rosenbloom ; Alex Gruver ; James C. Boerkoel Jr.
【Abstract】: College students often struggle to balance their work with personal wellness. In part, this occurs because students work when they are unable to focus. We hypothesize that we can adapt the Experience Sampling Method (ESM) to build a model of users’ efficacy and predict when they will be most likely to experience flow, a state of motivation and immersion. We also hypothesize that we can present this information effectively to users, allowing them to understand when they are most likely to achieve flow. In order to test these hypotheses, we introduce the Productivity and Wellness Pal (PaWPal), a smartphone-based application that seeks to make users aware of their efficacy at various tasks as well as which courses of action are likely to lead to immersive experiences.
【Keywords】: Applications of AI; Human Computer Interaction; Information Retrieval; Machine Learning; Qualitative Reasoning
【Paper Link】 【Pages】:4156-4157
【Authors】: Zhanwei Du ; Yongjian Yang ; Chuang Ma ; Yuan Bai
【Abstract】: Individual mood is important for physical and emotional well-being, creativity and working memory. However, due to the lack of long-term real tracking daily data in individual level, most current works focus their efforts on population level and short-term small group. An ignored yet important task is to find the sentiment spreading mechanism in individual level from their daily behavior data. This paper studies this task by raising the following fundamental and summarization question, being not sufficiently answered by the literature so far:Given a social network, how the sentiment spread? The current individual-level network spreading models always assume one can infect others only when he/she has been infected. Considering the negative emotion spreading characters in individual level, we loose this assumption, and give an individual negative emotion spreading model. In this paper, we propose a Graph-Coupled Hidden Markov Sentiment Model for modeling the propagation of infectious negative sentiment locally within a social network. Taking the MIT Social Evolution dataset as an example, the experimental results verify the efficacy of our techniques on real-world data.
【Keywords】:
【Paper Link】 【Pages】:4158-4159
【Authors】: Madalina Fiterau ; Artur Dubrawski
【Abstract】: We introduce an active learning framework designed to train classification models which use informative projections. Our approach works with the obtained low-dimensional models in finding unlabeled data for annotation by experts. The advantage of our approach is that the labeling effort is expended mainly on samples which benefit models from the considered hypothesis class. This results in an improved learning rate over standard selection criteria for data from the clinical domain.
【Keywords】: active learning; feature selection; informative projection; clinical data
【Paper Link】 【Pages】:4160-4161
【Authors】: Katie Genter ; Peter Stone
【Abstract】: Flocking is a emergent behavior exhibited by many different animal species, including birds and fish. In our work we consider adding a small set of influencing agents, that are under our control, into a flock. Following ad hoc teamwork methodology, we assume that we are given knowledge of, but no direct control over, the rest of the flock. In our ongoing work highlighted in this abstract, we are specifically considering the problem of where to initially place influencing agents that we add to such a flock. We use these influencing agents to influence the flock to behave in a particular way - for example, to fly in a particular orientation or fly in a particular pattern such as to avoid an obstacle.
【Keywords】: ad hoc teamwork; flocking; coordination
【Paper Link】 【Pages】:4162-4163
【Authors】: Daniel J. Geschwender ; Robert J. Woodward ; Berthe Y. Choueiry
【Abstract】: In Constraint Processing, many algorithms for enforcing the same level of local consistency may exist. The performance of those algorithms varies widely. In order to understand what problem features lead to better performance of one algorithm over another, we utilize an algorithm configurator to tune the parameters of a random problem generator and maximize the performance difference of two consistency algorithms for enforcing constraint minimality. Our approach allowed us to generate instances that run 1000 times faster for one algorithm over the other.
【Keywords】: Constraint Satisfaction; Optimization
【Paper Link】 【Pages】:4164-4165
【Authors】: Nick Gisolfi ; Madalina Fiterau ; Artur Dubrawski
【Abstract】: We consider the problem of identifying discrepancies between training and test data which are responsible for the reduced performance of a classification system. Intended for use when data acquisition is an iterative process controlled by domain experts, our method exposes insufficiencies of training data and presents them in a user-friendly manner. The system is capable of working with any classification system which admits diagnostics on test data. We illustrate the usefulness of our approach in recovering compact representations of the revealed gaps in training data and show that predictive accuracy of the resulting models is improved once the gaps are filled through collection of additional training samples.
【Keywords】: Dataset Shift; Decision Support Systems; Ensemble Methods; Feature Selection
【Paper Link】 【Pages】:4166-4167
【Authors】: Sviatlana Höhn
【Abstract】: Troubles in hearing, comprehension or speech production are common in human conversations, especially if participants of the conversation communicate in a foreign language that they have not yet fully mastered. Here I describe a data-driven model for simulation of dialogue sequences where the learner user does not understand the talk of a conversational agent in chat and asks for clarification.
【Keywords】: Linguistic Repair in Chat, Conversational Agents, Second Language Acquisition
【Paper Link】 【Pages】:4168-4169
【Authors】: Hadi Hosseini ; Kate Larson ; Robin Cohen
【Abstract】: We consider the problem of repeatedly matching a set of alternatives to a set of agents in the absence of monetary transfer. We propose a generic framework for evaluating sequential matching mechanisms with dynamic preferences, and show that unlike single-shot settings, the random serial dictatorship mechanism is manipulable.
【Keywords】: Matching; Random Assignment; Strategyproofness; Dynamic Preferences; Mechanism Design
【Paper Link】 【Pages】:4170-4171
【Authors】: Young-Seob Jeong ; Ho-Jin Choi
【Abstract】: We propose a new customizable tool, Language Independent Feature Extractor (LIFE), which models the inherent patterns of any language and extracts relevant features of thelanguage. There are two contributions of this work: (1) no labeled data is necessary to train LIFE (It works when a sufficient number of unlabeled documents are given), and (2) LIFE is designed to be applicable to any language. We proved the usefulness of LIFE by experimental results of time information extraction.
【Keywords】: language independent feature; topic model; hidden markov model
【Paper Link】 【Pages】:4172-4173
【Authors】: Xinxin Jiang ; Wei Liu ; Longbing Cao ; Guodong Long
【Abstract】: Context-aware features have been widely recognized as important factors in recommender systems. However, as a major technique in recommender systems, traditional Collaborative Filtering (CF) does not provide a straight-forward way of integrating the context-aware information into personal recommendation. We propose a Coupled Collaborative Filtering (CCF) model to measure the contextual information and use it to improve recommendations. In the proposed approach, coupled similarity computation is designed to be calculated by interitem, intra-context and inter-context interactions among item, user and context-ware factors. Experiments based on different types of CF models demonstrate the effectiveness of our design.
【Keywords】:
【Paper Link】 【Pages】:4174-4175
【Authors】: Mayank Kejriwal
【Abstract】: Entity Resolution (ER) concerns identifying logically equivalent entity pairs across databases. To avoid quadratic pairwise comparisons of entities, blocking methods are used. Sorted Neighborhood is an established blocking method for relational databases. It has not been applied on graph-based data models such as the Resource Description Framework (RDF). This poster presents a modular workflow for applying Sorted Neighborhood to RDF. Real-world evaluations demonstrate the workflow's utility against a popular baseline.
【Keywords】: Data Matching; Linked Data; Semantic Web; Entity Resolution
【Paper Link】 【Pages】:4176-4177
【Authors】: Mackenzie Leake ; Liyu Xia ; Kamil Rocki ; Wayne Imaino
【Abstract】: In the Hierarchical Temporal Memory (HTM) paradigm the effect of overlap between inputs on the activation of columns in the spatial pooler is studied. Numerical results suggest that similar inputs are represented by similar sets of columns and dissimilar inputs are represented by dissimilar sets of columns. It is shown that the spatial pooler produces these results under certain conditions for the connectivity and proximal thresholds at initialization. Qualitative arguments about the learning dynamics of the spatial pooler are then discussed.
【Keywords】: Hierarchical Temporal Memory; HTM; Learning Algorithms; Machine Learning; Spatial Pooler
【Paper Link】 【Pages】:4178-4179
【Authors】: Chao Li ; Lei Ji ; Jun Yan
【Abstract】: According to the website AcronymFinder.com which is one of the world's largest and most comprehensive dictionaries of acronyms, an average of 37 new human-edited acronym definitions are added every day. There are 379,918 acronyms with 4,766,899 definitions on that site up to now, and each acronym has 12.5 definitions on average. It is a very important research topic to identify what exactly an acronym means in a given context for document comprehension as well as for document retrieval. In this paper, we propose two word embedding based models for acronym disambiguation. Word embedding is to represent words in a continuous and multidimensional vector space, so that it is easy to calculate the semantic similarity between words by calculating the vector distance. We evaluate the models on MSH Dataset and ScienceWISE Dataset, and both models outperform the state-of-art methods on accuracy. The experimental results show that word embedding helps to improve acronym disambiguation.
【Keywords】: Word Embedding; Acronym Disambiguation; Machine Learing
【Paper Link】 【Pages】:4180-4181
【Authors】: Qun Luo ; Weiran Xu
【Abstract】: We propose some better word embedding models based on vLBL model and ivLBL model by sharing representations between context and target words and using document representations. Our proposed models are much simpler which have almost half less parameters than the state-of-the-art methods. We achieve better results on word analogy task than the best ones reported before using significantly less training data and computing time.
【Keywords】: Knowledge Representation; Machine Learning; Statistical Learning; Data Mining
【Paper Link】 【Pages】:4182-4183
【Authors】: Valentin Mayer-Eichberger
【Abstract】: Lazy Clause Generation (LCG) solvers dominate the current constraint programming competitions. These solvers successfully combine systematic propagation based search, global constraints and conflict clause learning from SAT solving into a hybrid approach. My research project extends the LCG methodology by using a mix of eager and lazy encodings and a richer set of constraint decompositions. Global Constraints exhibit a whole hierarchy of different decomposition into more basic constraints. In our work we want to take advantage of such hierarchies and identify criteria on how constraints could be decomposed before and during search.
【Keywords】: SAT and CSP: Modeling/Formulations
【Paper Link】 【Pages】:4184-4185
【Authors】: Vaishnavh Nagarajan ; Leandro Soriano Marcolino ; Milind Tambe
【Abstract】: We show that without using any domain knowledge, we can predict the final performance of a team of voting agents, at any step towards solving a complex problem.
【Keywords】: Multi-agent systems; Teamwork; Multi-agent Learning
【Paper Link】 【Pages】:4186-4187
【Authors】: Phillip Odom ; Sriraam Natarajan
【Abstract】: Intelligent systems that interact with humans typically require demonstrations and/or advice from the expert for optimal decision making. While the active learning formalism allows for these systems to incrementally acquire demonstrations from the human expert, most learning systems require all the advice about the domain in advance. We consider the problem of actively soliciting human advice in an inverse reinforcement learning setting where the utilities are learned from demonstrations. Our hypothesis is that such solicitation of advice reduces the burden on the human to provide advice about every scenario in advance.
【Keywords】:
【Paper Link】 【Pages】:4188-4189
【Authors】: Swetasudha Panda ; Yevgeniy Vorobeychik
【Abstract】: Drug and vaccination therapies are important tools in the battle against infectious diseases such as HIV and influenza. However, many viruses, including HIV, can rapidly escape the therapeautic effect through a sequence of mutations. We propose to design vaccines, or, equivalently, antibody sequences that make such evasion difficult. We frame this as a bilevel combinatorial optimization problem of maximizing the escape cost, defined as the minimum number of virus mutations to evade binding an antibody. Binding strength can be evaluated by a protein modeling software, Rosetta, that serves as an oracle and computes a binding score for an input virus-antibody pair. However, score calculation for each possible such pair is intractable. %, as the search space is of the order 10^{130}. We propose a three-pronged approach to address this: first, application of local search, using a native antibody sequence as leverage, second, machine learning to predict binding for antibody-virus pairs, and third, a poisson regression to predict escape costs as a function of antibody sequence assignment. We demonstrate the effectiveness of the proposed methods, and exhibit an antibody with a far higher escape cost (7) than the native (1).
【Keywords】: Heuristic search, Machine learning, Optimization
【Paper Link】 【Pages】:4190-4191
【Authors】: Natalie Parde ; Michalis Papakostas ; Konstantinos Tsiakas ; Rodney D. Nielsen
【Abstract】: Training robots about the objects in their environment requires a multimodal correlation of features extracted from visual and linguistic sources. This work abstracts the task of collecting multimodal training data for object and feature learning by encapsulating it in an interactive game, I Spy , played between human players and robots. It introduces the concept of the game, briefly describes its methodology, and finally presents an evaluation of the game's performance and its appeal to human players.
【Keywords】: I Spy; 20 Questions; human-robot games; multimodal training data; games with a purpose
【Paper Link】 【Pages】:4192-4193
【Authors】: Mahboobeh Parsapoor ; John Brooke ; Bertil Svensson
【Abstract】: This paper briefly describes how the neural structure of fear conditioning has inspired to develop a computational intelligence model that is referred to as the brain emotional learning-inspired model (BELIM). The model is applied to predict long step ahead of solar activity and geomagnetic storms.
【Keywords】:
【Paper Link】 【Pages】:4194-4195
【Authors】: Charles Peabody ; Jennifer Seitzer
【Abstract】: Grammatical Evolution (GE) is that area of genetic algorithms that evolves computer programs in high-level languages possessing a BNF grammar. In this work, we present GEF (“Grammatical Evolution for the Finch”), a system that employs grammatical evolution to create a Finch robot controller program in Java. The system uses both the traditional GE model as well as employing extensions and augmentations that push the boundaries of goal-oriented contexts in which robots typically act including a meta-level handler that fosters a level of self-awareness in the robot. To handle contingencies, the GEF system has been endowed with the ability to perform meta-level jumps. When confronted with unplanned events and dynamic changes in the environment, our robot will automatically transition to pursue another goal, changing fitness functions, and generate and invoke operating system level scripting to facilitate the change. The robot houses a raspberry pi controller that is capable of executing one (evolved) program while wirelessly receiving another over an asynchronous client. This work is part of an overall project that involves planning for contingencies. In this poster, we present the development framework and system architecture of GEF, including the newly discovered meta-level handler, as well as some other system successes, failures, and insights.
【Keywords】: grammatical evolution; evolutionary computation; meta-level jumping; contingency handling; robot controllers
【Paper Link】 【Pages】:4196-4197
【Authors】: Chiara Piacentini ; Maria Fox ; Derek Long
【Abstract】: Numeric Timed Initial Fluents represent a new feature in PDDL that extends the concept of Timed Initial Literals to numeric fluents. They are particularly useful to model independent functions that change through time and influence the actions to be applied. Although they are very useful to model real world problems, they are not systematically defined in the family of PDDL languages and they are not implemented in any generic PDDL planner, except for POPF2 and UPMurphi. In this paper we present an extension of the planner POPF2 (POPF-TIF) to handle problems with numeric Timed Initial Fluents. We propose and evaluate two contributions: the first is based on improvements of the heuristic evaluation, while the second considers alternative search algorithms based on a mixture of Enforced Hill Climbing and Best First Search.
【Keywords】: Temporal Planning; Numeric Planning; Exogenous Events
【Paper Link】 【Pages】:4198-4199
【Authors】: Jedrzej Potoniec ; Agnieszka Lawrynowicz
【Abstract】: We present an idea of using mathematicall modelling to guide a process of mining a set of patterns in an RDF graph and further exploiting these patterns to build expressive OWL class hierarchies.
【Keywords】: ontology learning;
【Paper Link】 【Pages】:4200-4201
【Authors】: Siting Ren ; Sheng Gao ; Jianxin Liao ; Jun Guo
【Abstract】: Cross-domain recommendation has been proposed to transfer user behavior pattern by pooling together the rating data from multiple domains to alleviate the sparsity problem appearing in single rating domains. However, previous models only assume that multiple domains share a latent common rating pattern based on the user-item co-clustering. To capture diversities among different domains, we propose a novel Probabilistic Cluster-level Latent Factor (PCLF) model to improve the cross-domain recommendation performance. Experiments on several real world datasets demonstrate that our proposed model outperforms the state-of-the-art methods for the cross-domain recommendation task.
【Keywords】:
【Paper Link】 【Pages】:4202-4203
【Authors】: Sherry Shanshan Ruan ; Gheorghe Comanici ; Prakash Panangaden ; Doina Precup
【Abstract】: We provide a novel, flexible, iterative refinement algorithm to automatically construct an approximate statespace representation for Markov Decision Processes (MDPs). Our approach leverages bisimulation metrics, which have been used in prior work to generate features to represent the state space of MDPs.We address a drawback of this approach, which is the expensive computation of the bisimulation metrics. We propose an algorithm to generate an iteratively improving sequence of state space partitions. Partial metric computations guide the representation search and provide much lower space and computational complexity, while maintaining strong convergence properties. We provide theoretical results guaranteeing convergence as well as experimental illustrations of the accuracy and savings (in time and memory usage) of the new algorithm, compared to traditional bisimulation metric computation.
【Keywords】:
【Paper Link】 【Pages】:4204-4205
【Authors】: Claudia Schulz
【Abstract】: Since Assumption-Based Argumentation (ABA) was introduced in the nineties,the structure and semantics of an ABA framework have been studied exclusively in logical termswithout any graphical representation.Here, we show how an ABA framework and its complete semantics can be displayed in a graph,clarifying the structure of the ABA framework as well as the resulting complete assumption labellings.Furthermore, we show that such an ABA graph can be used to represent the structureand semantics of a logic program (LP), based on the correspondence between the semantics of a LP and an ABA framework encoding this LP.
【Keywords】: Assumption-Based Argumentation; Complete Semantics; Labelling; Logic Programming; 3-valued Stable Semantics
【Paper Link】 【Pages】:4206-4207
【Authors】: Jingyu Shao ; Junfu Yin ; Wei Liu ; Longbing Cao
【Abstract】: The itemsets discovered by traditional High Utility Itemsets Mining (HUIM) methods are more useful than frequent itemset mining outcomes; however, they are usually disordered and not actionable, and sometime accidental, because the utility is the only judgement and no relations among itemsets are considered. In this paper, we introduce the concept of combined mining to select combined itemsets that are not only high utility and high frequency, but also involving relations between itemsets. An effective method for mining such actionable combined high utility itemsets is proposed. The experimental results are promising, compared to those from traditional HUIM algorithm (UP-Growth).
【Keywords】:
【Paper Link】 【Pages】:4208-4209
【Authors】: Samta Shukla ; Aditya Telang ; Salil Joshi ; L. Venkata Subramaniam
【Abstract】: Much work has been done on understanding and predicting human mobility in time. In this work, we are interested in obtaining a set of users who are spatio-temporally most similar to a query user. We propose an efficient way of user data representation called Spatio-Temporal Signatures to keep track of complete record of user movement. We define a measure called Spatio-Temporal similarity for comparing a given pair of users. Although computing exact pairwise Spatio-Temporal similarities between query user with all users is inefficient, we show that with our hybrid pruning scheme the most similar users can be obtained in logarithmic time with in a (1+\epsilon) factor approximation of the optimal. We are developing a framework to test our models against a real dataset of urban users.
【Keywords】:
【Paper Link】 【Pages】:4210-4211
【Authors】: Niket Tandon ; Shekhar Sharma ; Tanima Makkad
【Abstract】: The Web contains a large amount of information in the form of videos that remains inaccessible to the visually impaired people. We identify a class of videos whose information content can be approximately encoded as an audio, thereby increasing the amount of accessible videos. We propose a model to automatically identify such videos. Our model jointly relies on the textual metadata and visual content of the video. We use this model to re-rank Youtube video search results based on accessibility of the video. We present preliminary results by conducting a user study with visually impaired people to measure the effectiveness of our system.
【Keywords】: accessibility; videos
【Paper Link】 【Pages】:4212-4213
【Authors】: Wenting Tu ; David Wai-Lok Cheung ; Nikos Mamoulis
【Abstract】: A large-scale training corpus consisting of microblogs belonging to a desired category is important for high-accuracy microblog retrieval. Obtaining such a large-scale microblgging corpus manually is very time and labor-consuming. Therefore, some models for the automatic retrieval of microblogs froman exterior corpus have been proposed. However, these approaches may fail in considering microblog-specific features. To alleviate this issue, we propose a methodology that constructs a simulated microblogging corpus rather than directly building a model from the exterior corpus. The performance of our model is better since the microblog-special knowledge of the microblogging corpus is used in the end by the retrieval model. Experimental results on real-world microblogs demonstrate the superiority of our technique compared to the previous approaches.
【Keywords】: text classification; microblogging platform
【Paper Link】 【Pages】:4214-4215
【Authors】: Wenting Tu ; David Wai-Lok Cheung ; Nikos Mamoulis
【Abstract】: Users commonly use Web 2.0 platforms to post their opinions and their predictions about future events (e.g., the movement of astock). Therefore, opinion mining can be used as a tool for predicting future events. Previous work on opinion mining extracts from the text only the polarity of opinions as sentiment indicators. We observe that a typical opinion post also contains temporal references which can improve prediction. This short paper presents our preliminary work on extracting reference time tagsand integrating them into an opinion mining model, in order to improvethe accuracy of future event prediction. We conduct anexperimental evaluation using a collection of microblogs posted by investors to demonstrate the effectiveness of our approach.
【Keywords】: time-sensitive; opinion mining; prediction
【Paper Link】 【Pages】:4216-4217
【Authors】: Gabriele Valentini ; Heiko Hamann ; Marco Dorigo
【Abstract】: We study a self-organized collective decision-making strategy to solve a site-selection problem using a swarm of simple robots. Robots can only move forward or turn in place; sense the intensity of the ambient light; and exchange 3-byte messages with peers in a limited range. The goal of the swarm is to collectively decide which of the sites available in the environment is the best candidate site. We define a distributed and iterative decision-making strategy: robots explore the available options, determine the options' qualities, decide autonomously which option to take, and communicate their decision to neighboring robots. We study the effectiveness and robustness of the proposed strategy using a swarm of 100 Kilobots and we focus on the impact of the neighborhood size over the dynamics of the system.
【Keywords】: robot swarm; Kilobot; collective decision-making; majority rule
【Paper Link】 【Pages】:4218-4219
【Authors】: Yi Wang ; Joohyung Lee
【Abstract】: We present a probabilistic extension of logic programs under the stable model semantics, inspired by the concept of Markov Logic Networks. The proposed language takes advantage of both formalisms in a single framework, allowing us to represent commonsense reasoning problems that require both logical and probabilistic reasoning in an intuitive and elaboration tolerant way.
【Keywords】: answer set programming; stable model semantics; markov logic network
【Paper Link】 【Pages】:4220-4221
【Authors】: Heting Wu ; Hailong Sun ; Yili Fang ; Kefan Hu ; Yongqing Xie ; Yangqiu Song ; Xudong Liu
【Abstract】: In e-commerce systems, customer reviews are important information for understanding market feedbacks on certain commodities. However, accurate analyzing reviews is challenging due to the complexity of natural language processing and informal descriptions in reviews. Existing methods mainly focus on studying efficient algorithms that cannot guarantee the accuracy for review analysis. Crowdsourcing can improve the accuracy of review analysis while it is subject to extra costs and low response time. In this work, we combine machine learning and crowdsourcing together for better understanding customer reviews. First, we collectively use multiple machine learning algorithms to pre-process review classification. Second, we select the reviews on which all machine learning algorithms cannot agree and assign them to humans to process. Third, the results from machine learning and crowdsourcing are aggregated to be the final analysis results. Finally, we perform real experiments with practical review data to confirm the effectiveness of our method.
【Keywords】:
【Paper Link】 【Pages】:4222-4223
【Authors】: Kun Xu ; Sheng Zhang ; Yansong Feng ; Songfang Huang ; Dongyan Zhao
【Abstract】: Answering natural language questions against structured knowledge bases (KB) has been attracting increasing attention in both IR and NLP communities. The task involves two main challenges: recognizing the questions' meanings, which are then grounded to a given KB. Targeting simple factoid questions, many existing open domain semantic parsers jointly solve these two subtasks, but are usually expensive in complexity and resources.In this paper, we propose a simple pipeline framework to efficiently answer more complicated questions, especially those implying aggregation operations, e.g., argmax, argmin.We first develop a transition-based parsing model to recognize the KB-independent meaning representation of the user's intention inherent in the question. Secondly, we apply a probabilistic model to map the meaning representation, including those aggregation functions, to a structured query.The experimental results showed that our method can better understand aggregation questions, outperforming the state-of-the-art methods on the Free917 dataset while still maintaining promising performance on a more challenging dataset, WebQuestions, without extra training.
【Keywords】:
【Paper Link】 【Pages】:4224-4225
【Authors】: Yaowei Yan ; Chris E. Gutierrez ; Jeriah Jn-Charles ; Forrest Sheng Bao ; Yuanlin Zhang
【Abstract】: Boolean SATisfiability (SAT) is an important problem in AI. SAT solvers have been effectively used in important industrial applications including automated planning and verification. In this paper, we present novel algorithms for fast SAT solving by employing two common subclause elimination (CSE) approaches. Our motivation is that modern SAT solving techniques can be more efficient on CSE-processed instances. Empirical study shows that CSE can significantly speed up SAT solving.
【Keywords】:
【Paper Link】 【Pages】:4226-4227
【Authors】: Ruohan Zhang ; Zhao Song ; Dana H. Ballard
【Abstract】: We propose a modular reinforcement learning algorithm which decomposes a Markov decision process into independent modules. Each module is trained using Sarsa(lambda). We introduce three algorithms for forming global policy from modules policies, and demonstrate our results using a 2D grid world.
【Keywords】: Modular reinforcemenet learning
【Paper Link】 【Pages】:4228-4230
【Authors】: Tian Zhou ; Maria Eugenia Cabrera ; Juan Pablo Wachs
【Abstract】: This paper presents a comprehensive evaluation among touchless, vision-based hand tracking interfaces (Kinect and Leap Motion) and the feasibility of their adoption into the surgical theater compared to traditional interfaces.
【Keywords】: Human Computer Interaction, Robotics
【Paper Link】 【Pages】:4231-4232
【Authors】: Maria Barrett ; Anders Søgaard
【Abstract】: This PhD project aims at a quantitative description of reading patterns from eye movements when reading tweets and the development of an eye movement relevance model.
【Keywords】: eye tracking; reading; microblogs
【Paper Link】 【Pages】:4233-
【Authors】: Ferdinando Fioretto
【Abstract】: In the proposed thesis, we study Distributed Constraint Optimization Problems (DCOPs), which are problems where several agents coordinate with each other to optimize a global cost function. The use of DCOPs has gained momentum, due to their capability of addressing complex and naturally distributed problems. A majority of the work in DCOP addresses the resolution problem by detaching the model from the resolution process, where they assume that each agent controls exclusively one variable of the problem (Burke et al. 2006). This assumption often is not reflected in the model specifications, and may lead to inefficient communication requirements. Another limitation of current DCOP resolution methods is their inability to capitalize on the presence of structural information, which may allow incoherent/unnecessary data to reticulate among the agents (Yokoo 2001). The purpose of the proposed dissertation is to study how to adapt and integrate insights gained from centralized solving techniques in order to enhance DCOP performance and scalability, enabling their use for the resolution of real-world complex problems. To do so, we hypothesize that one can exploit the DCOP structure in both problem modeling and problem resolution phases.
【Keywords】: DCOP; CP; Smart Grids
【Paper Link】 【Pages】:4135-4236
【Authors】: Zack Fitzsimmons
【Abstract】: We must properly model attacks and the preferences of the electorate for the computational study of attacks on elections to give us insight into the hardness of attacks in practice. Theoretical and empirical analysis are equally important methods to understand election attacks. I discuss my recent work on domain restrictions on partial preferences and on new election attacks. I propose further study into modeling realistic election attacks and the advancement of the current state of empirical analysis of their hardness by using more advanced statistical techniques.
【Keywords】: elections; manipulation; control; domain restrictions
【Paper Link】 【Pages】:4237-4238
【Authors】: Bradley Hayes
【Abstract】: My dissertation research focuses on the application of hierarchical learning and heuristics based on social signals to solve challenges inherent to enabling human-robot collaboration. I approach this problem through advancing the state of the art in building hierarchical task representations, multi-agent task-level planning, and learning assistive behaviors from demonstration.
【Keywords】:
【Paper Link】 【Pages】:4239-4240
【Authors】: Charmgil Hong ; Milos Hauskrecht
【Abstract】: This paper overviews the background, goals, past achievements and future directions of our research that aims to build a multivariate conditional anomaly detection framework for the clinical application.
【Keywords】: Multivariate conditional anomaly detection; Multi-dimensional models
【Paper Link】 【Pages】:4241-4242
【Authors】: Ping Hou
【Abstract】: While probabilistic planning models have been extensively used by AI and Decision Theoretic communities for planning under uncertainty, the objective to minimize the expected cumulative cost is inappropriate for high-stake planning problems. With this motivation in mind, we revisit the Risk-Sensitive criterion (RS-criterion), where the objective is to find a policy that maximizes the probability that the cumulative cost is within some user-defined cost threshold. The overall scope of this research is to develop efficient and scalable algorithms to optimize the RS-criterion in probabilistic planning problems. In our recent paper (Hou, Yeoh, and Varakantham 2014), we formally defined Risk-Sensitive MDPs (RS-MDPs) and introduced new algorithms for RS-MDPs with non-negative costs. Next, my plan is to develop algorithm for RS-MDPs with negative cost cycles and for Risk-Sensitive POMDPs (RS-POMDPs).
【Keywords】: Markov Decision Process; Utility Theory; Partially Observable Markov Decision Process
【Paper Link】 【Pages】:4243-4244
【Authors】: Mayank Kejriwal
【Abstract】: Entity Resolution (ER) concerns identifying logically equivalent pairs of entities that may be syntactically disparate. Although ER is a long-standing problem in the artificial intelligence community, the growth of Linked Open Data, a collection of semi-structured datasets published and inter-connected on the Web, mandates a new approach. The thesis is that building a viable Entity Resolution solution for serving Big Data needs requires simultaneously resolving challenges of automation, heterogeneity, scalability and domain independence. The dissertation aims to build such a system and evaluate it on real-world datasets published already as Linked Open Data.
【Keywords】: Data Matching; Entity Resolution; Big Data; Linked Data; Semantic Web
【Paper Link】 【Pages】:4245-4246
【Authors】: Scott Kiesel
【Abstract】: For my dissertation I am focusing on non-classical planning for robotic applications. Much classical planning research relies on assumptions that do not hold in real world robotics applications. In many cases the entire world state is not known in advance and the events that occur in the future can not be known with certainty. Robots operating in the real world also need to be responsive and react to dynamic obstacles and events.
【Keywords】:
【Paper Link】 【Pages】:4247-4248
【Authors】: Wei Kuang ; Laura E. Brown ; Zhenlin Wang
【Abstract】: Today’s data centers are designed with multi-core CPUs where multiple virtual machines (VMs) can be co-located into one physical machine or distribute multiple computing tasks onto one physical machine. The result is co-tenancy, resource sharing and competition. Modeling and predicting such co-run interference becomes crucial for job scheduling and Quality of Service assurance. Co-locating interference can be characterized into two components, sensitivity and pressure, where sensitivity characterizes how an application’s own performance is affected by a co-run application, and pressure characterizes how much contentiousness an application exerts/brings onto the memory subsystem. Previous studies show that with simple models, sensitivity and pressure can be accurately characterized for a single machine. We extend the models to consider cross-architecture sensitivity (across different machines).
【Keywords】: Transfer Learning; Regression Modeling
【Paper Link】 【Pages】:4249-4250
【Authors】: Boon-Ping Lim
【Abstract】: My research focuses on developing innovative ways to control Heating, Ventilation, and Air Conditioning (HVAC) and schedule occupancy flows in smart buildings to reduce our ecological footprint (and energy bills). We look at the potential for integrating building operations with room booking and meeting scheduling. Specifically, we improve on the effectiveness of energy-aware room-booking and occupancy scheduling approaches, by allowing the scheduling decisions to rely on an explicit model of the building's occupancy-based HVAC control. From computational standpoint, this is a challenging topic as HVAC models are inherently non-linear non-convex, and occupancy scheduling models additionally introduce discrete variables capturing the time slot and location at which each activity is scheduled. The mechanism needs to tradeoff minimizing energy cost against addressing occupancy thermal comfort and control feasibility in a highly dynamic and uncertain system.
【Keywords】: Smart buildings; Occupancy scheduling; Mixed integer programming; Large neighborhood search; HVAC control; Planning and scheduling;
【Paper Link】 【Pages】:4251-4252
【Authors】: Carrie Rebhuhn
【Abstract】: In a heterogeneous multiagent system it can be useful to have knowledge about the different types of agents in the system. Agent modeling develops agent models based on interactions between agents, then predicts agent actions. This approach is effective in small domains but does not scale well. We develop an approach where an agent can learn using an abstract model identification or stereotype rather than an explicit and unique model for each agent. We associate each agent with a stereotype and learn a policy incorporating this knowledge. The benefits of this approach are that it is simple, scalable, and degrades gracefully with misidentification.
【Keywords】: agent modeling; multiagent systems; stereotypes
【Paper Link】 【Pages】:4253-4254
【Authors】: Claudia Schulz
【Abstract】: Argumentation Theory and Answer Set Programming (ASP) are two prominent theories in the field of knowledge representation and non-monotonic reasoning,where Argumentation Theory stands for a variety of approaches following similar ideas.The main difference between Argumentation Theory and ASP is that the former focusses on representing knowledge and reasoning about it in a way that resembles human reasoning, neglecting the efficiency of the reasoning procedure,whereas the latter is concerned with the efficient computation of solutions to a reasoning problem, resulting in a less human-understandable process. In recent years, ASP has been frequently applied for the computation of reasoning problems represented in argumentation-theoretical-terms and has been found an efficient method for determining solutions to problems in Argumentation Theory. My research is concerned with the opposite direction, i.e. with applying Argumentation Theory to ASP in order to explain the solutions to an ASP reasoning problem in a more human-understandable way.Developing such an explanation method also involves to investigate both the exact relationship between different approaches in Argumentation Theory in order to find the most suitable one for explanations and their connection with ASP, in particular with respect to their semantics.
【Keywords】: Argumentation; Answer Set Programming; Explanation
【Paper Link】 【Pages】:4255-4256
【Authors】: Guni Sharon
【Abstract】: The multi-agent path finding (MAPF) problem is a generalization of the single-agent path finding problem for k > 1 agents. It consists of a graph and a number of agents. Foreach agent, a unique start state and a unique goal state are given, the task is to find paths for all agents from their start states to their goal states, under the constraint that agents cannot collide during their movements. In many cases there is an additional goal of minimizing a cumulative cost function such as the sum of the time steps required for every agent to reach its goal. The goal of my research is providing new methods to solve MAPF optimally and provide theoretical understandings that will help choose the best solver given a problem instance.
【Keywords】: Multi-Agent Pathfinding; Multi-Robot Planning
【Paper Link】 【Pages】:4257-4258
【Authors】: Leandro Soriano Marcolino
【Abstract】: It is known that we can aggregate the opinions of different agents to find high-quality solutions to complex problems. However, choosing agents to form a team is still a great challenge. Moreover, it is essential to use a good aggregation methodology in order to unleash the potential of a given team in solving complex problems. In my thesis, I present two different novel models to aid in the team formation process. Moreover, I propose a new methodology for extracting rankings from existing agents. I show experimental results both in the Computer Go domain and in the building design domain.
【Keywords】: Multi-agent systems; Distributed problem solving
【Paper Link】 【Pages】:4259-4260
【Authors】: Deepak Venugopal
【Abstract】: Markov logic networks (MLNs) combine the power of first-order logic and probabilistic graphical models and as a result are ideally suited for solving large, complex problems in application domains that have both rich relational structure and large amount of uncertainty. However, inference in these rich, relational representations is quite challenging. The aim of this thesis is to advance the state-of-the-art in MLN inference, enabling it to solve much harder and more complex tasks than is possible today. To this end, I will develop techniques that exploit logical structures and symmetries that are either explicitly or implicitly encoded in the MLN representation and demonstrate their usefulness by using them to solve hard real-world problems in the field of natural language understanding.
【Keywords】:
【Paper Link】 【Pages】:4261-4263
【Authors】: Andrew J. Wang
【Abstract】: The paper is an extended abstract for the doctoral consortium.
【Keywords】:
【Paper Link】 【Pages】:4264-4265
【Authors】: Pascal Bercher ; Felix Richter ; Thilo Hörnle ; Thomas Geier ; Daniel Höller ; Gregor Behnke ; Florian Nothdurft ; Frank Honold ; Wolfgang Minker ; Michael Weber ; Susanne Biundo
【Abstract】: Modern technical devices are often too complex for many users to be able to use them to their full extent. Based on planning technology, we are able to provide advanced user assistance for operating technical devices. We present a system that assists a human user in setting up a complex home theater consisting of several HiFi devices. For a human user, the task is rather challenging due to a large number of different ports of the devices and the variety of available cables. The system supports the user by giving detailed instructions how to assemble the theater. Its performance is based on advanced user-centered planning capabilities including the generation, repair, and explanation of plans.
【Keywords】: user-centered planning, user assistance, plan generation, plan repair, plan explanation
【Paper Link】 【Pages】:4266-4267
【Authors】: Abhijit Bhole ; Raghavendra Udupa
【Abstract】: We consider the problem of providing spelling corrections for misspelled queries in Email Search using user’s own mail data. A popular strategy for general query spelling correction is to generate corrections from query logs. However, this strategy is not effective in Email Search for two reasons: 1) query log of any sin- gle user is typically not rich enough to provide potential corrections for a new query 2) corrections generated us- ing query logs of other users are not particularly useful since the mail data as well as search intent are highly specific to the user. We address the challenge of design- ing an effective spelling correction algorithm for Email Search in the absence of query logs. We propose SpEQ, a Machine Learning based approach that generates cor- rections for misspelled queries directly from the user’s own mail data.
【Keywords】: email, search, spelling correction, statistical ranking, machine learning, learning to rank
【Paper Link】 【Pages】:4268-4269
【Authors】: Alain Biem ; Maria Butrico ; Mark Feblowitz ; Tim Klinger ; Yuri Malitsky ; Kenney Ng ; Adam Perer ; Chandra Reddy ; Anton Riabov ; Horst Samulowitz ; Daby M. Sow ; Gerald Tesauro ; Deepak S. Turaga
【Abstract】: A Data Scientist typically performs a number of tedious and time-consuming steps to derive insight from a raw data set. The process usually starts with data ingestion, cleaning, and transformation (e.g. outlier removal, missing value imputation), then proceeds to model building, and finally a presentation of predictions that align with the end-users objectives and preferences. It is a long, complex, and sometimes artful process requiring substantial time and effort, especially because of the combinatorial explosion in choices of algorithms (and platforms), their parameters, and their compositions. Tools that can help automate steps in this process have the potential to accelerate the time-to-delivery of useful results, expand the reach of data science to non-experts, and offer a more systematic exploration of the available options. This work presents a step towards this goal.
【Keywords】: Data Science; Automation; Reasoning Under Uncertainty; Visualization; NLP; Text Analytics
【Paper Link】 【Pages】:4270-4271
【Authors】: Noam Brown ; Sam Ganzfried ; Tuomas Sandholm
【Abstract】: The leading approach for solving large imperfect-information games is automated abstraction followed by running an equilibrium-finding algorithm. We introduce a distributed version of the most commonly used equilibrium-finding algorithm, counterfactual regret minimization (CFR), which enables CFR to scale to dramatically larger abstractions and numbers of cores. The new algorithm begets constraints on the abstraction so as to make the pieces running on different computers disjoint. We introduce an algorithm for generating such abstractions while capitalizing on state-of-the-art abstraction ideas such as imperfect recall and the earth-mover's-distance similarity metric. Our techniques enabled an equilibrium computation of unprecedented size on a supercomputer with a high inter-blade memory latency. Prior approaches run slowly on this architecture. Our approach also leads to a significant improvement over using the prior best approach on a large shared-memory server with low memory latency. Finally, we introduce a family of post-processing techniques that outperform prior ones. We applied these techniques to generate an agent for two-player no-limit Texas Hold'em. It won the 2014 Annual Computer Poker Competition, beating each opponent with statistical significance.
【Keywords】: game theory; distributed AI; game solving; imperfect information
【Paper Link】 【Pages】:4272-4273
【Authors】: Jun Chen ; Chaokun Wang ; Yiyuan Bai
【Abstract】: Large-scale distributed computing has made available the resources necessary to solve "AI-hard" problems. As a result, it becomes feasible to automate the processing of such problems, but accuracy is not very high due to the conceptual difficulty of these problems. In this paper, we integrated crowdsourcing with MapReduce to provide a scalable innovative human-machine solution to AI-hard problems, which is called CrowdMR. In CrowdMR, the majority of problem instances are automatically processed by machine while the troublesome instances are redirected to human via crowdsourcing. The results returned from crowdsourcing are validated in the form of CAPTCHA (Completely Automated Public Turing test to Tell Computers and Humans Apart) before adding to the output. An incremental scheduling method was brought forward to combine the results from machine and human in a "pay-as-you-go" way.
【Keywords】: CrowdMR; AI-hard Problems; Crowdsourcing; MapReduce
【Paper Link】 【Pages】:4274-4275
【Authors】: Sheng Gao ; Dai Zhang ; Honggang Zhang ; Chao Huang ; Yongsheng Zhang ; Jianxin Liao ; Jun Guo
【Abstract】: We propose VecLP, a novel Internet Video recommendation system working for Live TV Programs in this paper. Given little information on the live TV programs, our proposed VecLP system can effectively collect necessary information on both the programs and the subscribers as well as a large volume of related online videos, and then recommend the relevant Internet videos to the subscribers. For that, the key frames are firstly detected from the live TV programs, and then visual and textual features are extracted from these frames to enhance the understanding of the TV broadcasts. Furthermore, by utilizing the subscribers' profiles and their social relationships, a user preference model is constructed, which greatly improves the diversity of the recommendations in our system. The subscriber's browsing history is also recorded and used to make a further personalized recommendation. This work also illustrates how our proposed VecLP system makes it happen. Finally, we dispose some sort of new recommendation strategies in use at the system to meet special needs from diverse live TV programs and throw light upon how to fuse these strategies.
【Keywords】:
【Paper Link】 【Pages】:4276-4277
【Authors】: Maxime Guériau ; Romain Billot ; Nour-Eddin El Faouzi ; Salima Hassas ; Frédéric Armetta
【Abstract】: Cooperative Intelligent Transportation Systems (C-ITS) are complex systems well-suited to a multi-agent modeling. We propose a multi-agent based modeling of a C-ITS, that couples 3 dynamics (physical, informational and control dynamics) in order to ensure a smooth cooperation between non cooperative and cooperative vehicles, that communicate with each other (V2V communication) and the infrastructure (I2V and V2I communication). We present our multi-agent model, tested through simulations using real traffic data and integrated into our extension of the Multi-model Open-source Vehicular-traffic SIMulator (MovSim).
【Keywords】:
【Paper Link】 【Pages】:4278-4279
【Authors】: Raghu Krishnapuram ; Luis A. Lastras ; Satya Nitta
【Abstract】: The “Cognitive Master Teacher” is a result of discussions with teachers, members of educational institutions, government bodies and other thought leaders in the United States who have helped us shape its the requirements. It is conceived as a cloud-based and mobile-accessible personal agent that is readily available for teachers to use at anytime and assist them with various issues related to day-to-day teaching activities as well as professional development.
【Keywords】: cognitive computing, information retrieval, question answering, education applications
【Paper Link】 【Pages】:4280-4281
【Authors】: Henry Lieberman ; Joe Henke
【Abstract】: Graphical visualization has demonstrated enormous power in helping people to understand complexity in many branches of science. But, curiously, AI has been slow to pick up on the power of visualization. Alar is a visualization system intended to help people understand and control symbolic inference. Alar presents dynamically controllable node-and-arc graphs of concepts, and of assertions both supplied to the system and inferred. Alar is useful in quality assurance of knowledge bases (finding false, vague, or misleading statements; or missing assertions). It is also useful in tuning parameters of inference, especially how “liberal vs. conservative” the inference is (trading off the desire to maximize the power of inference versus the risk of making incorrect inferences). We present a typical scenario of using Alar to debug a knowledge base.
【Keywords】: Visualization; Inference
【Paper Link】 【Pages】:4282-4283
【Authors】: Tobias Linnenberg ; Alexander Fay ; Michael Kaisers
【Abstract】: We present an open-source low-budget hardware and software prototype of a smart plug, and the principles behind its capability to align power demand with a ref- erence signal, e.g. from local renewable energy genera- tion. We envision its use in conjunction with a platform that combines social-media and gamification elements with energy networks.
【Keywords】: Social Energy Networking; Demand Response; Smart Plug; Local Energy Generation
【Paper Link】 【Pages】:4284-4285
【Authors】: Jaimie Murdock ; Colin Allen
【Abstract】: Topic models remain a black box both for modelers and for end users in many respects. From the modelers' perspective, many decisions must be made which lack clear rationales and whose interactions are unclear — for example, how many topics the algorithms should find (K), which words to ignore (aka the "stop list"), and whether it is adequate to run the modeling process once or multiple times, producing different results due to the algorithms that approximate the Bayesian priors. Furthermore, the results of different parameter settings are hard to analyze, summarize, and visualize, making model comparison difficult. From the end users' perspective, it is hard to understand why the models perform as they do, and information-theoretic similarity measures do not fully align with humanistic interpretation of the topics. We present the Topic Explorer, which advances the state-of-the-art in topic model visualization for document-document and topic-document relations. It brings topic models to life in a way that fosters deep understanding of both corpus and models, allowing users to generate interpretive hypotheses and to suggest further experiments. Such tools are an essential step toward assessing whether topic modeling is a suitable technique for AI and cognitive modeling applications.
【Keywords】: visualization; topic modeling; nlp; applications
【Paper Link】 【Pages】:4286-4287
【Authors】: Tam Van Nguyen
【Abstract】: Salient object detection has gradually become a popular topic in robotics and computer vision research. This paper presents a real-time system that detects salient object by integrating objectness, foreground and compactness measures. Our algorithm consists of four basic steps. First, our method generates the objectness map via object proposals. Based on the objectness map, we estimate the background margin and compute the corresponding foreground map which prefers the foreground objects. From the objectness map and the foreground map, the compactness map is formed to favor the compact objects. We then integrate those cues to form a pixel-accurate saliency map which covers the salient objects and consistently separates fore- and background.
【Keywords】:
【Paper Link】 【Pages】:4288-4289
【Authors】: Yamuna Prasad ; Kanad K. Biswas
【Abstract】: In this paper we propose a wrapper based PSO method for gene selection in microarray datasets, where we gradually refine the feature (gene) space from a very coarse level to a fine grained one, by reducing the gene set at each step of the algorithm. We use the linear support vector machine weight vector to serve as the initial gene pool selection. In addition, we also examine integration of other filter based ranking methods with our proposed approach. Experiments on publicly available datasets, Colon, Leukemia and T2D show that our approach selects only a very small subset of genes while yielding substantial improvements in accuracy over state-of-the-art evolutionary methods.
【Keywords】: gene selection; PSO; linear svm weight vector
【Paper Link】 【Pages】:4290-4291
【Authors】: Hanumant Harichandra Redkar ; Sudha Baban Bhingardive ; Diptesh Kanojia ; Pushpak Bhattacharyya
【Abstract】: WordNet is an online lexical resource which expresses unique concepts in a language. English WordNet is the first WordNet which was developed at Princeton University. Over a period of time, many language WordNets were developed by various organizations all over the world. It has always been a challenge to store the WordNet data. Some WordNets are stored using file system and some WordNets are stored using different database models. In this paper, we present the World WordNet Database Structure which can be used to efficiently store the WordNet information of all languages of the World. This design can be adapted by most language WordNets to store information such as synset data, semantic and lexical relations, ontology details, language specific features, linguistic information, etc. An attempt is made to develop Application Programming Interfaces to manipulate the data from these databases. This database structure can help in various Natural Language Processing applications like Multilingual Information Retrieval, Word Sense Disambiguation, Machine Translation, etc.
【Keywords】: WordNet; World WordNet Database Structure; WWDS; Database Schema; Database Design; Database Structure; WordNet Databases; WordNet Storage; Natural Language Processing; Multilingual Information Retrieval; Global WordNet Structure
【Paper Link】 【Pages】:4292-4293
【Authors】: Ryan Rossi ; Nesreen Ahmed
【Abstract】: NetworkRepository (NR) is the first interactive data repository with a web-based platform for visual interactive analytics. Unlike other data repositories (e.g., UCI ML Data Repository, and SNAP), the network data repository (networkrepository.com) allows users to not only download, but to interactively analyze and visualize such data using our web-based interactive graph analytics platform. Users can in real-time analyze, visualize, compare, and explore data along many different dimensions. The aim of NR is to make it easy to discover key insights into the data extremely fast with little effort while also providing a medium for users to share data, visualizations, and insights. Other key factors that differentiate NR from the current data repositories is the number of graph datasets, their size, and variety. While other data repositories are static, they also lack a means for users to collaboratively discuss a particular dataset, corrections, or challenges with using the data for certain applications. In contrast, NR incorporates many social and collaborative aspects that facilitate scientific research, e.g., users can discuss each graph, post observations, and visualizations.
【Keywords】: visual interactive analytics; data repository; network repository; data archive; graph analytics; graph visualization
【Paper Link】 【Pages】:4294-4295
【Authors】: Vasile Rus ; Nobal B. Niraula ; Rajendra Banjade
【Abstract】: We present in this paper an innovative solution to the challenge of building effective educational technologies that offer tailored instruction to each individual learner. The proposed solution in the form of a conversational intelligent tutoring system, called DeepTutor, has been developed as a web application that is accessible 24/7 through a browser from any device connected to the Internet. The success of several large scale experiments with high-school students using DeepTutor is a solid proof that conversational intelligent tutoring at scale over the web is possible.
【Keywords】: dialogue, intelligent tutoring
【Paper Link】 【Pages】:4296-4297
【Authors】: Svitlana Volkova ; Yoram Bachrach ; Michael Armstrong ; Vijay Sharma
【Abstract】: We demonstrate an approach to predict latent personal attributes including user demographics, online personality, emotions and sentiments from texts published on Twitter. We rely on machine learning and natural language processing techniques to learn models from user communications. We first examine individual tweets to detect emotions and opinions emanating from them, and then analyze all the tweets published by a user to infer latent traits of that individual. We consider various user properties including age, gender, income, education, relationship status, optimism and life satisfaction. We focus on Ekman’s six emotions: anger, joy, surprise, fear, disgust and sadness. Our work can help social network users to understand how others may perceive them based on how they communicate in social media, in addition to its evident applications in online sales and marketing, targeted advertising, large scale polling and healthcare analytics.
【Keywords】:
【Paper Link】 【Pages】:4298-4299
【Authors】: Stefan J. Witwicki ; Francesco Mondada
【Abstract】: This paper overviews our application of state-of-the-art automated planning algorithms to real mobile robots performing an autonomous construction task, a domain in which robots are prone to faults. We describe how embracing these faults leads to better representations and smarter planning, allowing robots with limited precision to avoid catastrophic failures and succeed in intricate constructions.
【Keywords】: Planning Under Uncertainty; Fault Tolerance; Autonomous Construction, Mobile Robots
【Paper Link】 【Pages】:4300-4302
【Authors】: Xinfeng Zhang ; Su Yang ; Yuan Yan Tang ; Weishan Zhang
【Abstract】: Crowd motion in surveillance videos is comparable to heat motion of basic particles. Inspired by that, we introduce Boltzmann Entropy to measure crowd motion in optical flow field so as to detect abnormal collective behaviors. As a result, the collective crowd moving pattern can be represented as a time series. We found that when most people behave anomaly, the entropy value will increase drastically. Thus, a threshold can be applied to the time series to identify abnormal crowd commotion in a simple and efficient manner without machine learning. The experimental results show promising performance compared with the state of the art methods. The system works in real time with high precision.
【Keywords】: microscopic statistics, Boltzmann Entropy, crowd behavior, collective behavior, abnormal event detection, anomaly detection
【Paper Link】 【Pages】:4303-4304
【Authors】: Margot Lhommet ; Yuyu Xu ; Stacy Marsella
【Abstract】: Our method automatically generates realistic nonverbal performances for virtual characters to accompany spo- ken utterances. It analyses the acoustic, syntactic, se- mantic and rhetorical properties of the utterance text and audio signal to generate nonverbal behavior such as such as head movements, eye saccades, and novel gesture animations based on co-articulation.
【Keywords】: virtual human; gesture; embodied conversational agent
【Paper Link】 【Pages】:4305-4306
【Authors】: Boyang Li ; Mark O. Riedl
【Abstract】: Interactive narrative is a form of storytelling in which users affect a dramatic storyline through actions by assuming the role of characters in a virtual world.This extended abstract outlines the Scheherazade-IF system, which uses crowdsourcing and artificial intelligence to automatically construct text-based interactive narrative experiences.
【Keywords】: interactive narrative; crowdsourcing
【Paper Link】 【Pages】:4307-4308
【Authors】: Louis-Philippe Morency ; Giota Stratou ; David DeVault ; Arno Hartholt ; Margot Lhommet ; Gale M. Lucas ; Fabrizio Morbini ; Kallirroi Georgila ; Stefan Scherer ; Jonathan Gratch ; Stacy Marsella ; David R. Traum ; Albert A. Rizzo
【Abstract】: We present the SimSensei system, a fully automatic virtual agent that conducts interviews to assess indicators of psychological distress. We emphasize on the perception part of the system, a multimodal framework which captures and analyzes user state for both behavioral understanding and interactional purposes.
【Keywords】: virtual humans; therapy; depression
【Paper Link】 【Pages】:4309-4310
【Authors】: Florian Pecune ; Beatrice Biancardi ; Yu Ding ; Catherine Pelachaud ; Maurizio Mancini ; Giovanna Varni ; Antonio Camurri ; Gualtiero Volpe
【Abstract】: In our demo, LoL, a user interacts with a virtual agentable to copy and to adapt its laughing and expressive behaviorson-the-fly. Our aim is to study copying capabilitiesparticipate in enhancing user’s experience in the interaction.User listens to funny audio stimuli in the presenceof a laughing agent: when funniness of audio increases, theagent laughs and the quality of its body movement (directionand amplitude of laughter movements) is modulated on-theflyby user’s body features.
【Keywords】:
【Paper Link】 【Pages】:4311-4312
【Authors】: Julie Porteous ; Fred Charles ; Marc Cavazza
【Abstract】: Narrative generation represents an application domain for AI planning where plan quality is related to properties such as shape of plan trajectory. In our work we have developed a plan-based approach to narrative generation that uses character relationships as a key determinant in controlling plan shape (relationships are key in genres such as serial dramas and soaps). Our approach is implemented in a demonstration Interactive Narrative, called NetworkING, set in the medical drama genre. The system features a user-friendly mechanism for specifying relationships between virtual characters, via a social network and real-time visualisation of generated narratives on a 3D stage.
【Keywords】:
【Paper Link】 【Pages】:4313-4315
【Authors】: Stephen G. Ware ; Robert Michael Young ; Christian Stith ; Phillip Wright
【Abstract】: The Best Laid Plans is an interactive narrative video game that uses cognitive-inspired fast planning techniques to generate stories with conflict during play. Players alternate between acting out a plan and seeing that plan thwarted by non-player characters. The Glaive narrative planner combines causal-link-based computational models of narrative with the speed of fast heuristic search techniques to adapt the story each time the player attempts a new plan.
【Keywords】: computational models of narrative; narrative planning; interactive narrative; intentional planning; conflict; narrative; planning
【Paper Link】 【Pages】:4316-4317
【Authors】: Chitta Baral ; Giuseppe De Giacomo
【Abstract】: This is an extended abstract about what is hot in the field of Knowledge Representation and Reasoning.
【Keywords】: Description Logic; Reasoning about Actions; Knowledge Representation; Reasoning
【Paper Link】 【Pages】:4318-4319
【Authors】: Jeffrey Bigham
【Abstract】: The focus of HCOMP 2014 was the crowd worker. While crowdsourcing is motivated by the promise of leveraging people's intelligence and diverse skillsets in computational processes, the human aspects of this workforce are all too often overlooked. Instead, workers are frequently viewed as interchangeable components that can be statistically managed to eek out reasonable outputs.We are quickly moving past and rejecting these notions, and beginning to understand that it is sometimes the very abstractions that we introduce to make human computation feasible, e.g., abstracting humans behind APIs or isolating workers from others in order to ensure independent input, that can lead to the problems that we then set about trying to solve, e.g., poor or inconsistent quality work. Creating a brighter future for crowd work will require new socio-technical systems that not only decompose tasks, recruit and coordinate workers, and make sense of results, but also find interesting tasks for people to contribute to, structure tasks so that workers learn from them as they go, and eventually automate mundane parts of work. Research in artificial intelligence will be vital for achieving this future.
【Keywords】:
【Paper Link】 【Pages】:4320-4321
【Authors】: Stefan Edelkamp ; Peter Kissmann ; Álvaro Torralba
【Abstract】: The cost-optimal track of the international planning competition in 2014 has seen an unexpected outcome. Different to the precursing competition in 2011, where explicit-state heuristic search planning scored best, advances in the state-set exploration with BDDs showed a significant lead. In this paper we review the outcome of the competition, briefly looking into the internals of the competing systems.
【Keywords】: State-Space Search, Binary Decision Diagrams
【Paper Link】 【Pages】:4322-4323
【Authors】: Marijn Heule ; Torsten Schaub
【Abstract】: During the Vienna Summer of Logic, the first FLoC Olympic Games were organized, bringing together a dozen competitions related to logic. Here we present the highlights of the Satisfiability (SAT) and Answer Set Programming (ASP) competitions.
【Keywords】: SAT, ASP, Competitions
【Paper Link】 【Pages】:4324-4325
【Authors】: Wei Li
【Abstract】: As the premier international forumon human-computer interaction, "ACM Conference on Human Factors in ComputingSystems" (CHI), has continued to grow and broaden its range of topics and contributing disciplines. CHI 2014 received over 2000 submissions. Those papers and notes were from diversified research domains — psychologists and computer scientists began to meet new visions from sociology, engineering and manufacturing, communication sciences, design and arts, among others. Here, I would like to introduce progress in HCI research which will bring new opportunities and challenges to AI community.
【Keywords】: CHI; Human-computer interaction
【Paper Link】 【Pages】:4326-4327
【Authors】: Jochen Renz
【Abstract】: The Angry Birds AI Competition (aibirds.org) has been held in conjunction with the AI 2012, IJCAI 2013 and ECAI 2014 conferences and will be held again at the IJCAI 2015 conference. The declared goal of the competition is to build an AI agent that can play Angry Birds as good or better than the best human players. In this paper we describe why this is a very difficult problem, why it is a challenge for AI, and why it is an important step towards building AI that can successfully interact with the real world. We also summarise some highlights of past competitions, describe which methods were successful, and give an outlook to proposed variants of the competition.
【Keywords】: Computer Vision; Machine Learning; Knowledge Representation and Reasoning; Planning; Heuristic Search; Reasoning under Uncertainty; Game Theory; Qualitative Physics; Qualitative Reasoning; Spatial Reasoning; Angry Birds
【Paper Link】 【Pages】:4328-4329
【Authors】: Sven Wachsmuth ; Dirk Holz ; Maja Rudinac ; Javier Ruiz-del-Solar
【Abstract】: The RoboCup@Home league has been founded in 2006with the idea to drive research in AI and related fieldstowards autonomous and interactive robots that copewith real life tasks in supporting humans in everday life.The yearly competition format establishes benchmarkingas a continuous process with yearly changes insteadof a single challenge. We discuss the current state andfuture perspectives of this endeavor.
【Keywords】: service robotics; human-robot interaction; benchmarking; competition
【Paper Link】 【Pages】:4330-
【Authors】: Wei Wang
【Abstract】: As the premier international forum for data science, data mining, knowledge discovery and big data, the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) brings together researchers and practitioners from academia, industry, and government to share their ideas, research results and experiences. Partnered with Bloomberg, it celebrated its 20th years in 2014 with the theme “Data Science for Social Good”. The breadth of topics covered in the 2014 research program is truly comprehensive and nicely balanced among social and information networks, data mining for social good, graph mining, statistical techniques for big data, topic modeling, recommender systems, data streams, scalable methods, Web mining, clustering, feature selection, applications to health care and medicine, public safety, advertising, social analytics, personalization, workforce analytics, health, and many more.
【Keywords】: Data mining