Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020. ijcai.org 【DBLP Link】
【Paper Link】 【Pages】:3-9
【Authors】: Saba Ahmadi ; Faez Ahmed ; John P. Dickerson ; Mark Fuge ; Samir Khuller
【Abstract】: Bipartite b-matching, where agents on one side of a market are matched to one or more agents or items on the other, is a classical model that is used in myriad application areas such as healthcare, advertising, education, and general resource allocation. Traditionally, the primary goal of such models is to maximize a linear function of the constituent matches (e.g., linear social welfare maximization) subject to some constraints. Recent work has studied a new goal of balancing whole-match diversity and economic efficiency, where the objective is instead a monotone submodular function over the matching. Basic versions of this problem are solvable in polynomial time. In this work, we prove that the problem of simultaneously maximizing diversity along several features (e.g., country of citizenship, gender, skills) is NP-hard. To address this problem, we develop the first combinatorial algorithm that constructs provably-optimal diverse b-matchings in pseudo-polynomial time. We also provide a Mixed-Integer Quadratic formulation for the same problem and show that our method guarantees optimal solutions and takes less computation time for a reviewer assignment application. The source code is made available at https://github.com/faezahmed/diverse_matching.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Humans and AI: Human Computation and Crowdsourcing;
【Paper Link】 【Pages】:10-16
【Authors】: Yuan Yao ; Natasha Alechina ; Brian Logan ; John Thangarajah
【Abstract】: A key problem in Belief-Desire-Intention agents is how an agent progresses its intentions, i.e., which plans should be selected and how the execution of these plans should be interleaved so as to achieve the agent’s goals. Previous approaches to the intention progression problem assume the agent has perfect information about the state of the environment. However, in many real-world applications, an agent may be uncertain about whether an environment condition holds, and hence whether a particular plan is applicable or an action is executable. In this paper, we propose SAU, a Monte-Carlo Tree Search (MCTS)-based scheduler for intention progression problems where the agent’s beliefs are uncertain. We evaluate the performance of our approach experimentally by varying the degree of uncertainty in the agent’s beliefs. The results suggest that SAU is able to successfully achieve the agent’s goals even in settings where there is significant uncertainty in the agent’s beliefs.
【Keywords】: Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Engineering Methods, Platforms, Languages and Tools;
【Paper Link】 【Pages】:17-23
【Authors】: Tahar Allouche ; Bruno Escoffier ; Stefano Moretti ; Meltem Öztürk
【Abstract】: We investigate the issue of manipulability for social ranking rules, where the goal is to rank individuals given the ranking of coalitions formed by them and each individual prefers to reach the highest positions in the social ranking. This problem lies at the intersection of computational social choice and the algorithmic theory of power indices. Different social ranking rules have been recently proposed and studied from an axiomatic point of view. In this paper, we focus on rules representing three classical approaches in social choice theory: the marginal contribution approach, the lexicographic approach and the (ceteris paribus) majority one. We first consider some particular members of these families analysing their resistance to a malicious behaviour of individuals. Then, we analyze the computational complexity of manipulation, and complete our theoretical results with simulations in order to analyse the manipulation frequencies and to assess the effects of manipulations.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Cooperative Games;
【Paper Link】 【Pages】:24-30
【Authors】: Georgios Amanatidis ; Georgios Birmpas ; Aris Filos-Ratsikas ; Alexandros Hollender ; Alexandros A. Voudouris
【Abstract】: We consider the classic problem of fairly allocating indivisible goods among agents with additive valuation functions and explore the connection between two prominent fairness notions: maximum Nash welfare (MNW) and envy-freeness up to any good (EFX). We establish that an MNW allocation is always EFX as long as there are at most two possible values for the goods, whereas this implication is no longer true for three or more distinct values. As a notable consequence, this proves the existence of EFX allocations for these restricted valuation functions. While the efficient computation of an MNW allocation for two possible values remains an open problem, we present a novel algorithm for directly constructing EFX allocations in this setting. Finally, we study the question of whether an MNW allocation implies any EFX guarantee for general additive valuation functions under a natural new interpretation of approximate EFX allocations.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems;
【Paper Link】 【Pages】:31-38
【Authors】: Yanchen Deng ; Bo An
【Abstract】: Incomplete GDL-based algorithms including Max-sum and its variants are important methods for multi-agent optimization. However, they face a significant scalability challenge as the computational overhead grows exponentially with respect to the arity of each utility function. Generic Domain Pruning (GDP) technique reduces the computational effort by performing a one-shot pruning to filter out suboptimal entries. Unfortunately, GDP could perform poorly when dealing with dense local utilities and ties which widely exist in many domains. In this paper, we present several novel sorting-based acceleration algorithms by alleviating the effect of densely distributed local utilities. Specifically, instead of one-shot pruning in GDP, we propose to integrate both search and pruning to iteratively reduce the search space. Besides, we cope with the utility ties by organizing the search space of tied utilities into AND/OR trees to enable branch-and-bound. Finally, we propose a discretization mechanism to offer a tradeoff between the reconstruction overhead and the pruning efficiency. We demonstrate the superiorities of our algorithms over the state-of-the-art from both theoretical and experimental perspectives.
【Keywords】: Agent-based and Multi-agent Systems: Coordination and Cooperation; Constraints and SAT: Constraint Optimization; Constraints and SAT: Distributed Constraints;
【Paper Link】 【Pages】:39-45
【Authors】: Haris Aziz ; Simon Rey
【Abstract】: We consider a multi-agent resource allocation setting in which an agent's utility may decrease or increase when an item is allocated. We take the group envy-freeness concept that is well-established in the literature and present stronger and relaxed versions that are especially suitable for the allocation of indivisible items. Of particular interest is a concept called group envy-freeness up to one item (GEF1). We then present a clear taxonomy of the fairness concepts. We study which fairness concepts guarantee the existence of a fair allocation under which preference domain. For two natural classes of additive utilities, we design polynomial-time algorithms to compute a GEF1 allocation. We also prove that checking whether a given allocation satisfies GEF1 is coNP-complete when there are either only goods, only chores or both.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Resource Allocation;
【Paper Link】 【Pages】:46-52
【Authors】: Siddharth Barman ; Ranjani G. Sundaram
【Abstract】: We study the problem of allocating indivisible goods among agents that have an identical subadditive valuation over the goods. The extent of fair- ness and efficiency of allocations is measured by the generalized means of the values that the alloca- tions generate among the agents. Parameterized by an exponent term p, generalized-mean welfares en- compass multiple well-studied objectives, such as social welfare, Nash social welfare, and egalitarian welfare. We establish that, under identical subadditive valu- ations and in the demand oracle model, one can efficiently find a single allocation that approximates the optimal generalized-mean welfare—to within a factor of 40—uniformly for all p ∈ (−∞,1]. Hence, by way of a constant-factor approximation algorithm, we obtain novel results for maximizing Nash social welfare and egalitarian welfare for identical subadditive valuations.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:53-59
【Authors】: Aris Anagnostopoulos ; Luca Becchetti ; Emilio Cruciani ; Francesco Pasquale ; Sara Rizzo
【Abstract】: We investigate opinion dynamics in multi-agent networks when there exists a bias toward one of two possible opinions; for example, reflecting a status quo vs a superior alternative. Starting with all agents sharing an initial opinion representing the status quo, the system evolves in steps. In each step, one agent selected uniformly at random adopts with some probability a the superior opinion, and with probability 1 - a it follows an underlying update rule to revise its opinion on the basis of those held by its neighbors. We analyze the convergence of the resulting process under two well-known update rules, namely majority and voter. The framework we propose exhibits a rich structure, with a nonobvious interplay between topology and underlying update rule. For example, for the voter rule we show that the speed of convergence bears no significant dependence on the underlying topology, whereas the picture changes completely under the majority rule, where network density negatively affects convergence. We believe that the model we propose is at the same time simple, rich, and modular, affording mathematical characterization of the interplay between bias, underlying opinion dynamics, and social structure in a unified setting.
【Keywords】: Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Agent-Based Simulation and Emergence; Agent-based and Multi-agent Systems: Agent Societies;
【Paper Link】 【Pages】:60-66
【Authors】: Nagat Drawel ; Jamal Bentahar ; Amine Laarej ; Gaith Rjoub
【Abstract】: We present a formal framework that allows individual and group of agents to reason about their trust toward other agents. In particular, we propose a branching time temporal logic BT which includes operators that express concepts such as everyone trust, distributed trust and propagated trust. We analyze the satisfiability and model checking problems of this logic using a reduction technique.
【Keywords】: Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis; Agent-based and Multi-agent Systems: Trust and Reputation; Agent-based and Multi-agent Systems: Engineering Methods, Platforms, Languages and Tools;
【Paper Link】 【Pages】:67-73
【Authors】: Niclas Boehmer ; Robert Bredereck ; Dusan Knop ; Junjie Luo
【Abstract】: Given a set of individuals qualifying or disqualifying each other, group identification is the task of identifying a socially qualified subgroup of individuals. Social qualification depends on the specific rule used to aggregate individual qualifications. The bribery problem in this context asks how many agents need to change their qualifications in order to change the outcome.
Complementing previous results showing polynomial-time solvability or NP-hardness of bribery for various social rules in the constructive (aiming at making specific individuals socially qualified) or destructive (aiming at making specific individuals socially disqualified) setting, we provide a comprehensive picture of the parameterized computational complexity landscape. Conceptually, we also consider a more fine-grained concept of bribery cost, where we ask how many single qualifications need to be changed, and a more general bribery goal that combines the constructive and destructive setting.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:74-80
【Authors】: Ioannis Caragiannis ; Christos Kaklamanis ; Nikos Karanikolas ; George A. Krimpas
【Abstract】: Approval-based multiwinner voting rules have recently received much attention in the Computational Social Choice literature. Such rules aggregate approval ballots and determine a winning committee of alternatives. To assess effectiveness, we propose to employ new noise models that are specifically tailored for approval votes and committees. These models take as input a ground truth committee and return random approval votes to be thought of as noisy estimates of the ground truth. A minimum robustness requirement for an approval-based multiwinner voting rule is to return the ground truth when applied to profiles with sufficiently many noisy votes. Our results indicate that approval-based multiwinner voting can indeed be robust to reasonable noise. We further refine this finding by presenting a hierarchy of rules in terms of how robust to noise they are.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:81-88
【Authors】: Aleksander Czechowski ; Frans A. Oliehoek
【Abstract】:
Decentralized online planning can be an attractive paradigm for cooperative
multi-agent systems, due to improved scalability and robustness.
A key difficulty of such approach lies in making accurate
predictions about the decisions of other agents.
In this paper, we present a trainable online decentralized planning algorithm
based on decentralized Monte Carlo Tree Search, combined with
models of teammates learned from previous episodic runs.
By only allowing one agent to adapt its models at a time,
under the assumption of ideal policy approximation,
successive iterations of our method are guaranteed to improve joint
policies, and eventually lead to convergence to a Nash equilibrium.
We test the efficiency of the algorithm by performing experiments
in several scenarios of the spatial task
allocation environment introduced in [Claes et al., 2015]. We show that
deep learning and convolutional neural networks can be employed
to produce accurate policy approximators which exploit the spatial features of the
problem, and that the proposed algorithm improves over the baseline
planning performance for particularly challenging domain configurations.
【Keywords】: Agent-based and Multi-agent Systems: Coordination and Cooperation; Agent-based and Multi-agent Systems: Multi-agent Learning; Agent-based and Multi-agent Systems: Multi-agent Planning;
【Paper Link】 【Pages】:89-95
【Authors】: Chinmay Sonar ; Palash Dey ; Neeldhara Misra
【Abstract】: The Chamberlin-Courant and Monroe rules are fundamental and well-studied rules in the literature of multi-winner elections. The problem of determining if there exists a committee of size k that has a Chamberlin-Courant (respectively, Monroe) dissatisfaction score of at most r is known to be NP-complete. We consider the following natural problems in this setting: a) given a committee S of size k as input, is it an optimal k-sized committee?, and b) given a candidate c and a committee size k, does there exist an optimal k-sized committee that contains c? In this work, we resolve the complexity of both problems for the Chamberlin-Courant and Monroe voting rules in the settings of rankings as well as approval ballots. We show that verifying if a given committee is optimal is coNP-complete whilst the latter problem is complete for Theta_2^P. Our contribution fills an essential gap in the literature for these important multi-winner rules.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:96-102
【Authors】: Niclas Boehmer ; Edith Elkind
【Abstract】: In the multidimensional stable roommate problem, agents have to be allocated to rooms and have preferences over sets of potential roommates. We study the complexity of finding good allocations of agents to rooms under the assumption that agents have diversity preferences (Bredereck, Elkind, Igarashi, AAMAS'19): each agent belongs to one of the two types (e.g., juniors and seniors, artists and engineers), and agents’ preferences over rooms depend solely on the fraction of agents of their own type among their potential roommates. We consider various solution concepts for this setting, such as core and exchange stability, Pareto optimality and envy-freeness. On the negative side, we prove that envy-free, core stable or (strongly) exchange stable outcomes may fail to exist and that the associated decision problems are NP-complete. On the positive side, we show that these problems are in FPT with respect to the room size, which is not the case for the general stable roommate problem.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Cooperative Games;
【Paper Link】 【Pages】:103-109
【Authors】: Robert Bredereck ; Piotr Faliszewski ; Michal Furdyna ; Andrzej Kaczmarczyk ; Martin Lackner
【Abstract】: In parliamentary elections, parties compete for a limited, typically fixed number of seats. We study the complexity of the following bribery-style problem: Given the distribution of votes among the parties, what is the smallest number of voters that need to be convinced to vote for our party, so that it gets a desired number of seats. We also run extensive experiments on real-world election data and measure the effectiveness of our method.
【Keywords】: Agent-based and Multi-agent Systems: Voting; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:110-116
【Authors】: Brandon Fain ; William Fan ; Kamesh Munagala
【Abstract】: We study higher statistical moments of Distortion for randomized social choice in a metric implicit utilitarian model. The Distortion of a social choice mechanism is the expected approximation factor with respect to the optimal utilitarian social cost (OPT). The k'th moment of Distortion is the expected approximation factor with respect to the k'th power of OPT. We consider mechanisms that elicit alternatives by randomly sampling voters for their favorite alternative. We design two families of mechanisms that provide constant (with respect to the number of voters and alternatives) k'th moment of Distortion using just k samples if all voters can then participate in a vote among the proposed alternatives, or 2k-1 samples if only the sampled voters can participate. We also show that these numbers of samples are tight. Such mechanisms deviate from a constant approximation to OPT with probability that drops exponentially in the number of samples, independent of the total number of voters and alternatives. We conclude with simulations on real-world Participatory Budgeting data to qualitatively complement our theoretical insights.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:117-123
【Authors】: Nathanaël Fijalkow ; Bastien Maubert ; Aniello Murano ; Moshe Y. Vardi
【Abstract】: Prompt-LTL extends Linear Temporal Logic with a bounded version of the ``eventually'' operator to express temporal requirements such as bounding waiting times. We study assume-guarantee synthesis for prompt-LTL: the goal is to construct a system such that for all environments satisfying a first prompt-LTL formula (the assumption) the system composed with this environment satisfies a second prompt-LTL formula (the guarantee). This problem has been open for a decade. We construct an algorithm for solving it and show that, like classical LTL synthesis, it is 2-EXPTIME-complete.
【Keywords】: Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:124-131
【Authors】: Naman Goel ; Aris Filos-Ratsikas ; Boi Faltings
【Abstract】: We derive conditions under which a peer-consistency mechanism can be used to elicit truthful data from non-trusted rational agents when an aggregate statistic of the collected data affects the amount of their incentives to lie. Furthermore, we discuss the relative saving that can be achieved by the mechanism, compared to the rational outcome, if no such mechanism was implemented. Our work is motivated by distributed platforms, where decentralized data oracles collect information about real-world events, based on the aggregate information provided by often self-interested participants. We compare our theoretical observations with numerical simulations on two public real datasets.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Agent-based and Multi-agent Systems: Trust and Reputation; Humans and AI: Human Computation and Crowdsourcing; Trust, Fairness, Bias: General;
【Paper Link】 【Pages】:132-138
【Authors】: Rupert Freeman ; Anson Kahng ; David M. Pennock
【Abstract】: We study proportionality in approval-based multiwinner elections with a variable number of winners, where both the size and identity of the winning committee are informed by voters' opinions. While proportionality has been studied in multiwinner elections with a fixed number of winners, it has not been considered in the variable number of winners setting. The measure of proportionality we consider is average satisfaction (AS), which intuitively measures the number of agreements on average between sufficiently large and cohesive groups of voters and the output of the voting rule. First, we show an upper bound on AS that any deterministic rule can provide, and that straightforward adaptations of deterministic rules from the fixed number of winners setting do not achieve better than a 1/2 approximation to AS even for large numbers of candidates. We then prove that a natural randomized rule achieves a 29/32 approximation to AS.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:139-145
【Authors】: Hagen Echzell ; Tobias Friedrich ; Pascal Lenzner ; Anna Melnichenko
【Abstract】: Network Creation Games(NCGs) model the creation of decentralized communication networks like the Internet. In such games strategic agents corresponding to network nodes selfishly decide with whom to connect to optimize some objective function. Past research intensively analyzed models where the agents strive for a central position in the network. This models agents optimizing the network for low-latency applications like VoIP. However, with today's abundance of streaming services it is important to ensure that the created network can satisfy the increased bandwidth demand. To the best of our knowledge, this natural problem of the decentralized strategic creation of networks with sufficient bandwidth has not yet been studied.
We introduce Flow-Based NCGs where the selfish agents focus on bandwidth instead of latency. In essence, budget-constrained agents create network links to maximize their minimum or average network flow value to all other network nodes. Equivalently, this can also be understood as agents who create links to increase their connectivity and thus also the robustness of the network.
For this novel type of NCG we prove that pure Nash equilibria exist, we give a simple algorithm for computing optimal networks, we show that the Price of Stability is 1 and we prove an (almost) tight bound of 2 on the Price of Anarchy. Last but not least, we show that our models do not admit a potential function.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Noncooperative Games;
【Paper Link】 【Pages】:146-152
【Authors】: Jiehua Chen ; Robert Ganian ; Thekla Hamm
【Abstract】: We investigate the following many-to-one stable matching problem with diversity constraints (SMTI-DIVERSE): Given a set of students and a set of colleges which have preferences over each other, where the students have overlapping types, and the colleges each have a total capacity as well as quotas for individual types (the diversity constraints), is there a matching satisfying all diversity constraints such that no unmatched student-college pair has an incentive to deviate?
SMTI-DIVERSE is known to be NP-hard. However, as opposed to the NP-membership claims in the literature [Aziz et al., AAMAS 2019; Huang,SODA 2010], we prove that it is beyond NP: it is complete for the complexity class Σ^P_2. In addition, we provide a comprehensive analysis of the problem’s complexity from the viewpoint of natural restrictions to inputs and obtain new algorithms for the problem.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems;
【Paper Link】 【Pages】:153-159
【Authors】: Haris Aziz ; Serge Gaspers ; Zhaohong Sun
【Abstract】: We study the controlled school choice problem where students may belong to overlapping types and schools have soft target quotas for each type. We formalize fairness concepts for the setting that extend fairness concepts considered for restricted settings without overlapping types. Our central contribution is presenting a new class of algorithms that takes into account the representations of combinations of student types. The algorithms return matchings that are non-wasteful and satisfy fairness for same types. We further prove that the algorithms are strategyproof for the students and yield a fair outcome with respect to the induced quotas for type combinations. We experimentally compare our algorithms with two existing approaches in terms of achieving diversity goals and satisfying fairness.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; AI Ethics: Fairness;
【Paper Link】 【Pages】:160-166
【Authors】: Nika Haghtalab ; Nicole Immorlica ; Brendan Lucier ; Jack Z. Wang
【Abstract】: Motivated by applications such as college admission and insurance rate determination, we study a classification problem where the inputs are controlled by strategic individuals who can modify their features at a cost. A learner can only partially observe the features, and aims to classify individuals with respect to a quality score. The goal is to design a classification mechanism that maximizes the overall quality score in the population, taking any strategic updating into account.
When scores are linear and mechanisms can assign their own scores to agents, we show that the optimal classifier is an appropriate projection of the quality score. For the more restrictive task of binary classification via linear thresholds, we construct a (1/4)-approximation to the optimal classifier when the underlying feature distribution is sufficiently smooth and admits an oracle for finding dense regions. We extend our results to settings where the prior distribution is unknown and must be learned from samples.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Machine Learning: Learning Theory; AI Ethics: Moral Decision Making;
【Paper Link】 【Pages】:167-174
【Authors】: Aditya Hegde ; Vibhav Agarwal ; Shrisha Rao
【Abstract】: Modelling ethics is critical to understanding and analysing social phenomena. However, prior literature either incorporates ethics into agent strategies or uses it for evaluation of agent behaviour. This work proposes a framework that models both, ethical decision making as well as evaluation using virtue ethics and utilitarianism. In an iteration, agents can use either the classical Continuous Prisoner's Dilemma or a new type of interaction called moral interaction, where agents donate or steal from other agents. We introduce moral interactions to model ethical decision making. We also propose a novel agent type, called virtue agent, parametrised by the agent's level of ethics. Virtue agents' decisions are based on moral evaluations of past interactions. Our simulations show that unethical agents make short term gains but are less prosperous in the long run. We find that in societies with positivity bias, unethical agents have high incentive to become ethical. The opposite is true of societies with negativity bias. We also evaluate the ethicality of existing strategies and compare them with those of virtue agents.
【Keywords】: Agent-based and Multi-agent Systems: Agent-Based Simulation and Emergence; AI Ethics: Moral Decision Making; Multidisciplinary Topics and Applications: Social Sciences;
【Paper Link】 【Pages】:175-181
【Authors】: Niklas Hahn ; Martin Hoefer ; Rann Smorodinsky
【Abstract】: We study an information-structure design problem (i.e., a Bayesian persuasion problem) in an online scenario. Inspired by the classic gambler's problem, consider a set of candidates who arrive sequentially and are evaluated by one agent (the sender). This agent learns the value from hiring the candidate to herself as well as the value to another agent, the receiver. The sender provides a signal to the receiver who, in turn, makes an irrevocable decision on whether or not to hire the candidate. A-priori, for each agent the distribution of valuation is independent across candidates but may not be identical. We design good online signaling schemes for the sender. To assess the performance, we compare the expected utility to that of an optimal offline scheme by a prophet sender who knows all candidate realizations in advance. We show an optimal prophet inequality for online Bayesian persuasion, with a 1/2-approximation when the instance satisfies a "satisfactory-status-quo" assumption. Without this assumption, there are instances without any finite approximation factor. We extend the results to combinatorial domains and obtain prophet inequalities for matching with multiple hires and multiple receivers.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Agent Communication;
【Paper Link】 【Pages】:182-188
【Authors】: Hadi Hosseini ; Ayumi Igarashi ; Andrew Searns
【Abstract】: We initiate the study of multi-layered cake cutting with the goal of fairly allocating multiple divisible resources (layers of a cake) among a set of agents. The key requirement is that each agent can only utilize a single resource at each time interval. Several real-life applications exhibit such restrictions on overlapping pieces, for example, assigning time intervals over multiple facilities and resources or assigning shifts to medical professionals. We investigate the existence and computation of envy-free and proportional allocations. We show that envy-free allocations that are both feasible and contiguous are guaranteed to exist for up to three agents with two types of preferences, when the number of layers is two. We further devise an algorithm for computing proportional allocations for any number of agents when the number of layers is factorable to three and/or some power of two.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:189-195
【Authors】: Sushmita Gupta ; Pallavi Jain ; Saket Saurabh
【Abstract】: In the standard model of committee selection, we are given a set of ordinal votes over a set of candidates and a desired committee size, and the task is to select a committee that relates to the given votes. Motivated by possible interactions and dependencies between candidates, we study a generalization of committee selection in which the candidates are connected via a network and the task is to select a committee that relates to the given votes while also satisfy certain properties with respect to this candidate network. To accommodate certain correspondences to the voter preferences, we consider three standard voting rules (in particular, $k$-Borda, Chamberlin-Courant, and Gehrlein stability); to model different aspects of interactions and dependencies between candidates, we consider two graph properties (in particular, Independent Set and Connectivity). We study the parameterized complexity of the corresponding combinatorial problems and discuss certain implications of our algorithmic results.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:196-202
【Authors】: Michal Jaworski ; Piotr Skowron
【Abstract】: We study a model where a group of representatives is elected to make a series of decisions on behalf of voters. The quality of such a representative committee is judged based on the extent to which the decisions it makes are consistent with the voters' preferences. We assume the set of issues on which the committee will make the decisions is unknown---a committee is elected based on the preferences of the voters over the candidates, which only reflect how similar are the preferences of the voters and candidates regarding the issues. In this model we theoretically and experimentally assess qualities of various multiwinner election rules.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:203-209
【Authors】: Piotr Faliszewski ; Alexander Karpov ; Svetlana Obraztsova
【Abstract】: We analyze the complexity of several NP-hard election-related problems under the assumptions that the voters have group-separable preferences. We show that under this assumption our problems typically remain NP-hard, but we provide more efficient algorithms if additionally the clone decomposition tree is of moderate height.
【Keywords】: Agent-based and Multi-agent Systems: Voting; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:210-216
【Authors】: Wiebe van der Hoek ; Louwe B. Kuijer ; Yì N. Wáng
【Abstract】: We combine social balance theory with temporal logic to obtain a Logic of Allies and Enemies (LAE), which formally describes the likely changes to a social network due to social pressure. We demonstrate how the rich language of LAE can be used to describe various interesting concepts, and show that both model checking and validity checking are PSPACE-complete.
【Keywords】: Agent-based and Multi-agent Systems: Agent Societies; Multidisciplinary Topics and Applications: Social Sciences;
【Paper Link】 【Pages】:217-223
【Authors】: Abhishek Ninad Kulkarni ; Jie Fu
【Abstract】: We consider a class of two-player turn-based zero-sum games on graphs with reachability objectives, known as reachability games, where the objective of Player 1 (P1) is to reach a set of goal states, and that of Player 2 (P2) is to prevent this. In particular, we consider the case where the players have asymmetric information about each other's action capabilities: P2 starts with an incomplete information (misperception) about P1's action set, and updates the misperception when P1 uses an action previously unknown to P2. When P1 is made aware of P2's misperception, the key question is whether P1 can control P2's perception so as to deceive P2 into selecting actions to P1's advantage? To answer this question, we introduce a dynamic hypergame model to capture the reachability game with evolving misperception of P2. Then, we present a fixed-point algorithm to compute the deceptive winning region and strategy for P1 under almost-sure winning condition. Finally, we show that the synthesized deceptive winning strategy is at least as powerful as the (non-deceptive) winning strategy in the game in which P1 does not account for P2's misperception. We illustrate our algorithm using a robot motion planning in an adversarial environment.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief;
【Paper Link】 【Pages】:224-230
【Authors】: Zack Fitzsimmons ; Omer Lev
【Abstract】: While manipulative attacks on elections have been well-studied, only recently has attention turned to attacks that account for geographic information, which are extremely common in the real world. The most well known in the media is gerrymandering, in which district border-lines are changed to increase a party's chance to win, but a different geographical manipulation involves influencing the election by selecting the location of polling places, as many people are not willing to go to any distance to vote. In this paper we initiate the study of this manipulation. We find that while it is easy to manipulate the selection of polling places on the line, it becomes difficult already on the plane or in the case of more than two candidates. Moreover, we show that for more than two candidates the problem is inapproximable. However, we find a few restricted cases on the plane where some algorithms perform well. Finally, we discuss how existing results for standard control actions hold in the geographic setting, consider additional control actions in the geographic setting, and suggest directions for future study.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:231-237
【Authors】: Bin Li ; Dong Hao ; Dengji Zhao
【Abstract】: Diffusion auction is a new model in auction design. It can incentivize the buyers who have already joined in the auction to further diffuse the sale information to others via social relations, whereby both the seller's revenue and the social welfare can be improved. Diffusion auctions are essentially non-typical multidimensional mechanism design problems and agents' social relations are complicatedly involved with their bids. In such auctions, incentive-compatibility (IC) means it is best for every agent to honestly report her valuation and fully diffuse the sale information to all her neighbors. Existing work identified some specific mechanisms for diffusion auctions, while a general theory characterizing all incentive-compatible diffusion auctions is still missing. In this work, we identify a sufficient and necessary condition for all dominant-strategy incentive-compatible (DSIC) diffusion auctions. We formulate the monotonic allocation policies in such multidimensional problems and show that any monotonic allocation policy can be implemented in a DSIC diffusion auction mechanism. Moreover, given any monotonic allocation policy, we obtain the optimal payment policy to maximize the seller's revenue.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Noncooperative Games; Agent-based and Multi-agent Systems: Resource Allocation;
【Paper Link】 【Pages】:238-245
【Authors】: Minming Li ; Pinyan Lu ; Yuhao Yao ; Jialin Zhang
【Abstract】: In this paper, we study the two-facility location game with optional preference where the acceptable set of facilities for each agent could be different and an agent's cost is his distance to the closest facility within his acceptable set. The objective is to minimize the total cost of all agents while achieving strategyproofness. For general metrics, we design a deterministic strategyproof mechanism for the problem with approximation ratio of 1+2alpha, where alpha is the approximation ratio of the optimization version. In particular, for the setting on a line, we improve the earlier best ratio of n/2+1 to a ratio of 2.75.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:246-253
【Authors】: Jakub Cerný ; Viliam Lisý ; Branislav Bosanský ; Bo An
【Abstract】: Stackelberg security games (SSGs) have been deployed in many real-world situations to optimally allocate scarce resource to protect targets against attackers. However, actual human attackers are not perfectly rational and there are several behavior models that attempt to predict subrational behavior. Quantal response is among the most commonly used such models and Quantal Stackelberg Equilibrium (QSE) describes the optimal strategy to commit to when facing a subrational opponent. Non-concavity makes computing QSE computationally challenging and while there exist algorithms for computing QSE for SSGs, they cannot be directly used for solving an arbitrary game in the normal form. We (1) present a transformation of the primal problem for computing QSE using a Dinkelbach's method for any general-sum normal-form game, (2) provide a gradient-based and a MILP-based algorithm, give the convergence criteria, and bound their error, and finally (3) we experimentally demonstrate that using our novel transformation, a QSE can be closely approximated several orders of magnitude faster.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Noncooperative Games; Humans and AI: Cognitive Modeling;
【Paper Link】 【Pages】:254-260
【Authors】: Reuth Mirsky ; William Macke ; Andy Wang ; Harel Yedidsion ; Peter Stone
【Abstract】: In ad hoc teamwork, multiple agents need to collaborate without having knowledge about their teammates or their plans a priori. A common assumption in this research area is that the agents cannot communicate. However, just as two random people may speak the same language, autonomous teammates may also happen to share a communication protocol. This paper considers how such a shared protocol can be leveraged, introducing a means to reason about Communication in Ad Hoc Teamwork (CAT). The goal of this work is enabling improved ad hoc teamwork by judiciously leveraging the ability of the team to communicate. We situate our study within a novel CAT scenario, involving tasks with multiple steps, where teammates' plans are unveiled over time. In this context, the paper proposes methods to reason about the timing and value of communication and introduces an algorithm for an ad hoc agent to leverage these methods. Finally, we introduces a new multiagent domain, the tool fetching domain, and we study how varying this domain's properties affects the usefulness of communication. Empirical results show the benefits of explicit reasoning about communication content and timing in ad hoc teamwork.
【Keywords】: Agent-based and Multi-agent Systems: Agent Communication; Agent-based and Multi-agent Systems: Coordination and Cooperation;
【Paper Link】 【Pages】:261-267
【Authors】: Soh Kumabe ; Takanori Maehara
【Abstract】: The b-matching game is a cooperative game defined on a graph. The game generalizes the matching game to allow each individual to have more than one partner. The game has several applications, such as the roommate assignment, the multi-item version of the seller-buyer assignment, and the international kidney exchange.
Compared with the standard matching game, the b-matching game is computationally hard. In particular, the core non-emptiness problem and the core membership problem are co-NP-hard. Therefore, we focus on the convexity of the game, which is a sufficient condition of the core non-emptiness and often more tractable concept than the core non-emptiness. It also has several additional benefits.
In this study, we give a necessary and sufficient condition of the convexity of the b-matching game. This condition also gives an O(n log n + m α(n)) time algorithm to determine whether a given game is convex or not, where n and m are the number of vertices and edges of a given graph, respectively, and α(・) is the inverse-Ackermann function. Using our characterization, we also give a polynomial-time algorithm to compute the Shapley value of a convex b-matching game.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Cooperative Games;
【Paper Link】 【Pages】:268-275
【Authors】: Saaduddin Mahmud ; Md. Mosaddek Khan ; Moumita Choudhury ; Long Tran-Thanh ; Nicholas R. Jennings
【Abstract】: Distributed Constraint Optimization Problems (DCOPs) are an important framework for modeling coordinated decision-making problems in multi-agent systems with a set of discrete variables. Later works have extended DCOPs to model problems with a set of continuous variables, named Functional DCOPs (F-DCOPs). In this paper, we combine both of these frameworks into the Mixed Integer Functional DCOP (MIF-DCOP) framework that can deal with problems regardless of their variables' type. We then propose a novel algorithm - Distributed Parallel Simulated Annealing (DPSA), where agents cooperatively learn the optimal parameter configuration for the algorithm while also solving the given problem using the learned knowledge. Finally, we empirically evaluate our approach in DCOP, F-DCOP, and MIF-DCOP settings and show that DPSA produces solutions of significantly better quality than the state-of-the-art non-exact algorithms in their corresponding settings.
【Keywords】: Agent-based and Multi-agent Systems: Coordination and Cooperation; Constraints and SAT: Constraint Optimization; Constraints and SAT: Distributed Constraints; Agent-based and Multi-agent Systems: Multi-agent Learning;
【Paper Link】 【Pages】:276-282
【Authors】: Szymon Dudycz ; Pasin Manurangsi ; Jan Marcinkowski ; Krzysztof Sornat
【Abstract】: In approval-based multiwinner elections, we are given a set of voters, a set of candidates, and, for each voter, a set of candidates approved by the voter. The goal is to find a committee of size k that maximizes the total utility of the voters. In this paper, we study approximability of Thiele rules, which are known to be NP-hard to solve exactly. We provide a tight polynomial time approximation algorithm for a natural class of geometrically dominant weights that includes such voting rules as Proportional Approval Voting or p-Geometric. The algorithm is relatively simple: first we solve a linear program and then we round a solution by employing a framework called pipage rounding due to Ageev and Sviridenko (2004) and Calinescu et al. (2011). We provide a matching lower bound via a reduction from the Label Cover problem. Moreover, assuming a conjecture called Gap-ETH, we show that better approximation ratio cannot be obtained even in time f(k)*pow(n,o(k)).
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:283-289
【Authors】: Thanh Hong Nguyen ; Arunesh Sinha ; He He
【Abstract】: Learning attacker behavior is an important research topic in security games as security agencies are often uncertain about attackers' decision making. Previous work has focused on developing various behavioral models of attackers based on historical attack data. However, a clever attacker can manipulate its attacks to fail such attack-driven learning, leading to ineffective defense strategies. We study attacker behavior deception with three main contributions. First, we propose a new model, named partial behavior deception model, in which there is a deceptive attacker (among multiple attackers) who controls a portion of attacks. Our model captures real-world security scenarios such as wildlife protection in which multiple poachers are present. Second, we introduce a new scalable algorithm, GAMBO, to compute an optimal deception strategy of the deceptive attacker. Our algorithm employs the projected gradient descent and uses the implicit function theorem for the computation of gradient. Third, we conduct a comprehensive set of experiments, showing a significant benefit for the attacker and loss for the defender due to attacker deception.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Noncooperative Games; Machine Learning: Adversarial Machine Learning;
【Paper Link】 【Pages】:290-296
【Authors】: Trung Thanh Nguyen ; Jörg Rothe
【Abstract】: In fair division of indivisible goods, finding an allocation that satisfies fairness and efficiency simultaneously is highly desired but computationally hard. We solve this problem approximately in polynomial time by modeling it as a bi-criteria optimization problem that can be solved efficiently by determining an approximate Pareto set of bounded size. We focus on two criteria: max-min fairness and utilitarian efficiency, and study this problem for the setting when there are only a few item types or a few agent types. We show in both cases that one can construct an approximate Pareto set in time polynomial in the input size, either by designing a dynamic programming scheme, or a linear-programming algorithm. Our techniques strengthen known methods and can be potentially applied to other notions of fairness and efficiency as well.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Resource Allocation; Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems;
【Paper Link】 【Pages】:297-303
【Authors】: Pallavi Bagga ; Nicola Paoletti ; Bedour Alrayes ; Kostas Stathis
【Abstract】: We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.
【Keywords】: Agent-based and Multi-agent Systems: Agreement Technologies: Negotiation and Contract-Based Systems; Machine Learning Applications: Applications of Reinforcement Learning; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:304-310
【Authors】: Evangelos Markakis ; Georgios Papasotiropoulos
【Abstract】: Approval voting provides a simple, practical framework for multi-issue elections, and the most representative example among such election rules is the classic Minisum approval voting rule. We consider a generalization of Minisum, introduced by the work of Barrot and Lang [2016], referred to as Conditional Minisum, where voters are also allowed to express dependencies between issues. The price we have to pay when we move to this higher level of expressiveness is that we end up with a computationally hard rule. Motivated by this, we focus on the computational aspects of Conditional Minisum, where progress has been rather scarce so far. We identify restrictions to every voter's dependencies, under which we provide the first multiplicative approximation algorithms for the problem. The restrictions involve upper bounds on the number of dependencies an issue can have on the others. At the same time, by additionally requiring certain structural properties for the union of dependencies cast by the whole electorate, we obtain optimal efficient algorithms for well-motivated special cases. Overall, our work provides a better understanding on the complexity implications introduced by conditional voting.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:311-317
【Authors】: Elham Parhizkar ; Mohammad Hossein Nikravan ; Robert C. Holte ; Sandra Zilles
【Abstract】: To assess the trustworthiness of an agent in a multi-agent system, one often combines two types of trust information: direct trust information derived from one's own interactions with that agent, and indirect trust information based on advice from other agents. This paper provides the first systematic study on when it is beneficial to combine these two types of trust as opposed to relying on only one of them. Our large-scale experimental study shows that strong methods for computing indirect trust make direct trust redundant in a surprisingly wide variety of scenarios. Further, a new method for the combination of the two trust types is proposed that, in the remaining scenarios, outperforms the ones known from the literature.
【Keywords】: Agent-based and Multi-agent Systems: Trust and Reputation; Agent-based and Multi-agent Systems: Other;
【Paper Link】 【Pages】:318-324
【Authors】: Edith Elkind ; Neel Patel ; Alan Tsang ; Yair Zick
【Abstract】: We examine the problem of assigning plots of land to prospective buyers who prefer living next to their friends. In this setting, each agent's utility depends on the plot she receives and the identities of the agents who receive the adjacent plots. We are interested in mechanisms without money that guarantee truthful reporting of both land values and friendships, as well as Pareto optimality and computational efficiency. We explore several modifications of the Random Serial Dictatorship (RSD) mechanism, and identify one that performs well according to these criteria, We also study the expected social welfare of the assignments produced by our mechanisms when agents' values for the land plots are binary; it turns out that we can achieve good approximations to the optimal social welfare, but only if the agents value the friendships highly.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:325-331
【Authors】: Alessio Lomuscio ; Edoardo Pirovano
【Abstract】: We present a method for reasoning about fault-tolerance in unbounded robotic swarms. We introduce a novel semantics that accounts for the probabilistic nature of both the swarm and possible malfunctions, as well as the unbounded nature of swarm systems. We define and interpret a variant of probabilistic linear-time temporal logic on the resulting executions, including those arising from faulty behaviour by some of the agents in the swarm. We specify the decision problem of parameterised fault-tolerance, which concerns determining whether a probabilistic specification holds under possibly faulty behaviour. We outline a verification procedure that we implement and use to study a foraging protocol from swarm robotics, and report the experimental results obtained.
【Keywords】: Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis; Robotics: Multi-Robot Systems;
【Paper Link】 【Pages】:332-338
【Authors】: Maria-Florina Balcan ; Siddharth Prasad ; Tuomas Sandholm
【Abstract】: A two-part tariff is a pricing scheme that consists of an up-front lump sum fee and a per unit fee. Various products in the real world are sold via a menu, or list, of two-part tariffs---for example gym memberships, cell phone data plans, etc. We study learning high-revenue menus of two-part tariffs from buyer valuation data, in the setting where the mechanism designer has access to samples from the distribution over buyers' values rather than an explicit description thereof. Our algorithms have clear direct uses, and provide the missing piece for the recent generalization theory of two-part tariffs. We present a polynomial time algorithm for optimizing one two-part tariff. We also present an algorithm for optimizing a length-L menu of two-part tariffs with run time exponential in L but polynomial in all other problem parameters. We then generalize the problem to multiple markets. We prove how many samples suffice to guarantee that a two-part tariff scheme that is feasible on the samples is also feasible on a new problem instance with high probability. We then show that computing revenue-maximizing feasible prices is hard even for buyers with additive valuations. Then, for buyers with identical valuation distributions, we present a condition that is sufficient for the two-part tariff scheme from the unsegmented setting to be optimal for the market-segmented setting. Finally, we prove a generalization result that states how many samples suffice so that we can compute the unsegmented solution on the samples and still be guaranteed that we get a near-optimal solution for the market-segmented setting with high probability.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems;
【Paper Link】 【Pages】:339-346
【Authors】: Yujian Ye ; Dawei Qiu ; Jonathan Ward ; Marcin Abram
【Abstract】: The problem of real-time autonomous energy management is an application area that is receiving unprecedented attention from consumers, governments, academia, and industry. This paper showcases the first application of deep reinforcement learning (DRL) to real-time autonomous energy management for a multi-carrier energy system. The proposed approach is tailored to align with the nature of the energy management problem by posing it in multi-dimensional continuous state and action spaces, in order to coordinate power flows between different energy devices, and to adequately capture the synergistic effect of couplings between different energy carriers. This fundamental contribution is a significant step forward from earlier approaches that only sought to control the power output of a single device and neglected the demand-supply coupling of different energy carriers. Case studies on a real-world scenario demonstrate that the proposed method significantly outperforms existing DRL methods as well as model-based control approaches in achieving the lowest energy cost and yielding a representation of energy management policies that adapt to system uncertainties.
【Keywords】: Agent-based and Multi-agent Systems: Agent-Based Simulation and Emergence; Multidisciplinary Topics and Applications: Real-Time Systems; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:347-353
【Authors】: Anna Maria Kerkmann ; Jörg Rothe
【Abstract】: Nguyen et al. [2016] introduced altruistic hedonic games in which agents’ utilities depend not only on their own preferences but also on those of their friends in the same coalition. We propose to extend their model to coalition formation games in general, considering also the friends in other coalitions. Comparing the two models, we argue that excluding some friends from the altruistic behavior of an agent is a major disadvantage that comes with the restriction to hedonic games. After introducing our model, we additionally study some common stability notions and provide a computational analysis of the associated verification and existence problems.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Cooperative Games;
【Paper Link】 【Pages】:354-361
【Authors】: Sandhya Saisubramanian ; Ece Kamar ; Shlomo Zilberstein
【Abstract】: Agents operating in unstructured environments often create negative side effects (NSE) that may not be easy to identify at design time. We examine how various forms of human feedback or autonomous exploration can be used to learn a penalty function associated with NSE during system deployment. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov decision process with lexicographic reward preferences and slack. The slack denotes the maximum deviation from an optimal policy with respect to the agent's primary objective allowed in order to mitigate NSE as a secondary objective. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE and that different feedback mechanisms introduce different biases, which influence the identification of NSE.
【Keywords】: Agent-based and Multi-agent Systems: Human-Agent Interaction; Humans and AI: Human-AI Collaboration; Planning and Scheduling: Markov Decisions Processes;
【Paper Link】 【Pages】:362-370
【Authors】: Ludwig Dierks ; Sven Seuken
【Abstract】: In many markets, like electricity or cloud computing markets, providers incur large costs for keeping sufficient capacity in reserve to accommodate demand fluctuations of a mostly fixed user base. These costs are significantly affected by the unpredictability of the users' demand. Nevertheless, standard mechanisms charge fixed per-unit prices that do not depend on the variability of the users' demand. In this paper, we study a variance-based pricing rule in a two-provider market setting and perform a game-theoretic analysis of the resulting competitive effects. We show that an innovative provider who employs variance-based pricing can choose a pricing strategy that guarantees himself a higher profit than using fixed per-unit prices for any individually rational response of a provider playing a fixed pricing strategy. We then characterize all equilibria for the setting where both providers use variance-based pricing strategies. We show that, in equilibrium, the providers' profits may increase or decrease, depending on their cost functions. However, social welfare always weakly increases.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems;
【Paper Link】 【Pages】:371-377
【Authors】: Weiran Shen ; Weizhe Chen ; Taoan Huang ; Rohit Singh ; Fei Fang
【Abstract】: Although security games have attracted intensive research attention over the past years, few existing works consider how information from local communities would affect the game. In this paper, we introduce a new player -- a strategic informant, who can observe and report upcoming attacks -- to the defender-attacker security game setting. Characterized by a private type, the informant has his utility structure that leads to his strategic behaviors. We model the game as a 3-player extensive-form game and propose a novel solution concept of Strong Stackelberg-perfect Bayesian equilibrium. To compute the optimal defender strategy, we first show that although the informant can have infinitely many types in general, the optimal defense plan can only include a finite (exponential) number of different patrol strategies. We then prove that there exists a defense plan with only a linear number of patrol strategies that achieve the optimal defender's utility, which significantly reduces the computational burden and allows us to solve the game in polynomial time using linear programming. Finally, we conduct extensive experiments to show the effect of the strategic informant and demonstrate the effectiveness of our algorithm.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Noncooperative Games;
【Paper Link】 【Pages】:378-385
【Authors】: Ron Lavi ; Omer Shiran-Shvarzbard
【Abstract】: We study a competition among two contests, where each contest designer aims to attract as much effort as possible. Such a competition exists in reality, e.g., in crowd-sourcing websites. Our results are phrased in terms of the ``relative prize power'' of a contest, which is the ratio of the total prize offered by this contest designer relative to the sum of total prizes of the two contests. When contestants have a quasi-linear utility function that captures both a risk-aversion effect and a cost of effort, we show that a simple contest attracts a total effort which approaches the relative prize power of the contest designer assuming a large number of contestants. This holds regardless of the contest policy of the opponent, hence providing a ``safety level'' which is a robust notion similar in spirit to the max-min solution concept.
【Keywords】: Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Agent-Based Simulation and Emergence; Agent-based and Multi-agent Systems: Multi-agent Planning;
【Paper Link】 【Pages】:386-392
【Authors】: Pallavi Jain ; Krzysztof Sornat ; Nimrod Talmon
【Abstract】: Participatory budgeting systems allow city residents to jointly decide on projects they wish to fund using public money, by letting residents vote on such projects. While participatory budgeting is gaining popularity, existing aggregation methods do not take into account the natural possibility of project interactions, such as substitution and complementarity effects. Here we take a step towards fixing this issue: First, we augment the standard model of participatory budgeting by introducing a partition over the projects and model the type and extent of project interactions within each part using certain functions. We study the computational complexity of finding bundles that maximize voter utility, as defined with respect to such functions. Motivated by the desire to incorporate project interactions in real-world participatory budgeting systems, we identify certain cases that admit efficient aggregation in the presence of such project interactions.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:393-399
【Authors】: Nicholas Mattei ; Paolo Turrini ; Stanislav Zhydkov
【Abstract】: In peer selection agents must choose a subset of themselves for an award or a prize. As agents are self-interested, we want to design algorithms that are impartial, so that an individual agent cannot affect their own chance of being selected. This problem has broad application in resource allocation and mechanism design and has received substantial attention in the artificial intelligence literature. Here, we present a novel algorithm for impartial peer selection, PeerNomination, and provide a theoretical analysis of its accuracy. Our algorithm possesses various desirable features. In particular, it does not require an explicit partitioning of the agents, as previous algorithms in the literature. We show empirically that it achieves higher accuracy than the exiting algorithms over several metrics.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Noncooperative Games;
【Paper Link】 【Pages】:400-406
【Authors】: Minming Li ; Chenhao Wang ; Mengqi Zhang
【Abstract】: This paper studies the facility location games with payments, where facilities are strategic players. In the game, customers and facilities are located at publicly known locations on a line segment. Each selfish facility has an opening-cost as her private information, and she may strategically report it. Upon receiving the reports, the government uses a mechanism to select some facilities to open and pay to them. The cost/utility of each customer depends on the distance to the nearest opened facility. Under a given budget B, which constrains the total payment, we derive upper and lower bounds on the approximation ratios of truthful budget feasible mechanisms for four utilitarian and egalitarian objectives, and study the case when augmented budget is allowed.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:407-413
【Authors】: Feng Wu ; Sarvapali D. Ramchurn
【Abstract】: We propose a novel algorithm based on Monte-Carlo tree search for the problem of coalition structure generation (CSG). Specifically, we find the optimal solution by sampling the coalition structure graph and incrementally expanding a search tree, which represents the partial space that has been searched. We prove that our algorithm is complete and converges to the optimal given sufficient number of iterations. Moreover, it is anytime and can scale to large CSG problems with many agents. Experimental results on six common CSG benchmark problems and a disaster response domain confirm the advantages of our approach comparing to the state-of-the-art methods.
【Keywords】: Agent-based and Multi-agent Systems: Cooperative Games; Agent-based and Multi-agent Systems: Coordination and Cooperation;
【Paper Link】 【Pages】:414-421
【Authors】: Ying Wen ; Yaodong Yang ; Jun Wang
【Abstract】: Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem. In this paper, we introduce generalized recursive reasoning (GR2) as a novel framework to model agents with different \emph{hierarchical} levels of rationality; our framework enables agents to exhibit varying levels of ``thinking'' ability thereby allowing higher-level agents to best respond to various less sophisticated learners. We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. Within the GR2, we propose a practical actor-critic solver, and demonstrate its convergent property to a stationary point in two-player games through Lyapunov analysis. On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.
【Keywords】: Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Multi-agent Learning;
【Paper Link】 【Pages】:422-428
【Authors】: Yao Zhang ; Xiuzhen Zhang ; Dengji Zhao
【Abstract】: We study a question answering problem on a social network, where a requester is seeking an answer from the agents on the network. The goal is to design reward mechanisms to incentivize the agents to propagate the requester's query to their neighbours if they don't have the answer. Existing mechanisms are vulnerable to Sybil-attacks, i.e., an agent may get more reward by creating fake identities. Hence, we combat this problem by first proving some impossibility results to resolve Sybil-attacks and then characterizing a class of mechanisms which satisfy Sybil-proofness (prevents Sybil-attacks) as well as other desirable properties. Except for Sybil-proofness, we also consider cost minimization for the requester and agents' collusions.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Agent-based and Multi-agent Systems: Noncooperative Games;
【Paper Link】 【Pages】:430-436
【Authors】: Pablo Badilla ; Felipe Bravo-Marquez ; Jorge Pérez
【Abstract】: Word embeddings are known to exhibit stereotypical biases towards gender, race, religion, among other criteria. Severa fairness metrics have been proposed in order to automatically quantify these biases. Although all metrics have a similar objective, the relationship between them is by no means clear. Two issues that prevent a clean comparison is that they operate with different inputs, and that their outputs are incompatible with each other. In this paper we propose WEFE, the word embeddings fairness evaluation framework, to encapsulate, evaluate and compare fairness metrics. Our framework needs a list of pre-trained embeddings and a set of fairness criteria, and it is based on checking correlations between fairness rankings induced by these criteria. We conduct a case study showing that rankings produced by existing fairness methods tend to correlate when measuring gender bias. This correlation is considerably less for other biases like race or religion. We also compare the fairness rankings with an embedding benchmark showing that there is no clear correlation between fairness and good performance in downstream tasks.
【Keywords】: AI Ethics: Fairness; Natural Language Processing: Embeddings; Trust, Fairness, Bias: General; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:437-443
【Authors】: Samuel Yeom ; Matt Fredrikson
【Abstract】: We turn the definition of individual fairness on its head - rather than ascertaining the fairness of a model given a predetermined metric, we find a metric for a given model that satisfies individual fairness. This can facilitate the discussion on the fairness of a model, addressing the issue that it may be difficult to specify a priori a suitable metric. Our contributions are twofold: First, we introduce the definition of a minimal metric and characterize the behavior of models in terms of minimal metrics. Second, for more complicated models, we apply the mechanism of randomized smoothing from adversarial robustness to make them individually fair under a given weighted Lp metric. Our experiments show that adapting the minimal metrics of linear models to more complicated neural networks can lead to meaningful and interpretable fairness guarantees at little cost to utility.
【Keywords】: AI Ethics: Fairness; Machine Learning: Adversarial Machine Learning;
【Paper Link】 【Pages】:444-450
【Authors】: Boli Fang ; Miao Jiang ; Pei-yi Cheng ; Jerry Shen ; Yi Fang
【Abstract】: Effective complements to human judgment, artificial intelligence techniques have started to aid human decisions in complicated social decision problems across the world. Automated machine learning/deep learning(ML/DL) classification models, through quantitative modeling, have the potential to improve upon human decisions in a wide range of decision problems on social resource allocation such as Medicaid and Supplemental Nutrition Assistance Program(SNAP, commonly referred to as Food Stamps). However, given the limitations in ML/DL model design, these algorithms may fail to leverage various factors for decision making, resulting in improper decisions that allocate resources to individuals who may not be in the most need of such resource. In view of such an issue, we propose in this paper the strategy of fairgroups, based on the legal doctrine of disparate impact, to improve fairness in prediction outcomes. Experiments on various datasets demonstrate that our fairgroup construction method effectively boosts the fairness in automated decision making, while maintaining high prediction accuracy.
【Keywords】: AI Ethics: Fairness; AI Ethics: Societal Impact of AI; AI Ethics: Moral Decision Making; Machine Learning: Classification;
【Paper Link】 【Pages】:451-457
【Authors】: Emanuele Albini ; Antonio Rago ; Pietro Baroni ; Francesca Toni
【Abstract】: We propose a general method for generating counterfactual explanations (CFXs) for a range of Bayesian Network Classifiers (BCs), e.g. single- or multi-label, binary or multidimensional. We focus on explanations built from relations of (critical and potential) influence between variables, indicating the reasons for classifications, rather than any probabilistic information. We show by means of a theoretical analysis of CFXs’ properties that they serve the purpose of indicating (potentially) pivotal factors in the classification process, whose absence would give rise to different classifications. We then prove empirically for various BCs that CFXs provide useful information in real world settings, e.g. when race plays a part in parole violation prediction, and show that they have inherent advantages over existing explanation methods in the literature.
【Keywords】: AI Ethics: Explainability; Uncertainty in AI: Bayesian Networks;
【Paper Link】 【Pages】:458-465
【Authors】: Pingchuan Ma ; Shuai Wang ; Jin Liu
【Abstract】: Natural language processing (NLP) models have been increasingly used in sensitive application domains including credit scoring, insurance, and loan assessment. Hence, it is critical to know that the decisions made by NLP models are free of unfair bias toward certain subpopulation groups. In this paper, we propose a novel framework employing metamorphic testing, a well-established software testing scheme, to test NLP models and find discriminatory inputs that provoke fairness violations. Furthermore, inspired by recent breakthroughs in the certified robustness of machine learning, we formulate NLP model fairness in a practical setting as (ε, k)-fairness and accordingly smooth the model predictions to mitigate fairness violations. We demonstrate our technique using popular (commercial) NLP models, and successfully flag thousands of discriminatory inputs that can cause fairness violations. We further enhance the evaluated models by adding certified fairness guarantee at a modest cost.
【Keywords】: AI Ethics: Fairness; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:467-473
【Authors】: Zhisheng Zhong ; Hiroaki Akutsu ; Kiyoharu Aizawa
【Abstract】: Deep image compression systems mainly contain four components: encoder, quantizer, entropy model, and decoder. To optimize these four components, a joint rate-distortion framework was proposed, and many deep neural network-based methods achieved great success in image compression. However, almost all convolutional neural network-based methods treat channel-wise feature maps equally, reducing the flexibility in handling different types of information. In this paper, we propose a channel-level variable quantization network to dynamically allocate more bitrates for significant channels and withdraw bitrates for negligible channels. Specifically, we propose a variable quantization controller. It consists of two key components: the channel importance module, which can dynamically learn the importance of channels during training, and the splitting-merging module, which can allocate different bitrates for different channels. We also formulate the quantizer into a Gaussian mixture model manner. Quantitative and qualitative experiments verify the effectiveness of the proposed model and demonstrate that our method achieves superior performance and can produce much better visual reconstructions.
【Keywords】: Computer Vision: Other; Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:474-480
【Authors】: Yan Bai ; Yihang Lou ; Yongxing Dai ; Jun Liu ; Ziqian Chen ; Ling-Yu Duan
【Abstract】: Vehicle Re-Identification (ReID) has attracted lots of research efforts due to its great significance to the public security. In vehicle ReID, we aim to learn features that are powerful in discriminating subtle differences between vehicles which are visually similar, and also robust against different orientations of the same vehicle. However, these two characteristics are hard to be encapsulated into a single feature representation simultaneously with unified supervision. Here we propose a Disentangled Feature Learning Network (DFLNet) to learn orientation specific and common features concurrently, which are discriminative at details and invariant to orientations, respectively. Moreover, to effectively use these two types of features for ReID, we further design a feature metric alignment scheme to ensure the consistency of the metric scales. The experiments show the effectiveness of our method that achieves state-of-the-art performance on three challenging datasets.
【Keywords】: Computer Vision: Perception; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:481-487
【Authors】: Agostina Calabrese ; Michele Bevilacqua ; Roberto Navigli
【Abstract】: The problem of grounding language in vision is increasingly attracting scholarly efforts. As of now, however, most of the approaches have been limited to word embeddings, which are not capable of handling polysemous words. This is mainly due to the limited coverage of the available semantically-annotated datasets, hence forcing research to rely on alternative technologies (i.e., image search engines). To address this issue, we introduce EViLBERT, an approach which is able to perform image classification over an open set of concepts, both concrete and non-concrete. Our approach is based on the recently introduced Vision-Language Pretraining (VLP) model, and builds upon a manually-annotated dataset of concept-image pairs. We use our technique to clean up the image-to-concept mapping that is provided within a multilingual knowledge base, resulting in over 258,000 images associated with 42,500 concepts. We show that our VLP-based model can be used to create multimodal sense embeddings starting from our automatically-created dataset. In turn, we also show that these multimodal embeddings improve the performance of a Word Sense Disambiguation architecture over a strong unimodal baseline. We release code, dataset and embeddings at http://babelpic.org.
【Keywords】: Computer Vision: Language and Vision; Natural Language Processing: Embeddings; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:488-494
【Authors】: Haimei Zhao ; Wei Bian ; Bo Yuan ; Dacheng Tao
【Abstract】: Scene perceiving and understanding tasks including depth estimation, visual odometry (VO) and camera relocalization are fundamental for applications such as autonomous driving, robots and drones. Driven by the power of deep learning, significant progress has been achieved on individual tasks but the rich correlations among the three tasks are largely neglected. In previous studies, VO is generally accurate in local scope yet suffers from drift in long distances. By contrast, camera relocalization performs well in the global sense but lacks local precision. We argue that these two tasks should be strategically combined to leverage the complementary advantages, and be further improved by exploiting the 3D geometric information from depth data, which is also beneficial for depth estimation in turn. Therefore, we present a collaborative learning framework, consisting of DepthNet, LocalPoseNet and GlobalPoseNet with a joint optimization loss to estimate depth, VO and camera localization unitedly. Moreover, the Geometric Attention Guidance Model is introduced to exploit the geometric relevance among three branches during learning. Extensive experiments demonstrate that the joint learning scheme is useful for all tasks and our method outperforms current state-of-the-art techniques in depth estimation and camera relocalization with highly competitive performance in VO.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Robotics: Localization, Mapping, State Estimation;
【Paper Link】 【Pages】:495-501
【Authors】: Elena Burceanu ; Marius Leordeanu
【Abstract】: We formulate object segmentation in video as a spectral graph clustering problem in space and time, in which nodes are pixels and their relations form local neighbourhoods. We claim that the strongest cluster in this pixel-level graph represents the salient object segmentation. We compute the main cluster using a novel and fast 3D filtering technique that finds the spectral clustering solution, namely the principal eigenvector of the graph's adjacency matrix, without building the matrix explicitly - which would be intractable. Our method is based on the power iteration which we prove is equivalent to performing a specific set of 3D convolutions in the space-time feature volume. This allows us to avoid creating the matrix and have a fast parallel implementation on GPU. We show that our method is much faster than classical power iteration applied directly on the adjacency matrix. Different from other works, ours is dedicated to preserving object consistency in space and time at the level of pixels. In experiments, we obtain consistent improvement over the top state of the art methods on DAVIS-2016 dataset. We also achieve top results on the well-known SegTrackv2 dataset.
【Keywords】: Computer Vision: Structural and Model-Based Approaches, Knowledge Representation and Reasoning; Data Mining: Clustering, Unsupervised Learning; Data Mining: Theoretical Foundations; Computer Vision: Motion and Tracking;
【Paper Link】 【Pages】:502-508
【Authors】: Qiyao Deng ; Jie Cao ; Yunfan Liu ; Zhenhua Chai ; Qi Li ; Zhenan Sun
【Abstract】: Face portrait editing has achieved great progress in recent years. However, previous methods either 1) operate on pre-defined face attributes, lacking the flexibility of controlling shapes of high-level semantic facial components (e.g., eyes, nose, mouth), or 2) take manually edited mask or sketch as an intermediate representation for observable changes, but such additional input usually requires extra efforts to obtain. To break the limitations (e.g. shape, mask or sketch) of the existing methods, we propose a novel framework termed r FACE (Reference Guided FAce Component Editing) for diverse and controllable face component editing with geometric changes. Specifically, r-FACE takes an image inpainting model as the backbone, utilizing reference images as conditions for controlling the shape of face components. In order to encourage the framework to concentrate on the target face components, an example-guided attention module is designed to fuse attention features and the target face component features extracted from the reference image. Through extensive experimental validation and comparisons, we verify the effectiveness of the proposed framework.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Biometrics, Face and Gesture Recognition;
【Paper Link】 【Pages】:509-515
【Authors】: Xiao Wang ; Jun Chen ; Zheng Wang ; Wu Liu ; Shin'ichi Satoh ; Chao Liang ; Chia-Wen Lin
【Abstract】: Pedestrian detection at nighttime is a crucial and frontier problem in surveillance, but has not been well explored by the computer vision and artificial intelligence communities. Most of existing methods detect pedestrians under favorable lighting conditions (e.g. daytime) and achieve promising performances. In contrast, they often fail under unstable lighting conditions (e.g. nighttime). Night is a critical time for criminal suspects to act in the field of security. The existing nighttime pedestrian detection dataset is captured by a car camera, specially designed for autonomous driving scenarios. The dataset for nighttime surveillance scenario is still vacant. There are vast differences between autonomous driving and surveillance, including viewpoint and illumination. In this paper, we build a novel pedestrian detection dataset from the nighttime surveillance aspect: NightSurveillance1. As a benchmark dataset for pedestrian detection at nighttime, we compare the performances of state-of-the-art pedestrian detectors and the results reveal that the methods cannot solve all the challenging problems of NightSurveillance. We believe that NightSurveillance can further advance the research of pedestrian detection, especially in the field of surveillance security at nighttime.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Video: Events, Activities and Surveillance; Computer Vision: Other;
【Paper Link】 【Pages】:516-522
【Authors】: Jian Ye ; Zhe Chen ; Juhua Liu ; Bo Du
【Abstract】: Arbitrary shape text detection in natural scenes is an extremely challenging task. Unlike existing text detection approaches that only perceive texts based on limited feature representations, we propose a novel framework, namely TextFuseNet, to exploit the use of richer features fused for text detection. More specifically, we propose to perceive texts from three levels of feature representations, i.e., character-, word- and global-level, and then introduce a novel text representation fusion technique to help achieve robust arbitrary text detection. The multi-level feature representation can adequately describe texts by dissecting them into individual characters while still maintaining their general semantics. TextFuseNet then collects and merges the texts’ features from different levels using a multi-path fusion architecture which can effectively align and fuse different representations. In practice, our proposed TextFuseNet can learn a more adequate description of arbitrary shapes texts, suppressing false positives and producing more accurate detection results. Our proposed framework can also be trained with weak supervision for those datasets that lack character-level annotations. Experiments on several datasets show that the proposed TextFuseNet achieves state-of-the-art performance. Specifically, we achieve an F-measure of 94.3% on ICDAR2013, 92.1% on ICDAR2015, 87.1% on Total-Text and 86.6% on CTW-1500, respectively.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Semi-Supervised Learning;
【Paper Link】 【Pages】:523-529
【Authors】: Zhengsu Chen ; Jianwei Niu ; Xuefeng Liu ; Shaojie Tang
【Abstract】: Convolutional neural networks (CNNs) have achieved remarkable success in image recognition. Although the internal patterns of the input images are effectively learned by the CNNs, these patterns only constitute a small proportion of useful patterns contained in the input images. This can be attributed to the fact that the CNNs will stop learning if the learned patterns are enough to make a correct classification. Network regularization methods like dropout and SpatialDropout can ease this problem. During training, they randomly drop the features. These dropout methods, in essence, change the patterns learned by the networks, and in turn, forces the networks to learn other patterns to make the correct classification. However, the above methods have an important drawback. Randomly dropping features is generally inefficient and can introduce unnecessary noise. To tackle this problem, we propose SelectScale. Instead of randomly dropping units, SelectScale selects the important features in networks and adjusts them during training. Using SelectScale, we improve the performance of CNNs on CIFAR and ImageNet.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:530-536
【Authors】: Qiankun Liu ; Qi Chu ; Bin Liu ; Nenghai Yu
【Abstract】: The popular tracking-by-detection paradigm for multi-object tracking (MOT) focuses on solving data association problem, of which a robust similarity model lies in the heart. Most previous works make effort to improve feature representation for individual object while leaving the relations among objects less explored, which may be problematic in some complex scenarios. In this paper, we focus on leveraging the relations among objects to improve robustness of the similarity model. To this end, we propose a novel graph representation that takes both the feature of individual object and the relations among objects into consideration. Besides, a graph matching module is specially designed for the proposed graph representation to alleviate the impact of unreliable relations. With the help of the graph representation and the graph matching module, the proposed graph similarity model, named GSM, is more robust to the occlusion and the targets sharing similar appearance. We conduct extensive experiments on challenging MOT benchmarks and the experimental results demonstrate the effectiveness of the proposed method.
【Keywords】: Computer Vision: Motion and Tracking;
【Paper Link】 【Pages】:537-543
【Authors】: Feng Li ; Runming Cong ; Huihui Bai ; Yifan He
【Abstract】: Recently, Convolutional Neural Networks (CNN) based image super-resolution (SR) have shown significant success in the literature. However, these methods are implemented as single-path stream to enrich feature maps from the input for the final prediction, which fail to fully incorporate former low-level features into later high-level features. In this paper, to tackle this problem, we propose a deep interleaved network (DIN) to learn how information at different states should be combined for image SR where shallow information guides deep representative features prediction. Our DIN follows a multi-branch pattern allowing multiple interconnected branches to interleave and fuse at different states. Besides, the asymmetric co-attention (AsyCA) is proposed and attacked to the interleaved nodes to adaptively emphasize informative features from different states and improve the discriminative ability of networks. Extensive experiments demonstrate the superiority of our proposed DIN in comparison with the state-of-the-art SR methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision;
【Paper Link】 【Pages】:544-550
【Authors】: Tong Wu ; Bicheng Dai ; Shuxin Chen ; Yanyun Qu ; Yuan Xie
【Abstract】: Despite recent great progress on semantic segmentation, there still exist huge challenges in medical ultra-resolution image segmentation. The methods based on multi-branch structure can make a good balance between computational burdens and segmentation accuracy. However, the fusion structure in these methods require to be designed elaborately to achieve desirable result, which leads to model redundancy. In this paper, we propose Meta Segmentation Network (MSN) to solve this challenging problem. With the help of meta-learning, the fusion module of MSN is quite simple but effective. MSN can fast generate the weights of fusion layers through a simple meta-learner, requiring only a few training samples and epochs to converge. In addition, to avoid learning all branches from scratch, we further introduce a particular weight sharing mechanism to realize a fast knowledge adaptation and share the weights among multiple branches, resulting in the performance improvement and significant parameters reduction. The experimental results on two challenging ultra-resolution medical datasets BACH and ISIC show that MSN achieves the best performance compared with the state-of-the-art approaches.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Biomedical Image Understanding;
【Paper Link】 【Pages】:551-557
【Authors】: Kun Wei ; Cheng Deng ; Xu Yang
【Abstract】: Zero-Shot Learning (ZSL) handles the problem that some testing classes never appear in training set. Existing ZSL methods are designed for learning from a fixed training set, which do not have the ability to capture and accumulate the knowledge of multiple training sets, causing them infeasible to many real-world applications. In this paper, we propose a new ZSL setting, named as Lifelong Zero-Shot Learning (LZSL), which aims to accumulate the knowledge during the learning from multiple datasets and recognize unseen classes of all trained datasets. Besides, a novel method is conducted to realize LZSL, which effectively alleviates the Catastrophic Forgetting in the continuous training process. Specifically, considering those datasets containing different semantic embeddings, we utilize Variational Auto-Encoder to obtain unified semantic representations. Then, we leverage selective retraining strategy to preserve the trained weights of previous tasks and avoid negative transfer when fine-tuning the entire model. Finally, knowledge distillation is employed to transfer knowledge from previous training stages to current stage. We also design the LZSL evaluation protocol and the challenging benchmarks. Extensive experiments on these benchmarks indicate that our method tackles LZSL problem effectively, while existing ZSL methods fail.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Generative Models; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:558-565
【Authors】: Haytham M. Fayek ; Anurag Kumar
【Abstract】: Recognizing sounds is a key aspect of computational audio scene analysis and machine perception. In this paper, we advocate that sound recognition is inherently a multi-modal audiovisual task in that it is easier to differentiate sounds using both the audio and visual modalities as opposed to one or the other. We present an audiovisual fusion model that learns to recognize sounds from weakly labeled video recordings. The proposed fusion model utilizes an attention mechanism to dynamically combine the outputs of the individual audio and visual models. Experiments on the large scale sound events dataset, AudioSet, demonstrate the efficacy of the proposed model, which outperforms the single-modal models, and state-of-the-art fusion and multi-modal models. We achieve a mean Average Precision (mAP) of 46.16 on Audioset, outperforming prior state of the art by approximately +4.35 mAP (relative: 10.4%).
【Keywords】: Computer Vision: Video: Events, Activities and Surveillance; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Natural Language Processing: Speech;
【Paper Link】 【Pages】:566-572
【Authors】: Menglu Wang ; Xueyang Fu ; Zepei Sun ; Zheng-Jun Zha
【Abstract】: Existing deep learning-based image de-blocking methods use only pixel-level loss functions to guide network training. The JPEG compression factor, which reflects the degradation degree, has not been fully utilized. However, due to the non-differentiability, the compression factor cannot be directly utilized to train deep networks. To solve this problem, we propose compression quality ranker-guided networks for this specific JPEG artifacts removal. We first design a quality ranker to measure the compression degree, which is highly correlated with the JPEG quality. Based on this differentiable ranker, we then propose one quality-related loss and one feature matching loss to guide de-blocking and perceptual quality optimization. In addition, we utilize dilated convolutions to extract multi-scale features, which enables our single model to handle multiple compression quality factors. Our method can implicitly use the information contained in the compression factors to produce better results. Experiments demonstrate that our model can achieve comparable or even better performance in both quantitative and qualitative measurements.
【Keywords】: Computer Vision: Perception; Computer Vision: Computational Photography, Photometry, Shape from X; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:573-579
【Authors】: Siddhartha Gairola ; Mayur Hemani ; Ayush Chopra ; Balaji Krishnamurthy
【Abstract】: Few-shot segmentation (FSS) methods perform image segmentation for a particular object class in a target (query) image, using a small set of (support) image-mask pairs. Recent deep neural network based FSS methods leverage high-dimensional feature similarity between the foreground features of the support images and the query image features. In this work, we demonstrate gaps in the utilization of this similarity information in existing methods, and present a framework - SimPropNet, to bridge those gaps. We propose to jointly predict the support and query masks to force the support features to share characteristics with the query features. We also propose to utilize similarities in the background regions of the query and support images using a novel foreground-background attentive fusion mechanism. Our method achieves state-of-the-art results for one-shot and five-shot segmentation on the PASCAL-5i dataset. The paper includes detailed analysis and ablation studies for the proposed improvements and quantitative comparisons with contemporary methods.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Semi-Supervised Learning;
【Paper Link】 【Pages】:580-586
【Authors】: Jiangzhang Gan ; Xiaofeng Zhu ; Rongyao Hu ; Yonghua Zhu ; Junbo Ma ; Zi-Wen Peng ; Guorong Wu
【Abstract】: Brain functional connectivity analysis on fMRI data could improve the understanding of human brain function. However, due to the influence of the inter-subject variability and the heterogeneity across subjects, previous methods of functional connectivity analysis are often insufficient in capturing disease-related representation so that decreasing disease diagnosis performance. In this paper, we first propose a new multi-graph fusion framework to fine-tune the original representation derived from Pearson correlation analysis, and then employ L1-SVM on fine-tuned representations to conduct joint brain region selection and disease diagnosis for avoiding the issue of the curse of dimensionality on high-dimensional data. The multi-graph fusion framework automatically learns the connectivity number for every node (i.e., brain region) and integrates all subjects in a unified framework to output homogenous and discriminative representations of all subjects. Experimental results on two real data sets, i.e., fronto-temporal dementia (FTD) and obsessive-compulsive disorder (OCD), verified the effectiveness of our proposed framework, compared to state-of-the-art methods.
【Keywords】: Computer Vision: Biomedical Image Understanding; Machine Learning: Classification; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:587-593
【Authors】: Tao He ; Lianli Gao ; Jingkuan Song ; Jianfei Cai ; Yuan-Fang Li
【Abstract】: Despite the huge progress in scene graph generation in recent years, its long-tail distribution in object relationships remains a challenging and pestering issue. Existing methods largely rely on either external knowledge or statistical bias information to alleviate this problem. In this paper, we tackle this issue from another two aspects: (1) scene-object interaction aiming at learning specific knowledge from a scene via an additive attention mechanism; and (2) long-tail knowledge transfer which tries to transfer the rich knowledge learned from the head into the tail. Extensive experiments on the benchmark dataset Visual Genome on three tasks demonstrate that our method outperforms current state-of-the-art competitors. Our source code is available at https://github.com/htlsn/issg.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:594-600
【Authors】: Lianli Gao ; Zhilong Zhou ; Heng Tao Shen ; Jingkuan Song
【Abstract】: Image edge detection is considered as a cornerstone task in computer vision. Due to the nature of hierarchical representations learned in CNN, it is intuitive to design side networks utilizing the richer convolutional features to improve the edge detection. However, there is no consensus way to integrate the hierarchical information. In this paper, we propose an effective and end-to-end framework, named Bidirectional Additive Net (BAN), for image edge detection. In the proposed framework, we focus on two main problems: 1) how to design a universal network for incorporating hierarchical information sufficiently; and 2) how to achieve effective information flow between different stages and gradually improve the edge map stage by stage. To tackle these problems, we design a consecutive bottom-up and top-down architecture, where a bottom-up branch can gradually remove detailed or sharp boundaries to enable accurate edge detection and a top-down branch offers a chance of error-correcting by revisiting the low-level features that contain rich textual and spatial information. And attended additive module (AAM) is designed to cumulatively refine edges by selecting pivotal features in each stage. Experimental results show that our proposed methods can improve the edge detection performance to new records and achieve state-of-the-art results on two public benchmarks: BSDS500 and NYUDv2.
【Keywords】: Computer Vision: Perception; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:601-607
【Authors】: Jiachen Xu ; Jingyu Gong ; Jie Zhou ; Xin Tan ; Yuan Xie ; Lizhuang Ma
【Abstract】: Besides local features, global information plays an essential role in semantic segmentation, while recent works usually fail to explicitly extract the meaningful global information and make full use of it. In this paper, we propose a SceneEncoder module to impose a scene-aware guidance to enhance the effect of global information. The module predicts a scene descriptor, which learns to represent the categories of objects existing in the scene and directly guides the point-level semantic segmentation through filtering out categories not belonging to this scene. Additionally, to alleviate segmentation noise in local region, we design a region similarity loss to propagate distinguishing features to their own neighboring points with the same label, leading to the enhancement of the distinguishing ability of point-wise features. We integrate our methods into several prevailing networks and conduct extensive experiments on benchmark datasets ScanNet and ShapeNet. Results show that our methods greatly improve the performance of baselines and achieve state-of-the-art performance.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Perception; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:608-614
【Authors】: Vikram Ravindra ; Ananth Grama
【Abstract】: The problem of characterizing brain functions such as memory, perception, and processing of stimuli has received significant attention in neuroscience literature. These experiments rely on carefully calibrated, albeit complex inputs, to record brain response to signals. A major problem in analyzing brain response to common stimuli such as audio-visual input from videos (e.g., movies) or story narration through audio books, is that observed neuronal responses are due to combinations of ``pure'' factors, many of which may be latent. In this paper, we present a novel methodological framework for deconvolving the brain's response to mixed stimuli into its constituent responses to underlying pure factors. This framework, based on archetypal analysis, is applied to the analysis of imaging data from an adult cohort watching the BBC show, Sherlock. By focusing on visual stimulus, we show strong correlation between our observed deconvolved response and third-party textual video annotations -- demonstrating the significant power of our analyses techniques. Building on these results, we show that our techniques can be used to predict neuronal responses in new subjects (how other individuals react to Sherlock), as well as to new visual content (how individuals react to other videos with known annotations). This paper reports on the first study that relates video features with neuronal responses in a rigorous algorithmic and statistical framework based on deconvolution of observed mixed imaging signals using archetypal analysis.
【Keywords】: Computer Vision: Biomedical Image Understanding; Machine Learning: Feature Selection; Learning Sparse Models; Humans and AI: Brain Sciences;
【Paper Link】 【Pages】:615-622
【Authors】: Han-Yi Lin ; Pi-Cheng Hsiu ; Tei-Wei Kuo ; Yen-Yu Lin
【Abstract】: Spatiotemporal super-resolution (SR) aims to upscale both the spatial and temporal dimensions of input videos, and produces videos with higher frame resolutions and rates. It involves two essential sub-tasks: spatial SR and temporal SR. We design a two-stream network for spatiotemporal SR in this work. One stream contains a temporal SR module followed by a spatial SR module, while the other stream has the same two modules in the reverse order. Based on the interchangeability of performing the two sub-tasks, the two network streams are supposed to produce consistent spatiotemporal SR results. Thus, we present a cross-stream consistency to enforce the similarity between the outputs of the two streams. In this way, the training of the two streams is correlated, which allows the two SR modules to share their supervisory signals and improve each other. In addition, the proposed cross-stream consistency does not consume labeled training data and can guide network training in an unsupervised manner. We leverage this property to carry out semi-supervised spatiotemporal SR. It turns out that our method makes the most of training data, and can derive an effective model with few high-resolution and high-frame-rate videos, achieving the state-of-the-art performance. The source code of this work is available at https://hankweb.github.io/STSRwithCrossTask/.
【Keywords】: Computer Vision: Other; Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Semi-Supervised Learning;
【Paper Link】 【Pages】:623-629
【Authors】: Siyu Huang ; Haoyi Xiong ; Zhi-Qi Cheng ; Qingzhong Wang ; Xingran Zhou ; Bihan Wen ; Jun Huan ; Dejing Dou
【Abstract】: Generation of high-quality person images is challenging, due to the sophisticated entanglements among image factors, e.g., appearance, pose, foreground, background, local details, global structures, etc. In this paper, we present a novel end-to-end framework to generate realistic person images based on given person poses and appearances. The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively. The framework is highly flexible and controllable by effectively decoupling various complex person image factors in the encoding phase, followed by re-coupling them in the decoding phase. In addition, we present a new normalization method named adaptive patch normalization, which enables region-specific normalization and shows a good performance when adopted in person image generation model. Experiments on two benchmark datasets show that our method is capable of generating visually appealing and realistic-looking results using arbitrary image and pose inputs.
【Keywords】: Computer Vision: Other; Machine Learning: Deep Generative Models;
【Paper Link】 【Pages】:630-636
【Authors】: Tao Jin ; Siyu Huang ; Ming Chen ; Yingming Li ; Zhongfei Zhang
【Abstract】: In this paper, we focus on the problem of applying the transformer structure to video captioning effectively. The vanilla transformer is proposed for uni-modal language generation task such as machine translation. However, video captioning is a multimodal learning problem, and the video features have much redundancy between different time steps. Based on these concerns, we propose a novel method called sparse boundary-aware transformer (SBAT) to reduce the redundancy in video representation. SBAT employs boundary-aware pooling operation for scores from multihead attention and selects diverse features from different scenarios. Also, SBAT includes a local correlation scheme to compensate for the local information loss brought by sparse operation. Based on SBAT, we further propose an aligned cross-modal encoding scheme to boost the multimodal interaction. Experimental results on two benchmark datasets show that SBAT outperforms the state-of-the-art methods under most of the metrics.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Video: Events, Activities and Surveillance;
【Paper Link】 【Pages】:637-644
【Authors】: Xinjian Huang ; Bo Du ; Weiwei Liu
【Abstract】: The R, G and B channels of a color image generally have different noise statistical properties or noise strengths. It is thus problematic to apply grayscale image denoising algorithms to color image denoising. In this paper, based on the non-local self-similarity of an image and the different noise strength across each channel, we propose a MultiChannel Weighted Schatten p-Norm Minimization (MCWSNM) model for RGB color image denoising. More specifically, considering a small local RGB patch in a noisy image, we first find its nonlocal similar cubic patches in a search window with an appropriate size. These similar cubic patches are then vectorized and grouped to construct a noisy low-rank matrix, which can be recovered using the Schatten p-norm minimization framework. Moreover, a weight matrix is introduced to balance each channel’s contribution to the final denoising results. The proposed MCWSNM can be solved via the alternating direction method of multipliers. Convergence property of the proposed method are also theoretically analyzed . Experiments conducted on both synthetic and real noisy color image datasets demonstrate highly competitive denoising performance, outperforming comparison algorithms, including several methods based on neural networks.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Other; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:645-651
【Authors】: Yawen Huang ; Feng Zheng ; Danyang Wang ; Junyu Jiang ; Xiaoqian Wang ; Ling Shao
【Abstract】: Image super-resolution (SR) and image inpainting are two topical problems in medical image processing. Existing methods for solving the problems are either tailored to recovering a high-resolution version of the low-resolution image or focus on filling missing values, thus inevitably giving rise to poor performance when the acquisitions suffer from multiple degradations. In this paper, we explore the possibility of super-resolving and inpainting images to handle multiple degradations and therefore improve their usability. We construct a unified and scalable framework to overcome the drawbacks of propagated errors caused by independent learning. We additionally provide improvements over previously proposed super-resolution approaches by modeling image degradation directly from data observations rather than bicubic downsampling. To this end, we propose HLH-GAN, which includes a high-to-low (H-L) GAN together with a low-to-high (L-H) GAN in a cyclic pipeline for solving the medical image degradation problem. Our comparative evaluation demonstrates that the effectiveness of the proposed method on different brain MRI datasets. In addition, our method outperforms many existing super-resolution and inpainting approaches.
【Keywords】: Computer Vision: Biomedical Image Understanding; Computer Vision: Other; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:652-658
【Authors】: Zhikun Huang ; Zhedong Zheng ; Chenggang Yan ; Hongtao Xie ; Yaoqi Sun ; Jianzhong Wang ; Jiyong Zhang
【Abstract】: This paper focuses on the real-world automatic makeup problem. Given one non-makeup target image and one reference image, the automatic makeup is to generate one face image, which maintains the original identity with the makeup style in the reference image. In the real-world scenario, face makeup task demands a robust system against the environmental variants. The two main challenges in real-world face makeup could be summarized as follow: first, the background in real-world images is complicated. The previous methods are prone to change the style of background as well; second, the foreground faces are also easy to be affected. For instance, the ``heavy'' makeup may lose the discriminative information of the original identity. To address these two challenges, we introduce a new makeup model, called Identity Preservation Makeup Net (IPM-Net), which preserves not only the background but the critical patterns of the original identity. Specifically, we disentangle the face images to two different information codes, i.e., identity content code and makeup style code. When inference, we only need to change the makeup style code to generate various makeup images of the target person. In the experiment, we show the proposed method achieves not only better accuracy in both realism (FID) and diversity (LPIPS) in the test set, but also works well on the real-world images collected from the Internet.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:659-665
【Authors】: Ziwei Wang ; Zi Huang ; Yadan Luo
【Abstract】: Image captioning aims to describe an image with a concise, accurate, and interesting sentence. To build such an automatic neural captioner, the traditional models align the generated words with a number of human-annotated sentences to mimic human-like captions. However, the crowd-sourced annotations inevitably come with data quality issues such as grammatical errors, wrong identification of visual objects and sub-optimal sentence focus. During the model training, existing methods treat all the annotations equally regardless of the data quality. In this work, we explicitly engage human consensus to measure the quality of ground truth captions in advance, and directly encourage the model to learn high quality captions with high priority. Therefore, the proposed consensus-oriented method can accelerate the training process and achieve superior performance with only supervised objective without time-consuming reinforcement learning. The novel consensus loss can be implemented into most of the existing state-of-the-art methods, boosting the BLEU-4 performance by maximum relative 12.47% comparing to the conventional cross-entropy loss. Extensive experiments are conducted on MS-COCO Image Captioning dataset demonstrating the proposed human consensus-oriented training method can significantly improve the training efficiency and model effectiveness.
【Keywords】: Computer Vision: Language and Vision; Machine Learning: Deep Learning; Natural Language Processing: Machine Translation;
【Paper Link】 【Pages】:666-672
【Authors】: Changbin Shao ; Jing Huo ; Lei Qi ; Zhen-Hua Feng ; Wenbin Li ; Chuanqi Dong ; Yang Gao
【Abstract】: To address the challenges posed by unknown occlusions, we propose a Biased Feature Learning (BFL) framework for occlusion-invariant face recognition. We first construct an extended dataset using a multi-scale data augmentation method. For model training, we modify the label loss to adjust the impact of normal and occluded samples. Further, we propose a biased guidance strategy to manipulate the optimization of a network so that the feature embedding space is dominated by non-occluded faces. BFL not only enhances the robustness of a network to unknown occlusions but also maintains or even improves its performance for normal faces. Experimental results demonstrate its superiority as well as the generalization capability with different network architectures and loss functions.
【Keywords】: Computer Vision: Biometrics, Face and Gesture Recognition;
【Paper Link】 【Pages】:673-679
【Authors】: Mingbao Lin ; Rongrong Ji ; Yuxin Zhang ; Baochang Zhang ; Yongjian Wu ; Yonghong Tian
【Abstract】: Channel pruning is among the predominant approaches to compress deep neural networks. To this end, most existing pruning methods focus on selecting channels (filters) by importance/optimization or regularization based on rule-of-thumb designs, which defects in sub-optimal pruning. In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i.e., channel number in each layer, rather than selecting "important" channels as previous works did. To solve the intractably huge combinations of pruned structure for deep networks, we first propose to shrink the combinations where the preserved channels are limited to a specific space, thus the combinations of pruned structure can be significantly reduced. And then, we formulate the search of optimal pruned structure as an optimization problem and integrate the ABC algorithm to solve it in an automatic manner to lessen human interference. ABCPruner has been demonstrated to be more effective, which also enables the fine-tuning to be conducted efficiently in an end-to-end manner. The source codes can be available at https: //github.com/lmbxmu/ABCPruner.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:680-686
【Authors】: Rui Zheng ; Fei Jiang ; Ruimin Shen
【Abstract】: Students’ gestures, hand-raising, stand-up, and sleeping, indicates the engagement of students in classrooms and partially reflects teaching quality. Therefore, fast and automatically recognizing these gestures are of great importance. Due to limited computational resources in primary and secondary schools, we propose a real-time student behavior detector based on light-weight MobileNetV2-SSD to reduce the dependency of GPUs. Firstly, we build a large-scale corpus from real schools to capture various behavior gestures. Based on such a corpus, we transfer the gesture recognition task into object detections. Secondly, we design a multi-dimensional attention-based detector, named GestureDet, for real-time and accurate gesture analysis. The multi-dimensional attention mechanisms simultaneously consider all the dimensions of the training set, aiming to pay more attention to discriminative features and samples that are important for the final performance. Specifically, the spatial attention is constructed with stacked dilated convolution layers to generate a soft and learnable mask for re-weighting foreground and background features; the channel attention introduces the context modeling and squeeze-and-excitation module to focus on discriminative features; the batch attention discriminates important samples with a new designed reweight strategy. Experimental results demonstrate the effectiveness and versatility of GestureDet, which achieves 75.2% mAP on real student behavior dataset, and 74.5% on public PASCAL VOC dataset at 20fps on embedding device Nvidia Jetson TX2. Code will be made publicly available.
【Keywords】: Computer Vision: Biometrics, Face and Gesture Recognition; Multidisciplinary Topics and Applications: Real-Time Systems; Humans and AI: Computer-Aided Education;
【Paper Link】 【Pages】:687-693
【Authors】: Xiaoze Jiang ; Jing Yu ; Yajing Sun ; Zengchang Qin ; Zihao Zhu ; Yue Hu ; Qi Wu
【Abstract】: Visual Dialogue task requires an agent to be engaged in a conversation with human about an image. The ability of generating detailed and non-repetitive responses is crucial for the agent to achieve human-like conversation. In this paper, we propose a novel generative decoding architecture to generate high-quality responses, which moves away from decoding the whole encoded semantics towards the design that advocates both transparency and flexibility. In this architecture, word generation is decomposed into a series of attention-based information selection steps, performed by the novel recurrent Deliberation, Abandon and Memory (DAM) module. Each DAM module performs an adaptive combination of the response-level semantics captured from the encoder and the word-level semantics specifically selected for generating each word. Therefore, the responses contain more detailed and non-repetitive descriptions while maintaining the semantic accuracy. Furthermore, DAM is flexible to cooperate with existing visual dialogue encoders and adaptive to the encoder structures by constraining the information selection mode in DAM. We apply DAM to three typical encoders and verify the performance on the VisDial v1.0 dataset. Experimental results show that the proposed models achieve new state-of-the-art performance with high-quality responses. The code is available at https://github.com/JXZe/DAM.
【Keywords】: Computer Vision: Language and Vision; Natural Language Processing: Dialogue;
【Paper Link】 【Pages】:694-700
【Authors】: Yakun Ju ; Kin-Man Lam ; Yang Chen ; Lin Qi ; Junyu Dong
【Abstract】: We present an attention-weighted loss in a photometric stereo neural network to improve 3D surface recovery accuracy in complex-structured areas, such as edges and crinkles, where existing learning-based methods often failed. Instead of using a uniform penalty for all pixels, our method employs the attention-weighted loss learned in a self-supervise manner for each pixel, avoiding blurry reconstruction result in such difficult regions. The network first estimates a surface normal map and an adaptive attention map, and then the latter is used to calculate a pixel-wise attention-weighted loss that focuses on complex regions. In these regions, the attention-weighted loss applies higher weights of the detail-preserving gradient loss to produce clear surface reconstructions. Experiments on real datasets show that our approach significantly outperforms traditional photometric stereo algorithms and state-of-the-art learning-based methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Computational Photography, Photometry, Shape from X; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:701-707
【Authors】: Meng Lan ; Yipeng Zhang ; Qinning Xu ; Lefei Zhang
【Abstract】: In the semi-supervised video object segmentation (VOS) field, SiamMask has achieved competitive accuracy and the fastest running speed. However, the two-stage training procedure requires additional manual intervention, and using only single-level features does not maximize the rich hierarchical feature information. This paper proposes an efficient end-to-end Siamese network for VOS. In particular, a supervised sampling strategy is designed to optimize the training procedure. Such an optimization facilitates the training of the entire model in an end-to-end manner. Moreover, a multilevel feature aggregation module is developed to enhance feature representability and improve segmentation accuracy. Experimental results on DAVIS2016 and DAVIS2017 datasets show that the proposed approach outperforms the SiamMask in accuracy with similar FPS. Moreover, this approach also achieves good accuracy-speed trade-off compared with that of other state-of-the-art VOS algorithms.
【Keywords】: Computer Vision: Motion and Tracking; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:708-715
【Authors】: Siyuan Li ; Zhi Zhang ; Ziyu Liu ; Anna Wang ; Linglong Qiu ; Feng Du
【Abstract】: Target localization and proposal generation are two essential subtasks in generic visual tracking, and it is a challenge to address both the two efficiently. In this paper, we propose an efficient two-stage architecture which makes full use of the complementarity of two subtasks to achieve robust localization and high-quality proposals generation of the target jointly. Specifically, our model performs a novel deformable central correlation operation by an online learning model in both two stages to locate new target centers while generating target proposals in the vicinity of these centers. The proposals are refined in the refinement stage to further improve accuracy and robustness. Moreover, the model benefits from multi-level features aggregation in a neck module and a feature enhancement module. We conduct extensive ablation studies to demonstrate the effectiveness of our proposed methods. Our tracker runs at over 30 FPS and sets a new state-of-the-art on five tracking benchmarks, including LaSOT, VOT2018, TrackingNet, GOT10k, OTB2015.
【Keywords】: Computer Vision: Motion and Tracking; Machine Learning: Deep Learning: Convolutional networks; Computer Vision: Video: Events, Activities and Surveillance; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:716-722
【Authors】: Chuanqi Dong ; Wenbin Li ; Jing Huo ; Zheng Gu ; Yang Gao
【Abstract】: Few-shot learning for visual recognition aims to adapt to novel unseen classes with only a few images. Recent work, especially the work based on low-level information, has achieved great progress. In these work, local representations (LRs) are typically employed, because LRs are more consistent among the seen and unseen classes. However, most of them are limited to an individual image-to-image or image-to-class measure manner, which cannot fully exploit the capabilities of LRs, especially in the context of a certain task. This paper proposes an Adaptive Task-aware Local Representations Network (ATL-Net) to address this limitation by introducing episodic attention, which can adaptively select the important local patches among the entire task, as the process of human recognition. We achieve much superior results on multiple benchmarks. On the miniImagenet, ATL-Net gains 0.93% and 0.88% improvements over the compared methods under the 5-way 1-shot and 5-shot settings. Moreover, ATL-Net can naturally tackle the problem that how to adaptively identify and weight the importance of different key local parts, which is the major concern of fine-grained recognition. Specifically, on the fine-grained dataset Stanford Dogs, ATL-Net outperforms the second best method with 5.39% and 9.69% gains under the 5-way 1-shot and 5-shot settings.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:723-730
【Authors】: Risheng Liu ; Zi Li ; Yuxi Zhang ; Xin Fan ; Zhongxuan Luo
【Abstract】: We address the challenging issue of deformable registration that robustly and efficiently builds dense correspondences between images. Traditional approaches upon iterative energy optimization typically invoke expensive computational load. Recent learning-based methods are able to efficiently predict deformation maps by incorporating learnable deep networks. Unfortunately, these deep networks are designated to learn deterministic features for classification tasks, which are not necessarily optimal for registration. In this paper, we propose a novel bi-level optimization model that enables jointly learning deformation maps and features for image registration. The bi-level model takes the energy for deformation computation as the upper-level optimization while formulates the maximum \emph{a posterior} (MAP) for features as the lower-level optimization. Further, we design learnable deep networks to simultaneously optimize the cooperative bi-level model, yielding robust and efficient registration. These deep networks derived from our bi-level optimization constitute an unsupervised end-to-end framework for learning both features and deformations. Extensive experiments of image-to-atlas and image-to-image deformable registration on 3D brain MR datasets demonstrate that we achieve state-of-the-art performance in terms of accuracy, efficiency, and robustness.
【Keywords】: Computer Vision: Biomedical Image Understanding; Machine Learning: Deep Learning: Convolutional networks; Computer Vision: Statistical Methods and Machine Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:731-737
【Authors】: Yingruo Fan ; Zhaojiang Lin
【Abstract】: Facial action unit (AU) intensity estimation aims to measure the intensity of different facial muscle movements. The external knowledge such as AU co-occurrence relationship is typically leveraged to improve performance. However, the AU characteristics may vary among individuals due to different physiological structures of human faces. To this end, we propose a novel geometry-guided representation learning (G2RL) method for facial AU intensity estimation. Specifically, our backbone model is based on a heatmap regression framework, where the produced heatmaps reflect rich information associated with AU intensities and their spatial distributions. Besides, we incorporate the external geometric knowledge into the backbone model to guide the training process via a learned projection matrix. The experimental results on two benchmark datasets demonstrate that our method is comparable with the state-of-the-art approaches, and validate the effectiveness of incorporating external geometric knowledge for facial AU intensity estimation.
【Keywords】: Computer Vision: Biometrics, Face and Gesture Recognition; Humans and AI: Human-Computer Interaction;
【Paper Link】 【Pages】:738-744
【Authors】: Chao Li ; Baolin Liu ; Jianguo Wei
【Abstract】: Using a convolutional neural network to build visual encoding and decoding models of the human brain is a good starting point for the study on relationship between deep learning and human visual cognitive mechanism. However, related studies have not fully considered their differences. In this paper, we assume that only a portion of neural network features is directly related to human brain signals, which we call shared features. In the encoding process, we extract shared features from the lower and higher layers of the neural network, and then build a non-negative sparse map to predict brain activities. In the decoding process, we use back-propagation to reconstruct visual stimuli, and use dictionary learning and a deep image prior to improve the robustness and accuracy of the algorithm. Experiments on a public fMRI dataset confirm the rationality of the encoding models, and comparing with a recently proposed method, our reconstruction results obtain significantly higher accuracy.
【Keywords】: Computer Vision: Biomedical Image Understanding; Humans and AI: Cognitive Modeling; Humans and AI: Human-Computer Interaction; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:745-752
【Authors】: Ganchao Tan ; Daqing Liu ; Meng Wang ; Zheng-Jun Zha
【Abstract】: Generating natural language descriptions for videos, i.e., video captioning, essentially requires step-by-step reasoning along the generation process. For example, to generate the sentence “a man is shooting a basketball”, we need to first locate and describe the subject “man”, next reason out the man is “shooting”, then describe the object “basketball” of shooting. However, existing visual reasoning methods designed for visual question answering are not appropriate to video captioning, for it requires more complex visual reasoning on videos over both space and time, and dynamic module composition along the generation process. In this paper, we propose a novel visual reasoning approach for video captioning, named Reasoning Module Networks (RMN), to equip the existing encoder-decoder framework with the above reasoning capacity. Specifically, our RMN employs 1) three sophisticated spatio-temporal reasoning modules, and 2) a dynamic and discrete module selector trained by a linguistic loss with a Gumbel approximation. Extensive experiments on MSVD and MSR-VTT datasets demonstrate the proposed RMN outperforms the state-of-the-art methods while providing an explicit and explainable generation process. Our code is available at https://github.com/tgc1997/RMN.
【Keywords】: Computer Vision: Language and Vision;
【Paper Link】 【Pages】:753-759
【Authors】: Haoze Wu ; Jiawei Liu ; Xierong Zhu ; Meng Wang ; Zheng-Jun Zha
【Abstract】: Applying multi-scale representations leads to consistent performance improvements on a wide range of image recognition tasks. However, with the addition of the temporal dimension in video domain, directly obtaining layer-wise multi-scale spatial-temporal features will add a lot extra computational cost. In this work, we propose a novel and efficient Multi-Scale Spatial-Temporal Integration Convolutional Tube (MSTI) aiming at achieving accurate recognition of actions with lower computational cost. It firstly extracts multi-scale spatial and temporal features through the multi-scale convolution block. Considering the interaction of different-scales representations and the interaction of spatial appearance and temporal motion, we employ the cross-scale attention weighted blocks to perform feature recalibration by integrating multi-scale spatial and temporal features. An end-to-end deep network, MSTI-Net, is also presented based on the proposed MSTI tube for human action recognition. Extensive experimental results show that our MSTI-Net significantly boosts the performance of existing convolution networks and achieves state-of-the-art accuracy on three challenging benchmarks, i.e., UCF-101, HMDB-51 and Kinetics-400, with much fewer parameters and FLOPs.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Action Recognition;
【Paper Link】 【Pages】:760-766
【Authors】: Chengwei Chen ; Jing Liu ; Yuan Xie ; Yin Xiao Ban ; Chunyun Wu ; Yiqing Tao ; Haichuan Song
【Abstract】: With the development of adversarial attack in deep learning, it is critical for abnormal detector to not only discover the out-of-distribution samples but also provide defence against the adversarial attacker. Since few previous universal detector is known to work well on both tasks, we consider against both scenarios by constructing a robust and effective technique, where one sample could be regarded as the abnormal sample if it exhibits a higher image reconstruction error. Due to the training instability issues existed in previous generative adversarial networks (GANs) based methods, in this paper we propose a dual auxiliary autoencoder to make a tradeoff between the capability of generator and discriminator, leading to a more stable training process and high-quality image reconstruction. Moreover, to generate discriminative and robust latent representations, the mutual information estimator regarded as latent regularizer is adopted to extract the most unique information of target class. Overall, our generative dual adversarial network simultaneously optimizes the image reconstruction space and latent space to improve the performance. Experiments show that our model has the clear superiority over cutting edge semi-supervised abnormal detectors and achieves the state-of-the-art results on the datasets.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:767-773
【Authors】: Longteng Guo ; Jing Liu ; Xinxin Zhu ; Xingjian He ; Jie Jiang ; Hanqing Lu
【Abstract】: Most image captioning models are autoregressive, i.e. they generate each word by conditioning on previously generated words, which leads to heavy latency during inference. Recently, non-autoregressive decoding has been proposed in machine translation to speed up the inference time by generating all words in parallel. Typically, these models use the word-level cross-entropy loss to optimize each word independently. However, such a learning process fails to consider the sentence-level consistency, thus resulting in inferior generation quality of these non-autoregressive models. In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL). CMAL formulates NAIC as a multi-agent reinforcement learning system where positions in the target sequence are viewed as agents that learn to cooperatively maximize a sentence-level reward. Besides, we propose to utilize massive unlabeled images to boost captioning performance. Extensive experiments on MSCOCO image captioning benchmark show that our NAIC model achieves a performance comparable to state-of-the-art autoregressive models, while brings 13.9x decoding speedup.
【Keywords】: Computer Vision: Language and Vision; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:774-781
【Authors】: Jiping Zheng ; Ganfeng Lu
【Abstract】: With the explosive growth of video data, video summarization which converts long-time videos to key frame sequences has become an important task in information retrieval and machine learning. Determinantal point processes (DPPs) which are elegant probabilistic models have been successfully applied to video summarization. However, existing DPP-based video summarization methods suffer from poor efficiency of outputting a specified size summary or neglecting inherent sequential nature of videos. In this paper, we propose a new model in the DPP lineage named k-SDPP in vein of sequential determinantal point processes but with fixed user specified size k. Our k-SDPP partitions sampled frames of a video into segments where each segment is with constant number of video frames. Moreover, an efficient branch and bound method (BB) considering sequential nature of the frames is provided to optimally select k frames delegating the summary from the divided segments. Experimental results show that our proposed BB method outperforms not only k-DPP and sequential DPP (seqDPP) but also the partition and Markovian assumption based methods.
【Keywords】: Computer Vision: Big Data and Large Scale Methods; Computer Vision: Other;
【Paper Link】 【Pages】:782-788
【Authors】: Shengyuan Liu ; Pei Lv ; Yuzhen Zhang ; Jie Fu ; Junjin Cheng ; Wanqing Li ; Bing Zhou ; Mingliang Xu
【Abstract】: This paper proposes a novel Semi-Dynamic Hypergraph Neural Network (SD-HNN) to estimate 3D human pose from a single image. SD-HNN adopts hypergraph to represent the human body to effectively exploit the kinematic constrains among adjacent and non-adjacent joints. Specifically, a pose hypergraph in SD-HNN has two components. One is a static hypergraph constructed according to the conventional tree body structure. The other is the semi-dynamic hypergraph representing the dynamic kinematic constrains among different joints. These two hypergraphs are combined together to be trained in an end-to-end fashion. Unlike traditional Graph Convolutional Networks (GCNs) that are based on a fixed tree structure, the SD-HNN can deal with ambiguity in human pose estimation. Experimental results demonstrate that the proposed method achieves state-of-the-art performance both on the Human3.6M and MPI-INF-3DHP datasets.
【Keywords】: Computer Vision: Biometrics, Face and Gesture Recognition; Machine Learning: Deep Learning: Convolutional networks; Computer Vision: 2D and 3D Computer Vision;
【Paper Link】 【Pages】:789-796
【Authors】: Gege Zhang ; Qinghua Ma ; Licheng Jiao ; Fang Liu ; Qigong Sun
【Abstract】: 3D point cloud semantic segmentation has attracted wide attention with its extensive applications in autonomous driving, AR/VR, and robot sensing fields. However, in existing methods, each point in the segmentation results is predicted independently from each other. This property causes the non-contiguity of label sets in three-dimensional space and produces many noisy label points, which hinders the improvement of segmentation accuracy. To address this problem, we first extend adversarial learning to this task and propose a novel framework Attention Adversarial Networks (AttAN). With high-order correlations in label sets learned from the adversarial learning, segmentation network can predict labels closer to the real ones and correct noisy results. Moreover, we design an additive attention block for the segmentation network, which is used to automatically focus on regions critical to the segmentation task by learning the correlation between multi-scale features. Adversarial learning, which explores the underlying relationship between labels in high-dimensional space, opens up a new way in 3D point cloud semantic segmentation. Experimental results on ScanNet and S3DIS datasets show that this framework effectively improves the segmentation quality and outperforms other state-of-the-art methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Adversarial Machine Learning; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:797-803
【Abstract】: We present a new method to improve the representational power of the features in Convolutional Neural Networks (CNNs). By studying traditional image processing methods and recent CNN architectures, we propose to use positional information in CNNs for effective exploration of feature dependencies. Rather than considering feature semantics alone, we incorporate spatial positions as an augmentation for feature semantics in our design. From this vantage, we present a Position-Aware Recalibration Module (PRM in short) which recalibrates features leveraging both feature semantics and position. Furthermore, inspired by multi-head attention, our module is capable of performing multiple recalibrations where results are concatenated as the output. As PRM is efficient and easy to implement, it can be seamlessly integrated into various base networks and applied to many position-aware visual tasks. Compared to original CNNs, our PRM introduces a negligible number of parameters and FLOPs, while yielding better performance. Experimental results on ImageNet and MS COCO benchmarks show that our approach surpasses related methods by a clear margin with less computational overhead. For example, we improve the ResNet50 by absolute 1.75% (77.65% vs. 75.90%) on ImageNet 2012 validation dataset, and 1.5%~1.9% mAP on MS COCO validation dataset with almost no computational overhead. Codes are made publicly available.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:804-810
【Authors】: Yuqing Ma ; Shihao Bai ; Shan An ; Wei Liu ; Aishan Liu ; Xiantong Zhen ; Xianglong Liu
【Abstract】: Few-shot learning, aiming to learn novel concepts from few labeled examples, is an interesting and very challenging problem with many practical advantages. To accomplish this task, one should concentrate on revealing the accurate relations of the support-query pairs. We propose a transductive relation-propagation graph neural network (TRPN) to explicitly model and propagate such relations across support-query pairs. Our TRPN treats the relation of each support-query pair as a graph node, named relational node, and resorts to the known relations between support samples, including both intra-class commonality and inter-class uniqueness, to guide the relation propagation in the graph, generating the discriminative relation embeddings for support-query pairs. A pseudo relational node is further introduced to propagate the query characteristics, and a fast, yet effective transductive learning strategy is devised to fully exploit the relation information among different queries. To the best of our knowledge, this is the first work that explicitly takes the relations of support-query pairs into consideration in few-shot learning, which might offer a new way to solve the few-shot learning problem. Extensive experiments conducted on several benchmark datasets demonstrate that our method can significantly outperform a variety of state-of-the-art few-shot learning methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:811-817
【Authors】: Yuqing Ma ; Wei Liu ; Shihao Bai ; Qingyu Zhang ; Aishan Liu ; Weimin Chen ; Xianglong Liu
【Abstract】: Few-shot learning aims to learn a model that can be readily adapted to new unseen classes (concepts) by accessing one or few examples. Despite the successful progress, most of the few-shot learning approaches, concentrating on either global or local characteristics of examples, still suffer from weak generalization abilities. Inspired by the inverted pyramid theory, to address this problem, we propose an inverted pyramid network (IPN) that intimates the human's coarse-to-fine cognition paradigm. The proposed IPN consists of two consecutive stages, namely global stage and local stage. At the global stage, a class-sensitive contextual memory network (CCMNet) is introduced to learn discriminative support-query relation embeddings and predict the query-to-class similarity based on the contextual memory. Then at the local stage, a fine-grained calibration is further appended to complement the coarse relation embeddings, targeting more precise query-to-class similarity evaluation. To the best of our knowledge, IPN is the first work that simultaneously integrates both global and local characteristics in few-shot learning, approximately imitating the human cognition mechanism. Our extensive experiments on multiple benchmark datasets demonstrate the superiority of IPN, compared to a number of state-of-the-art approaches.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Structural and Model-Based Approaches, Knowledge Representation and Reasoning;
【Paper Link】 【Pages】:818-824
【Authors】: Thao Minh Le ; Vuong Le ; Svetha Venkatesh ; Truyen Tran
【Abstract】: We present Language-binding Object Graph Network, the first neural reasoning method with dynamic relational structures across both visual and textual domains with applications in visual question answering. Relaxing the common assumption made by current models that the object predicates pre-exist and stay static, passive to the reasoning process, we propose that these dynamic predicates expand across the domain borders to include pair-wise visual-linguistic object binding. In our method, these contextualized object links are actively found within each recurrent reasoning step without relying on external predicative priors. These dynamic structures reflect the conditional dual-domain object dependency given the evolving context of the reasoning through co-attention. Such discovered dynamic graphs facilitate multi-step knowledge combination and refinements that iteratively deduce the compact representation of the final answer. The effectiveness of this model is demonstrated on image question answering demonstrating favorable performance on major VQA datasets. Our method outperforms other methods in sophisticated question-answering tasks wherein multiple object relations are involved. The graph structure effectively assists the progress of training, and therefore the network learns efficiently compared to other reasoning models.
【Keywords】: Computer Vision: Language and Vision; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:825-831
【Authors】: Lixin Fan ; KamWoh Ng ; Ce Ju ; Tianyu Zhang ; Chee Seng Chan
【Abstract】: This paper proposes a novel deep polarized network (DPN) for learning to hash, in which each channel in the network outputs is pushed far away from zero by employing a differentiable bit-wise hinge-like loss which is dubbed as polarization loss. Reformulated within a generic Hamming Distance Metric Learning framework [Norouzi et al., 2012], the proposed polarization loss bypasses the requirement to prepare pairwise labels for (dis-)similar items and, yet, the proposed loss strictly bounds from above the pairwise Hamming Distance based losses. The intrinsic connection between pairwise and pointwise label information, as disclosed in this paper, brings about the following methodological improvements: (a) we may directly employ the proposed differentiable polarization loss with no large deviations incurred from the target Hamming distance based loss; and (b) the subtask of assigning binary codes becomes extremely simple --- even random codes assigned to each class suffice to result in state-of-the-art performances, as demonstrated in CIFAR10, NUS-WIDE and ImageNet100 datasets.
【Keywords】: Computer Vision: Big Data and Large Scale Methods; Machine Learning: Clustering; Data Mining: Big Data, Large-Scale Systems;
【Paper Link】 【Pages】:832-838
【Authors】: Zhen-Liang Ni ; Gui-Bin Bian ; Guan'an Wang ; Xiao-Hu Zhou ; Zeng-Guang Hou ; Xiao-Liang Xie ; Zhen Li ; Yu-Han Wang
【Abstract】: Surgical instrument segmentation is crucial for computer-assisted surgery. Different from common object segmentation, it is more challenging due to the large illumination variation and scale variation in the surgical scenes. In this paper, we propose a bilinear attention network with adaptive receptive fields to address these two issues. To deal with the illumination variation, the bilinear attention module models global contexts and semantic dependencies between pixels by capturing second-order statistics. With them, semantic features in challenging areas can be inferred from their neighbors, and the distinction of various semantics can be boosted. To adapt to the scale variation, our adaptive receptive field module aggregates multi-scale features and selects receptive fields adaptively. Specifically, it models the semantic relationships between channels to choose feature maps with appropriate scales, changing the receptive field of subsequent convolutions. The proposed network achieves the best performance 97.47% mean IoU on Cata7. It also takes the first place on EndoVis 2017, exceeding the second place by 10.10% mean IoU.
【Keywords】: Computer Vision: Biomedical Image Understanding; Machine Learning Applications: Bio/Medicine; Machine Learning: Deep Learning; Robotics: Robotics and Vision;
【Paper Link】 【Pages】:839-845
【Authors】: Heyu Zhou ; Weizhi Nie ; Wenhui Li ; Dan Song ; An-An Liu
【Abstract】: 2D image-based 3D shape retrieval has become a hot research topic since its wide industrial applications and academic significance. However, existing view-based 3D shape retrieval methods are restricted by two settings, 1) learn the common-class features while neglecting the instance visual characteristics, 2) narrow the global domain variations while ignoring the local semantic variations in each category. To overcome these problems, we propose a novel hierarchical instance feature alignment (HIFA) method for this task. HIFA consists of two modules, cross-modal instance feature learning and hierarchical instance feature alignment. Specifically, we first use CNN to extract both 2D image and multi-view features. Then, we maximize the mutual information between the input data and the high-level feature to preserve as much as visual characteristics of an individual instance. To mix up the features in two domains, we enforce feature alignment considering both global domain and local semantic levels. By narrowing the global domain variations we impose the identical large norm restriction on both 2D and 3D feature-norm expectations to facilitate more transferable possibility. By narrowing the local variations we propose to minimize the distance between two centroids of the same class from different domains to obtain semantic consistency. Extensive experiments on two popular and novel datasets, MI3DOR and MI3DOR-2, validate the superiority of HIFA for 2D image-based 3D shape retrieval task.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:846-852
【Authors】: Chuanqi Zang ; Mingtao Pei ; Yu Kong
【Abstract】: Human motion prediction is a task where we anticipate future motion based on past observation. Previous approaches rely on the access to large datasets of skeleton data, and thus are difficult to be generalized to novel motion dynamics with limited training data. In our work, we propose a novel approach named Motion Prediction Network (MoPredNet) for few-short human motion prediction. MoPredNet can be adapted to predicting new motion dynamics using limited data, and it elegantly captures long-term dependency in motion dynamics. Specifically, MoPredNet dynamically selects the most informative poses in the streaming motion data as masked poses. In addition, MoPredNet improves its encoding capability of motion dynamics by adaptively learning spatio-temporal structure from the observed poses and masked poses. We also propose to adapt MoPredNet to novel motion dynamics based on accumulated motion experiences and limited novel motion dynamics data. Experimental results show that our method achieves better performance over state-of-the-art methods in motion prediction.
【Keywords】: Computer Vision: Motion and Tracking; Computer Vision: Video: Events, Activities and Surveillance; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:853-859
【Authors】: Weiwei Wang ; Yuming Shen ; Haofeng Zhang ; Yazhou Yao ; Li Liu
【Abstract】: The label-free nature of unsupervised cross-modal hashing hinders models from exploiting the exact semantic data similarity. Existing research typically simulates the semantics by a heuristic geometric prior in the original feature space. However, this introduces heavy bias into the model as the original features are not fully representing the underlying multi-view data relations. To address the problem above, in this paper, we propose a novel unsupervised hashing method called Semantic-Rebased Cross-modal Hashing (SRCH). A novel ‘Set-and-Rebase’ process is defined to initialize and update the cross-modal similarity graph of training data. In particular, we set the graph according to the intra-modal feature geometric basis and then alternately rebase it to update the edges within according to the hashing results. We develop an alternating optimization routine to rebase the graph and train the hashing auto-encoders with closed-form solutions so that the overall framework is efficiently trained. Our experimental results on benchmarked datasets demonstrate the superiority of our model against state-of-the-art algorithms.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:860-867
【Authors】: Mennatullah Siam ; Naren Doraiswamy ; Boris N. Oreshkin ; Hengshuai Yao ; Martin Jägersand
【Abstract】: Significant progress has been made recently in developing few-shot object segmentation methods. Learning is shown to be successful in few-shot segmentation settings, using pixel-level, scribbles and bounding box supervision. This paper takes another approach, i.e., only requiring image-level label for few-shot object segmentation. We propose a novel multi-modal interaction module for few-shot object segmentation that utilizes a co-attention mechanism using both visual and word embedding. Our model using image-level labels achieves 4.8% improvement over previously proposed image-level few-shot object segmentation. It also outperforms state-of-the-art methods that use weak bounding box supervision on PASCAL-5^i. Our results show that few-shot segmentation benefits from utilizing word embeddings, and that we are able to perform few-shot segmentation using stacked joint visual semantic processing with weak image-level labels. We further propose a novel setup, Temporal Object Segmentation for Few-shot Learning (TOSFL) for videos. TOSFL can be used on a variety of public video data such as Youtube-VOS, as demonstrated in both instance-level and category-level TOSFL experiments.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:868-875
【Authors】: Hongwei Xie ; Yafei Song ; Ling Cai ; Mingyang Li
【Abstract】: The inherent heavy computation of deep neural networks prevents their widespread applications. A widely used method for accelerating model inference is quantization, by replacing the input operands of a network using fixed-point values. Then the majority of computation costs focus on the integer matrix multiplication accumulation. In fact, high-bit accumulator leads to partially wasted computation and low-bit one typically suffers from numerical overflow. To address this problem, we propose an overflow aware quantization method by designing trainable adaptive fixed-point representation, to optimize the number of bits for each input tensor while prohibiting numeric overflow during the computation. With the proposed method, we are able to fully utilize the computing power to minimize the quantization loss and obtain optimized inference performance. To verify the effectiveness of our method, we conduct image classification, object detection, and semantic segmentation tasks on ImageNet, Pascal VOC, and COCO datasets, respectively. Experimental results demonstrate that the proposed method can achieve comparable performance with state-of-the-art quantization methods while accelerating the inference process by about 2 times.
【Keywords】: Computer Vision: Perception; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:876-882
【Authors】: Celia Cintas ; Skyler Speakman ; Victor Akinwande ; William Ogallo ; Komminist Weldemariam ; Srihari Sridharan ; Edward McFowland
【Abstract】: Reliably detecting attacks in a given set of inputs is of high practical relevance because of the vulnerability of neural networks to adversarial examples. These altered inputs create a security risk in applications with real-world consequences, such as self-driving cars, robotics and financial services. We propose an unsupervised method for detecting adversarial attacks in inner layers of autoencoder (AE) networks by maximizing a non-parametric measure of anomalous node activations. Previous work in this space has shown AE networks can detect anomalous images by thresholding the reconstruction error produced by the final layer. Furthermore, other detection methods rely on data augmentation or specialized training techniques which must be asserted before training time. In contrast, we use subset scanning methods from the anomalous pattern detection domain to enhance detection power without labeled examples of the noise, retraining or data augmentation methods. In addition to an anomalous “score” our proposed method also returns the subset of nodes within the AE network that contributed to that score. This will allow future work to pivot from detection to visualisation and explainability. Our scanning approach shows consistently higher detection power than existing detection methods across several adversarial noise models and a wide range of perturbation strengths.
【Keywords】: Computer Vision: Statistical Methods and Machine Learning; Machine Learning: Deep Learning; Data Mining: Big Data, Large-Scale Systems;
【Paper Link】 【Pages】:883-889
【Authors】: Yuting Su ; Yuqian Li ; Dan Song ; Weizhi Nie ; Wenhui Li ; An-An Liu
【Abstract】: 2D image-based 3D objects retrieval is a new topic for 3D objects retrieval which can be used to manage 3D data with 2D images. The goal is to search some related 3D objects when given a 2D image. The task is challenging due to the large domain gap between 2D images and 3D objects. Therefore, it is essential to consider domain adaptation problems to reduce domain discrepancy. However, most of the existing domain adaptation methods only utilize the semantic information from the source domain to predict labels in the target domain and neglect the intrinsic structure of the target domain. In this paper, we propose a domain alignment framework with consistent domain structure learning to reduce the large gap between 2D images and 3D objects. The domain structure learning module makes use of both the semantic information from the source domain and the intrinsic structure of the target domain, which provides more reliable predicted labels to the domain alignment module to better align the conditional distribution. We conducted experiments on two public datasets, MI3DOR and MI3DOR-2, and the experimental results demonstrate the proposed method outperforms the state-of-the-art methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:890-897
【Authors】: Yubo Zhang ; Hao Tan ; Mohit Bansal
【Abstract】: Vision-and-Language Navigation (VLN) requires an agent to follow natural-language instructions, explore the given environments, and reach the desired target locations. These step-by-step navigational instructions are crucial when the agent is navigating new environments about which it has no prior knowledge. Most recent works that study VLN observe a significant performance drop when tested on unseen environments (i.e., environments not used in training), indicating that the neural agent models are highly biased towards training environments. Although this issue is considered as one of the major challenges in VLN research, it is still under-studied and needs a clearer explanation. In this work, we design novel diagnosis experiments via environment re-splitting and feature replacement, looking into possible reasons for this environment bias. We observe that neither the language nor the underlying navigational graph, but the low-level visual appearance conveyed by ResNet features directly affects the agent model and contributes to this environment bias in results. According to this observation, we explore several kinds of semantic representations that contain less low-level visual information, hence the agent learned with these features could be better generalized to unseen testing environments. Without modifying the baseline agent model and its training method, our explored semantic features significantly decrease the performance gaps between seen and unseen on multiple datasets (i.e. R2R, R4R, and CVDN) and achieve competitive unseen results to previous state-of-the-art models.
【Keywords】: Computer Vision: Language and Vision;
【Paper Link】 【Pages】:898-905
【Authors】: Haocong Rao ; Siqi Wang ; Xiping Hu ; Mingkui Tan ; Huang Da ; Jun Cheng ; Bin Hu
【Abstract】: Gait-based person re-identification (Re-ID) is valuable for safety-critical applications, and using only 3D skeleton data to extract discriminative gait features for person Re-ID is an emerging open topic. Existing methods either adopt hand-crafted features or learn gait features by traditional supervised learning paradigms. Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner. Specifically, we first propose to introduce self-supervision by learning to reconstruct input skeleton sequences in reverse order, which facilitates learning richer high-level semantics and better gait representations. Second, inspired by the fact that motion's continuity endows temporally adjacent skeletons with higher correlations (“locality”), we propose a locality-aware attention mechanism that encourages learning larger attention weights for temporally adjacent skeletons when reconstructing current skeleton, so as to learn locality when encoding gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are built using context vectors learned by locality-aware attention, as final gait representations. AGEs are directly utilized to realize effective person Re-ID. Our approach typically improves existing skeleton-based methods by 10-20% Rank-1 accuracy, and it achieves comparable or even superior performance to multi-modal methods with extra RGB or depth information.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:906-912
【Authors】: Licheng Zhang ; Xianzhi Wang ; Lina Yao ; Lin Wu ; Feng Zheng
【Abstract】: Zero-shot object detection (ZSD) has received considerable attention from the community of computer vision in recent years. It aims to simultaneously locate and categorize previously unseen objects during inference. One crucial problem of ZSD is how to accurately predict the label of each object proposal, i.e. categorizing object proposals, when conducting ZSD for unseen categories. Previous ZSD models generally relied on learning an embedding from visual space to semantic space or learning a joint embedding between semantic description and visual representation. As the features in the learned semantic space or the joint projected space tend to suffer from the hubness problem, namely the feature vectors are likely embedded to an area of incorrect labels, and thus it will lead to lower detection precision. In this paper, instead, we propose to learn a deep embedding from the semantic space to the visual space, which enables to well alleviate the hubness problem, because, compared with semantic space or joint embedding space, the distribution in visual space has smaller variance. After learning a deep embedding model, we perform $k$ nearest neighbor search in the visual space of unseen categories to determine the category of each semantic description. Extensive experiments on two public datasets show that our approach significantly outperforms the existing methods.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:913-919
【Authors】: Jinjia Peng ; Yang Wang ; Huibing Wang ; Zhao Zhang ; Xianping Fu ; Meng Wang
【Abstract】: Vehicle re-identification (reID) aims at identifying vehicles across different non-overlapping cameras views. The existing methods heavily relied on well-labeled datasets for ideal performance, which inevitably causes fateful drop due to the severe domain bias between the training domain and the real-world scenes; worse still, these approaches required full annotations, which is labor-consuming. To tackle these challenges, we propose a novel Progressive Adaptation Learning method for vehicle reID, named PAL, which infers from the abundant data without annotations. For PAL, a data adaptation module is employed for source domain, which generates the images with similar data distribution to unlabeled target domain as “pseudo target samples”. These pseudo samples are combined with the unlabeled samples that are selected by a dynamic sampling strategy to make training faster. We further proposed a weighted label smoothing (WLS) loss, which considers the similarity between samples with different clusters to balance the confidence of pseudo labels. Comprehensive experimental results validate the advantages of PAL on both VehicleID and VeRi-776 dataset.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:920-926
【Authors】: Dan Guo ; Yang Wang ; Peipei Song ; Meng Wang
【Abstract】: Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models. In this paper, we propose a novel memory-based network rather than GAN, named Recurrent Relational Memory Network (R2M). Unlike complicated and sensitive adversarial learning that non-ideally performs for long sentence generation, R2M implements a concepts-to-sentence memory translator through two-stage memory mechanisms: fusion and recurrent memories, correlating the relational reasoning between common visual concepts and the generated words for long periods. R2M encodes visual context through unsupervised training on images, while enabling the memory to learn from irrelevant textual corpus via supervised fashion. Our solution enjoys less learnable parameters and higher computational efficiency than GAN-based methods, which heavily bear parameter sensitivity. We experimentally validate the superiority of R2M than state-of-the-arts on all benchmark datasets.
【Keywords】: Computer Vision: Language and Vision; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:927-933
【Authors】: Ruixin Liu ; Zhenyu Weng ; Yuesheng Zhu ; Bairong Li
【Abstract】: Video inpainting aims to synthesize visually pleasant and temporally consistent content in missing regions of video. Due to a variety of motions across different frames, it is highly challenging to utilize effective temporal information to recover videos. Existing deep learning based methods usually estimate optical flow to align frames and thereby exploit useful information between frames. However, these methods tend to generate artifacts once the estimated optical flow is inaccurate. To alleviate above problem, we propose a novel end-to-end Temporal Adaptive Alignment Network(TAAN) for video inpainting. The TAAN aligns reference frames with target frame via implicit motion estimation at a feature level and then reconstruct target frame by taking the aggregated aligned reference frame features as input. In the proposed network, a Temporal Adaptive Alignment (TAA) module based on deformable convolutions is designed to perform temporal alignment in a local, dense and adaptive manner. Both quantitative and qualitative evaluation results show that our method significantly outperforms existing deep learning based methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Machine Learning: Deep Learning: Convolutional networks; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:934-940
【Authors】: Pin Jiang ; Aming Wu ; Yahong Han ; Yunfeng Shao ; Meiyu Qi ; Bingshuai Li
【Abstract】: Semi-supervised domain adaptation (SSDA) is a novel branch of machine learning that scarce labeled target examples are available, compared with unsupervised domain adaptation. To make effective use of these additional data so as to bridge the domain gap, one possible way is to generate adversarial examples, which are images with additional perturbations, between the two domains and fill the domain gap. Adversarial training has been proven to be a powerful method for this purpose. However, the traditional adversarial training adds noises in arbitrary directions, which is inefficient to migrate between domains, or generate directional noises from the source to target domain and reverse. In this work, we devise a general bidirectional adversarial training method and employ gradient to guide adversarial examples across the domain gap, i.e., the Adaptive Adversarial Training (AAT) for source to target domain and Entropy-penalized Virtual Adversarial Training (E-VAT) for target to source domain. Particularly, we devise a Bidirectional Adversarial Training (BiAT) network to perform diverse adversarial trainings jointly. We evaluate the effectiveness of BiAT on three benchmark datasets and experimental results demonstrate the proposed method achieves the state-of-the-art.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:941-947
【Authors】: Kai Shen ; Lingfei Wu ; Fangli Xu ; Siliang Tang ; Jun Xiao ; Yueting Zhuang
【Abstract】: The task of Grounded Video Description~(GVD) is to generate sentences whose objects can be grounded with the bounding boxes in the video frames. Existing works often fail to exploit structural information both in modeling the relationships among the region proposals and in attending them for text generation. To address these issues, we cast the GVD task as a spatial-temporal Graph-to-Sequence learning problem, where we model video frames as spatial-temporal sequence graph in order to better capture implicit structural relationships. In particular, we exploit two ways to construct a sequence graph that captures spatial-temporal correlations among different objects in each frame and further present a novel graph topology refinement technique to discover optimal underlying graph structure. In addition, we also present hierarchical attention mechanism to attend sequence graph in different resolution levels for better generating the sentences. Our extensive experiments demonstrate the effectiveness of our proposed method compared to state-of-the-art methods.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Video: Events, Activities and Surveillance;
【Paper Link】 【Pages】:948-954
【Authors】: Ke Ning ; Lingxi Xie ; Fei Wu ; Qi Tian
【Abstract】: In this paper, we tackle a challenging task named video-language segmentation. Given a video and a sentence in natural language, the goal is to segment the object or actor described by the sentence in video frames. To accurately denote a target object, the given sentence usually refers to multiple attributes, such as nearby objects with spatial relations, etc. In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i.e., in terms of direction and range. Sentence feature can interact with positional embeddings in a more direct way to extract the implied relative positional relations. We also propose parameterized functions for these positional embeddings to adapt real-value directions and ranges. With PRPE, we design a Polar Attention Module (PAM) as the basic module for vision-language fusion. Our method outperforms previous best method by a large margin of 11.4% absolute improvement in terms of mAP on the challenging A2D Sentences dataset. Our method also achieves competitive performances on the J-HMDB Sentences dataset.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Action Recognition; Computer Vision: Video: Events, Activities and Surveillance;
【Paper Link】 【Pages】:955-962
【Authors】: Yanzhao Xie ; Yu Liu ; Yangtao Wang ; Lianli Gao ; Peng Wang ; Ke Zhou
【Abstract】: For the multi-label image retrieval, the existing hashing algorithms neglect the dependency between objects and thus fail to capture the attention information in the feature extraction, which affects the precision of hash codes. To address this problem, we explore the inter-dependency between objects through their co-occurrence correlation from the label set and adopt Multi-modal Factorized Bilinear (MFB) pooling component so that the image representation learning can capture this attention information. We propose a Label-Attended Hashing (LAH) algorithm which enables an end-to-end hash model with inter-dependency feature extraction. LAH first combines Convolutional Neural Network (CNN) and Graph Convolution Network (GCN) to separately generate the image representation and label co-occurrence embeddings, then adopts MFB to fuse these two modal vectors, finally learns the hash function with a Cauchy distribution based loss function via back propagation. Extensive experiments on public multi-label datasets demonstrate that (1) LAH can achieve the state-of-the-art retrieval results and (2) the usage of co-occurrence relationship and MFB not only promotes the precision of hash codes but also accelerates the hash learning. GitHub address: https://github.com/IDSM-AI/LAH.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:963-969
【Authors】: Mingkang Xiong ; Zhenghong Zhang ; Weilin Zhong ; Jinsheng Ji ; Jiyuan Liu ; Huilin Xiong
【Abstract】: The self-supervised learning-based depth and visual odometry (VO) estimators trained on monocular videos without ground truth have drawn significant attention recently. Prior works use photometric consistency as supervision, which is fragile under complex realistic environments due to illumination variations. More importantly, it suffers from scale inconsistency in the depth and pose estimation results. In this paper, robust geometric losses are proposed to deal with this problem. Specifically, we first align the scales of two reconstructed depth maps estimated from the adjacent image frames, and then enforce forward-backward relative pose consistency to formulate scale-consistent geometric constraints. Finally, a novel training framework is constructed to implement the proposed losses. Extensive evaluations on KITTI and Make3D datasets demonstrate that, i) by incorporating the proposed constraints as supervision, the depth estimation model can achieve state-of-the-art (SOTA) performance among the self-supervised methods, and ii) it is effective to use the proposed training framework to obtain a uniform global scale VO model.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Robotics: Robotics and Vision;
【Paper Link】 【Pages】:970-976
【Authors】: Zixiang Zhao ; Shuang Xu ; Chunxia Zhang ; Junmin Liu ; Jiangshe Zhang ; Pengfei Li
【Abstract】: Infrared and visible image fusion, a hot topic in the field of image processing, aims at obtaining fused images keeping the advantages of source images. This paper proposes a novel auto-encoder (AE) based fusion network. The core idea is that the encoder decomposes an image into background and detail feature maps with low- and high-frequency information, respectively, and that the decoder recovers the original image. To this end, the loss function makes the background/detail feature maps of source images similar/dissimilar. In the test phase, background and detail feature maps are respectively merged via a fusion module, and the fused image is recovered by the decoder. Qualitative and quantitative results illustrate that our method can generate fusion images containing highlighted targets and abundant detail texture information with strong reproducibility and meanwhile surpass state-of-the-art (SOTA) approaches.
【Keywords】: Computer Vision: Computational Photography, Photometry, Shape from X; Computer Vision: 2D and 3D Computer Vision; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:977-983
【Authors】: Yuhui Xu ; Yuxi Li ; Shuai Zhang ; Wei Wen ; Botao Wang ; Yingyong Qi ; Yiran Chen ; Weiyao Lin ; Hongkai Xiong
【Abstract】: To enable DNNs on edge devices like mobile phones, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. As a result, performance usually drops significantly and a sophisticated effort on fine-tuning is required to recover accuracy. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. The TRP trained network inherently has a low-rank structure, and is approximated with negligible performance loss, thus eliminating the fine-tuning process after low rank decomposition. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression methods using low rank approximation.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:984-990
【Authors】: Xinxun Xu ; Muli Yang ; Yanhua Yang ; Hao Wang
【Abstract】: Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a specific cross-modal retrieval task for searching natural images given free-hand sketches under the zero-shot scenario. Most existing methods solve this problem by simultaneously projecting visual features and semantic supervision into a low-dimensional common space for efficient retrieval. However, such low-dimensional projection destroys the completeness of semantic knowledge in original semantic space, so that it is unable to transfer useful knowledge well when learning semantic features from different modalities. Moreover, the domain information and semantic information are entangled in visual features, which is not conducive for cross-modal matching since it will hinder the reduction of domain gap between sketch and image. In this paper, we propose a Progressive Domain-independent Feature Decomposition (PDFD) network for ZS-SBIR. Specifically, with the supervision of original semantic knowledge, PDFD decomposes visual features into domain features and semantic ones, and then the semantic features are projected into common space as retrieval features for ZS-SBIR. The progressive projection strategy maintains strong semantic supervision. Besides, to guarantee the retrieval features to capture clean and complete semantic information, the cross-reconstruction loss is introduced to encourage that any combinations of retrieval features and domain features can reconstruct the visual features. Extensive experiments demonstrate the superiority of our PDFD over state-of-the-art competitors.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Generative Models; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:991-997
【Authors】: Zhihui Lin ; Chun Yuan ; Maomao Li
【Abstract】: Stochastic video generation methods predict diverse videos based on observed frames, where the main challenge lies in modeling the complex future uncertainty and generating realistic frames. Numerous of Recurrent-VAE-based methods have achieved state-of-the-art results. However, on the one hand, the independence assumption of the variables of approximate posterior limits the inference performance. On the other hand, although these methods adopt skip connections between encoder and decoder to utilize multi-level features, they still produce blurry generation due to the spatial misalignment between encoder and decoder features at different time steps. In this paper, we propose a hierarchical recurrent VAE with a feature aligner, which can not only relax the independence assumption in typical VAE but also use a feature aligner to enable the decoder to obtain the aligned spatial information from the last observed frames. The proposed model is named Hierarchical Stochastic Video Generation network with Aligned Features, referred to as HAF-SVG. Experiments on Moving-MNIST, BAIR, and KTH datasets demonstrate that hierarchical structure is helpful for modeling more accurate future uncertainty, and the feature aligner is beneficial to generate realistic frames. Besides, the HAF-SVG exceeds SVG on both prediction accuracy and the quality of generated frames.
【Keywords】: Computer Vision: Other; Machine Learning: Deep Generative Models; Computer Vision: Video: Events, Activities and Surveillance; Machine Learning: Learning Generative Models;
【Paper Link】 【Pages】:998-1004
【Authors】: Jiayin Cai ; Chun Yuan ; Cheng Shi ; Lei Li ; Yangyang Cheng ; Ying Shan
【Abstract】: Recently, Recurrent Neural Network (RNN) based methods and Self-Attention (SA) based methods have achieved promising performance in Video Question Answering (VideoQA). Despite the success of these works, RNN-based methods tend to forget the global semantic contents due to the inherent drawbacks of the recurrent units themselves, while SA-based methods cannot precisely capture the dependencies of the local neighborhood, leading to insufficient modeling for temporal order. To tackle these problems, we propose a novel VideoQA framework which progressively refines the representations of videos and questions from fine to coarse grain in a sequence-sensitive manner. Specifically, our model improves the feature representations via the following two steps: (1) introducing two fine-grained feature-augmented memories to strengthen the information augmentation of video and text which can improve memory capacity by memorizing more relevant and targeted information. (2) appending the self-attention and co-attention module to the memory output thus the module is able to capture global interaction between high-level semantic informations. Experimental results show that our approach achieves state-of-the-art performance on VideoQA benchmark datasets.
【Keywords】: Computer Vision: Video: Events, Activities and Surveillance; Computer Vision: Language and Vision; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:1005-1011
【Authors】: Zerun Feng ; Zhimin Zeng ; Caili Guo ; Zheng Li
【Abstract】: Video retrieval is a challenging research topic bridging the vision and language areas and has attracted broad attention in recent years. Previous works have been devoted to representing videos by directly encoding from frame-level features. In fact, videos consist of various and abundant semantic relations to which existing methods pay less attention. To address this issue, we propose a Visual Semantic Enhanced Reasoning Network (ViSERN) to exploit reasoning between frame regions. Specifically, we consider frame regions as vertices and construct a fully-connected semantic correlation graph. Then, we perform reasoning by novel random walk rule-based graph convolutional networks to generate region features involved with semantic relations. With the benefit of reasoning, semantic interactions between regions are considered, while the impact of redundancy is suppressed. Finally, the region features are aggregated to form frame-level features for further encoding to measure video-text similarity. Extensive experiments on two public benchmark datasets validate the effectiveness of our method by achieving state-of-the-art performance due to the powerful semantic reasoning.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Language and Vision;
【Paper Link】 【Pages】:1012-1018
【Authors】: Jiawei Liu ; Zheng-Jun Zha ; Xierong Zhu ; Na Jiang
【Abstract】: Person re-identification aims at identifying a certain pedestrian across non-overlapping camera networks. Video-based person re-identification approaches have gained significant attention recently, expanding image-based approaches by learning features from multiple frames. In this work, we propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos. It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions, towards learning discriminative pedestrian representation. Specifically, multiple co-saliency learning modules within CSTNet are designed to utilize the correlated information across video frames to extract the salient features from the task-relevant regions and suppress background interference. Moreover, multiple spatial-temporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation, to enhance feature representation. Extensive experiments on two benchmarks have demonstrated the effectiveness of the proposed method.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Video: Events, Activities and Surveillance;
【Paper Link】 【Pages】:1019-1025
【Authors】: Kai Zhu ; Wei Zhai ; Yang Cao
【Abstract】: Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples. It is a challenging task since the dense prediction can only be achieved under the guidance of latent features defined by sparse annotations. Existing meta-learning based method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space. To address this issue, this paper presents an adaptive tuning framework, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme, augmenting category-specific descriptors for label prediction. Specifically, a novel self-supervised inner-loop is firstly devised as the base learner to extract the underlying semantic features from the support image. Then, gradient maps are calculated by back-propagating self-supervised loss through the obtained features, and leveraged as guidance for augmenting the corresponding elements in the embedding space. Finally, with the ability to continuously learn from different episodes, an optimization-based meta-learner is adopted as outer loop of our proposed framework to gradually refine the segmentation results. Extensive experiments on benchmark PASCAL-5i and COCO-20i datasets demonstrate the superiority of our proposed method over state-of-the-art.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning; Machine Learning: Other;
【Paper Link】 【Pages】:1026-1032
【Authors】: Mengxi Jia ; Yunpeng Zhai ; Shijian Lu ; Siwei Ma ; Jian Zhang
【Abstract】: RGB-Infrared (IR) cross-modality person re-identification (re-ID), which aims to search an IR image in RGB gallery or vice versa, is a challenging task due to the large discrepancy between IR and RGB modalities. Existing methods address this challenge typically by aligning feature distributions or image styles across modalities, whereas the very useful similarities among gallery samples of the same modality (i.e. intra-modality sample similarities) are largely neglected. This paper presents a novel similarity inference metric (SIM) that exploits the intra-modality sample similarities to circumvent the cross-modality discrepancy targeting optimal cross-modality image matching. SIM works by successive similarity graph reasoning and mutual nearest-neighbor reasoning that mine cross-modality sample similarities by leveraging intra-modality sample similarities from two different perspectives. Extensive experiments over two cross-modality re-ID datasets (SYSU-MM01 and RegDB) show that SIM achieves significant accuracy improvement but with little extra training as compared with the state-of-the-art.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:1033-1039
【Authors】: Li'an Zhuo ; Baochang Zhang ; Hanlin Chen ; Linlin Yang ; Chen Chen ; Yanjun Zhu ; David S. Doermann
【Abstract】: Neural architecture search (NAS) proves to be among the best approaches for many tasks by generating an application-adaptive neural architectures, which are still challenged by high computational cost and memory consumption. At the same time, 1-bit convolutional neural networks (CNNs) with binarized weights and activations show their potential for resource-limited embedded devices. One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS by taking advantage of the strengths of each in a unified framework. To this end, a Child-Parent model is introduced to a differentiable NAS to search the binarized architecture(Child) under the supervision of a full-precision model (Parent). In the search stage, the Child-Parent model uses an indicator generated by the parent and child model accuracy to evaluate the performance and abandon operations with less potential. In the training stage, a kernel level CP loss is introduced to optimize the binarized network. Extensive experiments demonstrate that the proposed CP-NAS achieves a comparable accuracy with traditional NAS on both the CIFAR and ImageNet databases. It achieves an accuracy of 95.27% on CIFAR-10, 64.3% on ImageNet with binarized weights and activations, and a 30% faster search than prior arts.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:1040-1046
【Authors】: Haifeng Zhang ; Wen Su ; Jun Yu ; Zengfu Wang
【Abstract】: To extract crucial local features and enhance the complementary relation between local and global features, this paper proposes a Weakly Supervised Local-Global Relation Network (WS-LGRN), which uses the attention mechanism to deal with part location and feature fusion problems. Firstly, the Attention Map Generator quickly finds the local regions-of-interest under the supervision of image-level labels. Secondly, bilinear attention pooling is employed to generate and refine local features. Thirdly, Relational Reasoning Unit is designed to model the relation among all features before making classification. The weighted fusion mechanism in the Relational Reasoning Unit makes the model benefit from the complementary advantages between different features. In addition, contrastive losses are introduced for local and global features to increase the inter-class dispersion and intra-class compactness at different granularities. Experiments on lab-controlled and real-world facial expression dataset show that WS-LGRN achieves state-of-the-art performance, which demonstrates its superiority in FER.
【Keywords】: Computer Vision: Biometrics, Face and Gesture Recognition; Machine Learning: Classification; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:1047-1053
【Authors】: Qinming Zhang ; Luyan Liu ; Kai Ma ; Cheng Zhuo ; Yefeng Zheng
【Abstract】: Deep convolutional neural networks (DCNNs) have contributed many breakthroughs in segmentation tasks, especially in the field of medical imaging. However, domain shift and corrupted annotations, which are two common problems in medical imaging, dramatically degrade the performance of DCNNs in practice. In this paper, we propose a novel robust cross-denoising framework using two peer networks to address domain shift and corrupted label problems with a peer-review strategy. Specifically, each network performs as a mentor, mutually supervised to learn from reliable samples selected by the peer network to combat with corrupted labels. In addition, a noise-tolerant loss is proposed to encourage the network to capture the key location and filter the discrepancy under various noise-contaminant labels. To further reduce the accumulated error, we introduce a class-imbalanced cross learning using most confident predictions at class-level. Experimental results on REFUGE and Drishti-GS datasets for optic disc (OD) and optic cup (OC) segmentation demonstrate the superior performance of our proposed approach to the state-of-the-art methods.
【Keywords】: Computer Vision: 2D and 3D Computer Vision; Machine Learning Applications: Applications of Unsupervised Learning; Machine Learning Applications: Bio/Medicine;
【Paper Link】 【Pages】:1054-1060
【Authors】: Hongrui Zhao ; Jin Yu ; Yanan Li ; Donghui Wang ; Jie Liu ; Hongxia Yang ; Fei Wu
【Abstract】: Nowadays, both online shopping and video sharing have grown exponentially. Although internet celebrities in videos are ideal exhibition for fashion corporations to sell their products, audiences do not always know where to buy fashion products in videos, which is a cross-domain problem called video-to-shop. In this paper, we propose a novel deep neural network, called Detect, Pick, and Retrieval Network (DPRNet), to break the gap between fashion products from videos and audiences. For the video side, we have modified the traditional object detector, which automatically picks out the best object proposals for every commodity in videos without duplication, to promote the performance of the video-to-shop task. For the fashion retrieval side, a simple but effective multi-task loss network obtains new state-of-the-art results on DeepFashion. Extensive experiments conducted on a new large-scale cross-domain video-to-shop dataset shows that DPRNet is efficient and outperforms the state-of-the-art methods on video-to-shop task.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Multidisciplinary Topics and Applications: Recommender Systems; Machine Learning: Recommender Systems; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:1061-1068
【Authors】: Wentian Zhao ; Seokhwan Kim ; Ning Xu ; Hailin Jin
【Abstract】: This paper presents a new video question answering task on screencast tutorials. We introduce a dataset including question, answer and context triples from the tutorial videos for a software. Unlike other video question answering works, all the answers in our dataset are grounded to the domain knowledge base. An one-shot recognition algorithm is designed to extract the visual cues, which helps enhance the performance of video question answering. We also propose several baseline neural network architectures based on various aspects of video contexts from the dataset. The experimental results demonstrate that our proposed models significantly improve the question answering performances by incorporating multi-modal contexts and domain knowledge.
【Keywords】: Computer Vision: Language and Vision; Natural Language Processing: Question Answering; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:1069-1075
【Authors】: Zhu Zhang ; Zhou Zhao ; Zhijie Lin ; Baoxing Huai ; Jing Yuan
【Abstract】: Spatio-temporal video grounding aims to retrieve the spatio-temporal tube of a queried object according to the given sentence. Currently, most existing grounding methods are restricted to well-aligned segment-sentence pairs. In this paper, we explore spatio-temporal video grounding on unaligned data and multi-form sentences. This challenging task requires to capture critical object relations to identify the queried target. However, existing approaches cannot distinguish notable objects and remain in ineffective relation modeling between unnecessary objects. Thus, we propose a novel object-aware multi-branch relation network for object-aware relation discovery. Concretely, we first devise multiple branches to develop object-aware region modeling, where each branch focuses on a crucial object mentioned in the sentence. We then propose multi-branch relation reasoning to capture critical object relationships between the main branch and auxiliary branches. Moreover, we apply a diversity loss to make each branch only pay attention to its corresponding object and boost multi-branch learning. The extensive experiments show the effectiveness of our proposed method.
【Keywords】: Computer Vision: Language and Vision;
【Paper Link】 【Pages】:1076-1082
【Authors】: Zhedong Zheng ; Yi Yang
【Abstract】: This work focuses on the unsupervised scene adaptation problem of learning from both labeled source data and unlabeled target data. Existing approaches focus on minoring the inter-domain gap between the source and target domains. However, the intra-domain knowledge and inherent uncertainty learned by the network are under-explored. In this paper, we propose an orthogonal method, called memory regularization in vivo, to exploit the intra-domain knowledge and regularize the model training. Specifically, we refer to the segmentation model itself as the memory module, and minor the discrepancy of the two classifiers, i.e., the primary classifier and the auxiliary classifier, to reduce the prediction inconsistency. Without extra parameters, the proposed method is complementary to most existing domain adaptation methods and could generally improve the performance of existing methods. Albeit simple, we verify the effectiveness of memory regularization on two synthetic-to-real benchmarks: GTA5 → Cityscapes and SYNTHIA → Cityscapes, yielding +11.1% and +11.3% mIoU improvement over the baseline model, respectively. Besides, a similar +12.0% mIoU improvement is observed on the cross-city benchmark: Cityscapes → Oxford RobotCar.
【Keywords】: Computer Vision: Big Data and Large Scale Methods; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:1083-1089
【Authors】: Xi Zhu ; Zhendong Mao ; Chunxiao Liu ; Peng Zhang ; Bin Wang ; Yongdong Zhang
【Abstract】: Most Visual Question Answering (VQA) models suffer from the language prior problem, which is caused by inherent data biases. Specifically, VQA models tend to answer questions (e.g., what color is the banana?) based on the high-frequency answers (e.g., yellow) ignoring image contents. Existing approaches tackle this problem by creating delicate models or introducing additional visual annotations to reduce question dependency and strengthen image dependency. However, they are still subject to the language prior problem since the data biases have not been fundamentally addressed. In this paper, we introduce a self-supervised learning framework to solve this problem. Concretely, we first automatically generate labeled data to balance the biased data, and then propose a self-supervised auxiliary task to utilize the balanced data to assist the VQA model to overcome language priors. Our method can compensate for the data biases by generating balanced data without introducing external annotations. Experimental results show that our method achieves state-of-the-art performance, improving the overall accuracy from 49.50% to 57.59% on the most commonly used benchmark VQA-CP v2. In other words, we can increase the performance of annotation-based methods by 16% without using external annotations. Our code is available on GitHub.
【Keywords】: Computer Vision: Language and Vision; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:1090-1096
【Authors】: Yaohui Zhu ; Chenlong Liu ; Shuqiang Jiang
【Abstract】: The goal of few-shot image recognition is to distinguish different categories with only one or a few training samples. Previous works of few-shot learning mainly work on general object images. And current solutions usually learn a global image representation from training tasks to adapt novel tasks. However, fine-gained categories are distinguished by subtle and local parts, which could not be captured by global representations effectively. This may hinder existing few-shot learning approaches from dealing with fine-gained categories well. In this work, we propose a multi-attention meta-learning (MattML) method for few-shot fine-grained image recognition (FSFGIR). Instead of using only base learner for general feature learning, the proposed meta-learning method uses attention mechanisms of the base learner and task learner to capture discriminative parts of images. The base learner is equipped with two convolutional block attention modules (CBAM) and a classifier. The two CBAM can learn diverse and informative parts. And the initial weights of classifier are attended by the task learner, which gives the classifier a task-related sensitive initialization. For adaptation, the gradient-based meta-learning approach is employed by updating the parameters of two CBAM and the attended classifier, which facilitates the updated base learner to adaptively focus on discriminative parts. We experimentally analyze the different components of our method, and experimental results on four benchmark datasets demonstrate the effectiveness and superiority of our method.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:1097-1103
【Authors】: Zihao Zhu ; Jing Yu ; Yujing Wang ; Yajing Sun ; Yue Hu ; Qi Wu
【Abstract】: Fact-based Visual Question Answering (FVQA) requires external knowledge beyond the visible content to answer questions about an image. This ability is challenging but indispensable to achieve general VQA. One limitation of existing FVQA solutions is that they jointly embed all kinds of information without fine-grained selection, which introduces unexpected noises for reasoning the final answer. How to capture the question-oriented and information-complementary evidence remains a key challenge to solve the problem. In this paper, we depict an image by a multi-modal heterogeneous graph, which contains multiple layers of information corresponding to the visual, semantic and factual features. On top of the multi-layer graph representations, we propose a modality-aware heterogeneous graph convolutional network to capture evidence from different layers that is most relevant to the given question. Specifically, the intra-modal graph convolution selects evidence from each modality and cross-modal graph convolution aggregates relevant information across different graph layers. By stacking this process multiple times, our model performs iterative reasoning across three modalities and predicts the optimal answer by analyzing all question-oriented evidence. We achieve a new state-of-the-art performance on the FVQA task and demonstrate the effectiveness and interpretability of our model with extensive experiments.
【Keywords】: Computer Vision: Language and Vision; Machine Learning: Knowledge-based Learning;
【Paper Link】 【Pages】:1104-1110
【Authors】: Xue Lin ; Qi Zou ; Xixia Xu
【Abstract】: Human-object interaction (HOI) detection is important to understand human-centric scenes and is challenging due to subtle difference between fine-grained actions, and multiple co-occurring interactions. Most approaches tackle the problems by considering the multi-stream information and even introducing extra knowledge, which suffer from a huge combination space and the non-interactive pair domination problem. In this paper, we propose an Action-Guided attention mining and Relation Reasoning (AGRR) network to solve the problems. Relation reasoning on human-object pairs is performed by exploiting contextual compatibility consistency among pairs to filter out the non-interactive combinations. To better discriminate the subtle difference between fine-grained actions, an action-aware attention based on class activation map is proposed to mine the most relevant features for recognizing HOIs. Extensive experiments on V-COCO and HICO-DET datasets demonstrate the effectiveness of the proposed model compared with the state-of-the-art approaches.
【Keywords】: Computer Vision: Action Recognition; Computer Vision: Structural and Model-Based Approaches, Knowledge Representation and Reasoning; Machine Learning: Relational Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:1111-1117
【Authors】: Dongming Yang ; Yuexian Zou
【Abstract】: Human-Object Interaction (HOI) detection devotes to learn how humans interact with surrounding objects via inferring triplets of < human, verb, object >. However, recent HOI detection methods mostly rely on additional annotations (e.g., human pose) and neglect powerful interactive reasoning beyond convolutions. In this paper, we present a novel graph-based interactive reasoning model called Interactive Graph (abbr. in-Graph) to infer HOIs, in which interactive semantics implied among visual targets are efficiently exploited. The proposed model consists of a project function that maps related targets from convolution space to a graph-based semantic space, a message passing process propagating semantics among all nodes and an update function transforming the reasoned nodes back to convolution space. Furthermore, we construct a new framework to assemble in-Graph models for detecting HOIs, namely in-GraphNet. Beyond inferring HOIs using instance features respectively, the framework dynamically parses pairwise interactive semantics among visual targets by integrating two-level in-Graphs, i.e., scene-wide and instance-wide in-Graphs. Our framework is end-to-end trainable and free from costly annotations like human pose. Extensive experiments show that our proposed framework outperforms existing HOI detection methods on both V-COCO and HICO-DET benchmarks and improves the baseline about 9.4% and 15% relatively, validating its efficacy in detecting HOIs.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Structural and Model-Based Approaches, Knowledge Representation and Reasoning; Computer Vision: Action Recognition; Computer Vision: 2D and 3D Computer Vision;
【Paper Link】 【Pages】:1119-1125
【Authors】: Julien Baste ; Michael R. Fellows ; Lars Jaffke ; Tomás Masarík ; Mateus de Oliveira Oliveira ; Geevarghese Philip ; Frances A. Rosamond
【Abstract】: When modeling an application of practical relevance as an instance of a combinatorial problem X, we are often interested not merely in finding one optimal solution for that instance, but in finding a sufficiently diverse collection of good solutions. In this work we initiate a systematic study of diversity from the point of view of fixed-parameter tractability theory. We consider an intuitive notion of diversity of a collection of solutions which suits a large variety of combinatorial problems of practical interest. Our main contribution is an algorithmic framework which --automatically-- converts a tree-decomposition-based dynamic programming algorithm for a given combinatorial problem X into a dynamic programming algorithm for the diverse version of X. Surprisingly, our algorithm has a polynomial dependence on the diversity parameter.
【Keywords】: Constraints and SAT: Constraint Optimization; Constraints and SAT: Constraint Satisfaction;
【Paper Link】 【Pages】:1126-1133
【Authors】: Xizhe Zhang ; Jian Gao ; Yizhi Lv ; Weixiong Zhang
【Abstract】: Constraints propagation and backtracking are two basic techniques for solving constraint satisfaction problems (CSPs). During the search for a solution, the variable and value pairs that do not belong to any solution can be discarded by constraint propagation to ensure generalized arc consistency so as to avoid the fruitless search. However, constraint propagation is frequently invoked often with little effect on many CSPs. Much effort has been devoted to predicting when to invoke constraint propagation for solving a CSP; however, no effective approach has been developed for the alldifferent constraint. Here we present a novel theorem for identifying the edges in a value graph of alldifferent constraint whose removal can significantly reduce useless constraint propagation. We prove that if an alternating cycle exists for a prospectively removable edge that represents a variable-value assignment, the edge (and the assignment) can be discarded without constraint propagation. Based on this theorem, we developed a novel optimizing technique for early detection of useless constraint propagation which can be incorporated in any existing algorithm for alldifferent constraint. Our implementation of the new method achieved speedup by a factor of 1-5 over the state-of-art approaches on 93 benchmark problem instances in 8 domains. Furthermore, the new algorithm is scalable well and runs increasingly faster than the existing methods on larger problems.
【Keywords】: Constraints and SAT: Constraint Satisfaction; Constraints and SAT: Constraints: Modeling, Solvers, Applications; Constraints and SAT: Global Constraints;
【Paper Link】 【Pages】:1134-1140
【Authors】: Stephan Gocht ; Ciaran McCreesh ; Jakob Nordström
【Abstract】: Modern subgraph isomorphism solvers carry out sophisticated reasoning using graph invariants such as degree sequences and path counts. We show that all of this reasoning can be justified compactly using the cutting planes proofs studied in complexity theory. This allows us to extend a state of the art subgraph isomorphism enumeration solver with proof logging support, so that the solutions it outputs may be audited and verified for correctness and completeness by a simple third party tool which knows nothing about graph theory.
【Keywords】: Constraints and SAT: Constraint Satisfaction; Constraints and SAT: SAT: Algorithms and Techniques;
【Paper Link】 【Pages】:1141-1147
【Authors】: Zhendong Lei ; Shaowei Cai ; Chuan Luo
【Abstract】: Satisfiability (SAT) and Maximum Satisfiability (MaxSAT) are two basic and important constraint problems with many important applications. SAT and MaxSAT are expressed in CNF, which is difficult to deal with cardinality constraints. In this paper, we introduce Extended Conjunctive Normal Form (ECNF), which expresses cardinality constraints straightforward and does not need auxiliary variables or clauses. Then, we develop a simple and efficient local search solver LS-ECNF with a well designed scoring function under ECNF. We also develop a generalized Unit Propagation (UP) based algorithm to generate the initial solution for local search. We encode instances from Nurse Rostering and Discrete Tomography Problems into CNF with three different cardinality constraint encodings and ECNF respectively. Experimental results show that LS-ECNF has much better performance than state of the art MaxSAT, SAT, Pseudo-Boolean and ILP solvers, which indicates solving cardinality constraints with ECNF is promising.
【Keywords】: Constraints and SAT: Constraint Satisfaction; Constraints and SAT: SAT: : Solvers and Applications; Constraints and SAT: MaxSAT, MinSAT;
【Paper Link】 【Pages】:1148-1154
【Authors】: Daniel Le Berre ; Pierre Marquis ; Stefan Mengel ; Romain Wallon
【Abstract】: Learning pseudo-Boolean (PB) constraints in PB solvers exploiting cutting planes based inference is not as well understood as clause learning in conflict-driven clause learning solvers. In this paper, we show that PB constraints derived using cutting planes may contain irrelevant literals, i.e., literals whose assigned values (whatever they are) never change the truth value of the constraint. Such literals may lead to infer constraints that are weaker than they should be, impacting the size of the proof built by the solver, and thus also affecting its performance. This suggests that current implementations of PB solvers based on cutting planes should be reconsidered to prevent the generation of irrelevant literals. Indeed, detecting and removing irrelevant literals is too expensive in practice to be considered as an option (the associated problem is NP-hard).
【Keywords】: Constraints and SAT: SAT: Algorithms and Techniques; Constraints and SAT: SAT: : Solvers and Applications;
【Paper Link】 【Pages】:1155-1162
【Authors】: Georg Gottlob ; Cem Okulmus ; Reinhard Pichler
【Abstract】: Constraint Satisfaction Problems (CSP) are notoriously hard. Consequently, powerful decomposition methods have been developed to overcome this complexity. However, this poses the challenge of actually computing such a decomposition for a given CSP instance, and previous algorithms have shown their limitations in doing so. In this paper, we present a number of key algorithmic improvements and parallelisation techniques to compute so-called Generalized Hypertree Decompositions (GHDs) faster. We thus advance the ability to compute optimal (i.e., minimal-width) GHDs for a significantly wider range of CSP instances on modern machines. This lays the foundation for more systems and applications in evaluating CSPs and related problems (such as Conjunctive Query answering) based on their structural properties.
【Keywords】: Constraints and SAT: Constraint Satisfaction; Multidisciplinary Topics and Applications: Databases;
【Paper Link】 【Pages】:1163-1169
【Authors】: Marc-André Ménard ; Claude-Guy Quimper ; Jonathan Gaudreault
【Abstract】: Solving the problem is an important part of optimization. An equally important part is the analysis of the solution where several questions can arise. For a scheduling problem, is it possible to obtain a better solution by increasing the capacity of a resource? What happens to the objective value if we start a specific task earlier? Answering such questions is important to provide explanations and increase the acceptability of a solution. A lot of research has been done on sensitivity analysis, but few techniques can be applied to constraint programming. We present a new method for sensitivity analysis applied to constraint programming. It collects information, during the search, about the propagation of the CUMULATIVE constraint, the filtering of the variables, and the solution returned by the solver. Using machine learning algorithms, we predict if increasing/decreasing the capacity of the cumulative resource allows a better solution. We also predict the impact on the objective value of forcing a task to finish earlier. We experimentally validate our method with the RCPSP problem.
【Keywords】: Constraints and SAT: Constraints and Data Mining ; Constraints and Machine Learning; Planning and Scheduling: Scheduling; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:1170-1176
【Authors】: Hao Hu ; Mohamed Siala ; Emmanuel Hebrard ; Marie-José Huguet
【Abstract】: Recently, several exact methods to compute decision trees have been introduced. On the one hand, these approaches can find optimal trees for various objective functions including total size, depth or accuracy on the training set and therefore. On the other hand, these methods are not yet widely used in practice and classic heuristics are often still the methods of choice.
In this paper we show how the SAT model proposed by [Narodytska et.al 2018] can be lifted to a MaxSAT approach, making it much more practically relevant. In particular, it scales to much larger data sets; the objective function can easily be adapted to take into account combinations of size, depth and accuracy on the training set; and the fine-grained control of the objective function it offers makes it particularly well suited for boosting.
Our experiments show promising results. In particular, we show that the prediction quality of our approach often exceeds state of the art heuristics. We also show that the MaxSAT formulation is well adapted for boosting using the well-known AdaBoost Algorithm.
【Keywords】: Constraints and SAT: MaxSAT, MinSAT; Constraints and SAT: Constraint Optimization; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:1177-1183
【Authors】: Wenjie Zhang ; Zeyu Sun ; Qihao Zhu ; Ge Li ; Shaowei Cai ; Yingfei Xiong ; Lu Zhang
【Abstract】: The Boolean satisfiability problem (SAT) is a famous NP-complete problem in computer science. An effective way for solving a satisfiable SAT problem is the stochastic local search (SLS). However, in this method, the initialization is assigned in a random manner, which impacts the effectiveness of SLS solvers. To address this problem, we propose NLocalSAT. NLocalSAT combines SLS with a solution prediction model, which boosts SLS by changing initialization assignments with a neural network. We evaluated NLocalSAT on five SLS solvers (CCAnr, Sparrow, CPSparrow, YalSAT, and probSAT) with instances in the random track of SAT Competition 2018. The experimental results show that solvers with NLocalSAT achieve 27% ~ 62% improvement over the original SLS solvers.
【Keywords】: Constraints and SAT: SAT: : Solvers and Applications; Constraints and SAT: SAT: Algorithms and Techniques;
【Paper Link】 【Pages】:1184-1191
【Authors】: Ruiwei Wang ; Roland H. C. Yap
【Abstract】: Constraint Satisfaction Problems (CSPs) are typically solved with Generalized Arc Consistency (GAC). A general CSP can also be encoded into a binary CSP and solved with Arc Consistency (AC). The well-known Hidden Variable Encoding (HVE) is still a state-of-the-art binary encoding for solving CSPs. We propose a new binary encoding, called Bipartite Encoding (BE) which uses the idea of partitioning constraints. A BE encoded CSP can achieve a higher level of consistency than GAC on the original CSP. We give an algorithm for creating compact bipartite encoding for non-binary CSPs. We present a AC propagator on the binary constraints from BE exploiting their special structure. Experiments on a large set of non-binary CSP benchmarks with table constraints using the Wdeg, Activity and Impact heuristics show that BE with our AC propagator can outperform existing state-of-the-art GAC algorithms (CT, STRbit) and binary encodings (HVE with HTAC).
【Keywords】: Constraints and SAT: Constraint Satisfaction; Constraints and SAT: Constraints: Modeling, Solvers, Applications;
【Paper Link】 【Pages】:1192-1200
【Authors】: Jimmy Ho-Man Lee ; Allen Z. Zhong
【Abstract】: Exploiting dominance relations in many Constraint Optimization Problems can drastically speed up the solving process in practice. Identification and utilization of dominance relations, however, usually require human expertise. We present a theoretical framework for a useful class of constraint optimization problems to detect dominance automatically and formulate the generation of the associated dominance breaking nogoods as constraint satisfaction. By controlling the length and quantity of the nogoods, our method can generate dominance break- ing nogoods of varying strengths. Experimentation confirms runtime improvements of up to three orders of magnitude against manual methods.
【Keywords】: Constraints and SAT: Constraint Optimization; Constraints and SAT: Constraint Satisfaction; Constraints and SAT: Constraints: Modeling, Solvers, Applications;
【Paper Link】 【Pages】:1202-1208
【Authors】: Debarun Bhattacharjya ; Tian Gao ; Nicholas Mattei ; Dharmashankar Subramanian
【Abstract】: Causal discovery from observational data has been intensely studied across fields of study. In this paper, we consider datasets involving irregular occurrences of various types of events over the timeline. We propose a suite of scores and related algorithms for estimating the cause-effect association between pairs of events from such large event datasets. In particular, we introduce a general framework and the use of conditional intensity rates to characterize pairwise associations between events. Discovering such potential causal relationships is critical in several domains, including health, politics and financial analysis. We conduct an experimental investigation with synthetic data and two real-world event datasets, where we evaluate and compare our proposed scores using assessments from human raters as ground truth. For a political event dataset involving interaction between actors, we show how performance could be enhanced by enforcing additional knowledge pertaining to actor identities.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Machine Learning: Time-series;Data Streams; Knowledge Representation and Reasoning: Action, Change and Causality;
【Paper Link】 【Pages】:1209-1215
【Authors】: Yu Hao ; Xin Cao ; Yixiang Fang ; Xike Xie ; Sibo Wang
【Abstract】: Predicting the link between two nodes is a fundamental problem for graph data analytics. In attributed graphs, both the structure and attribute information can be utilized for link prediction. Most existing studies focus on transductive link prediction where both nodes are already in the graph. However, many real-world applications require inductive prediction for new nodes having only attribute information. It is more challenging since the new nodes do not have structure information and cannot be seen during the model training. To solve this problem, we propose a model called DEAL, which consists of three components: two node embedding encoders and one alignment mechanism. The two encoders aim to output the attribute-oriented node embedding and the structure-oriented node embedding, and the alignment mechanism aligns the two types of embeddings to build the connections between the attributes and links. Our model DEAL is versatile in the sense that it works for both inductive and transductive link prediction. Extensive experiments on several benchmark datasets show that our proposed model significantly outperforms existing inductive link prediction methods, and also outperforms the state-of-the-art methods on transductive link prediction.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1216-1222
【Authors】: Xuefeng Chen ; Xin Cao ; Yifeng Zeng ; Yixiang Fang ; Bin Yao
【Abstract】: Region search is an important problem in location-based services due to its wide applications. In this paper, we study the problem of optimal region search with submodular maximization (ORS-SM). This problem considers a region as a connected subgraph. We compute an objective value over the locations in the region using a submodular function and a budget value by summing up the costs of edges in the region, and aim to search the region with the largest objective score under a budget value constraint. ORS-SM supports many applications such as the most diversified region search. We prove that the problem is NP-hard and develop two approximation algorithms with guaranteed error bounds. We conduct experiments on two applications using three real-world datasets. The results demonstrate that our algorithms can achieve high-quality solutions and are faster than a state-of-the-art method by orders of magnitude.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:1223-1229
【Authors】: Hong Yang ; Ling Chen ; Minglong Lei ; Lingfeng Niu ; Chuan Zhou ; Peng Zhang
【Abstract】: Discrete network embedding emerged recently as a new direction of network representation learning. Compared with traditional network embedding models, discrete network embedding aims to compress model size and accelerate model inference by learning a set of short binary codes for network vertices. However, existing discrete network embedding methods usually assume that the network structures (e.g., edge weights) are readily available. In real-world scenarios such as social networks, sometimes it is impossible to collect explicit network structure information and it usually needs to be inferred from implicit data such as information cascades in the networks. To address this issue, we present an end-to-end discrete network embedding model for latent networks DELN that can learn binary representations from underlying information cascades. The essential idea is to infer a latent Weisfeiler-Lehman proximity matrix that captures node dependence based on information cascades and then to factorize the latent Weisfiler-Lehman matrix under the binary node representation constraint. Since the learning problem is a mixed integer optimization problem, an efficient maximal likelihood estimation based cyclic coordinate descent (MLE-CCD) algorithm is used as the solution. Experiments on real-world datasets show that the proposed model outperforms the state-of-the-art network embedding methods.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:1230-1236
【Authors】: Yu Chen ; Lingfei Wu ; Mohammed J. Zaki
【Abstract】: Conversational machine comprehension (MC) has proven significantly more challenging compared to traditional MC since it requires better utilization of conversation history. However, most existing approaches do not effectively capture conversation history and thus have trouble handling questions involving coreference or ellipsis. Moreover, when reasoning over passage text, most of them simply treat it as a word sequence without exploring rich semantic relationships among words. In this paper, we first propose a simple yet effective graph structure learning technique to dynamically construct a question and conversation history aware context graph at each conversation turn. Then we propose a novel Recurrent Graph Neural Network, and based on that, we introduce a flow mechanism to model the temporal dependencies in a sequence of context graphs. The proposed GraphFlow model can effectively capture conversational flow in a dialog, and shows competitive performance compared to existing state-of-the-art methods on CoQA, QuAC and DoQA benchmarks. In addition, visualization experiments show that our proposed model can offer good interpretability for the reasoning process.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Natural Language Processing: Dialogue; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:1237-1243
【Authors】: Hong Huang ; Zixuan Fang ; Xiao Wang ; Youshan Miao ; Hai Jin
【Abstract】: Network embedding, mapping nodes in a network to a low-dimensional space, achieves powerful performance. An increasing number of works focus on static network embedding, however, seldom attention has been paid to temporal network embedding, especially without considering the effect of mesoscopic dynamics when the network evolves. In light of this, we concentrate on a particular motif --- triad --- and its temporal dynamics, to study the temporal network embedding. Specifically, we propose MTNE, a novel embedding model for temporal networks. MTNE not only integrates the Hawkes process to stimulate the triad evolution process that preserves motif-aware high-order proximities, but also combines attention mechanism to distinguish the importance of different types of triads better. Experiments on various real-world temporal networks demonstrate that, compared with several state-of-the-art methods, our model achieves the best performance in both static and dynamic tasks, including node classification, link prediction, and link recommendation.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:1244-1250
【Authors】: Adam Goodge ; Bryan Hooi ; See-Kiong Ng ; Wee Siong Ng
【Abstract】: Detecting anomalies is an important task in a wide variety of applications and domains. Deep learning methods have achieved state-of-the-art performance in anomaly detection in recent years; unsupervised methods being particularly popular. However, deep learning methods can be fragile to small perturbations in the input data. This can be exploited by an adversary to deliberately hinder model performance; an adversarial attack. This phenomena has been widely studied in the context of supervised image classification since its discovery, however such studies for an anomaly detection setting are sorely lacking. Moreover, the plethora of defense mechanisms that have been proposed are often not applicable to unsupervised anomaly detection models. In this work, we study the effect of adversarial attacks on the performance of anomaly-detecting autoencoders using real data from a Cyber physical system (CPS) testbed with intervals of controlled, physical attacks as anomalies. An adversary would attempt to disguise these points as normal through adversarial perturbations. To combat this, we propose the Approximate Projection Autoencoder (APAE), which incorporates two defenses against such attacks into a general autoencoder. One of these involves a novel technique to improve robustness under adversarial impact by optimising latent representations for better reconstruction outputs.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:1251-1257
【Authors】: Pinghua Xu ; Wenbin Hu ; Jia Wu ; Weiwei Liu
【Abstract】: Social media sites are now becoming very important platforms for product promotion or marketing campaigns. Therefore, there is broad interest in determining ways to guide a site to react more positively to a product with a limited budget. However, the practical significance of the existing studies on this subject is limited for two reasons. First, most studies have investigated the issue in oversimplified networks in which several important network characteristics are ignored. Second, the opinions of individuals are modeled as bipartite states (e.g., support or not) in numerous studies, however, this setting is too strict for many real scenarios. In this study, we focus on social trust networks (STNs), which have the significant characteristics ignored in the previous studies. We generalized a famed continuous-valued opinion dynamics model for STNs, which is more consistent with real scenarios. We subsequently formalized two novel problems for solving the issue in STNs. In addition, we developed two matrix-based methods for these two problems and experiments on realworld datasets to demonstrate the practical utility of our methods.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1258-1264
【Authors】: Zhichao Huang ; Xutao Li ; Yunming Ye ; Michael K. Ng
【Abstract】: Graph Convolutional Networks (GCNs) have been extensively studied in recent years. Most of existing GCN approaches are designed for the homogenous graphs with a single type of relation. However, heterogeneous graphs of multiple types of relations are also ubiquitous and there is a lack of methodologies to tackle such graphs. Some previous studies address the issue by performing conventional GCN on each single relation and then blending their results. However, as the convolutional kernels neglect the correlations across relations, the strategy is sub-optimal. In this paper, we propose the Multi-Relational Graph Convolutional Network (MR-GCN) framework by developing a novel convolution operator on multi-relational graphs. In particular, our multi-dimension convolution operator extends the graph spectral analysis into the eigen-decomposition of a Laplacian tensor. And the eigen-decomposition is formulated with a generalized tensor product, which can correspond to any unitary transform instead of limited merely to Fourier transform. We conduct comprehensive experiments on four real-world multi-relational graphs to solve the semi-supervised node classification task, and the results show the superiority of MR-GCN against the state-of-the-art competitors.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning: Tensor and Matrix Methods; Machine Learning: Deep Learning: Convolutional networks; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:1265-1271
【Authors】: Yacine Izza ; Saïd Jabbour ; Badran Raddaoui ; Abdelhamid Boudane
【Abstract】: While traditional data mining techniques have been used extensively for finding patterns in databases, they are not always suitable for incorporating user-specified constraints. To overcome this issue, CP and SAT based frameworks for modeling and solving pattern mining tasks have gained a considerable audience in recent years. However, a bottleneck for all these CP and SAT-based approaches is the encoding size which makes these algorithms inefficient for large databases. This paper introduces a practical SAT-based approach to discover efficiently (minimal non-redundant) association rules. First, we present a decomposition-based paradigm that splits the original transaction database into smaller and independent subsets. Then, we show that without producing too large formulas, our decomposition method allows independent mining evaluation on a multi-core machine, improving performance. Finally, an experimental evaluation shows that our method is fast and scale well compared with the existing CP approach even in the sequential case, while significantly reducing the gap with the best state-of-the-art specialized algorithm.
【Keywords】: Data Mining: Frequent Pattern Mining; Constraints and SAT: Constraints and Data Mining ; Constraints and Machine Learning; Constraints and SAT: Constraints: Modeling, Solvers, Applications;
【Paper Link】 【Pages】:1272-1278
【Authors】: Jian Sun ; Hongyu Jia ; Bo Hu ; Xiao Huang ; Hao Zhang ; Hai Wan ; Xibin Zhao
【Abstract】: Very Fast Decision Tree (VFDT) is one of the most widely used online decision tree induction algorithms, and it provides high classification accuracy with theoretical guarantees. In VFDT, the split-attempt operation is essential for leaf-split. It is computation-intensive since it computes the heuristic measure of all attributes of a leaf. To reduce split-attempts, VFDT tries to split at constant intervals (for example, every 200 examples). However, this mechanism introduces split-delay for split can only happen at fixed intervals, which slows down the growth of VFDT and finally lowers accuracy. To address this problem, we first devise an online incremental algorithm that computes the heuristic measure of an attribute with a much lower computational cost. Then a subset of attributes is carefully selected to find a potential split timing using this algorithm. A split-attempt will be carried out once the timing is verified. By the whole process, computational cost and split-delay are lowered significantly. Comprehensive experiments are conducted using multiple synthetic and real datasets. Compared with state-of-the-art algorithms, our method reduces split-attempts by about 5 to 10 times on average with much lower split-delay, which makes our algorithm run faster and more accurate.
【Keywords】: Data Mining: Mining Data Streams; Machine Learning: Online Learning; Machine Learning: Classification;
【Paper Link】 【Pages】:1279-1287
【Authors】: Boyang Li ; Yurong Cheng ; Ye Yuan ; Guoren Wang ; Lei Chen
【Abstract】: In recent years, 3D spatial crowdsourcing platforms become popular, in which users and workers travel together to their assigned workplaces for services, such as InterestingSport and Nanguache. A typical problem over 3D spatial crowdsourcing platforms is to match users with suitable workers and workplaces. Existing studies all ignored that the workers and users assigned to the same workplace should arrive almost at the same time, which is very practical in the real world. Thus, in this paper, we propose a new Simultaneous Arrival Matching (SAM), which enables workers and users to arrive at their assigned workplace within a given tolerant time. We find that the new considered arriving time constraint breaks the monotonic additivity of the result set. Thus, it brings a large challenge in designing effective and efficient algorithms for the SAM. We design Sliding Window algorithm and Threshold Scanning algorithm to solve the SAM. We conduct the experiments on real and synthetic datasets, experimental results show the effectiveness and efficiency of our algorithms.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Data Mining: Other;
【Paper Link】 【Pages】:1288-1294
【Authors】: Kaize Ding ; Jundong Li ; Nitin Agarwal ; Huan Liu
【Abstract】: Anomaly detection on attributed networks has attracted a surge of research attention due to its broad applications in various high-impact domains, such as security, finance, and healthcare. Nonetheless, most of the existing efforts do not naturally generalize to unseen nodes, leading to the fact that people have to retrain the detection model from scratch when dealing with newly observed data. In this study, we propose to tackle the problem of inductive anomaly detection on attributed networks with a novel unsupervised framework: Aegis (adversarial graph differentiation networks). Specifically, we design a new graph neural layer to learn anomaly-aware node representations and further employ generative adversarial learning to detect anomalies among new data. Extensive experiments on various attributed networks demonstrate the efficacy of the proposed approach.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Applications; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:1295-1302
【Authors】: Fan Zhou ; Liang Li ; Ting Zhong ; Goce Trajcevski ; Kunpeng Zhang ; Jiahao Wang
【Abstract】: Flow super-resolution (FSR) enables inferring fine-grained urban flows with coarse-grained observations and plays an important role in traffic monitoring and prediction. The existing FSR solutions rely on deep CNN models (e.g., ResNet) for learning spatial correlation, incurring excessive memory cost and numerous parameter updates. We propose to tackle the urban flows inference using dynamic systems paradigm and present a new method FODE -- FSR with Ordinary Differential Equations (ODEs). FODE extends neural ODEs by introducing an affine coupling layer to overcome the problem of numerically unstable gradient computation, which allows more accurate and efficient spatial correlation estimation, without extra memory cost. In addition, FODE provides a flexible balance between flow inference accuracy and computational efficiency. A FODE-based augmented normalization mechanism is further introduced to constrain the flow distribution with the influence of external factors. Experimental evaluations on two real-world datasets demonstrate that FODE significantly outperforms several baseline approaches.
【Keywords】: Data Mining: Feature Extraction, Selection and Dimensionality Reduction; Data Mining: Mining Spatial, Temporal Data; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:1303-1309
【Authors】: Yiqing Xie ; Sha Li ; Carl Yang ; Raymond Chi-Wing Wong ; Jiawei Han
【Abstract】: Graph Neural Networks (GNNs) have been shown to be powerful in a wide range of graph-related tasks. While there exists various GNN models, a critical common ingredient is neighborhood aggregation, where the embedding of each node is updated by referring to the embedding of its neighbors. This paper aims to provide a better understanding of this mechanisms by asking the following question: Is neighborhood aggregation always necessary and beneficial? In short, the answer is no. We carve out two conditions under which neighborhood aggregation is not helpful: (1) when a node's neighbors are highly dissimilar and (2) when a node's embedding is already similar with that of its neighbors. We propose novel metrics that quantitatively measure these two circumstances and integrate them into an Adaptive-layer module. Our experiments show that allowing for node-specific aggregation degrees have significant advantage over current GNNs.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning: Semi-Supervised Learning;
【Paper Link】 【Pages】:1310-1316
【Authors】: Yongshun Gong ; Zhibin Li ; Jian Zhang ; Wei Liu ; Bei Chen ; Xiangjun Dong
【Abstract】: Large volumes of urban statistical data with multiple views imply rich knowledge about the development degree of cities. These data present crucial statistics which play an irreplaceable role in the regional analysis and urban computing. In reality, however, the statistical data divided into fine-grained regions usually suffer from missing data problems. Those missing values hide the useful information that may result in a distorted data analysis. Thus, in this paper, we propose a spatial missing data imputation method for multi-view urban statistical data. To address this problem, we exploit an improved spatial multi-kernel clustering method to guide the imputation process cooperating with an adaptive-weight non-negative matrix factorization strategy. Intensive experiments are conducted with other state-of-the-art approaches on six real-world urban statistical datasets. The results not only show the superiority of our method against other comparative methods on different datasets, but also represent a strong generalizability of our model.
【Keywords】: Data Mining: Applications; Data Mining: Big Data, Large-Scale Systems; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:1317-1323
【Authors】: Hanchen Wang ; Defu Lian ; Ying Zhang ; Lu Qin ; Xuemin Lin
【Abstract】: Entity interaction prediction is essential in many important applications such as chemistry, biology, material science, and medical science. The problem becomes quite challenging when each entity is represented by a complex structure, namely structured entity, because two types of graphs are involved: local graphs for structured entities and a global graph to capture the interactions between structured entities. We observe that existing works on structured entity interaction prediction cannot properly exploit the unique graph of graphs model. In this paper, we propose a Graph of Graphs Neural Network, namely GoGNN, which extracts the features in both structured entity graphs and the entity interaction graph in a hierarchical way. We also propose the dual-attention mechanism that enables the model to preserve the neighbor importance in both levels of graphs. Extensive experiments on real-world datasets show that GoGNN outperforms the state-of-the-art methods on two representative structured entity interaction prediction tasks: chemical-chemical interaction prediction and drug-drug interaction prediction. Our code is available at Github.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Multidisciplinary Topics and Applications: Biology and Medicine;
【Paper Link】 【Pages】:1324-1330
【Authors】: Ziyu Jia ; Youfang Lin ; Jing Wang ; Ronghao Zhou ; Xiaojun Ning ; Yuanlai He ; Yaoshuai Zhao
【Abstract】: Sleep stage classification is essential for sleep assessment and disease diagnosis. However, how to effectively utilize brain spatial features and transition information among sleep stages continues to be challenging. In particular, owing to the limited knowledge of the human brain, predefining a suitable spatial brain connection structure for sleep stage classification remains an open question. In this paper, we propose a novel deep graph neural network, named GraphSleepNet, for automatic sleep stage classification. The main advantage of the GraphSleepNet is to adaptively learn the intrinsic connection among different electroencephalogram (EEG) channels, represented by an adjacency matrix, thereby best serving the spatial-temporal graph convolution network (ST-GCN) for sleep stage classification. Meanwhile, the ST-GCN consists of graph convolutions for extracting spatial features and temporal convolutions for capturing the transition rules among sleep stages. Experiments on the Montreal Archive of Sleep Studies (MASS) dataset demonstrate that the GraphSleepNet outperforms the state-of-the-art baselines.
【Keywords】: Data Mining: Applications; Machine Learning: Time-series;Data Streams; Machine Learning Applications: Bio/Medicine; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:1331-1337
【Authors】: Jie Feng ; Ziqian Lin ; Tong Xia ; Funing Sun ; Diansheng Guo ; Yong Li
【Abstract】: Population flow prediction is one of the most fundamental components in many applications from urban management to transportation schedule. It is challenging due to the complicated spatial-temporal correlation.While many studies have been done in recent years, they fail to simultaneously and effectively model the spatial correlation and temporal variations among population flows. In this paper, we propose Convolution based Sequential and Cross Network (CSCNet) to solve them. On the one hand, we design a CNN based sequential structure with progressively merging the flow features from different time in different CNN layers to model the spatial-temporal information simultaneously. On the other hand, we make use of the transition flow as the proxy to efficiently and explicitly capture the dynamic correlation between different types of population flows. Extensive experiments on 4 datasets demonstrate that CSCNet outperforms the state-of-the-art baselines by reducing the prediction error around 7.7%∼10.4%.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Data Mining: Applications; Machine Learning Applications: Environmental; Multidisciplinary Topics and Applications: Transportation;
【Paper Link】 【Pages】:1338-1344
【Authors】: Man Luo ; Wenzhe Zhang ; Tianyou Song ; Kun Li ; Hongming Zhu ; Bowen Du ; Hongkai Wen
【Abstract】: Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i.e., repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i.e., the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:1345-1351
【Authors】: Avirup Saha ; Shreyas Sheshadri ; Samik Datta ; Niloy Ganguly ; Disha Makhija ; Priyank Patel
【Abstract】: With the proliferation of learning scenarios with an abundance of instances, but limited amount of high-quality labels, semi-supervised learning algorithms came to prominence. Graph-based semi-supervised learning (G-SSL) algorithms, of which Label Propagation (LP) is a prominent example, are particularly well-suited for these problems. The premise of LP is the existence of homophily in the graph, but beyond that nothing is known about the efficacy of LP. In particular, there is no characterisation that connects the structural constraints, volume and quality of the labels to the accuracy of LP. In this work, we draw upon the notion of recovery from the literature on community detection, and provide guarantees on accuracy for partially-labelled graphs generated from the Partially-Labelled Stochastic Block Model (PLSBM). Extensive experiments performed on synthetic data verify the theoretical findings.
【Keywords】: Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1352-1358
【Authors】: Kaixiong Zhou ; Qingquan Song ; Xiao Huang ; Daochen Zha ; Na Zou ; Xia Hu
【Abstract】: The classification of graph-structured data has be-come increasingly crucial in many disciplines. It has been observed that the implicit or explicit hierarchical community structures preserved in real-world graphs could be useful for downstream classification applications. A straightforward way to leverage the hierarchical structure is to make use the pooling algorithms to cluster nodes into fixed groups, and shrink the input graph layer by layer to learn the pooled graphs.However, the pool shrinking discards the graph details to make it hard to distinguish two non-isomorphic graphs, and the fixed clustering ignores the inherent multiple characteristics of nodes. To compensate the shrinking loss and learn the various nodes’ characteristics, we propose the multi-channel graph neural networks (MuchGNN). Motivated by the underlying mechanisms developed in convolutional neural networks, we define the tailored graph convolutions to learn a series of graph channels at each layer, and shrink the graphs hierarchically to en-code the pooled structures. Experimental results on real-world datasets demonstrate the superiority of MuchGNN over the state-of-the-art methods.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Classification, Semi-Supervised Learning; Data Mining: Applications;
【Paper Link】 【Pages】:1359-1365
【Authors】: Peiyan Li ; Honglian Wang ; Christian Böhm ; Junming Shao
【Abstract】: Online semi-supervised multi-label classification serves a practical yet challenging task since only a small number of labeled instances are available in real streaming environments. However, the mainstream of existing online classification techniques are focused on the single-label case, while only a few multi-label stream classification algorithms exist, and they are mainly trained on labeled instances. In this paper, we present a novel Online Semi-supervised Multi-Label learning algorithm (OnSeML) based on label compression and local smooth regression, which allows real-time multi-label predictions in a semi-supervised setting and is robust to evolving label distributions. Specifically, to capture the high-order label relationship and to build a compact target space for regression, OnSeML compresses the label set into a low-dimensional space by a fixed orthogonal label encoder. Then a locally defined regression function for each incoming instance is obtained with a closed-form solution. Targeting the evolving label distribution problem, we propose an adaptive decoding scheme to adequately integrate newly arriving labeled data. Extensive experiments provide empirical evidence for the effectiveness of our approach.
【Keywords】: Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Data Streams;
【Paper Link】 【Pages】:1366-1372
【Authors】: Jianan Zhao ; Xiao Wang ; Chuan Shi ; Zekuan Liu ; Yanfang Ye
【Abstract】: As heterogeneous networks have become increasingly ubiquitous, Heterogeneous Information Network (HIN) embedding, aiming to project nodes into a low-dimensional space while preserving the heterogeneous structure, has drawn increasing attention in recent years. Many of the existing HIN embedding methods adopt meta-path guided random walk to retain both the semantics and structural correlations between different types of nodes. However, the selection of meta-paths is still an open problem, which either depends on domain knowledge or is learned from label information. As a uniform blueprint of HIN, the network schema comprehensively embraces the high-order structure and contains rich semantics. In this paper, we make the first attempt to study network schema preserving HIN embedding, and propose a novel model named NSHE. In NSHE, a network schema sampling method is first proposed to generate sub-graphs (i.e., schema instances), and then multi-task learning task is built to preserve the heterogeneous structure of each schema instance. Besides preserving pairwise structure information, NSHE is able to retain high-order structure (i.e., network schema). Extensive experiments on three real-world datasets demonstrate that our proposed model NSHE significantly outperforms the state-of-the-art methods.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Feature Extraction, Selection and Dimensionality Reduction; Data Mining: Clustering, Unsupervised Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:1373-1380
【Authors】: Ximing Li ; Yang Wang
【Abstract】: Partial Multi-label Learning (PML) aims to induce the multi-label predictor from datasets with noisy supervision, where each training instance is associated with several candidate labels but only partially valid. To address the noisy issue, the existing PML methods basically recover the ground-truth labels by leveraging the ground-truth confidence of the candidate label, i.e., the likelihood of a candidate label being a ground-truth one. However, they neglect the information from non-candidate labels, which potentially contributes to the ground-truth label recovery. In this paper, we propose to recover the ground-truth labels, i.e., estimating the ground-truth confidences, from the label enrichment, composed of the relevance degrees of candidate labels and irrelevance degrees of non-candidate labels. Upon this observation, we further develop a novel two-stage PML method, namely Partial Multi-Label Learning with Label Enrichment-Recovery (PML3ER), where in the first stage, it estimates the label enrichment with unconstrained label propagation, then jointly learns the ground-truth confidence and multi-label predictor given the label enrichment. Experimental results validate that PML3ER outperforms the state-of-the-art PML methods.
【Keywords】: Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:1381-1387
【Authors】: Liang Yang ; Yuexue Wang ; Junhua Gu ; Chuan Wang ; Xiaochun Cao ; Yuanfang Guo
【Abstract】: Motivated by the capability of Generative Adversarial Network on exploring the latent semantic space and capturing semantic variations in the data distribution, adversarial learning has been adopted in network embedding to improve the robustness. However, this important ability is lost in existing adversarially regularized network embedding methods, because their embedding results are directly compared to the samples drawn from perturbation (Gaussian) distribution without any rectification from real data. To overcome this vital issue, a novel Joint Adversarial Network Embedding (JANE) framework is proposed to jointly distinguish the real and fake combinations of the embeddings, topology information and node features. JANE contains three pluggable components, Embedding module, Generator module and Discriminator module. The overall objective function of JANE is defined in a min-max form, which can be optimized via alternating stochastic gradient. Extensive experiments demonstrate the remarkable superiority of the proposed JANE on link prediction (3% gains in both AUC and AP) and node clustering (5% gain in F1 score).
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1388-1394
【Authors】: Lianwei Wu ; Yuan Rao ; Xiong Yang ; Wanzhen Wang ; Ambreen Nazir
【Abstract】: Exploring evidence from relevant articles to confirm the veracity of claims is a trend towards explainable claim verification. However, most strategies capture the top-k check-worthy articles or salient words as evidence, but this evidence is difficult to focus on the questionable parts of unverified claims. Besides, they utilize relevant articles indiscriminately, ignoring the source credibility of these articles, which may cause quiet a few unreliable articles to interfere with the assessment results. In this paper, we propose Evidence-aware Hierarchical Interactive Attention Networks (EHIAN) by considering the capture of evidence fragments and the fusion of source credibility to explore more credible evidence semantics discussing the questionable parts of claims for explainable claim verification. EHIAN first designs internal interaction layer (IIL) to strengthen deep interaction and matching between claims and relevant articles for obtaining key evidence fragments, and then proposes global inference layer (GIL) that fuses source features of articles and interacts globally with the average semantics of all articles and finally earns the more credible evidence semantics discussing the questionable parts of claims. Experiments on two datasets demonstrate that EHIAN not only achieves the state-of-the-art performance but also secures effective evidence to explain the results.
【Keywords】: Data Mining: Mining Text, Web, Social Media; Natural Language Processing: Sentiment Analysis and Text Mining; Natural Language Processing: Text Classification;
【Paper Link】 【Pages】:1395-1402
【Authors】: Shuo Zhang ; Lei Xie
【Abstract】: Graph Neural Networks (GNNs) are powerful for the representation learning of graph-structured data. Most of the GNNs use a message-passing scheme, where the embedding of a node is iteratively updated by aggregating the information from its neighbors. To achieve a better expressive capability of node influences, attention mechanism has grown to be popular to assign trainable weights to the nodes in aggregation. Though the attention-based GNNs have achieved remarkable results in various tasks, a clear understanding of their discriminative capacities is missing. In this work, we present a theoretical analysis of the representational properties of the GNN that adopts the attention mechanism as an aggregator. Our analysis determines all cases when those attention-based GNNs can always fail to distinguish certain distinct structures. Those cases appear due to the ignorance of cardinality information in attention-based aggregation. To improve the performance of attention-based GNNs, we propose cardinality preserved attention (CPA) models that can be applied to any kind of attention mechanisms. Our experiments on node and graph classification confirm our theoretical analysis and show the competitive performance of our CPA models. The code is available online: https://github.com/zetayue/CPA.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Networks; Machine Learning: Deep Learning; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:1403-1409
【Authors】: Yang Gao ; Hong Yang ; Peng Zhang ; Chuan Zhou ; Yue Hu
【Abstract】: Graph neural networks (GNNs) emerged recently as a powerful tool for analyzing non-Euclidean data such as social network data. Despite their success, the design of graph neural networks requires heavy manual work and domain knowledge. In this paper, we present a graph neural architecture search method (GraphNAS) that enables automatic design of the best graph neural architecture based on reinforcement learning. Specifically, GraphNAS uses a recurrent network to generate variable-length strings that describe the architectures of graph neural networks, and trains the recurrent network with policy gradient to maximize the expected accuracy of the generated architectures on a validation data set. Furthermore, to improve the search efficiency of GraphNAS on big networks, GraphNAS restricts the search space from an entire architecture space to a sequential concatenation of the best search results built on each single architecture layer. Experiments on real-world datasets demonstrate that GraphNAS can design a novel network architecture that rivals the best human-invented architecture in terms of validation set accuracy. Moreover, in a transfer learning task we observe that graph neural architectures designed by GraphNAS, when transferred to new datasets, still gain improvement in terms of prediction accuracy.
【Keywords】: Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Networks; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:1410-1416
【Authors】: Shuwen Yang ; Guojie Song ; Yilun Jin ; Lun Du
【Abstract】: Heterogeneous Information Networks (HINs) are ubiquitous structures in that they can depict complex relational data. Due to their complexity, it is hard to obtain sufficient labeled data on HINs, hampering classification on HINs. While domain adaptation (DA) techniques have been widely utilized in images and texts, the heterogeneity and complex semantics pose specific challenges towards domain adaptive classification on HINs. On one hand, HINs involve multiple levels of semantics, making it demanding to do domain alignment among them. On the other hand, the trade-off between domain similarity and distinguishability must be elaborately chosen, in that domain invariant features have been shown to be homogeneous and uninformative for classification. In this paper, we propose Multi-space Domain Adaptive Classification (MuSDAC) to handle the problem of DA on HINs. Specifically, we utilize multi-channel shared weight GCNs, projecting nodes in HINs to multiple spaces where pairwise alignment is carried out. In addition, we propose a heuristic sampling algorithm that efficiently chooses the combination of channels featuring distinguishability, and moving-averaged weighted voting scheme to fuse the selected channels, minimizing both transfer and classification loss. Extensive experiments on pairwise datasets endorse not only our model's performance on domain adaptive classification on HINs and contributions by individual components.
【Keywords】: Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:1417-1423
【Authors】: Xiaoyu Yang ; Yuefei Lyu ; Tian Tian ; Yifei Liu ; Yudong Liu ; Xi Zhang
【Abstract】: The wide spread of rumors on social media has caused tremendous effects in both the online and offline world. In addition to text information, recent detection methods began to exploit the graph structure in the propagation network. However, without a rigorous design, rumors may evade such graph models using various camouflage strategies by perturbing the structured data. Our focus in this work is to develop a robust graph-based detector to identify rumors on social media from an adversarial perspective. We first build a heterogeneous information network to model the rich information among users, posts, and user comments for detection. We then propose a graph adversarial learning framework, where the attacker tries to dynamically add intentional perturbations on the graph structure to fool the detector, while the detector would learn more distinctive structure features to resist such perturbations. In this way, our model would be enhanced in both robustness and generalization. Experiments on real-world datasets demonstrate that our model achieves better results than the state-of-the-art methods.
【Keywords】: Data Mining: Applications; Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1424-1430
【Authors】: Yuqiao Yang ; Xiaoqiang Lin ; Geng Lin ; Zengfeng Huang ; Changjian Jiang ; Zhongyu Wei
【Abstract】: In this paper, we explore to learn representations of legislation and legislator for the prediction of roll call results. The most popular approach for this topic is named the ideal point model that relies on historical voting information for representation learning of legislators. It largely ignores the context information of the legislative data. We, therefore, propose to incorporate context information to learn dense representations for both legislators and legislation. For legislators, we incorporate relations among them via graph convolutional neural networks (GCN) for their representation learning. For legislation, we utilize its narrative description via recurrent neural networks (RNN) for representation learning. In order to align two kinds of representations in the same vector space, we introduce a triplet loss for the joint training. Experimental results on a self-constructed dataset show the effectiveness of our model for roll call results prediction compared to some state-of-the-art baselines.
【Keywords】: Data Mining: Mining Text, Web, Social Media; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:1431-1437
【Authors】: Chenwei Zhang ; Yaliang Li ; Nan Du ; Wei Fan ; Philip S. Yu
【Abstract】: Being able to automatically discover synonymous entities in an open-world setting benefits various tasks such as entity disambiguation or knowledge graph canonicalization. Existing works either only utilize entity features, or rely on structured annotations from a single piece of context where the entity is mentioned. To leverage diverse contexts where entities are mentioned, in this paper, we generalize the distributional hypothesis to a multi-context setting and propose a synonym discovery framework that detects entity synonyms from free-text corpora with considerations on effectiveness and robustness. As one of the key components in synonym discovery, we introduce a neural network model SynonymNet to determine whether or not two given entities are synonym with each other. Instead of using entities features, SynonymNet makes use of multiple pieces of contexts in which the entity is mentioned, and compares the context-level similarity via a bilateral matching schema. Experimental results demonstrate that the proposed model is able to detect synonym sets that are not observed during training on both generic and domain-specific datasets: Wiki+Freebase, PubMed+UMLS, and MedBook+MKG, with up to 4.16% improvement in terms of Area Under the Curve and 3.19% in terms of Mean Average Precision compared to the best baseline method.
【Keywords】: Data Mining: Mining Text, Web, Social Media; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:1438-1444
【Authors】: Fuxin Ren ; Zhongbao Zhang ; Jiawei Zhang ; Sen Su ; Li Sun ; Guozhen Zhu ; Congying Guo
【Abstract】: Recently, aligning users among different social networks has received significant attention. However, most of the existing studies do not consider users’ behavior information during the aligning procedure and thus still suffer from the poor learning performance. In fact, we observe that social network alignment and behavior analysis can benefit from each other. Motivated by such an observation, we propose to jointly study the social network alignment problem and user behavior analysis problem. We design a novel end-to-end framework named BANANA. In this framework, to leverage behavior analysis for social network alignment at the distribution level, we design an earth mover’s distance based alignment model to fuse users’ behavior information for more comprehensive user representations. To further leverage social network alignment for behavior analysis, in turn, we design a temporal graph neural network model to fuse behavior information in different social networks based on the alignment result. Two models above can work together in an end-to-end manner. Through extensive experiments on real-world datasets, we demonstrate that our proposed approach outperforms the state-of-the-art methods in the social network alignment task and the user behavior analysis task, respectively.
【Keywords】: Data Mining: Applications; Data Mining: Mining Text, Web, Social Media; Humans and AI: Personalization and User Modeling;
【Paper Link】 【Pages】:1445-1451
【Authors】: Yi Zhou ; Zhe Wang ; Kaiyi Ji ; Yingbin Liang ; Vahid Tarokh
【Abstract】: Various types of parameter restart schemes have been proposed for proximal gradient algorithm with momentum to facilitate their convergence in convex optimization. However, under parameter restart, the convergence of proximal gradient algorithm with momentum remains obscure in nonconvex optimization. In this paper, we propose a novel proximal gradient algorithm with momentum and parameter restart for solving nonconvex and nonsmooth problems. Our algorithm is designed to 1) allow for adopting flexible parameter restart schemes that cover many existing ones; 2) have a global sub-linear convergence rate in nonconvex and nonsmooth optimization; and 3) have guaranteed convergence to a critical point and have various types of asymptotic convergence rates depending on the parameterization of local geometry in nonconvex and nonsmooth optimization. Numerical experiments demonstrate the convergence and effectiveness of our proposed algorithm.
【Keywords】: Data Mining: Theoretical Foundations; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:1452-1458
【Authors】: Hongmin Zhu ; Fuli Feng ; Xiangnan He ; Xiang Wang ; Yan Li ; Kai Zheng ; Yongdong Zhang
【Abstract】: Graph Neural Network (GNN) is a powerful model to learn representations and make predictions on graph data. Existing efforts on GNN have largely defined the graph convolution as a weighted sum of the features of the connected nodes to form the representation of the target node. Nevertheless, the operation of weighted sum assumes the neighbor nodes are independent of each other, and ignores the possible interactions between them. When such interactions exist, such as the co-occurrence of two neighbor nodes is a strong signal of the target node's characteristics, existing GNN models may fail to capture the signal. In this work, we argue the importance of modeling the interactions between neighbor nodes in GNN. We propose a new graph convolution operator, which augments the weighted sum with pairwise interactions of the representations of neighbor nodes. We term this framework as Bilinear Graph Neural Network (BGNN), which improves GNN representation ability with bilinear interactions between neighbor nodes. In particular, we specify two BGNN models named BGCN and BGAT, based on the well-known GCN and GAT, respectively. Empirical results on three public benchmarks of semi-supervised node classification verify the effectiveness of BGNN --- BGCN (BGAT) outperforms GCN (GAT) by 1.6% (1.5%) in classification accuracy. Codes are available at: https://github.com/zhuhm1996/bgnn.
【Keywords】: Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1460-1466
【Authors】: Curtis Bright ; Kevin K. H. Cheung ; Brett Stevens ; Ilias S. Kotsireas ; Vijay Ganesh
【Abstract】: In the 1970s and 1980s, searches performed by L. Carter, C. Lam, L. Thiel, and S. Swiercz showed that projective planes of order ten with weight 16 codewords do not exist. These searches required highly specialized and optimized computer programs and required about 2,000 hours of computing time on mainframe and supermini computers. In 2010, these searches were verified by D. Roy using an optimized C program and 16,000 hours on a cluster of desktop machines. We performed a verification of these searches by reducing the problem to the Boolean satisfiability problem (SAT). Our verification uses the cube-and-conquer SAT solving paradigm, symmetry breaking techniques using the computer algebra system Maple, and a result of Carter that there are ten nonisomorphic cases to check. Our searches completed in about 30 hours on a desktop machine and produced nonexistence proofs of about 1 terabyte in the DRAT (deletion resolution asymmetric tautology) format.
【Keywords】: Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Constraints and SAT: SAT: : Solvers and Applications; Constraints and SAT: SAT: Algorithms and Techniques; Constraints and SAT: Constraint Satisfaction;
【Paper Link】 【Pages】:1467-1473
【Authors】: Shaowei Cai ; Wenying Hou ; Yiyuan Wang ; Chuan Luo ; Qingwei Lin
【Abstract】: Minimum dominating set (MinDS) is a canonical NP-hard combinatorial optimization problem with applications. For large and hard instances one must resort to heuristic approaches to obtain good solutions within reasonable time. This paper develops an efficient local search algorithm for MinDS, which has two main ideas. The first one is a novel local search framework, while the second is a construction procedure with inference rules. Our algorithm named FastDS is evaluated on 4 standard benchmarks and 3 massive graphs benchmarks. FastDS obtains the best performance for almost all benchmarks, and obtains better solutions than state-of-the-art algorithms on massive graphs.
【Keywords】: Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:1474-1480
【Authors】: Fei-Yu Liu ; Zi-Niu Li ; Chao Qian
【Abstract】: Evolution Strategies (ES) are a class of black-box optimization algorithms and have been widely applied to solve problems, e.g., in reinforcement learning (RL), where the true gradient is unavailable. ES estimate the gradient of an objective function with respect to the parameters by randomly sampling search directions and evaluating parameter perturbations in these directions. However, the gradient estimator of ES tends to have a high variance for high-dimensional optimization, thus requiring a large number of samples and making ES inefficient. In this paper, we propose a new ES algorithm SGES, which utilizes historical estimated gradients to construct a low-dimensional subspace for sampling search directions, and adjusts the importance of this subspace adaptively. We prove that the variance of the gradient estimator of SGES can be much smaller than that of Vanilla ES; meanwhile, its bias can be well bounded. Empirical results on benchmark black-box functions and a set of popular RL tasks exhibit the superior performance of SGES over state-of-the-art ES algorithms.
【Keywords】: Heuristic Search and Game Playing: Heuristic Search and Machine Learning; Heuristic Search and Game Playing: Heuristic Search; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:1481-1487
【Authors】: Qingyun Zhang ; Zhipeng Lü ; Zhouxing Su ; Chumin Li ; Yuan Fang ; Fuda Ma
【Abstract】: The p-center problem consists of choosing p centers from a set of candidates to minimize the maximum cost between any client and its assigned facility. In this paper, we transform the p-center problem into a series of set covering subproblems, and propose a vertex weighting-based tabu search (VWTS) algorithm to solve them. The proposed VWTS algorithm integrates distinguishing features such as a vertex weighting technique and a tabu search strategy to help the search to jump out of the local optima. Computational experiments on 138 most commonly used benchmark instances show that VWTS is highly competitive comparing to the state-of-the-art methods in spite of its simplicity. As a well-known NP-hard problem which has already been studied for over half a century, it is a challenging task to break the records on these classic datasets. Yet VWTS improves the best known results for 14 out of 54 large instances, and matches the optimal results for all remaining 84 ones. In addition, the computational time taken by VWTS is much shorter than other algorithms in the literature.
【Keywords】: Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Heuristic Search and Game Playing: Heuristic Search; Heuristic Search and Game Playing: Meta-Reasoning and Meta-heuristics; Planning and Scheduling: Planning Algorithms;
【Paper Link】 【Pages】:1488-1494
【Authors】: Andrea Madotto ; Mahdi Namazifar ; Joost Huizinga ; Piero Molino ; Adrien Ecoffet ; Huaixiu Zheng ; Alexandros Papangelis ; Dian Yu ; Chandra Khatri ; Gökhan Tür
【Abstract】: This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games. These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents. Moreover, they provide a learning setting in which these skills can be acquired through interactions with an environment rather than using fixed corpora. One aspect that makes these games particularly challenging for learning agents is the combinatorially large action space. Existing methods for solving text-based games are limited to games that are either very simple or have an action space restricted to a predetermined set of admissible actions. In this work, we propose to use the exploration approach of Go-Explore for solving text-based games. More specifically, in an initial exploration phase, we first extract trajectories with high rewards, after which we train a policy to solve the game by imitating these trajectories. Our experiments show that this approach outperforms existing solutions in solving text-based games, and it is more sample efficient in terms of the number of interactions with the environment. Moreover, we show that the learned policy can generalize better than existing solutions to unseen games without using any restriction on the action space.
【Keywords】: Heuristic Search and Game Playing: Game Playing and Machine Learning; Natural Language Processing: NLP Applications and Tools; Natural Language Processing: Other;
【Paper Link】 【Pages】:1495-1502
【Authors】: Chuan Luo ; Bo Qiao ; Xin Chen ; Pu Zhao ; Randolph Yao ; Hongyu Zhang ; Wei Wu ; Andrew Zhou ; Qingwei Lin
【Abstract】: Virtual machine (VM) provisioning is a common and critical problem in cloud computing. In industrial cloud platforms, there are a huge number of VMs provisioned per day. Due to the complexity and resource constraints, it needs to be carefully optimized to make cloud platforms effectively utilize the resources. Moreover, in practice, provisioning a VM from scratch requires fairly long time, which would degrade the customer experience. Hence, it is advisable to provision VMs ahead for upcoming demands. In this work, we formulate the practical scenario as the predictive VM provisioning (PreVMP) problem, where upcoming demands are unknown and need to be predicted in advance, and then the VM provisioning plan is optimized based on the predicted demands. Further, we propose Uncertainty-Aware Heuristic Search (UAHS) for solving the PreVMP problem. UAHS first models the prediction uncertainty, and then utilizes the prediction uncertainty in optimization. Moreover, UAHS leverages Bayesian optimization to interact prediction and optimization to improve its practical performance. Extensive experiments show that UAHS performs much better than state-of-the-art competitors on two public datasets and an industrial dataset. UAHS has been successfully applied in Microsoft Azure and brought practical benefits in real-world applications.
【Keywords】: Heuristic Search and Game Playing: Heuristic Search; Heuristic Search and Game Playing: Heuristic Search and Machine Learning; Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Multidisciplinary Topics and Applications: Autonomic Computing;
【Paper Link】 【Pages】:1503-1510
【Authors】: Bohan Li ; Xindi Zhang ; Shaowei Cai ; Jinkun Lin ; Yiyuan Wang ; Christian Blum
【Abstract】: The minimum connected dominating set (MCDS) problem is an important extension of the minimum dominating set problem, with wide applications, especially in wireless networks. Despite its practical importance, there are few works on solving MCDS for massive graphs, mainly due to the complexity of maintaining connectivity. In this paper, we propose two novel ideas, and develop a new local search algorithm for MCDS called NuCDS. First, a hybrid dynamic connectivity maintenance method is designed to switch alternately between a novel fast connectivity maintenance method based on spanning tree and its previous counterpart. Second, we define a new vertex property called \emph{safety} to make the algorithm more considerate when selecting vertices. Experiments show that NuCDS significantly outperforms the state-of-the-art MCDS algorithms on both massive graphs and classic benchmarks.
【Keywords】: Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:1512-1518
【Authors】: Zhijun Chen ; Huimin Wang ; Hailong Sun ; Pengpeng Chen ; Tao Han ; Xudong Liu ; Jie Yang
【Abstract】: End-to-end learning from crowds has recently been introduced as an EM-free approach to training deep neural networks directly from noisy crowdsourced annotations. It models the relationship between true labels and annotations with a specific type of neural layer, termed as the crowd layer, which can be trained using pure backpropagation. Parameters of the crowd layer, however, can hardly be interpreted as annotator reliability, as compared with the more principled probabilistic approach. The lack of probabilistic interpretation further prevents extensions of the approach to account for important factors of annotation processes, e.g., instance difficulty. This paper presents SpeeLFC, a structured probabilistic model that incorporates the constraints of probability axioms for parameters of the crowd layer, which allows to explicitly model annotator reliability while benefiting from the end-to-end training of neural networks. Moreover, we propose SpeeLFC-D, which further takes into account instance difficulty. Extensive validation on real-world datasets shows that our methods improve the state-of-the-art.
【Keywords】: Humans and AI: Human Computation and Crowdsourcing; Humans and AI: Human-AI Collaboration; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:1519-1525
【Authors】: Xiang Cheng ; Yunzhe Hao ; Jiaming Xu ; Bo Xu
【Abstract】: Spiking Neural Network (SNN) is considered more biologically plausible and energy-efficient on emerging neuromorphic hardware. Recently backpropagation algorithm has been utilized for training SNN, which allows SNN to go deeper and achieve higher performance. However, most existing SNN models for object recognition are mainly convolutional structures or fully-connected structures, which only have inter-layer connections, but no intra-layer connections. Inspired by Lateral Interactions in neuroscience, we propose a high-performance and noise-robust Spiking Neural Network (dubbed LISNN). Based on the convolutional SNN, we model the lateral interactions between spatially adjacent neurons and integrate it into the spiking neuron membrane potential formula, then build a multi-layer SNN on a popular deep learning framework, i.\,e., PyTorch. We utilize the pseudo-derivative method to solve the non-differentiable problem when applying backpropagation to train LISNN and test LISNN on multiple standard datasets. Experimental results demonstrate that the proposed model can achieve competitive or better performance compared to current state-of-the-art spiking neural networks on MNIST, Fashion-MNIST, and N-MNIST datasets. Besides, thanks to lateral interactions, our model processes stronger noise-robustness than other SNN. Our work brings a biologically plausible mechanism into SNN, hoping that it can help us understand the visual information processing in the brain.
【Keywords】: Humans and AI: Cognitive Modeling;
【Paper Link】 【Pages】:1526-1533
【Authors】: Bryan Wilder ; Eric Horvitz ; Ece Kamar
【Abstract】: A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks. To date, systems aimed at complementing the skills of people have employed models trained to be as accurate as possible in isolation. We demonstrate how an end-to-end learning strategy can be harnessed to optimize the combined performance of human-machine teams by considering the distinct abilities of people and machines. The goal is to focus machine learning on problem instances that are difficult for humans, while recognizing instances that are difficult for the machine and seeking human input on them. We demonstrate in two real-world domains (scientific discovery and medical diagnosis) that human-machine teams built via these methods outperform the individual performance of machines and people. We then analyze conditions under which this complementarity is strongest, and which training methods amplify it. Taken together, our work provides the first systematic investigation of how machine learning systems can be trained to complement human reasoning.
【Keywords】: Humans and AI: Human-AI Collaboration;
【Paper Link】 【Pages】:1534-1541
【Authors】: Jiyi Li ; Yasushi Kawase ; Yukino Baba ; Hisashi Kashima
【Abstract】: Quality assurance is one of the most important problems in crowdsourcing and human computation, and it has been extensively studied from various aspects. Typical approaches for quality assurance include unsupervised approaches such as introducing task redundancy (i.e., asking the same question to multiple workers and aggregating their answers) and supervised approaches such as using worker performance on past tasks or injecting qualification questions into tasks in order to estimate the worker performance. In this paper, we propose to utilize the worker performance as a global constraint for inferring the true answers. The existing semi-supervised approaches do not consider such use of qualification questions. We also propose to utilize the constraint as a regularizer combined with existing statistical aggregation methods. The experiments using heterogeneous multiple-choice questions demonstrate that the performance constraint not only has the power to estimate the ground truths when used by itself, but also boosts the existing aggregation methods when used as a regularizer.
【Keywords】: Humans and AI: Human Computation and Crowdsourcing;
【Paper Link】 【Pages】:1542-1548
【Authors】: Li'ang Yin ; Yunfei Liu ; Weinan Zhang ; Yong Yu
【Abstract】: Aggregating crowd wisdom infers true labels for objects, from multiple noisy labels provided by various sources. Besides labels from sources, side information such as object features is also introduced to achieve higher inference accuracy. Usually, the learning-from-crowds framework is adopted. However, the framework considers each object in isolation and does not make full use of object features to overcome label noise. In this paper, we propose a clustering-based label-aware autoencoder (CLA) to alleviate label noise. CLA utilizes clusters to gather objects with similar features and exploits clustering to infer true labels, by constructing a novel deep generative process to simultaneously generate object features and source labels from clusters. For model inference, CLA extends the framework of variational autoencoders and utilizes maximizing a posteriori (MAP) estimation, which prevents the model from overfitting and trivial solutions. Experiments on real-world tasks demonstrate the significant improvement of CLA compared with the state-of-the-art aggregation algorithms.
【Keywords】: Humans and AI: Human Computation and Crowdsourcing; Machine Learning: Deep Generative Models; Machine Learning: Clustering; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:1549-1555
【Authors】: David Sarne ; Chen Rozenshtein
【Abstract】: This paper suggests a new paradigm for the design of collaborative autonomous agents engaged in executing a joint task alongside a human user. In particular, we focus on the way an agent's failures should affect its decision making, as far as user satisfaction measures are concerned. Unlike the common practice that considers agent (and more broadly, system) failures solely in the prism of their influence over the agent's contribution to the execution of the joint task, we argue that there is an additional, direct, influence which cannot be fully captured by the above measure. Through two series of large-scale controlled experiments with 450 human subjects, recruited through Amazon Mechanical Turk, we show that, indeed, such direct influence holds. Furthermore, we show that the use of a simple agent design that takes into account the direct influence of failures in its decision making yields considerably better user satisfaction, compared to an agent that focuses exclusively on maximizing its absolute contribution to the joint task.
【Keywords】: Humans and AI: Human-Computer Interaction; Humans and AI: Human-AI Collaboration; Robotics: Human Robot Interaction;
【Paper Link】 【Pages】:1556-1562
【Authors】: David Sarne ; Chen Rozenshtein
【Abstract】:
【Keywords】:
【Paper Link】 【Pages】:1563-1569
【Authors】: Feilong Tang
【Abstract】: Existing schemes cannot assign complex tasks to the most suitable workers because they either cannot measure skills quantitatively or do not consider assigning tasks to workers who are the most suitable but unavailable temporarily. In this paper, we investigate how to realize optimal complex task assignment. Firstly, we formulate the multiple-skill based task assignment problem in service crowdsourcing. We then propose a weighted multi-skill tree (WMST) to model multiple skills and their correlations. Next, we propose the acceptance expectation to uniformly measure the probabilities that different categories of workers will accept and complete specified tasks. Finally, we propose an acceptance-expectation-based task assignment (AE-TA) algorithm, which reserves tasks for the most suitable workers even unavailable temporarily. Comprehensive experimental results demonstrate that our WMST model and AE-TA algorithm significantly outperform related proposals.
【Keywords】: Humans and AI: Human Computation and Crowdsourcing; Humans and AI: Human-AI Collaboration;
【Paper Link】 【Pages】:1570-1576
【Authors】: Zhijie Fang ; Weiqun Wang ; Shixin Ren ; Jiaxing Wang ; Weiguo Shi ; Xu Liang ; Chen-Chen Fan ; Zeng-Guang Hou
【Abstract】: Recent deep learning-based Brain-Computer Interface (BCI) decoding algorithms mainly focus on spatial-temporal features, while failing to explicitly explore spectral information which is one of the most important cues for BCI. In this paper, we propose a novel regional attention convolutional neural network (RACNN) to take full advantage of spectral-spatial-temporal features for EEG motion intention recognition. Time-frequency based analysis is adopted to reveal spectral-temporal features in terms of neural oscillations of primary sensorimotor. The basic idea of RACNN is to identify the activated area of the primary sensorimotor adaptively. The RACNN aggregates a varied number of spectral-temporal features produced by a backbone convolutional neural network into a compact fixed-length representation. Inspired by the neuroscience findings that functional asymmetry of the cerebral hemisphere, we propose a region biased loss to encourage high attention weights for the most critical regions. Extensive evaluations on two benchmark datasets and real-world BCI dataset show that our approach significantly outperforms previous methods.
【Keywords】: Humans and AI: Brain Sciences; Multidisciplinary Topics and Applications: AI for Life Science; Robotics: Human Robot Interaction;
【Paper Link】 【Pages】:1577-1583
【Authors】: Yunfei Liu ; Yang Yang ; Xianyu Chen ; Jian Shen ; Haifeng Zhang ; Yong Yu
【Abstract】: Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response. Although much research has been devoted to exploiting the question information, plentiful advanced information among questions and skills hasn't been well extracted, making it challenging for previous work to perform adequately. In this paper, we demonstrate that large gains on KT can be realized by pre-training embeddings for each question on abundant side information, followed by training deep KT models on the obtained embeddings. To be specific, the side information includes question difficulty and three kinds of relations contained in a bipartite graph between questions and skills. To pre-train the question embeddings, we propose to use product-based neural networks to recover the side information. As a result, adopting the pre-trained embeddings in existing deep KT models significantly outperforms state-of-the-art baselines on three common KT datasets.
【Keywords】: Humans and AI: Cognitive Modeling; Humans and AI: Computer-Aided Education;
【Paper Link】 【Pages】:1585-1591
【Authors】: Abhijin Adiga ; Sarit Kraus ; Oleg Maksimov ; S. S. Ravi
【Abstract】: In Boolean games, each agent controls a set of Boolean variables and has a goal represented by a propositional formula. We study inference problems in Boolean games assuming the presence of a PRINCIPAL who has the ability to control the agents and impose taxation schemes. Previous work used taxation schemes to guide a game towards certain equilibria. We present algorithms that show how taxation schemes can also be used to infer agents' goals. We present experimental results to demonstrate the efficacy our algorithms. We also consider goal inference when only limited information is available in response to a query.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation and Game Theory; Social Choice; Agent-based and Multi-agent Systems: Agent Theories and Models; Agent-based and Multi-agent Systems: Noncooperative Games;
【Paper Link】 【Pages】:1592-1600
【Authors】: Stuart Armstrong ; Jan Leike ; Laurent Orseau ; Shane Legg
【Abstract】:
In some agent designs like inverse reinforcement learning an agent needs to learn its own reward function. Learning the reward function and optimising for it are typically two different processes, usually performed at different stages. We consider a continual (one life'') learning approach where the agent both learns the reward function and optimises for it at the same time. We show that this comes with a number of pitfalls, such as deliberately manipulating the learning process in one direction, refusing to learn,
learning'' facts already known to the agent, and making decisions that are strictly dominated (for all relevant reward functions). We formally introduce two desirable properties: the first is unriggability', which prevents the agent from steering the learning process in the direction of a reward function that is easier to optimise. The second is
uninfluenceability', whereby the reward-function learning process operates by learning facts about the environment. We show that an uninfluenceable process is automatically unriggable, and if the set of possible environments is sufficiently large, the converse is true too.
【Keywords】: Knowledge Representation and Reasoning: Belief Change, Belief Merging; Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Human-Agent Interaction;
【Paper Link】 【Pages】:1601-1607
【Authors】: Fadi Badra
【Abstract】: Analogical transfer consists in leveraging a measure of similarity between two situations to predict the amount of similarity between their outcomes. Acquiring a suitable similarity measure for analogical transfer may be difficult, especially when the data is sparse or when the domain knowledge is incomplete. To alleviate this problem, this paper presents a dataset complexity measure that can be used either to select an optimal similarity measure, or if the similarity measure is given, to perform analogical transfer: among the potential outcomes of a new situation, the most plausible is the one which minimizes the dataset complexity.
【Keywords】: Knowledge Representation and Reasoning: Case-based Reasoning; Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning;
【Paper Link】 【Pages】:1608-1614
【Authors】: Meghyn Bienvenu ; Quentin Manière ; Michaël Thomazo
【Abstract】: Ontology-mediated query answering (OMQA) is a promising approach to data access and integration that has been actively studied in the knowledge representation and database communities for more than a decade. The vast majority of work on OMQA focuses on conjunctive queries, whereas more expressive queries that feature counting or other forms of aggregation remain largely unexplored. In this paper, we introduce a general form of counting query, relate it to previous proposals, and study the complexity of answering such queries in the presence of DL-Lite ontologies. As it follows from existing work that query answering is intractable and often of high complexity, we consider some practically relevant restrictions, for which we establish improved complexity bounds.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Computational Complexity of Reasoning;
【Paper Link】 【Pages】:1615-1621
【Authors】: Lasse Dissing ; Thomas Bolander
【Abstract】: Previous research has claimed dynamic epistemic logic (DEL) to be a suitable formalism for representing essential aspects of a Theory of Mind (ToM) for an autonomous agent. This includes the ability of the formalism to represent the reasoning involved in false-belief tasks of arbitrary order, and hence for autonomous agents based on the formalism to become able to pass such tests. This paper provides evidence for the claims by documenting the implementation of a DEL-based reasoning system on a humanoid robot. Our implementation allows the robot to perform cognitive perspective-taking, in particular to reason about the first- and higher-order beliefs of other agents. We demonstrate how this allows the robot to pass a quite general class of false-belief tasks involving human agents. Additionally, as is briefly illustrated, it allows the robot to proactively provide human agents with relevant information in situations where a system without ToM-abilities would fail. The symbolic grounding problem of turning robotic sensor input into logical action descriptions in DEL is achieved via a perception system based on deep neural networks.
【Keywords】: Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Robotics: Human Robot Interaction; Robotics: Social Robots;
【Paper Link】 【Pages】:1622-1628
【Authors】: Robert Bredereck ; Lilian Jacobs ; Leon Kellerhals
【Abstract】: We consider the setting of asynchronous opinion diffusion with majority threshold: given a social network with each agent assigned to one opinion, an agent will update its opinion if more than half of its neighbors agree on a different opinion. The stabilized final outcome highly depends on the sequence in which agents update their opinion. We are interested in optimistic sequences---sequences that maximize the spread of a chosen opinion. We complement known results for two opinions where optimistic sequences can be computed in time and length linear in the number of agents. We analyze upper and lower bounds on the length of optimistic sequences, showing quadratic bounds in the general and linear bounds in the acyclic case. Moreover, we show that in networks with more than two opinions determining a spread-maximizing sequence becomes intractable; surprisingly, already with three opinions the intractability results hold in highly restricted cases, e.g., when each agent has at most three neighbors, when looking for a short sequence, or when we aim for approximate solutions.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation and Game Theory; Social Choice; Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:1629-1635
【Authors】: Prantik Chatterjee ; Abhijit Chatterjee ; José Campos ; Rui Abreu ; Subhajit Roy
【Abstract】: Spectrum-based Fault Localization (SFL) approaches aim to efficiently localize faulty components from examining program behavior. This is done by collecting the execution patterns of various combinations of components and the corresponding outcomes into a spectrum. Efficient fault localization depends heavily on the quality of the spectra. Previous approaches, including the current state-of-the-art Density- Diversity-Uniqueness (DDU) approach, attempt to generate “good” test-suites by improving certain structural properties of the spectra. In this work, we propose a different approach, Multiverse Analysis, that considers multiple hypothetical universes, each corresponding to a scenario where one of the components is assumed to be faulty, to generate a spectrum that attempts to reduce the expected worst-case wasted effort over all the universes. Our experiments show that the Multiverse Analysis not just improves the efficiency of fault localization but also achieves better coverage and generates smaller test-suites over DDU, the current state-of-the-art technique. On average, our approach reduces the developer effort over DDU by over 16% for more than 92% of the instances. Further, the improvements over DDU are indeed statistically significant on the paired Wilcoxon Signed-rank test.
【Keywords】: Knowledge Representation and Reasoning: Diagnosis and Abductive Reasoning; Multidisciplinary Topics and Applications: Knowledge-based Software Engineering;
【Paper Link】 【Pages】:1636-1643
【Authors】: Patrick Koopmann ; Jieying Chen
【Abstract】: In deductive module extraction, we determine a small subset of an ontology for a given vocabulary that preserves all logical entailments that can be expressed in that vocabulary. While in the literature stronger module notions have been discussed, we argue that for applications in ontology analysis and ontology reuse, deductive modules, which are decidable and potentially smaller, are often sufficient. We present methods based on uniform interpolation for extracting different variants of deductive modules, satisfying properties such as completeness, minimality and robustness under replacements, the latter being particularly relevant for ontology reuse. An evaluation of our implementation shows that the modules computed by our method are often significantly smaller than those computed by existing methods.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies;
【Paper Link】 【Pages】:1644-1650
【Authors】: Junyou Li ; Gong Cheng ; Qingxia Liu ; Wen Zhang ; Evgeny Kharlamov ; Kalpa Gunaratna ; Huajun Chen
【Abstract】: In a large-scale knowledge graph (KG), an entity is often described by a large number of triple-structured facts. Many applications require abridged versions of entity descriptions, called entity summaries. Existing solutions to entity summarization are mainly unsupervised. In this paper, we present a supervised approach NEST that is based on our novel neural model to jointly encode graph structure and text in KGs and generate high-quality diversified summaries. Since it is costly to obtain manually labeled summaries for training, our supervision is weak as we train with programmatically labeled data which may contain noise but is free of manual work. Evaluation results show that our approach significantly outperforms the state of the art on two public benchmarks.
【Keywords】: Knowledge Representation and Reasoning: Semantic Web; Machine Learning: Deep Learning; Machine Learning: Learning Preferences or Rankings;
【Paper Link】 【Pages】:1651-1657
【Authors】: Ondrej Cepek ; Milos Chromý
【Abstract】: In this paper we focus on a less usual way to represent Boolean functions, namely on representations by switch-lists. Given a truth table representation of a Boolean function f the switch-list representation (SLR) of f is a list of Boolean vectors from the truth table which have a different function value than the preceding Boolean vector in the truth table. The main aim of this paper is to include the language SL of all SLR in the Knowledge Compilation Map [Darwiche and Marquis, 2002] and to argue, that SL may in certain situations constitute a reasonable choice for a target language in knowledge compilation. First we compare SL with a number of standard representation languages (such as CNF, DNF, and OBDD) with respect to their relative succinctness. As a by-product of this analysis we also give a short proof of a long standing open question from [Darwiche and Marquis, 2002], namely the incomparability of MODS (models) and PI (prime implicates) languages. Next we analyze which standard transformations and queries (those considered in [Darwiche and Marquis, 2002] can be performed in poly-time with respect to the size of the input SLR. We show that this collection is quite broad and the combination of poly-time transformations and queries is quite unique.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation Languages; Knowledge Representation and Reasoning: Automated Reasoning; Tractable Languages and Knowledge compilation; Knowledge Representation and Reasoning: Computational Complexity of Reasoning;
【Paper Link】 【Pages】:1658-1666
【Authors】: Diego Calvanese ; Julien Corman ; Davide Lanti ; Simon Razniewski
【Abstract】: Counting answers to a query is an operation supported by virtually all database management systems. In this paper we focus on counting answers over a Knowledge Base (KB), which may be viewed as a database enriched with background knowledge about the domain under consideration. In particular, we place our work in the context of Ontology-Mediated Query Answering/Ontology-based Data Access (OMQA/OBDA), where the language used for the ontology is a member of the DL-Lite family and the data is a (usually virtual) set of assertions. We study the data complexity of query answering, for different members of the DL-Lite family that include number restrictions, and for variants of conjunctive queries with counting that differ with respect to their shape (connected, branching, rooted). We improve upon existing results by providing PTIME and coNP lower bounds, and upper bounds in PTIME and LOGSPACE. For the LOGSPACE case, we have devised a novel query rewriting technique into first-order logic with counting.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Knowledge Representation and Reasoning: Description Logics and Ontologies; Multidisciplinary Topics and Applications: Databases;
【Paper Link】 【Pages】:1667-1673
【Authors】: Dennis Craandijk ; Floris Bex
【Abstract】: In this paper, we present a learning-based approach to determining acceptance of arguments under several abstract argumentation semantics. More specifically, we propose an argumentation graph neural network (AGNN) that learns a message-passing algorithm to predict the likelihood of an argument being accepted. The experimental results demonstrate that the AGNN can almost perfectly predict the acceptability under different semantics and scales well for larger argumentation frameworks. Furthermore, analysing the behaviour of the message-passing algorithm shows that the AGNN learns to adhere to basic principles of argument semantics as identified in the literature, and can thus be trained to predict extensions under the different semantics – we show how the latter can be done for multi-extension semantics by using AGNNs to guide a basic search. We publish our code at https://github.com/DennisCraandijk/DL-Abstract-Argumentation.
【Keywords】: Knowledge Representation and Reasoning: Computational Models of Argument; Machine Learning: Neuro-Symbolic Methods;
【Paper Link】 【Pages】:1674-1680
【Authors】: Benjamin Aminof ; Giuseppe De Giacomo ; Alessio Lomuscio ; Aniello Murano ; Sasha Rubin
【Abstract】: We consider an agent that operates with two models of the environment: one that captures expected behaviors and one that captures additional exceptional behaviors. We study the problem of synthesizing agent strategies that enforce a goal against environments operating as expected while also making a best effort against exceptional environment behaviors. We formalize these concepts in the context of linear-temporal logic, and give an algorithm for solving this problem. We also show that there is no trade-off between enforcing the goal under the expected environment specification and making a best-effort for it under the exceptional one.
【Keywords】: Knowledge Representation and Reasoning: Action, Change and Causality; Planning and Scheduling: Theoretical Foundations of Planning; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis;
【Paper Link】 【Pages】:1681-1687
【Authors】: Bartosz Bednarczyk ; Stéphane Demri ; Alessio Mansutti
【Abstract】: Description logics are well-known logical formalisms for knowledge representation. We propose to enrich knowledge bases (KBs) with dynamic axioms that specify how the satisfaction of statements from the KBs evolves when the interpretation is decomposed or recomposed, providing a natural means to predict the evolution of interpretations. Our dynamic axioms borrow logical connectives from separation logics, well-known specification languages to verify programs with dynamic data structures. In the paper, we focus on ALC and EL augmented with dynamic axioms, or to their subclass of positive dynamic axioms. The knowledge base consistency problem in the presence of dynamic axioms is investigated, leading to interesting complexity results, among which the problem for EL with positive dynamic axioms is tractable, whereas EL with dynamic axioms is undecidable.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:1688-1694
【Authors】: Bernardo Cuteri ; Carmine Dodaro ; Francesco Ricca ; Peter Schüller
【Abstract】: Answer Set Programming (ASP) is a well-known formalism for Knowledge Representation and Reasoning, successfully employed to solve many AI problems, also thanks to the availability of efficient implementations. Traditionally, ASP systems are based on the ground&solve; approach, where the grounding transforms a general input program into its propositional counterpart, whose stable models are then computed by the solver using the CDCL algorithm. This approach suffers an intrinsic limitation: the grounding of one or few constraints may be unaffordable from a computational point of view; a problem known as grounding bottleneck. In this paper, we develop an innovative approach for evaluating ASP programs, where some of the constraints of the input program are not grounded but automatically translated into propagators of the CDCL algorithm that work on partial interpretations. We implemented the new approach on top of the solver WASP and carried out an experimental analysis on different benchmarks. Results show that our approach consistently outperforms state-of-the-art ASP systems by overcoming the grounding bottleneck.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation Languages; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Other;
【Paper Link】 【Pages】:1695-1702
【Authors】: Heshan Du ; Natasha Alechina ; Anthony G. Cohn
【Abstract】: We propose a logic of directions for points (LD) over 2D Euclidean space, which formalises primary direction relations east (E), west (W), and indeterminate east/west (Iew), north (N), south (S) and indeterminate north/south (Ins). We provide a sound and complete axiomatisation of it, and prove that its satisfiability problem is NP-complete.
【Keywords】: Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning; Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Knowledge Representation Languages;
【Paper Link】 【Pages】:1703-1711
【Authors】: Kaisheng Wu ; Liangda Fang ; Liping Xiong ; Zhao-Rong Lai ; Yong Qiao ; Kaidong Chen ; Fei Rong
【Abstract】: Strategy representation and reasoning has recently received much attention in artificial intelligence. Impartial combinatorial games (ICGs) are a type of elementary and fundamental games in game theory. One of the challenging problems of ICGs is to construct winning strategies, particularly, generalized winning strategies for possibly infinitely many instances of ICGs. In this paper, we investigate synthesizing generalized winning strategies for ICGs. To this end, we first propose a logical framework to formalize ICGs based on the linear integer arithmetic fragment of numeric part of PDDL. We then propose an approach to generating the winning formula that exactly captures the states in which the player can force to win. Furthermore, we compute winning strategies for ICGs based on the winning formula. Experimental results on several games demonstrate the effectiveness of our approach.
【Keywords】: Knowledge Representation and Reasoning: Action, Change and Causality; Knowledge Representation and Reasoning: Knowledge Representation and Game Theory; Social Choice; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis; Planning and Scheduling: Distributed;Multi-agent Planning;
【Paper Link】 【Pages】:1712-1718
【Authors】: Bettina Fazzinga ; Sergio Flesca ; Filippo Furfaro
【Abstract】: We revisit the notion of i-extension, i.e., the adaption of the fundamental notion of extension to the case of incomplete Abstract Argumentation Frameworks. We show that the definition of i-extension raises some concerns in the "possible" variant, e.g., it allows even conflicting arguments to be collectively considered as members of an (i-)extension. Thus, we introduce the alternative notion of i*-extension overcoming the highlighted problems, and provide a thorough complexity characterization of the corresponding verification problem. Interestingly, we show that the revisitation not only has beneficial effects for the semantics, but also for the complexity: under various semantics, the verification problem under the possible perspective moves from NP-complete to P.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Knowledge Representation and Reasoning: Computational Models of Argument;
【Paper Link】 【Pages】:1719-1725
【Authors】: Bartosz Bednarczyk ; Robert Ferens ; Piotr Ostropolski-Nalewaja
【Abstract】: The chase is a famous algorithmic procedure in database theory with numerous applications in ontology-mediated query answering. We consider static analysis of the chase termination problem, which asks, given set of TGDs, whether the chase terminates on all input databases. The problem was recently shown to be undecidable by Gogacz et al. for sets of rules containing only ternary predicates.
In this work, we show that undecidability occurs already for sets of single-head TGD over binary vocabularies. This question is relevant since many real-world ontologies, e.g., those from the Horn fragment of the popular OWL, are of this shape.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Multidisciplinary Topics and Applications: Databases;
【Paper Link】 【Pages】:1726-1733
【Authors】: Hubie Chen ; Georg Gottlob ; Matthias Lanzinger ; Reinhard Pichler
【Abstract】: Constraint satisfaction problems (CSPs) are an important formal framework for the uniform treatment of various prominent AI tasks, e.g., coloring or scheduling problems. Solving CSPs is, in general, known to be NP-complete and fixed-parameter intractable when parameterized by their constraint scopes. We give a characterization of those classes of CSPs for which the problem becomes fixed-parameter tractable. Our characterization significantly increases the utility of the CSP framework by making it possible to decide the fixed-parameter tractability of problems via their CSP formulations. We further extend our characterization to the evaluation of unions of conjunctive queries, a fundamental problem in databases. Furthermore, we provide some new insight on the frontier of PTIME solvability of CSPs. In particular, we observe that bounded fractional hypertree width is more general than bounded hypertree width only for classes that exhibit a certain type of exponential growth. The presented work resolves a long-standing open problem and yields powerful new tools for complexity research in AI and database theory.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Constraints and SAT: Constraint Satisfaction; Multidisciplinary Topics and Applications: Databases;
【Paper Link】 【Pages】:1734-1740
【Authors】: Rachael Colley ; Umberto Grandi ; Arianna Novaro
【Abstract】: We propose a generalisation of liquid democracy in which a voter can either vote directly on the issues at stake, delegate her vote to another voter, or express complex delegations to a set of trusted voters. By requiring a ranking of desirable delegations and a backup vote from each voter, we are able to put forward and compare four algorithms to solve delegation cycles and obtain a final collective decision.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation and Game Theory; Social Choice; Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting;
【Paper Link】 【Pages】:1741-1747
【Authors】: Robert Ganian ; Thekla Hamm ; Guillaume Mescoff
【Abstract】: The Resource-Constrained Project Scheduling Problem (RCPSP) and its extension via activity modes (MRCPSP) are well-established scheduling frameworks that have found numerous applications in a broad range of settings related to artificial intelligence. Unsurprisingly, the problem of finding a suitable schedule in these frameworks is known to be NP-complete; however, aside from a few results for special cases, we have lacked an in-depth and comprehensive understanding of the complexity of the problems from the viewpoint of natural restrictions of the considered instances.
In the first part of our paper, we develop new algorithms and give hardness-proofs in order to obtain a detailed complexity map of (M)RCPSP that settles the complexity of all 1024 considered variants of the problem defined in terms of explicit restrictions of natural parameters of instances. In the second part, we turn to implicit structural restrictions defined in terms of the complexity of interactions between individual activities. In particular, we show that if the treewidth of a graph which captures such interactions is bounded by a constant, then we can solve MRCPSP in polynomial time.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Planning and Scheduling: Scheduling; Planning and Scheduling: Theoretical Foundations of Planning;
【Paper Link】 【Pages】:1748-1754
【Authors】: Shuxin Li ; Zixian Huang ; Gong Cheng ; Evgeny Kharlamov ; Kalpa Gunaratna
【Abstract】: A prominent application of knowledge graph (KG) is document enrichment. Existing methods identify mentions of entities in a background KG and enrich documents with entity types and direct relations. We compute an entity relation subgraph (ERG) that can more expressively represent indirect relations among a set of mentioned entities. To find compact, representative, and relevant ERGs for effective enrichment, we propose an efficient best-first search algorithm to solve a new combinatorial optimization problem that achieves a trade-off between representativeness and compactness, and then we exploit ontological knowledge to rank ERGs by entity-based document-KG and intra-KG relevance. Extensive experiments and user studies show the promising performance of our approach.
【Keywords】: Knowledge Representation and Reasoning: Semantic Web; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:1755-1762
【Authors】: Zhun Yang ; Adam Ishay ; Joohyung Lee
【Abstract】: We present NeurASP, a simple extension of answer set programs by embracing neural networks. By treating the neural network output as the probability distribution over atomic facts in answer set programs, NeurASP provides a simple and effective way to integrate sub-symbolic and symbolic computation. We demonstrate how NeurASP can make use of a pre-trained neural network in symbolic computation and how it can improve the neural network's perception result by applying symbolic reasoning in answer set programming. Also, NeurASP can make use of ASP rules to train a neural network better so that a neural network not only learns from implicit correlations from the data but also from the explicit complex semantic constraints expressed by the rules.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation Languages; Machine Learning: Neuro-Symbolic Methods;
【Paper Link】 【Pages】:1763-1769
【Authors】: Patricia Everaere ; Sébastien Konieczny ; Pierre Marquis
【Abstract】: We study how belief merging operators can be considered as maximum likelihood estimators, i.e., we assume that there exists a (unknown) true state of the world and that each agent participating in the merging process receives a noisy signal of it, characterized by a noise model. The objective is then to aggregate the agents' belief bases to make the best possible guess about the true state of the world. In this paper, some logical connections between the rationality postulates for belief merging (IC postulates) and simple conditions over the noise model under consideration are exhibited. These results provide a new justification for IC merging postulates. We also provide results for two specific natural noise models: the world swap noise and the atom swap noise, by identifying distance-based merging operators that are maximum likelihood estimators for these two noise models.
【Keywords】: Knowledge Representation and Reasoning: Belief Change, Belief Merging;
【Paper Link】 【Pages】:1770-1776
【Authors】: Nicolas Schwind ; Sébastien Konieczny ; Jean-Marie Lagniez ; Pierre Marquis
【Abstract】: Iterated belief change aims to determine how the belief state of a rational agent evolves given a sequence of change formulae. Several families of iterated belief change operators (revision operators, improvement operators) have been pointed out so far, and characterized from an axiomatic point of view. This paper focuses on the inference problem for iterated belief change, when belief states are represented as a special kind of stratified belief bases. The computational complexity of the inference problem is identified and shown to be identical for all revision operators satisfying Darwiche and Pearl's (R1-R6) postulates. In addition, some complexity bounds for the inference problem are provided for the family of soft improvement operators. We also show that a revised belief state can be computed in a reasonable time for large-sized instances using SAT-based algorithms, and we report empirical results showing the feasibility of iterated belief change for bases of significant sizes.
【Keywords】: Knowledge Representation and Reasoning: Belief Change, Belief Merging; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning;
【Paper Link】 【Pages】:1777-1783
【Authors】: David Carral ; Markus Krötzsch
【Abstract】: Especially in data-intensive settings, a promising reasoning approach for description logics (DLs) is to rewrite DL theories into sets of rules. Although many such approaches have been considered in the literature, there are still various relevant DLs for which no small rewriting (of polynomial size) is known. We therefore develop small rewritings for the DL \ALCHIQ -- featuring disjunction, number restrictions, and inverse roles -- to disjunctive Datalog. By admitting existential quantifiers in rule heads, we can improve this result to yield only rules of bounded size, a property that is common to all rewritings that were implemented in practice so far.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Knowledge Representation Languages; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:1784-1790
【Authors】: Peter Jonsson ; Victor Lagerkvist
【Abstract】: We study the fine-grained complexity of NP-complete, infinite-domain constraint satisfaction problems (CSPs) parameterised by a set of first-order definable relations (with equality). Such CSPs are of central importance since they form a subclass of any infinite-domain CSP parameterised by a set of first-order definable relations. We prove that under the randomised exponential-time hypothesis it is not possible to find c > 1 such that a CSP over an arbitrary finite equality language is solvable in O(c^n) time (n is the number of variables). Stronger lower bounds are possible for infinite equality languages where we rule out the existence of 2^o(n log n) time algorithms; a lower bound which also extends to satisfiability modulo theories solving for an arbitrary background theory. Despite these lower bounds we prove that for each c > 1 there exists an NP-hard equality CSP solvable in O(c^n) time. Lower bounds like these immediately ask for closely matching upper bounds, and we prove that a CSP over a finite equality language is always solvable in O(c^n) time for a fixed c.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Constraints and SAT: Constraint Satisfaction; Constraints and SAT: Satisfiability Modulo Theories;
【Paper Link】 【Pages】:1791-1797
【Authors】: Gianluca Cima ; Domenico Lembo ; Riccardo Rosati ; Domenico Fabio Savo
【Abstract】: We study privacy-preserving query answering in Description Logics (DLs). Specifically, we consider the approach of controlled query evaluation (CQE) based on the notion of instance indistinguishability. We derive data complexity results for query answering over DL-LiteR ontologies, through a comparison with an alternative, existing confidentiality-preserving approach to CQE. Finally, we identify a semantically well-founded notion of approximated query answering for CQE, and prove that, for DL-LiteR ontologies, this form of CQE is tractable with respect to data complexity and is first-order rewritable, i.e., it is always reducible to the evaluation of a first-order query over the data instance.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies;
【Paper Link】 【Pages】:1798-1804
【Authors】: Fei Liang ; Zhe Lin
【Abstract】: Implicative semi-lattices (also known as Brouwerian semi-lattices) are a generalization of Heyting algebras, and have been already well studied both from a logical and an algebraic perspective. In this paper, we consider the variety ISt of the expansions of implicative semi-lattices with tense modal operators, which are algebraic models of the disjunction-free fragment of intuitionistic tense logic. Using methods from algebraic proof theory, we show that the logic of tense implicative semi-lattices has the finite model property. Combining with the finite axiomatizability of the logic, it follows that the logic is decidable.
【Keywords】: Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Other;
【Paper Link】 【Pages】:1805-1812
【Authors】: Zhaoshuai Liu ; Liping Xiong ; Yongmei Liu ; Yves Lespérance ; Ronghai Xu ; Hongyi Shi
【Abstract】: Representation and reasoning about strategic abilities has been an active research area in AI and multi-agent systems. Many variations and extensions of alternating-time temporal logic ATL have been proposed. However, most of the logical frameworks ignore the issue of coordination within a coalition, and are unable to specify the internal structure of strategies. In this paper, we propose JAADL, a modal logic for joint abilities under strategy commitments, which is an extension of ATL. Firstly, we introduce an operator of elimination of (strictly) dominated strategies, with which we can represent joint abilities of coalitions. Secondly, our logic is based on linear dynamic logic (LDL), an extension of linear temporal logic (LTL), so that we can use regular expressions to represent commitments to structured strategies. We analyze valid formulas in JAADL, give sufficient/necessary conditions for joint abilities, and show that model checking memoryless JAADL is in EXPTIME.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:1813-1819
【Authors】: Michael Sioutis ; Zhiguo Long ; Tomi Janhunen
【Abstract】: We introduce and study a notion of robustness in Qualitative Constraint Networks (QCNs), which are typically used to represent and reason about abstract spatial and temporal information. In particular, given a QCN, we are interested in obtaining a robust qualitative solution, or, a robust scenario of it, which is a satisfiable scenario that has a higher perturbation tolerance than any other, or, in other words, a satisfiable scenario that has more chances than any other to remain valid after it is altered. This challenging problem requires to consider the entire set of satisfiable scenarios of a QCN, whose size is usually exponential in the number of constraints of that QCN; however, we present a first algorithm that is able to compute a robust scenario of a QCN using linear space in the number of constraints. Preliminary results with a dataset from the job-shop scheduling domain, and a standard one, show the interest of our approach and highlight the fact that not all solutions are created equal.
【Keywords】: Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning;
【Paper Link】 【Pages】:1820-1826
【Authors】: Özgür Lütfü Özçep ; Mena Leemhuis ; Diedrich Wolter
【Abstract】: This paper presents an embedding of ontologies expressed in the ALC description logic into a real-valued vector space, comprising restricted existential and universal quantifiers, as well as concept negation and concept disjunction. Our main result states that an ALC ontology is satisfiable in the classical sense iff it is satisfiable by a partial faithful geometric model based on cones. The line of work to which we contribute aims to integrate knowledge representation techniques and machine learning. The new cone-model of ALC proposed in this work gives rise to conic optimization techniques for machine learning, extending previous approaches by its ability to model full ALC.
【Keywords】: Knowledge Representation and Reasoning: Other; Machine Learning: Knowledge-based Learning; Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning;
【Paper Link】 【Pages】:1827-1833
【Authors】: Anneke Haga ; Carsten Lutz ; Johannes Marti ; Frank Wolter
【Abstract】: We study complete approximations of an ontology formulated in a non-Horn description logic (DL) such as ALC in a Horn DL such as EL. We provide concrete approximation schemes that are necessarily infinite and observe that in the ELU-to-EL case finite approximations tend to exist in practice and are guaranteed to exist when the source ontology is acyclic. In contrast, neither of this is the case for ELU_bot-to-EL_bot and for ALC-to-EL_bot approximations. We also define a notion of approximation tailored towards ontology-mediated querying, connect it to subsumption-based approximations, and identify a case where finite approximations are guaranteed to exist.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies;
【Paper Link】 【Pages】:1834-1840
【Authors】: Alexis de Colnet ; Stefan Mengel
【Abstract】: Knowledge compilation studies the trade-off between succinctness and efficiency of different representation languages. For many languages, there are known strong lower bounds on the representation size, but recent work shows that, for some languages, one can bypass these bounds using approximate compilation. The idea is to compile an approximation of the knowledge for which the number of errors can be controlled. We focus on circuits in deterministic decomposable negation normal form (d-DNNF), a compilation language suitable in contexts such as probabilistic reasoning, as it supports efficient model counting and probabilistic inference. Moreover, there are known size lower bounds for d-DNNF which by relaxing to approximation one might be able to avoid. In this paper we formalize two notions of approximation: weak approximation which has been studied before in the decision diagram literature and strong approximation which has been used in recent algorithmic results. We then show lower bounds for approximation by d-DNNF, complementing the positive results from the literature.
【Keywords】: Knowledge Representation and Reasoning: Automated Reasoning; Tractable Languages and Knowledge compilation;
【Paper Link】 【Pages】:1841-1847
【Authors】: Marcello D'Agostino ; Sanjay Modgil
【Abstract】: ASPIC+ is an established general framework for argumentation and non-monotonic reasoning. However, ASPIC+ does not satisfy the non-contamination rationality postulates, and moreover, tacitly assumes unbounded resources when demonstrating satisfaction of the consistency postulates. In this paper we present a new version of ASPIC+ – Dialectial ASPIC+ – that is fully rational under resource bounds.
【Keywords】: Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Agent-based and Multi-agent Systems: Agreement Technologies: Argumentation;
【Paper Link】 【Pages】:1848-1854
【Authors】: Pierre-Alexandre Murena ; Marie Al-Ghossein ; Jean-Louis Dessalles ; Antoine Cornuéjols
【Abstract】: Analogies are 4-ary relations of the form "A is to B as C is to D". When A, B and C are fixed, we call analogical equation the problem of finding the correct D. A direct applicative domain is Natural Language Processing, in which it has been shown successful on word inflections, such as conjugation or declension. If most approaches rely on the axioms of proportional analogy to solve these equations, these axioms are known to have limitations, in particular in the nature of the considered flections. In this paper, we propose an alternative approach, based on the assumption that optimal word inflections are transformations of minimal complexity. We propose a rough estimation of complexity for word analogies and an algorithm to find the optimal transformations. We illustrate our method on a large-scale benchmark dataset and compare with state-of-the-art approaches to demonstrate the interest of using complexity to solve analogies on words.
【Keywords】: Knowledge Representation and Reasoning: Case-based Reasoning; Natural Language Processing: Phonology, Morphology, and word segmentation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:1855-1861
【Authors】: Andreas Niskanen ; Daniel Neugebauer ; Matti Järvisalo
【Abstract】: Control argumentation frameworks (CAFs) allow for modeling uncertainties inherent in various argumentative settings. We establish a complete computational complexity map of the central computational problem of controllability in CAFs for five key semantics. We also develop Boolean satisfiability based counterexample-guided abstraction refinement algorithms and direct encodings of controllability as quantified Boolean formulas, and empirically evaluate their scalability on a range of NP-hard variants of controllability.
【Keywords】: Knowledge Representation and Reasoning: Computational Models of Argument; Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Constraints and SAT: SAT: : Solvers and Applications;
【Paper Link】 【Pages】:1862-1869
【Authors】: Camille Bourgaux ; Ana Ozaki ; Rafael Peñaloza ; Livia Predoiu
【Abstract】: We address the problem of handling provenance information in ELHr ontologies. We consider a setting recently introduced for ontology-based data access, based on semirings and extending classical data provenance, in which ontology axioms are annotated with provenance tokens. A consequence inherits the provenance of the axioms involved in deriving it, yielding a provenance polynomial as an annotation. We analyse the semantics for the ELHr case and show that the presence of conjunctions poses various difficulties for handling provenance, some of which are mitigated by assuming multiplicative idempotency of the semiring. Under this assumption, we study three problems: ontology completion with provenance, computing the set of relevant axioms for a consequence, and query answering.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Computational Complexity of Reasoning;
【Paper Link】 【Pages】:1870-1876
【Authors】: Cosimo Persia ; Ana Ozaki
【Abstract】: We investigate learnability of possibilistic theories from entailments in light of Angluin’s exact learning model. We consider cases in which only membership, only equivalence, and both kinds of queries can be posed by the learner. We then show that, for a large class of problems, polynomial time learnability results for classical logic can be transferred to the respective possibilistic extension. In particular, it follows from our results that the possibilistic extension of propositional Horn theories is exactly learnable in polynomial time. As polynomial time learnability in the exact model is transferable to the classical probably approximately correct (PAC) model extended with membership queries, our work also establishes such results in this model.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Machine Learning: Learning Theory; Uncertainty in AI: Uncertainty Representations;
【Paper Link】 【Pages】:1877-1883
【Authors】: Bastien Maubert ; Sophie Pinchinat ; François Schwarzentruber ; Silvia Stranieri
【Abstract】: Action models of Dynamic Epistemic Logic (DEL) represent precisely how actions are perceived by agents. DEL has recently been used to define infinite multi-player games, and it was shown that they can be solved in some cases. However, the dynamics being defined by the classic DEL update product for individual actions, only turn-based games have been considered so far. In this work we define a concurrent DEL product, propose a mechanism to resolve conflicts between actions, and define concurrent DEL games. As in the turn-based case, the obtained concurrent infinite game arenas can be finitely represented when all actions are public, or all are propositional. Thus we identify cases where the strategic epistemic logic ATL*K can be model checked on such games.
【Keywords】: Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Planning and Scheduling: Theoretical Foundations of Planning; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis;
【Paper Link】 【Pages】:1884-1890
【Authors】: Stéphanie Roussel ; Xavier Pucel ; Valentin Bouziat ; Louise Travé-Massuyès
【Abstract】: State tracking, i.e. estimating the state over time, is always an important problem in autonomous dynamic systems. Run-time requirements advocate for incremental estimation and memory limitations lead us to consider an estimation strategy that retains only one state out of the set of candidate estimates at each time step. This avoids the ambiguity of a high number of candidate estimates and allows the decision system to be fed with a clear input. However, this strategy may lead to dead-ends in the continuation of the execution. In this paper, we show that single-state trackability can be expressed in terms of the simulation relation between automata. This allows us to provide a complexity bound and a way to build estimators endowed with this property and, moreover, customizable along some correctness criteria. Our implementation relies on the Sat Modulo Theory solver MonoSAT and experiments show that our encoding scales up and applies to real world scenarios.
【Keywords】: Knowledge Representation and Reasoning: Diagnosis and Abductive Reasoning; Constraints and SAT: Satisfiability Modulo Theories;
【Paper Link】 【Pages】:1891-1897
【Authors】: Yakoub Salhi
【Abstract】: Formal logic can be used as a tool for representing complex and heterogeneous data such as beliefs, knowledge and preferences. This study proposes an approach for defining clustering methods that deal with bases of propositional formulas in classical logic, i.e., methods for dividing formula bases into meaningful groups. We first use a postulate-based approach for introducing an intuitive framework for formula clustering. Then, in order to characterize interesting clustering forms, we introduce additional properties that take into consideration different notions, such us logical consequence, overlapping, and consistent partition. Finally, we describe our approach that shows how the inconsistency measures can be involved in improving the task of formula clustering. The main idea consists in using the measures for quantifying the quality of the inconsistent clusters. In this context, we propose further properties that allow characterizing interesting aspects related to the amount of inconsistency.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Machine Learning: Clustering;
【Paper Link】 【Pages】:1898-1904
【Authors】: Robert Ganian ; André Schidler ; Manuel Sorge ; Stefan Szeider
【Abstract】: Treewidth and hypertree width have proven to be highly successful structural parameters in the context of the Constraint Satisfaction Problem (CSP). When either of these parameters is bounded by a constant, then CSP becomes solvable in polynomial time. However, here the order of the polynomial in the running time depends on the width, and this is known to be unavoidable; therefore, the problem is not fixed-parameter tractable parameterized by either of these width measures. Here we introduce an enhancement of tree and hypertree width through a novel notion of thresholds, allowing the associated decompositions to take into account information about the computational costs associated with solving the given CSP instance. Aside from introducing these notions, we obtain efficient theoretical as well as empirical algorithms for computing threshold treewidth and hypertree width and show that these parameters give rise to fixed-parameter algorithms for CSP as well as other, more general problems. We complement our theoretical results with experimental evaluations in terms of heuristics as well as exact methods based on SAT/SMT encodings.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Constraints and SAT: Constraint Satisfaction; Constraints and SAT: Satisfiability Modulo Theories;
【Paper Link】 【Pages】:1905-1911
【Authors】: Pengyu Zhao ; Tianxiao Shui ; Yuanxing Zhang ; Kecheng Xiao ; Kaigui Bian
【Abstract】: Recently, sequential recommendation has become a significant demand for many real-world applications, where the recommended items would be displayed to users one after another and the order of the displays influences the satisfaction of users. An extensive number of models have been developed for sequential recommendation by recommending the next items with the highest scores based on the user histories while few efforts have been made on identifying the transition dependency and behavior continuity in the recommended sequences. In this paper, we introduce the Adversarial Oracular Seq2seq learning for sequential Recommendation (AOS4Rec), which formulates the sequential recommendation as a seq2seq learning problem to portray time-varying interactions in the recommendation, and exploits the oracular learning and adversarial learning to enhance the recommendation quality. We examine the performance of AOS4Rec over RNN-based and Transformer-based recommender systems on two large datasets from real-world applications and make comparisons with state-of-the-art methods. Results indicate the accuracy and efficiency of AOS4Rec, and further analysis verifies that AOS4Rec has both robustness and practicability for real-world scenarios.
【Keywords】: Knowledge Representation and Reasoning: Preference Modelling and Preference-Based Reasoning; Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems;
【Paper Link】 【Pages】:1912-1918
【Authors】: Dragan Doder ; Srdjan Vesic ; Madalina Croitoru
【Abstract】: Bipolar argumentation studies argumentation graphs where attacks are combined with another relation between arguments. Many kind of relations (e.g. deductive support, evidential support, necessities etc.) have been defined and investigated from a Dung semantics perspective. We place ourselves in the context of argumentation systems with necessities and provide the first study to investigate ranking semantics in this setting. To this end, we (1) provide a set of postulates specifically designed for necessities and (2) propose the first ranking-based semantics in the literature to be shown to respect these postulates.
【Keywords】: Knowledge Representation and Reasoning: Computational Models of Argument;
【Paper Link】 【Pages】:1919-1925
【Authors】: Przemyslaw Andrzej Walega ; Bernardo Cuenca Grau ; Mark Kaminski ; Egor V. Kostylev
【Abstract】: We study the data complexity of reasoning for several fragments of MTL - an extension of Datalog with metric temporal operators over the rational numbers. Reasoning in the full MTL language is PSPACE-complete, which handicaps its application in practice. To achieve tractability we first study the core fragment, which disallows conjunction in rule bodies, and show that reasoning remains PSPACE-hard. Intractability prompts us to also limit the kinds of temporal operators allowed in rules, and we propose a practical core fragment for which reasoning becomes TC0-complete. Finally, we show that this fragment can be extended by allowing linear conjunctions in rule bodies, where at most one atom can be intensional (IDB); we show that the resulting fragment is NL-complete, and hence no harder than plain linear Datalog.
【Keywords】: Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning; Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:1926-1932
【Authors】: Guojia Wan ; Shirui Pan ; Chen Gong ; Chuan Zhou ; Gholamreza Haffari
【Abstract】: Knowledge Graphs typically suffer from incompleteness. A popular approach to knowledge graph completion is to infer missing knowledge by multihop reasoning over the information found along other paths connecting a pair of entities. However, multi-hop reasoning is still challenging because the reasoning process usually experiences multiple semantic issue that a relation or an entity has multiple meanings. In order to deal with the situation, we propose a novel Hierarchical Reinforcement Learning framework to learn chains of reasoning from a Knowledge Graph automatically. Our framework is inspired by the hierarchical structure through which human handle cognitionally ambiguous cases. The whole reasoning process is decomposed into a hierarchy of two-level Reinforcement Learning policies for encoding historical information and learning structured action space. As a consequence, it is more feasible and natural for dealing with the multiple semantic issue. Experimental results show that our proposed model achieves substantial improvements in ambiguous relation tasks.
【Keywords】: Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:1933-1939
【Authors】: Zhe Wang ; Peng Xiao ; Kewen Wang ; Zhiqiang Zhuang ; Hai Wan
【Abstract】: Existential rules are an expressive ontology formalism for ontology-mediated query answering and thus query answering is of high complexity, while several tractable fragments have been identified. Existing systems based on first-order rewriting methods can lead to queries too large for DBMS to handle. It is shown that datalog rewriting can result in more compact queries, yet previously proposed datalog rewriting methods are mostly inefficient for implementation. In this paper, we fill the gap by proposing an efficient datalog rewriting approach for answering conjunctive queries over existential rules, and identify and combine existing fragments of existential rules for which our rewriting method terminates. We implemented a prototype system Drewer, and experiments show that it is able to handle a wide range of benchmarks in the literature. Moreover, Drewer shows superior or comparable performance over state-of-the-art systems on both the compactness of rewriting and the efficiency of query answering.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies;
【Paper Link】 【Pages】:1940-1946
【Authors】: Heng Zhang ; Yan Zhang ; Guifei Jiang
【Abstract】: Existential rules, a.k.a. dependencies in databases, and Datalog+/- in knowledge representation and reasoning recently, are a family of important logical languages widely used in computer science and artificial intelligence. Towards a deep understanding of these languages in model theory, we establish model-theoretic characterizations for a number of existential rule languages such as (disjunctive) embedded dependencies, tuple-generating dependencies (TGDs), (frontier-)guarded TGDs and linear TGDs. All these characterizations hold for the class of arbitrary structures, and most of them also work on the class of finite structures. As a natural application of these results, complexity bounds for the rewritability of above languages are also identified.
【Keywords】: Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Knowledge Representation Languages; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:1948-1954
【Authors】: Eden Abadi ; Ronen I. Brafman
【Abstract】: Regular Decision Processes (RDPs) are a recently introduced model that extends MDPs with non-Markovian dynamics and rewards. The non-Markovian behavior is restricted to depend on regular properties of the history. These can be specified using regular expressions or formulas in linear dynamic logic over finite traces. Fully specified RDPs can be solved by compiling them into an appropriate MDP. Learning RDPs from data is a challenging problem that has yet to be addressed, on which we focus in this paper. Our approach rests on a new representation for RDPs using Mealy Machines that emit a distribution and an expected reward for each state-action pair. Building on this representation, we combine automata learning techniques with history clustering to learn such a Mealy machine and solve it by adapting MCTS to it. We empirically evaluate this approach, demonstrating its feasibility.
【Keywords】: Machine Learning: Reinforcement Learning; Planning and Scheduling: Markov Decisions Processes; Knowledge Representation and Reasoning: Action, Change and Causality;
【Paper Link】 【Pages】:1955-1962
【Authors】: Yusuke Iwasawa ; Kei Akuzawa ; Yutaka Matsuo
【Abstract】: Adversarial invariance induction (AII) is a generic and powerful framework for enforcing an invariance to nuisance attributes into neural network representations. However, its optimization is often unstable and little is known about its practical behavior. This paper presents an analysis of the reasons for the optimization difficulties and provides a better optimization procedure by rethinking AII from a divergence minimization perspective. Interestingly, this perspective indicates a cause of the optimization difficulties: it does not ensure proper divergence minimization, which is a requirement of the invariant representations. We then propose a simple variant of AII, called invariance induction by discriminator matching, which takes into account the divergence minimization interpretation of the invariant representations. Our method consistently achieves near-optimal invariance in toy datasets with various configurations in which the original AII is catastrophically unstable. Extentive experiments on four real-world datasets also support the superior performance of the proposed method, leading to improved user anonymization and domain generalization.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:1963-1969
【Authors】: Raman Sankaran ; Francis Bach ; Chiranjib Bhattacharyya
【Abstract】: Subquadratic norms have been studied recently in the context of structured sparsity, which has been shown to be more beneficial than conventional regularizers in applications such as image denoising, compressed sensing, banded covariance estimation, etc. While existing works have been successful in learning structured sparse models such as trees, graphs, their associated optimization procedures have been inefficient because of hard-to-evaluate proximal operators of the norms. In this paper, we study the computational aspects of learning with subquadratic norms in a general setup. Our main contributions are two proximal-operator based algorithms ADMM-η and CP-η, which generically apply to these learning problems with convex loss functions, and achieve a proven rate of convergence of O(1/T) after T iterations. These algorithms are derived in a primal-dual framework, which have not been examined for subquadratic norms. We illustrate the efficiency of the algorithms developed in the context of tree-structured sparsity, where they comprehensively outperform relevant baselines.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:1970-1976
【Authors】: Ege Beyazit ; Doruk Tuncel ; Xu Yuan ; Nian-Feng Tzeng ; Xindong Wu
【Abstract】: Learning interpretable representations in an unsupervised setting is an important yet a challenging task. Existing unsupervised interpretable methods focus on extracting independent salient features from data. However they miss out the fact that the entanglement of salient features may also be informative. Acknowledging these entanglements can improve the interpretability, resulting in extraction of higher quality and a wider variety of salient features. In this paper, we propose a new method to enable Generative Adversarial Networks (GANs) to discover salient features that may be entangled in an informative manner, instead of extracting only disentangled features. Specifically, we propose a regularizer to punish the disagreement between the extracted feature interactions and a given dependency structure while training. We model these interactions using a Bayesian network, estimate the maximum likelihood parameters and calculate a negative likelihood score to measure the disagreement. Upon qualitatively and quantitatively evaluating the proposed method using both synthetic and real-world datasets, we show that our proposed regularizer guides GANs to learn representations with disentanglement scores competing with the state-of-the-art, while extracting a wider variety of salient features.
【Keywords】: Machine Learning: Interpretability; Machine Learning: Deep Generative Models; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:1977-1983
【Authors】: Debarun Bhattacharjya ; Tian Gao ; Dharmashankar Subramanian
【Abstract】: In multivariate event data, the instantaneous rate of an event's occurrence may be sensitive to the temporal sequence in which other influencing events have occurred in the history. For example, an agent’s actions are typically driven by preceding actions taken by the agent as well as those of other relevant agents in some order. We introduce a novel statistical/causal model for capturing such an order-sensitive historical dependence, where an event’s arrival rate is determined by the order in which its underlying causal events have occurred in the recent past. We propose an algorithm to discover these causal events and learn the most influential orders using time-stamped event occurrence data. We show that the proposed model fits various event datasets involving single as well as multiple agents better than baseline models. We also illustrate potentially useful insights from our proposed model for an analyst during the discovery process through analysis on a real-world political event dataset.
【Keywords】: Machine Learning: Learning Graphical Models; Data Mining: Mining Spatial, Temporal Data; Agent-based and Multi-agent Systems: Multi-agent Learning;
【Paper Link】 【Pages】:1984-1991
【Authors】: Roman Bresson ; Johanne Cohen ; Eyke Hüllermeier ; Christophe Labreuche ; Michèle Sebag
【Abstract】: Multi-Criteria Decision Making (MCDM) aims at modelling expert preferences and assisting decision makers in identifying options best accommodating expert criteria. An instance of MCDM model, the Choquet integral is widely used in real-world applications, due to its ability to capture interactions between criteria while retaining interpretability. Aimed at a better scalability and modularity, hierarchical Choquet integrals involve intermediate aggregations of the interacting criteria, at the cost of a more complex elicitation. The paper presents a machine learning-based approach for the automatic identification of hierarchical MCDM models, composed of 2-additive Choquet integral aggregators and of marginal utility functions on the raw features from data reflecting expert preferences. The proposed NEUR-HCI framework relies on a specific neural architecture, enforcing by design the Choquet model constraints and supporting its end-to-end training. The empirical validation of NEUR-HCI on real-world and artificial benchmarks demonstrates the merits of the approach compared to state-of-art baselines.
【Keywords】: Machine Learning: Learning Preferences or Rankings; Knowledge Representation and Reasoning: Preference Modelling and Preference-Based Reasoning; Knowledge Representation and Reasoning: Utility Theory; Machine Learning: Knowledge-based Learning;
【Paper Link】 【Pages】:1992-1998
【Authors】: Ling Pan ; Qingpeng Cai ; Qi Meng ; Wei Chen ; Longbo Huang
【Abstract】: Value function estimation is an important task in reinforcement learning, i.e., prediction. The Boltzmann softmax operator is a natural value estimator and can provide several benefits. However, it does not satisfy the non-expansion property, and its direct use may fail to converge even in value iteration. In this paper, we propose to update the value function with dynamic Boltzmann softmax (DBS) operator, which has good convergence property in the setting of planning and learning. Experimental results on GridWorld show that the DBS operator enables better estimation of the value function, which rectifies the convergence issue of the softmax operator. Finally, we propose the DBS-DQN algorithm by applying the DBS operator, which outperforms DQN substantially in 40 out of 49 Atari games.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:1999-2005
【Authors】: Yifan Hao ; Huiping Cao
【Abstract】: Classifying multivariate time series (MTS), which record the values of multiple variables over a continuous period of time, has gained a lot of attention. However, existing techniques suffer from two major issues. First, the long-range dependencies of the time-series sequences are not well captured. Second, the interactions of multiple variables are generally not represented in features. To address these aforementioned issues, we propose a novel Cross Attention Stabilized Fully Convolutional Neural Network (CA-SFCN) to classify MTS data. First, we introduce a temporal attention mechanism to extract long- and short-term memories across all time steps. Second, variable attention is designed to select relevant variables at each time step. CA-SFCN is compared with 16 approaches using 14 different MTS datasets. The extensive experimental results show that the CA-SFCN outperforms state-of-the-art classification methods, and the cross attention mechanism achieves better performance than other attention mechanisms.
【Keywords】: Machine Learning: Classification; Machine Learning: Deep Learning: Convolutional networks; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:2006-2013
【Authors】: Biagio La Rosa ; Roberto Capobianco ; Daniele Nardi
【Abstract】: In this paper we present a novel mechanism to get explanations that allow to better understand network predictions when dealing with sequential data. Specifically, we adopt memory-based networks — Differential Neural Computers — to exploit their capability of storing data in memory and reusing it for inference. By tracking both the memory access at prediction time, and the information stored by the network at each step of the input sequence, we can retrieve the most relevant input steps associated to each prediction. We validate our approach (1) on a modified T-maze, which is a non-Markovian discrete control task evaluating an algorithm’s ability to correlate events far apart in history, and (2) on the Story Cloze Test, which is a commonsense reasoning framework for evaluating story understanding that requires a system to choose the correct ending to a four-sentence story. Our results show that we are able to explain agent’s decisions in (1) and to reconstruct the most relevant sentences used by the network to select the story ending in (2). Additionally, we show not only that by removing those sentences the network prediction changes, but also that the same are sufficient to reproduce the inference.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Interpretability; Machine Learning: Deep Learning: Sequence Modeling; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2014-2021
【Authors】: Shizhen Chang ; Bo Du ; Liangpei Zhang
【Abstract】: The positive unlabeled (PU) learning aims to train a binary classifier from a set of positive labeled samples and other unlabeled samples. Much research has been done on this special branch of weakly supervised classification problems. Since only part of the positive class is labeled, the classical PU model trains the classifier assuming the class-prior is known. However, the true class prior is usually difficult to obtain and must be learned from the given data, and the traditional methods may not work. In this paper, we formulate a convex formulation to jointly solve the class-prior unknown problem and train an accurate classifier with no need of any class-prior assumptions or additional negative samples. The class prior is estimated by pursuing the optimal solution of gradient thresholding and the classifier is simultaneously trained by performing empirical unbiased risk. The detailed derivation and theoretical analysis of the proposed model are outlined, and a comparison of our experiments with other representative methods prove the superiority of our method.
【Keywords】: Machine Learning: Semi-Supervised Learning; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:2022-2028
【Authors】: Nirbhay Modhe ; Prithvijit Chattopadhyay ; Mohit Sharma ; Abhishek Das ; Devi Parikh ; Dhruv Batra ; Ramakrishna Vedantam
【Abstract】: We propose a novel framework to identify sub-goals useful for exploration in sequential decision making tasks under partial observability. We utilize the variational intrinsic control framework (Gregor et.al., 2016) which maximizes empowerment -- the ability to reliably reach a diverse set of states and show how to identify sub-goals as states with high necessary option information through an information theoretic regularizer. Despite being discovered without explicit goal supervision, our sub-goals provide better exploration and sample complexity on challenging grid-world navigation tasks compared to supervised counterparts in prior work.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2029-2036
【Authors】: Wenchao Chen ; Bo Chen ; Yicheng Liu ; Qianru Zhao ; Mingyuan Zhou
【Abstract】: We propose Switching Poisson gamma dynamical systems (SPGDS) to model sequentially observed multivariate count data. Different from previous models, SPGDS assigns its latent variables into mixture of gamma distributed parameters to model complex sequences and describe the nonlinear dynamics, meanwhile, capture various temporal dependencies. For efficient inference, we develop a scalable hybrid stochastic gradient-MCMC and switching recurrent autoencoding variational inference, which is scalable to large scale sequences and fast in out-of-sample prediction. Experiments on both unsupervised and supervised tasks demonstrate that the proposed model not only has excellent fitting and prediction performance on complex dynamic sequences, but also separates different dynamical patterns within them.
【Keywords】: Machine Learning: Probabilistic Machine Learning; Machine Learning: Learning Generative Models; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:2037-2043
【Authors】: Gehui Shen ; Xi Chen ; Zhi-Hong Deng
【Abstract】: Bayesian neural networks (BNNs) have received more and more attention because they are capable of modeling epistemic uncertainty which is hard for conventional neural networks. Markov chain Monte Carlo (MCMC) methods and variational inference (VI) are two mainstream methods for Bayesian deep learning. The former is effective but its storage cost is prohibitive since it has to save many samples of neural network parameters. The latter method is more time and space efficient, however the approximate variational posterior limits its performance. In this paper, we aim to combine the advantages of above two methods by distilling MCMC samples into an approximate variational posterior. On the basis of an existing distillation technique we first propose variational Bayesian dark knowledge method. Moreover, we propose Bayesian dark prior knowledge, a novel distillation method which considers MCMC posterior as the prior of a variational BNN. Two proposed methods both not only can reduce the space overhead of the teacher model so that are scalable, but also maintain a distilled posterior distribution capable of modeling epistemic uncertainty. Experimental results manifest our methods outperform existing distillation method in terms of predictive accuracy and uncertainty modeling.
【Keywords】: Machine Learning: Probabilistic Machine Learning; Uncertainty in AI: Approximate Probabilistic Inference; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2044-2050
【Authors】: Xiaozhou Wang ; Xi Chen ; Qihang Lin ; Weidong Liu
【Abstract】: The performance of clustering depends on an appropriately defined similarity between two items. When the similarity is measured based on human perception, human workers are often employed to estimate a similarity score between items in order to support clustering, leading to a procedure called crowdsourced clustering. Assuming a monetary reward is paid to a worker for each similarity score and assuming the similarities between pairs and workers' reliability have a large diversity, when the budget is limited, it is critical to wisely assign pairs of items to different workers to optimize the clustering result. We model this budget allocation problem as a Markov decision process where item pairs are dynamically assigned to workers based on the historical similarity scores they provided. We propose an optimistic knowledge gradient policy where the assignment of items in each stage is based on the minimum-weight K-cut defined on a similarity graph. We provide simulation studies and real data analysis to demonstrate the performance of the proposed method.
【Keywords】: Machine Learning: Unsupervised Learning; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:2051-2057
【Authors】: Niklas Åkerblom ; Yuxin Chen ; Morteza Haghir Chehreghani
【Abstract】: Energy-efficient navigation constitutes an important challenge in electric vehicles, due to their limited battery capacity. We employ a Bayesian approach to model the energy consumption at road segments for efficient navigation. In order to learn the model parameters, we develop an online learning framework and investigate several exploration strategies such as Thompson Sampling and Upper Confidence Bound. We then extend our online learning framework to multi-agent setting, where multiple vehicles adaptively navigate and learn the parameters of the energy model. We analyze Thompson Sampling and establish rigorous regret bounds on its performance. Finally, we demonstrate the performance of our methods via several real-world experiments on Luxembourg SUMO Traffic dataset.
【Keywords】: Machine Learning: Online Learning; Multidisciplinary Topics and Applications: Transportation; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:2058-2064
【Authors】: Ziye Chen ; Mingming Gong ; Lingjuan Ge ; Bo Du
【Abstract】: In this paper, we apply self-attention (SA) mechanism to boost the performance of deep metric learning. However, due to the pairwise similarity measurement, the cost of storing and manipulating the complete attention maps makes it infeasible for large inputs. To solve this problem, we propose a compressed self-attention with low-rank approximation (CSALR) module, which significantly reduces the computation and memory costs without sacrificing the accuracy. In CSALR, the original attention map is decomposed into a landmark attention map and a combination coefficient map with a small number of landmark feature vectors sampled from the input feature map by average pooling. Thanks to the efficiency of CSALR, we can apply CSALR to high-resolution shallow convolutional layers and implement a multi-head form of CSALR, which further boosts the performance. We evaluate the proposed CSALR on person reidentification which is a typical metric learning task. Extensive experiments shows the effectiveness and efficiency of CSALR in deep metric learning and its superiority over the baselines.
【Keywords】: Machine Learning: Classification; Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:2065-2072
【Authors】: Hsuan-Kung Yang ; Po-Han Chiang ; Min-Fong Hong ; Chun-Yi Lee
【Abstract】: In this paper, we focus on a prediction-based novelty estimation strategy upon the deep reinforcement learning (DRL) framework, and present a flow-based intrinsic curiosity module (FICM) to exploit the prediction errors from optical flow estimation as exploration bonuses. We propose the concept of leveraging motion features captured between consecutive observations to evaluate the novelty of observations in an environment. FICM encourages a DRL agent to explore observations with unfamiliar motion features, and requires only two consecutive frames to obtain sufficient information when estimating the novelty. We evaluate our method and compare it with a number of existing methods on multiple benchmark environments, including Atari games, Super Mario Bros., and ViZDoom. We demonstrate that FICM is favorable to tasks or environments featuring moving objects, which allow FICM to utilize the motion features between consecutive observations. We further ablatively analyze the encoding efficiency of FICM, and discuss its applicable domains comprehensively. See here for our codes and demo videos.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning Applications: Game Playing; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2073-2079
【Authors】: Andrew Cropper ; Sebastijan Dumancic
【Abstract】: A major challenge in inductive logic programming (ILP) is learning large programs. We argue that a key limitation of existing systems is that they use entailment to guide the hypothesis search. This approach is limited because entailment is a binary decision: a hypothesis either entails an example or does not, and there is no intermediate position. To address this limitation, we go beyond entailment and use 'example-dependent' loss functions to guide the search, where a hypothesis can partially cover an example. We implement our idea in Brute, a new ILP system which uses best-first search, guided by an example-dependent loss function, to incrementally build programs. Our experiments on three diverse program synthesis domains (robot planning, string transformations, and ASCII art), show that Brute can substantially outperform existing ILP systems, both in terms of predictive accuracies and learning times, and can learn programs 20 times larger than state-of-the-art systems.
【Keywords】: Machine Learning: Relational Learning;
【Paper Link】 【Pages】:2080-2087
【Authors】: Yufei Cui ; Ziquan Liu ; Wuguannan Yao ; Qiao Li ; Antoni B. Chan ; Tei-Wei Kuo ; Chun Jason Xue
【Abstract】: Neural network compression and quantization are important tasks for fitting state-of-the-art models into the computational, memory and power constraints of mobile devices and embedded hardware. Recent approaches to model compression/quantization are based on reinforcement learning or search methods to quantize the neural network for a specific hardware platform. However, these methods require multiple runs to compress/quantize the same base neural network to different hardware setups. In this work, we propose a fully nested neural network (FN3) that runs only once to build a nested set of compressed/quantized models, which is optimal for different resource constraints. Specifically, we exploit the additive characteristic in different levels of building blocks in neural network and propose an ordered dropout (ODO) operation that ranks the building blocks. Given a trained FN3, a fast heuristic search algorithm is run offline to find the optimal removal of components to maximize the accuracy under different constraints. Compared with the related works on adaptive neural network designed only for channels or bits, the proposed approach is applicable to different levels of building blocks (bits, neurons, channels, residual paths and layers). Empirical results validate strong practical performance of proposed approach.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:2088-2095
【Authors】: Yaoming Wang ; Wenrui Dai ; Chenglin Li ; Junni Zou ; Hongkai Xiong
【Abstract】: Bayesian methods have improved the interpretability and stability of neural architecture search (NAS). In this paper, we propose a novel probabilistic approach, namely Semi-Implicit Variational Dropout one-shot Neural Architecture Search (SI-VDNAS), that leverages semi-implicit variational dropout to support architecture search with variable operations and edges. SI-VDNAS achieves stable training that would not be affected by the over-selection of skip-connect operation. Experimental results demonstrate that SI-VDNAS finds a convergent architecture with only 2.7 MB parameters within 0.8 GPU-days and can achieve 2.60% top-1 error rate on CIFAR-10. The convergent architecture can obtain a top-1 error rate of 16.20% and 25.6% when transferred to CIFAR-100 and ImageNet (mobile setting).
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:2096-2102
【Authors】: Enmin Zhao ; Shihong Deng ; Yifan Zang ; Yongxin Kang ; Kai Li ; Junliang Xing
【Abstract】: Experience replay plays a crucial role in Reinforcement Learning (RL), enabling the agent to remember and reuse experience from the past. Most previous methods sample experience transitions using simple heuristics like uniformly sampling or prioritizing those good ones. Since humans can learn from both good and bad experiences, more sophisticated experience replay algorithms need to be developed. Inspired by the potential energy in physics, this work introduces the artificial potential field into experience replay and develops Potentialized Experience Replay (PotER) as a new and effective sampling algorithm for RL in hard exploration tasks with sparse rewards. PotER defines a potential energy function for each state in experience replay and helps the agent to learn from both good and bad experiences using intrinsic state supervision. PotER can be combined with different RL algorithms as well as the self-imitation learning algorithm. Experimental analyses and comparisons on multiple challenging hard exploration environments have verified its effectiveness and efficiency.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Deep Reinforcement Learning; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:2103-2110
【Authors】: Sayak Dey ; Swagatam Das ; Rammohan Mallipeddi
【Abstract】: Classical clustering methods usually face tough challenges when we have a larger set of features compared to the number of items to be partitioned. We propose a Sparse MinMax k-Means Clustering approach by reformulating the objective of the MinMax k-Means algorithm (a variation of classical k-Means that minimizes the maximum intra-cluster variance instead of the sum of intra-cluster variances), into a new weighted between-cluster sum of squares (BCSS) form. We impose sparse regularization on these weights to make it suitable for high-dimensional clustering. We seek to use the advantages of the MinMax k-Means algorithm in the high-dimensional space to generate good quality clusters. The efficacy of the proposal is showcased through comparison against a few representative clustering methods over several real world datasets.
【Keywords】: Machine Learning: Clustering; Machine Learning: Feature Selection; Learning Sparse Models; Data Mining: Clustering, Unsupervised Learning;
【Paper Link】 【Pages】:2111-2118
【Authors】: Tuan Hoang ; Thanh-Toan Do ; Tam V. Nguyen ; Ngai-Man Cheung
【Abstract】: This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.
【Keywords】: Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2119-2125
【Authors】: Michele Donini ; Luca Franceschi ; Orchid Majumder ; Massimiliano Pontil ; Paolo Frasconi
【Abstract】: We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedule -- the hypergradient. Based on this, we introduce MARTHE, a novel online algorithm guided by cheap approximations of the hypergradient that uses past information from the optimization trajectory to simulate future behaviour. It interpolates between two recent techniques, RTHO (Franceschi et al., 2017) and HD (Baydin et al. 2018), and is able to produce learning rate schedules that are more stable leading to models that generalize better.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Online Learning;
【Paper Link】 【Pages】:2126-2132
【Authors】: George Dasoulas ; Ludovic Dos Santos ; Kevin Scaman ; Aladin Virmaux
【Abstract】: In this paper, we show that a simple coloring scheme can improve, both theoretically and empirically, the expressive power of Message Passing Neural Networks (MPNNs). More specifically, we introduce a graph neural network called Colored Local Iterative Procedure (CLIP) that uses colors to disambiguate identical node attributes, and show that this representation is a universal approximator of continuous functions on graphs with node attributes. Our method relies on separability, a key topological characteristic that allows to extend well-chosen neural networks into universal representations. Finally, we show experimentally that CLIP is capable of capturing structural characteristics that traditional MPNNs fail to distinguish, while being state-of-the-art on benchmark graph classification datasets.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Relational Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:2133-2139
【Authors】: Peng Zhou ; Liang Du ; Xuejun Li
【Abstract】: Consensus clustering provides a framework to ensemble multiple clustering results to obtain a consensus and robust result. Most existing consensus clustering methods usually apply all data to ensemble learning, whereas ignoring the side effects caused by some difficult or unreliable instances. To tackle this problem, we propose a novel self-paced consensus clustering method to gradually involve instances from more reliable to less reliable ones into the ensemble learning. We first construct an initial bipartite graph from the multiple base clustering results, where the nodes represent the instances and clusters and the edges indicate that an instance belongs to a cluster. Then, we learn a structured bipartite graph from the initial one by self-paced learning, i.e., we automatically decide the reliability of each edge and involves the edges into graph learning in order of their reliability. At last, we obtain the final consensus clustering result from the learned bipartite graph. The extensive experimental results demonstrate the effectiveness and superiority of the proposed method.
【Keywords】: Machine Learning: Clustering; Machine Learning: Ensemble Methods;
【Paper Link】 【Pages】:2140-2147
【Authors】: Maxime Wabartha ; Audrey Durand ; Vincent François-Lavet ; Joelle Pineau
【Abstract】: By virtue of their expressive power, neural networks (NNs) are well suited to fitting large, complex datasets, yet they are also known to produce similar predictions for points outside the training distribution. As such, they are, like humans, under the influence of the Black Swan theory: models tend to be extremely "surprised" by rare events, leading to potentially disastrous consequences, while justifying these same events in hindsight. To avoid this pitfall, we introduce DENN, an ensemble approach building a set of Diversely Extrapolated Neural Networks that fits the training data and is able to generalize more diversely when extrapolating to novel data points. This leads DENN to output highly uncertain predictions for unexpected inputs. We achieve this by adding a diversity term in the loss function used to train the model, computed at specific inputs. We first illustrate the usefulness of the method on a low-dimensional regression problem. Then, we show how the loss can be adapted to tackle anomaly detection during classification, as well as safe imitation learning problems.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Ensemble Methods; Uncertainty in AI: Uncertainty Representations;
【Paper Link】 【Pages】:2148-2154
【Authors】: Christoph Dürr ; Nguyen Kim Thang ; Abhinav Srivastav ; Léo Tible
【Abstract】: Many real-world problems can often be cast as the optimization of DR-submodular functions defined over a convex domain. These functions play an important role with applications in many areas of applied mathematics, such as machine learning, computer vision, operation research, communication systems or economics. In addition, they capture a subclass of non-convex optimization that provides both practical and theoretical guarantees. In this paper, we show that for maximizing non-monotone DR-submodular functions over a general convex set (such as up-closed convex sets, conic convex set, etc) the Frank-Wolfe algorithm achieves an approximation guarantee which depends on the convex set. To the best of our knowledge, this is the first approximation guarantee. Finally we benchmark our algorithm on problems arising in machine learning domain with the real-world datasets.
【Keywords】: Machine Learning: Learning Theory; Machine Learning: Online Learning;
【Paper Link】 【Pages】:2155-2161
【Authors】: Rémi Viola ; Rémi Emonet ; Amaury Habrard ; Guillaume Metzler ; Marc Sebban
【Abstract】: Learning from imbalanced data, where the positive examples are very scarce, remains a challenging task from both a theoretical and algorithmic perspective. In this paper, we address this problem using a metric learning strategy. Unlike the state-of-the-art methods, our algorithm MLFP, for Metric Learning from Few Positives, learns a new representation that is used only when a test query is compared to a minority training example. From a geometric perspective, it artificially brings positive examples closer to the query without changing the distances to the negative (majority class) data. This strategy allows us to expand the decision boundaries around the positives, yielding a better F-Measure, a criterion which is suited to deal with imbalanced scenarios. Beyond the algorithmic contribution provided by MLFP, our paper presents generalization guarantees on the false positive and false negative rates. Extensive experiments conducted on several imbalanced datasets show the effectiveness of our method.
【Keywords】: Machine Learning: Classification; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:2162-2168
【Authors】: Tanguy Kerdoncuff ; Rémi Emonet ; Marc Sebban
【Abstract】: Domain Adaptation aims at benefiting from a labeled dataset drawn from a source distribution to learn a model from examples generated from a different but related target distribution. Creating a domain-invariant representation between the two source and target domains is the most widely technique used. A simple and robust way to perform this task consists in (i) representing the two domains by subspaces described by their respective eigenvectors and (ii) seeking a mapping function which aligns them. In this paper, we propose to use Optimal Transport (OT) and its associated Wassertein distance to perform this alignment. While the idea of using OT in domain adaptation is not new, the original contribution of this paper is two-fold: (i) we derive a generalization bound on the target error involving several Wassertein distances. This prompts us to optimize the ground metric of OT to reduce the target risk; (ii) from this theoretical analysis, we design an algorithm (MLOT) which optimizes a Mahalanobis distance leading to a transportation plan that adapts better. Extensive experiments demonstrate the effectiveness of this original approach.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2169-2176
【Authors】: Fei Mi ; Boi Faltings
【Abstract】: Increasing concerns with privacy have stimulated interests in Session-based Recommendation (SR) using no personal data other than what is observed in the current browser session. Existing methods are evaluated in static settings which rarely occur in real-world applications. To better address the dynamic nature of SR tasks, we study an incremental SR scenario, where new items and preferences appear continuously. We show that existing neural recommenders can be used in incremental SR scenarios with small incremental updates to alleviate computation overhead and catastrophic forgetting. More importantly, we propose a general framework called Memory Augmented Neural model (MAN). MAN augments a base neural recommender with a continuously queried and updated nonparametric memory, and the predictions from the neural and the memory components are combined through another lightweight gating network. We empirically show that MAN is well-suited for the incremental SR task, and it consistently outperforms state-oft-he-art neural and nonparametric methods. We analyze the results and demonstrate that it is particularly good at incrementally learning preferences on new and infrequent items.
【Keywords】: Machine Learning: Recommender Systems; Machine Learning: Online Learning; Machine Learning: Time-series;Data Streams;
【Paper Link】 【Pages】:2177-2183
【Authors】: Xiao Wang ; Shaohua Fan ; Kun Kuang ; Chuan Shi ; Jiawei Liu ; Bai Wang
【Abstract】: Most of existing clustering algorithms are proposed without considering the selection bias in data. In many real applications, however, one cannot guarantee the data is unbiased. Selection bias might bring the unexpected correlation between features and ignoring those unexpected correlations will hurt the performance of clustering algorithms. Therefore, how to remove those unexpected correlations induced by selection bias is extremely important yet largely unexplored for clustering. In this paper, we propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias. Specifically, the decorrelation regularizer aims to learn the global sample weights which are capable of balancing the sample distribution, so as to remove unexpected correlations among features. Meanwhile, the learned weights are combined with k-means, which makes the reweighted k-means cluster on the inherent data distribution without unexpected correlation influence. Moreover, we derive the updating rules to effectively infer the parameters in DCKM. Extensive experiments results on real world datasets well demonstrate that our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias when clustering.
【Keywords】: Machine Learning: Clustering; Machine Learning: Unsupervised Learning; Data Mining: Clustering, Unsupervised Learning;
【Paper Link】 【Pages】:2184-2190
【Authors】: Dingqi Yang ; Benjamin Fankhauser ; Paolo Rosso ; Philippe Cudré-Mauroux
【Abstract】: Location prediction is a key problem in human mobility modeling, which predicts a user's next location based on historical user mobility traces. As a sequential prediction problem by nature, it has been recently studied using Recurrent Neural Networks (RNNs). Due to the sparsity of user mobility traces, existing techniques strive to improve RNNs by considering spatiotemporal contexts. The most adopted scheme is to incorporate spatiotemporal factors into the recurrent hidden state passing process of RNNs using context-parameterized transition matrices or gates. However, such a scheme oversimplifies the temporal periodicity and spatial regularity of user mobility, and thus cannot fully benefit from rich historical spatiotemporal contexts encoded in user mobility traces. Against this background, we propose Flashback, a general RNN architecture designed for modeling sparse user mobility traces by doing flashbacks on hidden states in RNNs. Specifically, Flashback explicitly uses spatiotemporal contexts to search past hidden states with high predictive power (i.e., historical hidden states sharing similar contexts as the current one) for location prediction, which can then directly benefit from rich spatiotemporal contexts. Our extensive evaluation compares Flashback against a sizable collection of state-of-the-art techniques on two real-world LBSN datasets. Results show that Flashback consistently and significantly outperforms state-of-the-art RNNs involving spatiotemporal factors by 15.9% to 27.6% in the next location prediction task.
【Keywords】: Machine Learning: Deep Learning: Sequence Modeling; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:2191-2197
【Authors】: Bahare Fatemi ; Perouz Taslakian ; David Vázquez ; David Poole
【Abstract】: Knowledge graphs store facts using relations between two entities. In this work, we address the question of link prediction in knowledge hypergraphs where relations are defined on any number of entities. While techniques exist (such as reification) that convert non-binary relations into binary ones, we show that current embedding-based methods for knowledge graph completion do not work well out of the box for knowledge graphs obtained through these techniques. To overcome this, we introduce HSimplE and HypE, two embedding-based methods that work directly with knowledge hypergraphs. In both models, the prediction is a function of the relation embedding, the entity embeddings and their corresponding positions in the relation. We also develop public datasets, benchmarks and baselines for hypergraph prediction and show experimentally that the proposed models are more effective than the baselines.
【Keywords】: Machine Learning: Relational Learning; Uncertainty in AI: Statistical Relational AI;
【Paper Link】 【Pages】:2198-2205
【Authors】: Dieqiao Feng ; Carla P. Gomes ; Bart Selman
【Abstract】: Despite significant progress in general AI planning, certain domains remain out of reach of current AI planning systems. Sokoban is a PSPACE-complete planning task and represents one of the hardest domains for current AI planners. Even domain-specific specialized search methods fail quickly due to the exponential search complexity on hard instances. Our approach based on deep reinforcement learning augmented with a curriculum-driven method is the first one to solve hard instances within one day of training while other modern solvers cannot solve these instances within any reasonable time limit. In contrast to prior efforts, which use carefully handcrafted pruning techniques, our approach automatically uncovers domain structure. Our results reveal that deep RL provides a promising framework for solving previously unsolved AI planning problems, provided a proper training curriculum can be devised.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Planning and Scheduling: Planning Algorithms; Heuristic Search and Game Playing: Heuristic Search and Machine Learning;
【Paper Link】 【Pages】:2206-2212
【Authors】: Lei Feng ; Senlin Shu ; Zhuoyi Lin ; Fengmao Lv ; Li Li ; Bo An
【Abstract】: Trained with the standard cross entropy loss, deep neural networks can achieve great performance on correctly labeled data. However, if the training data is corrupted with label noise, deep models tend to overfit the noisy labels, thereby achieving poor generation performance. To remedy this issue, several loss functions have been proposed and demonstrated to be robust to label noise. Although most of the robust loss functions stem from Categorical Cross Entropy (CCE) loss, they fail to embody the intrinsic relationships between CCE and other loss functions. In this paper, we propose a general framework dubbed Taylor cross entropy loss to train deep models in the presence of label noise. Specifically, our framework enables to weight the extent of fitting the training labels by controlling the order of Taylor Series for CCE, hence it can be robust to label noise. In addition, our framework clearly reveals the intrinsic relationships between CCE and other loss functions, such as Mean Absolute Error (MAE) and Mean Squared Error (MSE). Moreover, we present a detailed theoretical analysis to certify the robustness of this framework. Extensive experimental results on benchmark datasets demonstrate that our proposed approach significantly outperforms the state-of-the-art counterparts.
【Keywords】: Machine Learning: Classification; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:2213-2219
【Authors】: Rajarshi Roy ; Dana Fisman ; Daniel Neider
【Abstract】: We address the problem of learning human-interpretable descriptions of a complex system from a finite set of positive and negative examples of its behavior. In contrast to most of the recent work in this area, which focuses on descriptions expressed in Linear Temporal Logic (LTL), we develop a learning algorithm for formulas in the IEEE standard temporal logic PSL (Property Specification Language). Our work is motivated by the fact that many natural properties, such as an event happening at every n-th point in time, cannot be expressed in LTL, whereas it is easy to express such properties in PSL. Moreover, formulas in PSL can be more succinct and easier to interpret (due to the use of regular expressions in PSL formulas) than formulas in LTL. The learning algorithm we designed, builds on top of an existing algorithm for learning LTL formulas. Roughly speaking, our algorithm reduces the learning task to a constraint satisfaction problem in propositional logic and then uses a SAT solver to search for a solution in an incremental fashion. We have implemented our algorithm and performed a comparative study between the proposed method and the existing LTL learning algorithm. Our results illustrate the effectiveness of the proposed approach to provide succinct human-interpretable descriptions from examples.
【Keywords】: Machine Learning: Classification; Machine Learning: Explainable Machine Learning; Constraints and SAT: SAT: : Solvers and Applications;
【Paper Link】 【Pages】:2220-2226
【Authors】: Tomás Kocák ; Aurélien Garivier
【Abstract】: We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint. We provide and analyze an efficient gradient ascent algorithm to compute the sample complexity of this problem as a solution of a non-smooth max-min problem (providing in passing a simplified analysis for the unconstrained case). Building on this algorithm, we propose an asymptotically optimal strategy. We furthermore illustrate by numerical experiments both the strategy's efficiency and the impact of the smoothness constraint on the sample complexity. Best Arm Identification (BAI) is an important challenge in many applications ranging from parameter tuning to clinical trials. It is now very well understood in vanilla bandit models, but real-world problems typically involve some dependency between arms that requires more involved models. Assuming a graph structure on the arms is an elegant practical way to encompass this phenomenon, but this had been done so far only for regret minimization. Addressing BAI with graph constraints involves delicate optimization problems for which the present paper offers a solution.
【Keywords】: Machine Learning: Learning Theory; Machine Learning: Online Learning;
【Paper Link】 【Pages】:2227-2233
【Authors】: Tao Shen ; Xiubo Geng ; Guodong Long ; Jing Jiang ; Chengqi Zhang ; Daxin Jiang
【Abstract】: Many algorithms for Knowledge-Based Question Answering (KBQA) depend on semantic parsing, which translates a question to its logical form. When only weak supervision is provided, it is usually necessary to search valid logical forms for model training. However, a complex question typically involves a huge search space, which creates two main problems: 1) the solutions limited by computation time and memory usually reduce the success rate of the search, and 2) spurious logical forms in the search results degrade the quality of training data. These two problems lead to a poorly-trained semantic parsing model. In this work, we propose an effective search method for weakly supervised KBQA based on operator prediction for questions. With search space constrained by predicted operators, sufficient search paths can be explored, more valid logical forms can be derived, and operators possibly causing spurious logical forms can be avoided. As a result, a larger proportion of questions in a weakly supervised training set are equipped with logical forms, and fewer spurious logical forms are generated. Such high-quality training data directly contributes to a better semantic parsing model. Experimental results on one of the largest KBQA datasets (i.e., CSQA) verify the effectiveness of our approach and deliver a new state-of-the-art performance.
【Keywords】: Machine Learning: Knowledge-based Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:2234-2240
【Authors】: Gabriele Ciravegna ; Francesco Giannini ; Marco Gori ; Marco Maggini ; Stefano Melacci
【Abstract】: Deep neural networks are usually considered black-boxes due to their complex internal architecture, that cannot straightforwardly provide human-understandable explanations on how they behave. Indeed, Deep Learning is still viewed with skepticism in those real-world domains in which incorrect predictions may produce critical effects. This is one of the reasons why in the last few years Explainable Artificial Intelligence (XAI) techniques have gained a lot of attention in the scientific community. In this paper, we focus on the case of multi-label classification, proposing a neural network that learns the relationships among the predictors associated to each class, yielding First-Order Logic (FOL)-based descriptions. Both the explanation-related network and the classification-related network are jointly learned, thus implicitly introducing a latent dependency between the development of the explanation mechanism and the development of the classifiers. Our model can integrate human-driven preferences that guide the learning-to-explain process, and it is presented in a unified framework. Different typologies of explanations are evaluated in distinct experiments, showing that the proposed approach discovers new knowledge and can improve the classifier performance.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Interpretability; Machine Learning: Neuro-Symbolic Methods;
【Paper Link】 【Pages】:2241-2247
【Authors】: Frank Nussbaum ; Joachim Giesen
【Abstract】: Measurement is at the core of scientific discovery. However, some quantities, such as economic behavior or intelligence, do not allow for direct measurement. They represent latent constructs that require surrogate measurements. In other scenarios, non-observed quantities can influence the variables of interest. In either case, models with latent variables are needed. Here, we investigate fused latent and graphical models that exhibit continuous latent variables and discrete observed variables. These models are characterized by a decomposition of the pairwise interaction parameter matrix into a group-sparse component of direct interactions and a low-rank component of indirect interactions due to the latent variables. We first investigate when such a decomposition is identifiable. Then, we show that fused latent and graphical models can be recovered consistently from data in the high-dimensional setting. We support our theoretical findings with experiments on synthetic and real-world data from polytomous item response theory studies.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Machine Learning: Learning Graphical Models; Multidisciplinary Topics and Applications: Social Sciences;
【Paper Link】 【Pages】:2248-2254
【Authors】: Chuang Zhang ; Chen Gong ; Tengfei Liu ; Xun Lu ; Weiqiang Wang ; Jian Yang
【Abstract】: Positive and Unlabeled learning (PU learning) aims to build a binary classifier where only positive and unlabeled data are available for classifier training. However, existing PU learning methods all work on a batch learning mode, which cannot deal with the online learning scenarios with sequential data. Therefore, this paper proposes a novel positive and unlabeled learning algorithm in an online training mode, which trains a classifier solely on the positive and unlabeled data arriving in a sequential order. Specifically, we adopt an unbiased estimate for the loss induced by the arriving positive or unlabeled examples at each time. Then we show that for any coming new single datum, the model can be updated independently and incrementally by gradient based online learning method. Furthermore, we extend our method to tackle the cases when more than one example is received at each time. Theoretically, we show that the proposed online PU learning method achieves low regret even though it receives sequential positive and unlabeled data. Empirically, we conduct intensive experiments on both benchmark and real-world datasets, and the results clearly demonstrate the effectiveness of the proposed method.
【Keywords】: Machine Learning: Classification; Machine Learning: Online Learning;
【Paper Link】 【Pages】:2255-2261
【Authors】: Priyadarshini Kumari ; Ritesh Goru ; Siddhartha Chaudhuri ; Subhasis Chaudhuri
【Abstract】: We present an active learning strategy for training parametric models of distance metrics, given triplet-based similarity assessments: object $x_i$ is more similar to object $x_j$ than to $x_k$. In contrast to prior work on class-based learning, where the fundamental goal is classification and any implicit or explicit metric is binary, we focus on perceptual metrics that express the degree of (dis)similarity between objects. We find that standard active learning approaches degrade when annotations are requested for batches of triplets at a time: our studies suggest that correlation among triplets is responsible. In this work, we propose a novel method to decorrelate batches of triplets, that jointly balances informativeness and diversity while decoupling the choice of heuristic for each criterion. Experiments indicate our method is general, adaptable, and outperforms the state-of-the-art.
【Keywords】: Machine Learning: Active Learning; Machine Learning: Dimensionality Reduction and Manifold Learning; Machine Learning: Learning Preferences or Rankings;
【Paper Link】 【Pages】:2262-2268
【Authors】: Vincent Grari ; Sylvain Lamprier ; Marcin Detyniecki
【Abstract】: The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. While plenty of fair algorithms have been proposed recently to tackle this challenge for discrete variables, only a few ideas exist for continuous ones. The objective in this paper is to ensure some independence level between the outputs of regression models and any given continuous sensitive variables. For this purpose, we use the Hirschfeld-Gebelein-Rényi (HGR) maximal correlation coefficient as a fairness metric. We propose to minimize the HGR coefficient directly with an adversarial neural network architecture. The idea is to predict the output Y while minimizing the ability of an adversarial neural network to find the estimated transformations which are required to predict the HGR coefficient. We empirically assess and compare our approach and demonstrate significant improvements on previously presented work in the field.
【Keywords】: Machine Learning: Adversarial Machine Learning; Machine Learning: Deep Learning; Trust, Fairness, Bias: General;
【Paper Link】 【Pages】:2269-2276
【Authors】: Zhongyi Han ; Xian-Jin Gui ; Chaoran Cui ; Yilong Yin
【Abstract】: In non-stationary environments, learning machines usually confront the domain adaptation scenario where the data distribution does change over time. Previous domain adaptation works have achieved great success in theory and practice. However, they always lose robustness in noisy environments where the labels and features of examples from the source domain become corrupted. In this paper, we report our attempt towards achieving accurate noise-robust domain adaptation. We first give a theoretical analysis that reveals how harmful noises influence unsupervised domain adaptation. To eliminate the effect of label noise, we propose an offline curriculum learning for minimizing a newly-defined empirical source risk. To reduce the impact of feature noise, we propose a proxy distribution based margin discrepancy. We seamlessly transform our methods into an adversarial network that performs efficient joint optimization for them, successfully mitigating the negative influence from both data corruption and distribution shift. A series of empirical studies show that our algorithm remarkably outperforms state of the art, over 10% accuracy improvements in some domain adaptation tasks under noisy environments.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Adversarial Machine Learning; Machine Learning: Learning Theory;
【Paper Link】 【Pages】:2277-2283
【Authors】: Padala Manisha ; Sujit Gujar
【Abstract】: In classification models, fairness can be ensured by solving a constrained optimization problem. We focus on fairness constraints like Disparate Impact, Demographic Parity, and Equalized Odds, which are non-decomposable and non-convex. Researchers define convex surrogates of the constraints and then apply convex optimization frameworks to obtain fair classifiers. Surrogates serve as an upper bound to the actual constraints, and convexifying fairness constraints is challenging.
We propose a neural network-based framework, \emph{FNNC}, to achieve fairness while maintaining high accuracy in classification. The above fairness constraints are included in the loss using Lagrangian multipliers. We prove bounds on generalization errors for the constrained losses which asymptotically go to zero. The network is optimized using two-step mini-batch stochastic gradient descent. Our experiments show that FNNC performs as good as the state of the art, if not better. The experimental evidence supplements our theoretical guarantees. In summary, we have an automated solution to achieve fairness in classification, which is easily extendable to many fairness constraints.
【Keywords】: Machine Learning: Classification; Machine Learning: Deep Learning; Trust, Fairness, Bias: General;
【Paper Link】 【Pages】:2284-2290
【Authors】: Julian Berk ; Sunil Gupta ; Santu Rana ; Svetha Venkatesh
【Abstract】: In order to improve the performance of Bayesian optimisation, we develop a modified Gaussian process upper confidence bound (GP-UCB) acquisition function. This is done by sampling the exploration-exploitation trade-off parameter from a distribution. We prove that this allows the expected trade-off parameter to be altered to better suit the problem without compromising a bound on the function's Bayesian regret. We also provide results showing that our method achieves better performance than GP-UCB in a range of real-world and synthetic problems.
【Keywords】: Machine Learning: Bayesian Optimization; Machine Learning: Cost-Sensitive Learning;
【Paper Link】 【Pages】:2291-2297
【Authors】: Peng Zhang ; Jianye Hao ; Weixun Wang ; Hongyao Tang ; Yi Ma ; Yihai Duan ; Yan Zheng
【Abstract】: Reinforcement learning agents usually learn from scratch, which requires a large number of interactions with the environment. This is quite different from the learning process of human. When faced with a new task, human naturally have the common sense and use the prior knowledge to derive an initial policy and guide the learning process afterwards. Although the prior knowledge may be not fully applicable to the new task, the learning process is significantly sped up since the initial policy ensures a quick-start of learning and intermediate guidance allows to avoid unnecessary exploration. Taking this inspiration, we propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to finetune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm. We conduct experiments on several control tasks. The empirical results show that our approach, which combines suboptimal human knowledge and RL, achieves significant improvement on learning efficiency of flat RL algorithms, even with very low-performance human prior knowledge.
【Keywords】: Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:2298-2304
【Authors】: Weixiang Xu ; Xiangyu He ; Tianli Zhao ; Qinghao Hu ; Peisong Wang ; Jian Cheng
【Abstract】: Large neural networks are difficult to deploy on mobile devices because of intensive computation and storage. To alleviate it, we study ternarization, a balance between efficiency and accuracy that quantizes both weights and activations into ternary values. In previous ternarized neural networks, a hard threshold Δ is introduced to determine quantization intervals. Although the selection of Δ greatly affects the training results, previous works estimate Δ via an approximation or treat it as a hyper-parameter, which is suboptimal. In this paper, we present the Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine quantization intervals instead of depending on a hard threshold. Concretely, we replace the original ternary kernel with the addition of two binary kernels at training time, where ternary values are determined by the combination of two corresponding binary values. At inference time, we add up the two binary kernels to obtain a single ternary kernel. Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and extreme low bit networks. Experiments on ImageNet with AlexNet (Top-1 55.6%), ResNet-18 (Top-1 66.2%) achieves new state-of-the-art.
【Keywords】: Machine Learning: Classification; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:2305-2311
【Authors】: Alexander Schulz ; Fabian Hinder ; Barbara Hammer
【Abstract】: Machine learning algorithms using deep architectures have been able to implement increasingly powerful and successful models. However, they also become increasingly more complex, more difficult to comprehend and easier to fool. So far, most methods in the literature investigate the decision of the model for a single given input datum. In this paper, we propose to visualize a part of the decision function of a deep neural network together with a part of the data set in two dimensions with discriminative dimensionality reduction. This enables us to inspect how different properties of the data are treated by the model, such as outliers, adversaries or poisoned data. Further, the presented approach is complementary to the mentioned interpretation methods from the literature and hence might be even more useful in combination with those. Code is available at https://github.com/LucaHermes/DeepView
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Dimensionality Reduction and Manifold Learning; Machine Learning: Deep Learning; Machine Learning: Interpretability;
【Paper Link】 【Pages】:2312-2318
【Authors】: Céline Hocquette ; Stephen H. Muggleton
【Abstract】: Predicate Invention in Meta-Interpretive Learning (MIL) is generally based on a top-down approach, and the search for a consistent hypothesis is carried out starting from the positive examples as goals. We consider augmenting top-down MIL systems with a bottom-up step during which the background knowledge is generalised with an extension of the immediate consequence operator for second-order logic programs. This new method provides a way to perform extensive predicate invention useful for feature discovery. We demonstrate this method is complete with respect to a fragment of dyadic datalog. We theoretically prove this method reduces the number of clauses to be learned for the top-down learner, which in turn can reduce the sample complexity. We formalise an equivalence relation for predicates which is used to eliminate redundant predicates. Our experimental results suggest pairing the state-of-the-art MIL system Metagol with an initial bottom-up step can significantly improve learning performance.
【Keywords】: Machine Learning: Relational Learning;
【Paper Link】 【Pages】:2319-2325
【Authors】: Nicholas Hoernle ; Kobi Gal ; Barbara J. Grosz ; Leilah Lyons ; Ada Ren ; Andee Rubin
【Abstract】: This paper describes methods for comparative evaluation of the interpretability of models of high dimensional time series data inferred by unsupervised machine learning algorithms. The time series data used in this investigation were logs from an immersive simulation like those commonly used in education and healthcare training. The structures learnt by the models provide representations of participants' activities in the simulation which are intended to be meaningful to people's interpretation. To choose the model that induces the best representation, we designed two interpretability tests, each of which evaluates the extent to which a model’s output aligns with people’s expectations or intuitions of what has occurred in the simulation. We compared the performance of the models on these interpretability tests to their performance on statistical information criteria. We show that the models that optimize interpretability quality differ from those that optimize (statistical) information theoretic criteria. Furthermore, we found that a model using a fully Bayesian approach performed well on both the statistical and human-interpretability measures. The Bayesian approach is a good candidate for fully automated model selection, i.e., when direct empirical investigations of interpretability are costly or infeasible.
【Keywords】: Machine Learning: Interpretability; AI Ethics: Explainability; Humans and AI: Human-AI Collaboration;
【Paper Link】 【Pages】:2326-2332
【Authors】: Weijun Hong ; Guilin Li ; Weinan Zhang ; Ruiming Tang ; Yunhe Wang ; Zhenguo Li ; Yong Yu
【Abstract】: Neural architecture search (NAS) has shown encouraging results in automating the architecture design. Recently, DARTS relaxes the search process with a differentiable formulation that leverages weight-sharing and SGD for cost reduction of NAS. In DARTS, all candidate operations are trained simultaneously during the network weight training step. Our empirical results show that this training procedure leads to the co-adaption problem and Matthew Effect: operations with fewer parameters would be trained maturely earlier. This causes two problems: firstly, the operations with more parameters may never have the chance to express the desired function since those with less have already done the job; secondly, the system will punish those underperforming operations by lowering their architecture parameter and backward smaller loss gradients, this causes the Matthew Effect. In this paper, we systematically study these problems and propose a novel grouped operation dropout algorithm named DropNAS to fix the problems with DARTS. Extensive experiments demonstrate that DropNAS solves the above issues and achieves promising performance. Specifically, DropNAS achieves 2.26% test error on CIFAR-10, 16.39% on CIFAR-100 and 23.4% on ImageNet (with the same training hyperparameters as DARTS for a fair comparison). It is also observed that DropNAS is robust across variants of the DARTS search space. Code is available at https://github.com/huawei-noah.
【Keywords】: Machine Learning: Deep Learning; Computer Vision: Other;
【Paper Link】 【Pages】:2333-2339
【Authors】: Shoujin Wang ; Liang Hu ; Yan Wang ; Quan Z. Sheng ; Mehmet A. Orgun ; Longbing Cao
【Abstract】: User purchase behaviours are complex and dynamic, which are usually observed as multiple choice actions across a sequence of shopping baskets. Most of the existing next-basket prediction approaches model user actions as homogeneous sequence data without considering complex and heterogeneous user intentions, impeding deep under-standing of user behaviours from the perspective of human inside drivers and thus reducing the prediction performance. Psychological theories have indicated that user actions are essentially driven by certain underlying intentions (e.g., diet and entertainment). Moreover, different intentions may influence each other while different choices usually have different utilities to accomplish an intention. Inspired by such psychological insights, we formalize the next-basket prediction as an Intention Recognition, Modelling and Accomplishing problem and further design the Intention2Basket (Int2Ba in short) model. In Int2Ba, an Intention Recognizer, a Coupled Intention Chain Net, and a Dynamic Basket Planner are specifically designed to respectively recognize, model and accomplish the heterogeneous intentions behind a sequence of baskets to better plan the next-basket. Extensive experiments on real-world datasets show the superiority of Int2Ba over the state-of-the-art approaches.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems; Humans and AI: Personalization and User Modeling; Data Mining: Applications;
【Paper Link】 【Pages】:2340-2346
【Authors】: Qian Li ; Qingyuan Hu ; Yong Qi ; Saiyu Qi ; Jie Ma ; Jian Zhang
【Abstract】: Data augmentation have been intensively used in training deep neural network to improve the generalization, whether in original space (e.g., image space) or representation space. Although being successful, the connection between the synthesized data and the original data is largely ignored in training, without considering the distribution information that the synthesized samples are surrounding the original sample in training. Hence, the behavior of the network is not optimized for this. However, that behavior is crucially important for generalization, even in the adversarial setting, for the safety of the deep learning system. In this work, we propose a framework called Stochastic Batch Augmentation (SBA) to address these problems. SBA stochastically decides whether to augment at iterations controlled by the batch scheduler and in which a ''distilled'' dynamic soft label regularization is introduced by incorporating the similarity in the vicinity distribution respect to raw samples. The proposed regularization provides direct supervision by the KL-Divergence between the output soft-max distributions of original and virtual data. Our experiments on CIFAR-10, CIFAR-100, and ImageNet show that SBA can improve the generalization of the neural networks and speed up the convergence of network training.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Deep-learning Theory; Machine Learning: Deep Learning; Machine Learning: Classification;
【Paper Link】 【Pages】:2347-2354
【Authors】: Peng Wei ; Guoliang Hua ; Weibo Huang ; Fanyang Meng ; Hong Liu
【Abstract】: Recently, unsupervised methods for monocular visual odometry (VO), with no need for quantities of expensive labeled ground truth, have attracted much attention. However, these methods are inadequate for long-term odometry task, due to the inherent limitation of only using monocular visual data and the inability to handle the error accumulation problem. By utilizing supplemental low-cost inertial measurements, and exploiting the multi-view geometric constraint and sequential constraint, an unsupervised visual-inertial odometry framework (UnVIO) is proposed in this paper. Our method is able to predict the per-frame depth map, as well as extracting and self-adaptively fusing visual-inertial motion features from image-IMU stream to achieve long-term odometry task. A novel sliding window optimization strategy, which consists of an intra-window and an inter-window optimization, is introduced for overcoming the error accumulation and scale ambiguity problem. The intra-window optimization restrains the geometric inferences within the window through checking the photometric consistency. And the inter-window optimization checks the 3D geometric consistency and trajectory consistency among predictions of separate windows. Extensive experiments have been conducted on KITTI and Malaga datasets to demonstrate the superiority of UnVIO over other state-of-the-art VO / VIO methods. The codes are open-source.
【Keywords】: Machine Learning: Unsupervised Learning; Robotics: Localization, Mapping, State Estimation; Robotics: Vision and Perception;
【Paper Link】 【Pages】:2355-2361
【Authors】: Rongzhou Huang ; Chuyin Huang ; Yubao Liu ; Genan Dai ; Weiyang Kong
【Abstract】: Traffic prediction is a classical spatial-temporal prediction problem with many real-world applications such as intelligent route planning, dynamic traffic management, and smart location-based applications. Due to the high nonlinearity and complexity of traffic data, deep learning approaches have attracted much interest in recent years. However, few methods are satisfied with both long and short-term prediction tasks. Target at the shortcomings of existing studies, in this paper, we propose a novel deep learning framework called Long Short-term Graph Convolutional Networks (LSGCN) to tackle both traffic prediction tasks. In our framework, we propose a new graph attention network called cosAtt, and integrate both cosAtt and graph convolution networks (GCN) into a spatial gated block. By the spatial gated block and gated linear units convolution (GLU), LSGCN can efficiently capture complex spatial-temporal features and obtain stable prediction results. Experiments with three real-world traffic datasets verify the effectiveness of LSGCN.
【Keywords】: Machine Learning: Deep Learning; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:2362-2368
【Authors】: Hao Zhu ; Huaibo Huang ; Yi Li ; Aihua Zheng ; Ran He
【Abstract】: Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image. Most existing methods mainly focus on either disentangling the information in a single image or learning temporal information between frames. However, cross-modality coherence between audio and video information has not been well addressed during synthesis. In this paper, we propose a novel arbitrary talking face generation framework by discovering the audio-visual coherence via the proposed Asymmetric Mutual Information Estimator (AMIE). In addition, we propose a Dynamic Attention (DA) block by selectively focusing the lip area of the input image during the training stage, to further enhance lip synchronization. Experimental results on benchmark LRW dataset and GRID dataset transcend the state-of-the-art methods on prevalent metrics with robust high-resolution synthesizing on gender and pose variations.
【Keywords】: Machine Learning: Deep Generative Models; Humans and AI: Human-Computer Interaction;
【Paper Link】 【Pages】:2369-2375
【Authors】: Fangwen Zhang ; Xiuyi Jia ; Weiwei Li
【Abstract】: Label enhancement (LE) is a procedure of recovering the label distributions from the logical labels in the multi-label data, the purpose of which is to better represent and mine the label ambiguity problem through the form of label distribution. Existing LE work mainly concentrates on how to leverage the topological information of the feature space and the correlation among the labels, and all are based on single view data. In view of the fact that there are many multi-view data in real-world applications, which can provide richer semantic information from different perspectives, this paper first presents a multi-view label enhancement problem and proposes a tensor-based multi-view label enhancement method, named TMV-LE. Firstly, we introduce the tensor factorization to get the common subspace which contains the high-order relationships among different views. Secondly, we use the common representation and multiple views to jointly mine a more comprehensive topological structure in the dataset. Finally, the topological structure of the feature space is migrated to the label space to get the label distributions. Extensive comparative studies validate that the performance of multi-view multi-label learning can be improved significantly with TMV-LE.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Tensor and Matrix Methods;
【Paper Link】 【Pages】:2376-2382
【Authors】: Wenfang Zhu ; Xiuyi Jia ; Weiwei Li
【Abstract】: Label distribution learning has attracted more and more attention in view of its more generalized ability to express the label ambiguity. However, it is much more expensive to obtain the label distribution information of the data rather than the logical labels. Thus, label enhancement is proposed to recover the label distributions from the logical labels. In this paper, we propose a novel label enhancement method by using privileged information. We first apply a multi-label learning model to implicitly capture the complex structural information between instances and generate the privileged information. Second, we adopt LUPI (learning with privileged information) paradigm to utilize the privileged information and employ RSVM+ as the prediction model. Finally, comparison experiments on 12 datasets demonstrate that our proposal can better fit the ground-truth label distributions.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Structured Prediction;
【Paper Link】 【Pages】:2383-2389
【Authors】: Teng Zhang ; Hai Jin
【Abstract】: Multi-instance learning (MIL) is a celebrated learning framework where each example is represented as a bag of instances. An example is negative if it has no positive instances, and vice versa if at least one positive instance is contained. During the past decades, various MIL algorithms have been proposed, among which the large margin based methods is a very popular class. Recently, the studies on margin theory disclose that the margin distribution is of more importance to generalization ability than the minimal margin. Inspired by this observation, we propose the multi-instance optimal margin distribution machine, which can identify the key instances via explicitly optimizing the margin distribution. We also extend a stochastic accelerated mirror prox method to solve the formulated minimax problem. Extensive experiments show the superiority of the proposed method.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Classification;
【Paper Link】 【Pages】:2390-2396
【Authors】: Naoya Takeishi ; Yoshinobu Kawahara
【Abstract】: Prior domain knowledge can greatly help to learn generative models. However, it is often too costly to hard-code prior knowledge as a specific model architecture, so we often have to use general-purpose models. In this paper, we propose a method to incorporate prior knowledge of feature relations into the learning of general-purpose generative models. To this end, we formulate a regularizer that makes the marginals of a generative model to follow prescribed relative dependence of features. It can be incorporated into off-the-shelf learning methods of many generative models, including variational autoencoders and generative adversarial networks, as its gradients can be computed using standard backpropagation techniques. We show the effectiveness of the proposed method with experiments on multiple types of datasets and generative models.
【Keywords】: Machine Learning: Knowledge-based Learning; Machine Learning: Learning Generative Models; Machine Learning: Deep Generative Models;
【Paper Link】 【Pages】:2397-2404
【Authors】: Tuan Dam ; Pascal Klink ; Carlo D'Eramo ; Jan Peters ; Joni Pajarinen
【Abstract】: We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT, and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. state of the art algorithms.
【Keywords】: Machine Learning: Reinforcement Learning; Uncertainty in AI: Markov Decision Processes; Uncertainty in AI: Sequential Decision Making;
【Paper Link】 【Pages】:2405-2411
【Authors】: Subhadeep Maji ; Rohan Kumar ; Manish Bansal ; Kalyani Roy ; Pawan Goyal
【Abstract】: Systematically discovering semantic relationships in text is an important and extensively studied area in Natural Language Processing, with various tasks such as entailment, semantic similarity, etc. Decomposability of sentence-level scores via subsequence alignments has been proposed as a way to make models more interpretable. We study the problem of aligning components of sentences leading to an interpretable model for semantic textual similarity. In this paper, we introduce a novel pointer network based model with a sentinel gating function to align constituent chunks, which are represented using BERT. We improve this base model with a loss function to equally penalize misalignments in both sentences, ensuring the alignments are bidirectional. Finally, to guide the network with structured external knowledge, we introduce first-order logic constraints based on ConceptNet and syntactic knowledge. The model achieves an F1 score of 97.73 and 96.32 on the benchmark SemEval datasets for the chunk alignment task, showing large improvements over the existing solutions. Source code is available at https://github.com/manishb89/interpretable_sentence_similarity
【Keywords】: Machine Learning: Deep Learning: Sequence Modeling; Natural Language Processing: Natural Language Processing; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:2412-2419
【Authors】: Masahiro Kohjima ; Takeshi Kurashima ; Hiroyuki Toda
【Abstract】: Due to the difficulty of comprehensive data collection, created by factors such as privacy protection and sensor device limitations, we often need to analyze incomplete transition data where some information is missing from the ideal (complete) transition data. In this paper, we propose a new method that can estimate, in a unified manner, Markov chain parameters from incomplete transition data that consist of hidden transition data (data from which visited state information is partially hidden) and dropped transition data (data from which some state visits are dropped). A key to developing the method is regarding the hidden and dropped transition data as labeled and unlabeled multi-step transition data, where the labels represent the number of steps required for each transition. This allows us to describe the generative process of multi-step transition data, and thus develop a new probabilistic model. We confirm the effectiveness of the proposal by experiments on synthetic and real data.
【Keywords】: Machine Learning: Learning Generative Models; Machine Learning: Semi-Supervised Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2420-2426
【Authors】: Zhen Wang ; Chao Lan
【Abstract】: Traditional anomaly detectors examine a single view of instances and cannot discover multi-view anomalies, i.e., instances that exhibit inconsistent behaviors across different views. To tackle the problem, several multi-view anomaly detectors have been developed recently, but they are all transductive and unsupervised thus may suffer some challenges. In this paper, we propose a novel inductive semi-supervised Bayesian multi-view anomaly detector. Specifically, we first present a generative model for normal data. Then, we build a hierarchical Bayesian model, by first assigning priors to all parameters and latent variables, and then assigning priors over the priors. Finally, we employ variational inference to approximate the posterior of the model and evaluate anomalous scores of multi-view instances. In the experiment, we show the proposed Bayesian detector consistently outperforms state-of-the-art counterparts across several public data sets and three well-known types of multi-view anomalies. In theory, we prove the inferred Bayesian estimator is consistent and derive a proximate sample complexity for the proposed anomaly detector.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Data Mining: Other;
【Paper Link】 【Pages】:2427-2434
【Authors】: Trung-Hoang Le ; Hady W. Lauw
【Abstract】: Explanations help users make sense of recommendations, increasing the likelihood of adoption. Existing approaches to explainable recommendations tend to rely on rigidly standardized templates, only allowing fill-in-the-blank aspect-level sentiments. For more flexible, literate, and varied explanations that cover various aspects of interest, we propose to synthesize an explanation by selecting snippets from reviews to optimize representativeness and coherence. To fit the target user's aspect preferences, we contextualize the opinions based on a compatible explainable recommendation model. Experiments on datasets of varying product categories showcase the efficacies of our method as compared to baselines based on templates, review summarization, selection, and text generation.
【Keywords】: Machine Learning: Interpretability; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:2435-2441
【Authors】: Sehun Yu ; Dongha Lee ; Hwanjo Yu
【Abstract】: To reliably detect out-of-distribution images based on already deployed convolutional neural networks, several recent studies on the out-of-distribution detection have tried to define effective confidence scores without retraining the model. Although they have shown promising results, most of them need to find the optimal hyperparameter values by using a few out-of-distribution images, which eventually assumes a specific test distribution and makes it less practical for real-world applications. In this work, we propose a novel out-of-distribution detection method termed as MALCOM, which neither uses any out-of-distribution sample nor retrains the model. Inspired by an observation that the global average pooling cannot capture spatial information of feature maps in convolutional neural networks, our method aims to extract informative sequential patterns from the feature maps. To this end, we introduce a similarity metric that focuses on shared patterns between two sequences based on the normalized compression distance. In short, MALCOM uses both the global average and the spatial patterns of feature maps to identify out-of-distribution images accurately.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Uncertainty in AI: Other;
【Paper Link】 【Pages】:2442-2448
【Authors】: Devendra Singh Chaplot ; Lisa Lee ; Ruslan Salakhutdinov ; Devi Parikh ; Dhruv Batra
【Abstract】: Visually-grounded embodied language learning models have recently shown to be effective at learning multiple multimodal tasks such as following navigational instructions and answering questions. In this paper, we address two key limitations of these models, (a) the inability to transfer the grounded knowledge across different tasks and (b) the inability to transfer to new words and concepts not seen during training using only a few examples. We propose a multitask model which facilitates knowledge transfer across tasks by disentangling the knowledge of words and visual attributes in the intermediate representations. We create scenarios and datasets to quantify cross-task knowledge transfer and show that the proposed model outperforms a range of baselines in simulated 3D environments. We also show that this disentanglement of representations makes our model modular and interpretable which allows for transfer to instructions containing new concepts.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2449-2455
【Authors】: Huiyuan Chen ; Jing Li
【Abstract】: Recommender systems often involve multi-aspect factors. For example, when shopping for shoes online, consumers usually look through their images, ratings, and product's reviews before making their decisions. To learn multi-aspect factors, many context-aware models have been developed based on tensor factorizations. However, existing models assume multilinear structures in the tensor data, thus failing to capture nonlinear feature interactions. To fill this gap, we propose a novel nonlinear tensor machine, which combines deep neural networks and tensor algebra to capture nonlinear interactions among multi-aspect factors. We further consider adversarial learning to assist the training of our model. Extensive experiments demonstrate the effectiveness of the proposed model.
【Keywords】: Machine Learning: Recommender Systems; Machine Learning: Tensor and Matrix Methods; Multidisciplinary Topics and Applications: Recommender Systems; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2456-2462
【Authors】: Shibo Li ; Wei Xing ; Robert M. Kirby ; Shandian Zhe
【Abstract】: Gaussian process regression networks (GPRN) are powerful Bayesian models for multi-output regression, but their inference is intractable. To address this issue, existing methods use a fully factorized structure (or a mixture of such structures) over all the outputs and latent functions for posterior approximation, which, however, can miss the strong posterior dependencies among the latent variables and hurt the inference quality. In addition, the updates of the variational parameters are inefficient and can be prohibitively expensive for a large number of outputs. To overcome these limitations, we propose a scalable variational inference algorithm for GPRN, which not only captures the abundant posterior dependencies but also is much more efficient for massive outputs. We tensorize the output space and introduce tensor/matrix-normal variational posteriors to capture the posterior correlations and to reduce the parameters. We jointly optimize all the parameters and exploit the inherent Kronecker product structure in the variational model evidence lower bound to accelerate the computation. We demonstrate the advantages of our method in several real-world applications.
【Keywords】: Machine Learning: Probabilistic Machine Learning; Uncertainty in AI: Approximate Probabilistic Inference; Uncertainty in AI: Graphical Models;
【Paper Link】 【Pages】:2463-2469
【Authors】: Daoyuan Chen ; Yaliang Li ; Minghui Qiu ; Zhen Wang ; Bofang Li ; Bolin Ding ; Hongbo Deng ; Jun Huang ; Wei Lin ; Jingren Zhou
【Abstract】: Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks. However, the huge parameter size makes them difficult to be deployed in real-time applications that require quick inference with limited resources. Existing methods compress BERT into small models while such compression is task-independent, i.e., the same compressed BERT for all different downstream tasks. Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks. We incorporate a task-oriented knowledge distillation loss to provide search hints and an efficiency-aware loss as search constraints, which enables a good trade-off between efficiency and effectiveness for task-adaptive BERT compression. We evaluate AdaBERT on several NLP tasks, and the results demonstrate that those task-adaptive compressed models are 12.7x to 29.3x faster than BERT in inference time and 11.5x to 17.0x smaller in terms of parameter size, while comparable performance is maintained.
【Keywords】: Machine Learning: Deep Learning; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:2470-2476
【Authors】: Yaqiong Li ; Xuhui Fan ; Ling Chen ; Bin Li ; Zheng Yu ; Scott A. Sisson
【Abstract】: The Dirichlet Belief Network~(DirBN) has been recently proposed as a promising approach in learning interpretable deep latent representations for objects. In this work, we leverage its interpretable modelling architecture and propose a deep dynamic probabilistic framework -- the Recurrent Dirichlet Belief Network~(Recurrent-DBN) -- to study interpretable hidden structures from dynamic relational data. The proposed Recurrent-DBN has the following merits: (1) it infers interpretable and organised hierarchical latent structures for objects within and across time steps; (2) it enables recurrent long-term temporal dependence modelling, which outperforms the one-order Markov descriptions in most of the dynamic probabilistic frameworks; (3) the computational cost scales to the number of positive links only. In addition, we develop a new inference strategy, which first upward-and-backward propagates latent counts and then downward-and-forward samples variables, to enable efficient Gibbs sampling for the Recurrent-DBN. We apply the Recurrent-DBN to dynamic relational data problems. The extensive experiment results on real-world data validate the advantages of the Recurrent-DBN over the state-of-the-art models in interpretable latent structure discovery and improved link prediction performance.
【Keywords】: Machine Learning: Probabilistic Machine Learning; Machine Learning: Deep Generative Models;
【Paper Link】 【Pages】:2477-2483
【Authors】: Haobo Wang ; Zhao Li ; Jiaming Huang ; Pengrui Hui ; Weiwei Liu ; Tianlei Hu ; Gang Chen
【Abstract】: Detecting fraud users, who fraudulently promote certain target items, is a challenging issue faced by e-commerce platforms. Generally, many fraud users have different spam behaviors simultaneously, e.g. spam transactions, clicks, reviews and so on. Existing solutions have two main limitations: 1) the correlations among multiple spam behaviors are neglected; 2) large-scale computations are intractable when dealing with an enormous user set. To remedy these problems, this work proposes a collaboration based multi-label propagation (CMLP) algorithm. We first introduce a general-purpose version that involves collaboration technique to exploit label correlations. Specifically, it breaks the final prediction into two parts: 1) its own prediction part; 2) the prediction of others, i.e. collaborative part. Then, to accelerate it on large-scale e-commerce data, we propose a heterogeneous graph based variant that detects communities on the user-item graph directly. Both theoretical analysis and empirical results clearly validate the effectiveness and scalability of our proposals.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning;
【Paper Link】 【Pages】:2484-2490
【Authors】: Peng Han ; Zhongxiao Li ; Yong Liu ; Peilin Zhao ; Jing Li ; Hao Wang ; Shuo Shang
【Abstract】: Point-of-interest (POI) recommendation has become an increasingly important sub-field of recommendation system research. Previous methods employ various assumptions to exploit the contextual information for improving the recommendation accuracy. The common property among them is that similar users are more likely to visit similar POIs and similar POIs would like to be visited by the same user. However, none of existing methods utilize similarity explicitly to make recommendations. In this paper, we propose a new framework for POI recommendation, which explicitly utilizes similarity with contextual information. Specifically, we categorize the context information into two groups, i.e., global and local context, and develop different regularization terms to incorporate them for recommendation. A graph Laplacian regularization term is utilized to exploit the global context information. Moreover, we cluster users into different groups, and let the objective function constrain the users in the same group to have similar predicted POI ratings. An alternating optimization method is developed to optimize our model and get the final rating matrix. The results in our experiments show that our algorithm outperforms all the state-of-the-art methods.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Transportation; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:2491-2497
【Authors】: Zhengxu Yu ; Shuxian Liang ; Long Wei ; Zhongming Jin ; Jianqiang Huang ; Deng Cai ; Xiaofei He ; Xian-Sheng Hua
【Abstract】: Urban traffic light control is an important and challenging real-world problem. By regarding intersections as agents, most of the Reinforcement Learning (RL) based methods generate actions of agents independently. They can cause action conflict and result in overflow or road resource waste in adjacent intersections. Recently, some collaborative methods have alleviated the above problems by extending the observable surroundings of agents, which can be considered as inactive cross-agent communication methods. However, when agents act synchronously in these works, the perceived action value is biased and the information exchanged is insufficient. In this work, we propose a novel Multi-agent Communication and Action Rectification (MaCAR) framework. It enables active communication between agents by considering the impact of synchronous actions of agents. MaCAR consists of two parts: (1) an active Communication Agent Network (CAN) involving a Message Propagation Graph Neural Network (MPGNN); (2) a Traffic Forecasting Network (TFN) which learns to predict the traffic after agents' synchronous actions and the corresponding action values. By using predicted information, we mitigate the action value bias during training to help rectify agents' future actions. In experiments, we show that our proposal can outperforms state-of-the-art methods on both synthetic and real-world datasets.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Agent-based and Multi-agent Systems: Coordination and Cooperation; Multidisciplinary Topics and Applications: Transportation; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2498-2504
【Authors】: Xiao Zhang ; Shizhong Liao
【Abstract】: Online kernel selection in continuous kernel space is more complex than that in discrete kernel set. But existing online kernel selection approaches for continuous kernel spaces have linear computational complexities at each round with respect to the current number of rounds and lack sublinear regret guarantees due to the continuously many candidate kernels. To address these issues, we propose a novel hypothesis sketching approach to online kernel selection in continuous kernel space, which has constant computational complexities at each round and enjoys a sublinear regret bound. The main idea of the proposed hypothesis sketching approach is to maintain the orthogonality of the basis functions and the prediction accuracy of the hypothesis sketches in a time-varying reproducing kernel Hilbert space. We first present an efficient dependency condition to maintain the basis functions of the hypothesis sketches under a computational budget. Then we update the weights and the optimal kernels by minimizing the instantaneous loss of the hypothesis sketches using the online gradient descent with a compensation strategy. We prove that the proposed hypothesis sketching approach enjoys a regret bound of order O(√T) for online kernel selection in continuous kernel space, which is optimal for convex loss functions, where T is the number of rounds, and reduces the computational complexities at each round from linear to constant with respect to the number of rounds. Experimental results demonstrate that the proposed hypothesis sketching approach significantly improves the efficiency of online kernel selection in continuous kernel space while retaining comparable predictive accuracies.
【Keywords】: Machine Learning: Kernel Methods; Machine Learning: Learning Theory; Machine Learning: Online Learning;
【Paper Link】 【Pages】:2505-2511
【Authors】: Ishan Durugkar ; Elad Liebman ; Peter Stone
【Abstract】: In multiagent reinforcement learning scenarios, it is often the case that independent agents must jointly learn to perform a cooperative task. This paper focuses on such a scenario in which agents have individual preferences regarding how to accomplish the shared task. We consider a framework for this setting which balances individual preferences against task rewards using a linear mixing scheme. In our theoretical analysis we establish that agents can reach an equilibrium that leads to optimal shared task reward even when they consider individual preferences which aren't fully aligned with this task. We then empirically show, somewhat counter-intuitively, that there exist mixing schemes that outperform a purely task-oriented baseline. We further consider empirically how to optimize the mixing scheme.
【Keywords】: Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Coordination and Cooperation; Agent-based and Multi-agent Systems: Multi-agent Learning;
【Paper Link】 【Pages】:2512-2518
【Authors】: Jia Zhang ; Yidong Lin ; Min Jiang ; Shaozi Li ; Yong Tang ; Kay Chen Tan
【Abstract】: Information theoretical based methods have attracted a great attention in recent years, and gained promising results to deal with multi-label data with high dimensionality. However, most of the existing methods are either directly transformed from heuristic single-label feature selection methods or inefficient in exploiting labeling information. Thus, they may not be able to get an optimal feature selection result shared by multiple labels. In this paper, we propose a general global optimization framework, in which feature relevance, label relevance (i.e., label correlation), and feature redundancy are taken into account, thus facilitating multi-label feature selection. Moreover, the proposed method has an excellent mechanism for utilizing inherent properties of multi-label learning. Specially, we provide a formulation to extend the proposed method with label-specific features. Empirical studies on twenty multi-label data sets reveal the effectiveness and efficiency of the proposed method. Our implementation of the proposed method is available online at: https://jiazhang-ml.pub/GRRO-master.zip.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:2519-2525
【Authors】: Ruobing Xie ; Cheng Ling ; Yalong Wang ; Rui Wang ; Feng Xia ; Leyu Lin
【Abstract】: Both explicit and implicit feedbacks can reflect user opinions on items, which are essential for learning user preferences in recommendation. However, most current recommendation algorithms merely focus on implicit positive feedbacks (e.g., click), ignoring other informative user behaviors. In this paper, we aim to jointly consider explicit/implicit and positive/negative feedbacks to learn user unbiased preferences for recommendation. Specifically, we propose a novel Deep feedback network (DFN) modeling click, unclick and dislike behaviors. DFN has an internal feedback interaction component that captures fine-grained interactions between individual behaviors, and an external feedback interaction component that uses precise but relatively rare feedbacks (click/dislike) to extract useful information from rich but noisy feedbacks (unclick). In experiments, we conduct both offline and online evaluations on a real-world recommendation system WeChat Top Stories used by millions of users. The significant improvements verify the effectiveness and robustness of DFN. The source code is in https://github.com/qqxiaochongqq/DFN.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems;
【Paper Link】 【Pages】:2526-2532
【Authors】: Yiyang Zhang ; Feng Liu ; Zhen Fang ; Bo Yuan ; Guangquan Zhang ; Jie Lu
【Abstract】: In unsupervised domain adaptation (UDA), classifiers for the target domain are trained with massive true-label data from the source domain and unlabeled data from the target domain. However, it may be difficult to collect fully-true-label data in a source domain given limited budget. To mitigate this problem, we consider a novel problem setting where the classifier for the target domain has to be trained with complementary-label data from the source domain and unlabeled data from the target domain named budget-friendly UDA (BFUDA). The key benefit is that it is much less costly to collect complementary-label source data (required by BFUDA) than collecting the true-label source data (required by ordinary UDA). To this end, complementary label adversarial network (CLARINET) is proposed to solve the BFUDA problem. CLARINET maintains two deep networks simultaneously, where one focuses on classifying complementary-label source data and the other takes care of the source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Semi-Supervised Learning; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2533-2539
【Authors】: Zihan Zhang ; Mingxuan Liu ; Chao Zhang ; Yiming Zhang ; Zhou Li ; Qi Li ; Haixin Duan ; Donghong Sun
【Abstract】: Natural language processing (NLP) models are known vulnerable to adversarial examples, similar to image processing models. Studying adversarial texts is an essential step to improve the robustness of NLP models. However, existing studies mainly focus on analyzing English texts and generating adversarial examples for English texts. There is no work studying the possibility and effect of the transformation to another language, e.g, Chinese. In this paper, we analyze the differences between Chinese and English, and explore the methodology to transform the existing English adversarial generation method to Chinese. We propose a novel black-box adversarial Chinese texts generation solution Argot, by utilizing the method for adversarial English samples and several novel methods developed on Chinese characteristics. Argot could effectively and efficiently generate adversarial Chinese texts with good readability. Furthermore, Argot could also automatically generate targeted Chinese adversarial text, achieving a high success rate and ensuring readability of the Chinese.
【Keywords】: Machine Learning: Adversarial Machine Learning; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:2540-2546
【Authors】: Renjun Xu ; Pelen Liu ; Yin Zhang ; Fang Cai ; Jindong Wang ; Shuoying Liang ; Heting Ying ; Jianwei Yin
【Abstract】: Domain adaptation (DA) has achieved a resounding success to learn a good classifier by leveraging labeled data from a source domain to adapt to an unlabeled target domain. However, in a general setting when the target domain contains classes that are never observed in the source domain, namely in Open Set Domain Adaptation (OSDA), existing DA methods failed to work because of the interference of the extra unknown classes. This is a much more challenging problem, since it can easily result in negative transfer due to the mismatch between the unknown and known classes. Existing researches are susceptible to misclassification when target domain unknown samples in the feature space distributed near the decision boundary learned from the labeled source domain. To overcome this, we propose Joint Partial Optimal Transport (JPOT), fully utilizing information of not only the labeled source domain but also the discriminative representation of unknown class in the target domain. The proposed joint discriminative prototypical compactness loss can not only achieve intra-class compactness and inter-class separability, but also estimate the mean and variance of the unknown class through backpropagation, which remains intractable for previous methods due to the blindness about the structure of the unknown classes. To our best knowledge, this is the first optimal transport model for OSDA. Extensive experiments demonstrate that our proposed model can significantly boost the performance of open set domain adaptation on standard DA datasets.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Deep Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2547-2553
【Authors】: Yongqiao Wang ; Xudong Liu
【Abstract】: Multivariate probability calibration is the problem of predicting class membership probabilities from classification scores of multiple classifiers. To achieve better performance, the calibrating function is often required to be coordinate-wise non-decreasing; that is, for every classifier, the higher the score, the higher the probability of the class labeling being positive. To this end, we propose a multivariate regression method based on shape-restricted Bernstein polynomials. This method is universally flexible: it can approximate any continuous calibrating function with any specified error, as the polynomial degree increases to infinite. Moreover, it is universally consistent: the estimated calibrating function converges to any continuous calibrating function, as the training size increases to infinity. Our empirical study shows that the proposed method achieves better calibrating performance than benchmark methods.
【Keywords】: Machine Learning: Ensemble Methods;
【Paper Link】 【Pages】:2554-2560
【Authors】: Huanle Xu ; Yang Liu ; Wing Cheong Lau ; Rui Li
【Abstract】: The problem of multi-armed bandit (MAB) with fairness constraint has emerged as an important research topic recently. For such problems, one common objective is to maximize the total rewards within a fixed round of pulls, while satisfying the fairness requirement of a minimum selection fraction for each individual arm in the long run. Previous works have made substantial advancements in designing efficient online selection solutions, however, they fail to achieve a sublinear regret bound when incorporating such fairness constraints. In this paper, we study a combinatorial MAB problem with concave objective and fairness constraints. In particular, we adopt a new approach that combines online convex optimization with bandit methods to design selection algorithms. Our algorithm is computationally efficient, and more importantly, manages to achieve a sublinear regret bound with probability guarantees. Finally, we evaluate the performance of our algorithm via extensive simulations and demonstrate that it outperforms the baselines substantially.
【Keywords】: Machine Learning: Online Learning; Uncertainty in AI: Other;
【Paper Link】 【Pages】:2561-2567
【Authors】: Yinan Zhang ; Yong Liu ; Peng Han ; Chunyan Miao ; Lizhen Cui ; Baoli Li ; Haihong Tang
【Abstract】: Cross-domain recommendation methods usually transfer knowledge across different domains implicitly, by sharing model parameters or learning parameter mappings in the latent space. Differing from previous studies, this paper focuses on learning explicit mapping between a user's behaviors (i.e. interaction itemsets) in different domains during the same temporal period. In this paper, we propose a novel deep cross-domain recommendation model, called Cycle Generation Networks (CGN). Specifically, CGN employs two generators to construct the dual-direction personalized itemset mapping between a user's behaviors in two different domains over time. The generators are learned by optimizing the distance between the generated itemset and the real interacted itemset, as well as the cycle-consistent loss defined based on the dual-direction generation procedure. We have performed extensive experiments on real datasets to demonstrate the effectiveness of the proposed model, comparing with existing single-domain and cross-domain recommendation methods.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems;
【Paper Link】 【Pages】:2568-2574
【Authors】: Yun-Peng Liu ; Ning Xu ; Yu Zhang ; Xin Geng
【Abstract】: The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.
【Keywords】: Machine Learning: Classification; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2575-2582
【Authors】: Zhaoyang Liu ; Haokun Chen ; Fei Sun ; Xu Xie ; Jinyang Gao ; Bolin Ding ; Yanyan Shen
【Abstract】: Accurately characterizing the user's current interest is the core of recommender systems. However, users' interests are dynamic and affected by intent factors and preference factors. The intent factors imply users' current needs and change among different visits. The preference factors are relatively stable and learned continuously over time. Existing works either resort to the sequential recommendation to model the current browsing intent and historical preference separately or just mix up these two factors during online learning. In this paper, we propose a novel learning strategy named FLIP to decouple the learning of intent and preference under the online settings. The learning of the intent is considered as a meta-learning task and fast adaptive to the current browsing; the learning of the preference is based on the calibrated user intent and constantly updated over time. We conducted experiments on two public datasets and a real-world recommender system. When equipping it with modern recommendation methods, significant improvements are demonstrated over strong baselines.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems;
【Paper Link】 【Pages】:2583-2590
【Authors】: Xiaotong Lu ; Han Huang ; Weisheng Dong ; Xin Li ; Guangming Shi
【Abstract】: Network pruning has been proposed as a remedy for alleviating the over-parameterization problem of deep neural networks. However, its value has been recently challenged especially from the perspective of neural architecture search (NAS). We challenge the conventional wisdom of pruning-after-training by proposing a joint search-and-training approach that directly learns a compact network from the scratch. By treating pruning as a search strategy, we present two new insights in this paper: 1) it is possible to expand the search space of networking pruning by associating each filter with a learnable weight; 2) joint search-and-training can be conducted iteratively to maximize the learning efficiency. More specifically, we propose a coarse-to-fine tuning strategy to iteratively sample and update compact sub-network to approximate the target network. The weights associated with network filters will be accordingly updated by joint search-and-training to reflect learned knowledge in NAS space. Moreover, we introduce strategies of random perturbation (inspired by Monte Carlo) and flexible thresholding (inspired by Reinforcement Learning) to adjust the weight and size of each layer. Extensive experiments on ResNet and VGGNet demonstrate the superior performance of our proposed method on popular datasets including CIFAR10, CIFAR100 and ImageNet.
【Keywords】: Machine Learning: Deep Learning; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:2591-2597
【Authors】: Anjing Luo ; Pengpeng Zhao ; Yanchi Liu ; Fuzhen Zhuang ; Deqing Wang ; Jiajie Xu ; Junhua Fang ; Victor S. Sheng
【Abstract】: Session-based recommendation becomes a research hotspot for its ability to make recommendations for anonymous users. However, existing session-based methods have the following limitations: (1) They either lack the capability to learn complex dependencies or focus mostly on the current session without explicitly considering collaborative information. (2) They assume that the representation of an item is static and fixed for all users at each time step. We argue that even the same item can be represented differently for different users at the same time step. To this end, we propose a novel solution, Collaborative Self-Attention Network (CoSAN) for session-based recommendation, to learn the session representation and predict the intent of the current session by investigating neighborhood sessions. Specially, we first devise a collaborative item representation by aggregating the embedding of neighborhood sessions retrieved according to each item in the current session. Then, we apply self-attention to learn long-range dependencies between collaborative items and generate collaborative session representation. Finally, each session is represented by concatenating the collaborative session representation and the embedding of the current session. Extensive experiments on two real-world datasets show that CoSAN constantly outperforms state-of-the-art methods.
【Keywords】: Machine Learning: Recommender Systems; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:2598-2604
【Authors】: Lei Luo ; Jian Pei ; Heng Huang
【Abstract】: This paper introduces a novel Robust Regression (RR) model, named Sinkhorn regression, which imposes Sinkhorn distances on both loss function and regularization. Traditional RR methods target at searching for an element-wise loss function (e.g., Lp-norm) to characterize the errors such that outlying data have a relatively smaller influence on the regression estimator. Due to the neglect of the geometric information, they often lead to the suboptimal results in the practical applications. To address this problem, we use a cross-bin distance function, i.e., Sinkhorn distances, to capture the geometric knowledge of real data. Sinkhorn distances is invariant in movement, rotation and zoom. Thus, our method is more robust to variations of data than traditional regression models. Meanwhile, we leverage Kullback-Leibler divergence to relax the proposed model with marginal constraints into its unbalanced formulation to adapt more types of features. In addition, we propose an efficient algorithm to solve the relaxed model and establish its complete statistical guarantees under mild conditions. Experiments on the five publicly available microarray data sets and one mass spectrometry data set demonstrate the effectiveness and robustness of our method.
【Keywords】: Machine Learning: Classification; Machine Learning Applications: Other;
【Paper Link】 【Pages】:2605-2611
【Authors】: Yijing Luo ; Bo Han ; Chen Gong
【Abstract】: Practically, we often face the dilemma that some of the examples for training a classifier are incorrectly labeled due to various subjective and objective factors. Although intensive efforts have been put to design classifiers that are robust to label noise, most of the previous methods have not fully utilized data distribution information. To address this issue, this paper introduces a bi-level learning paradigm termed “Spectral Cluster Discovery'' (SCD) for combating with noisy labels. Namely, we simultaneously learn a robust classifier (Learning stage) by discovering the low-rank approximation to the ground-truth label matrix and learn an ideal affinity graph (Clustering stage). Specifically, we use the learned classifier to assign the examples with similar label to a mutual cluster. Based on the cluster membership, we utilize the learned affinity graph to explore the noisy examples based on the cluster membership. Both stages will reinforce each other iteratively. Experimental results on typical benchmark and real-world datasets verify the superiority of SCD to other label noise learning methods.
【Keywords】: Machine Learning: Classification; Machine Learning: Clustering;
【Paper Link】 【Pages】:2612-2618
【Authors】: Ziwei Li ; Gengyu Lyu ; Songhe Feng
【Abstract】: Partial Multi-Label Learning (PML) aims to learn from the training data where each instance is associated with a set of candidate labels, among which only a part of them are relevant. Existing PML methods mainly focus on label disambiguation, while they lack the consideration of noise in the feature space. To tackle the problem, we propose a novel framework named partial multi-label learning via MUlti-SubspacE Representation (MUSER), where the redundant labels together with noisy features are jointly taken into consideration during the training process. Specifically, we first decompose the original label space into a latent label subspace and a label correlation matrix to reduce the negative effects of redundant labels, then we utilize the correlations among features to project the original noisy feature space to a feature subspace to resist the noisy feature information. Afterwards, we introduce a graph Laplacian regularization to constrain the label subspace to keep intrinsic structure among features and impose an orthogonality constraint on the correlations among features to guarantee discriminability of the feature subspace. Extensive experiments conducted on various datasets demonstrate the superiority of our proposed method.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:2619-2625
【Authors】: Hang Li ; Chen Ma ; Wei Xu ; Xue Liu
【Abstract】: Building compact convolutional neural networks (CNNs) with reliable performance is a critical but challenging task, especially when deploying them in real-world applications. As a common approach to reduce the size of CNNs, pruning methods delete part of the CNN filters according to some metrics such as l1-norm. However, previous methods hardly leverage the information variance in a single feature map and the similarity characteristics among feature maps. In this paper, we propose a novel filter pruning method, which incorporates two kinds of feature map selections: diversity-aware selection (DFS) and similarity-aware selection (SFS). DFS aims to discover features with low information diversity while SFS removes features that have high similarities with others. We conduct extensive empirical experiments with various CNN architectures on publicly available datasets. The experimental results demonstrate that our model obtains up to 91.6% parameter decrease and 83.7% FLOPs reduction with almost no accuracy loss.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Feature Selection; Learning Sparse Models; Machine Learning: Deep Learning; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:2626-2632
【Authors】: Changsheng Li ; Handong Ma ; Zhao Kang ; Ye Yuan ; Xiao-Yu Zhang ; Guoren Wang
【Abstract】: Unsupervised active learning has attracted increasing attention in recent years, where its goal is to select representative samples in an unsupervised setting for human annotating. Most existing works are based on shallow linear models by assuming that each sample can be well approximated by the span (i.e., the set of all linear combinations) of certain selected samples, and then take these selected samples as representative ones to label. However, in practice, the data do not necessarily conform to linear models, and how to model nonlinearity of data often becomes the key point to success. In this paper, we present a novel Deep neural network framework for Unsupervised Active Learning, called DUAL. DUAL can explicitly learn a nonlinear embedding to map each input into a latent space through an encoder-decoder architecture, and introduce a selection block to select representative samples in the the learnt latent space. In the selection block, DUAL considers to simultaneously preserve the whole input patterns as well as the cluster structure of data. Extensive experiments are performed on six publicly available datasets, and experimental results clearly demonstrate the efficacy of our method, compared with state-of-the-arts.
【Keywords】: Machine Learning: Active Learning;
【Paper Link】 【Pages】:2633-2639
【Authors】: Erik A. Daxberger ; Anastasia Makarova ; Matteo Turchetta ; Andreas Krause
【Abstract】: The optimization of expensive to evaluate, black-box, mixed-variable functions, i.e. functions that have continuous and discrete inputs, is a difficult and yet pervasive problem in science and engineering. In Bayesian optimization (BO), special cases of this problem that consider fully continuous or fully discrete domains have been widely studied. However, few methods exist for mixed-variable domains and none of them can handle discrete constraints that arise in many real-world applications. In this paper, we introduce MiVaBo, a novel BO algorithm for the efficient optimization of mixed-variable functions combining a linear surrogate model based on expressive feature representations with Thompson sampling. We propose an effective method to optimize its acquisition function, a challenging problem for mixed-variable domains, making MiVaBo the first BO method that can handle complex constraints over the discrete variables. Moreover, we provide the first convergence analysis of a mixed-variable BO algorithm. Finally, we show that MiVaBo is significantly more sample efficient than state-of-the-art mixed-variable BO algorithms on several hyperparameter tuning tasks, including the tuning of deep generative models.
【Keywords】: Machine Learning: Bayesian Optimization; Machine Learning: Active Learning; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:2640-2646
【Authors】: André Gustavo Maletzke ; Waqar Hassan ; Denis Moreira dos Reis ; Gustavo E. A. P. A. Batista
【Abstract】: Quantification is a task similar to classification in the sense that it learns from a labeled training set. However, quantification is not interested in predicting the class of each observation, but rather measure the class distribution in the test set. The community has developed performance measures and experimental setups tailored to quantification tasks. Nonetheless, we argue that a critical variable, the size of the test sets, remains ignored. Such disregard has three main detrimental effects. First, it implicitly assumes that quantifiers will perform equally well for different test set sizes. Second, it increases the risk of cherry-picking by selecting a test set size for which a particular proposal performs best. Finally, it disregards the importance of designing methods that are suitable for different test set sizes. We discuss these issues with the support of one of the broadest experimental evaluations ever performed, with three main outcomes. (i) We empirically demonstrate the importance of the test set size to assess quantifiers. (ii) We show that current quantifiers generally have a mediocre performance on the smallest test sets. (iii) We propose a metalearning scheme to select the best quantifier based on the test size that can outperform the best single quantification method.
【Keywords】: Machine Learning: Other; Data Mining: Other;
【Paper Link】 【Pages】:2647-2654
【Authors】: Rati Devidze ; Farnam Mansouri ; Luis Haug ; Yuxin Chen ; Adish Singla
【Abstract】: Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task. The typical assumption is that the teacher has perfect knowledge of the task---this knowledge comprises knowing the desired learning target, having the exact task representation used by the learner, and knowing the parameters capturing the learning dynamics of the learner. Inspired by real-world applications of machine teaching in education, we consider the setting where teacher's knowledge is limited and noisy, and the key research question we study is the following: When does a teacher succeed or fail in effectively teaching a learner using its imperfect knowledge? We answer this question by showing connections to how imperfect knowledge affects the teacher's solution of the corresponding machine teaching problem when constructing optimal teaching sets. Our results have important implications for designing robust teaching algorithms for real-world applications.
【Keywords】: Machine Learning: Other; Humans and AI: Computer-Aided Education;
【Paper Link】 【Pages】:2655-2661
【Authors】: Johan Ferret ; Raphaël Marinier ; Matthieu Geist ; Olivier Pietquin
【Abstract】: The ability to transfer knowledge to novel environments and tasks is a sensible desiderata for general learning agents. Despite the apparent promises, transfer in RL is still an open and little exploited research area. In this paper, we take a brand-new perspective about transfer: we suggest that the ability to assign credit unveils structural invariants in the tasks that can be transferred to make RL more sample-efficient. Our main contribution is SECRET, a novel approach to transfer learning for RL that uses a backward-view credit assignment mechanism based on a self-attentive architecture. Two aspects are key to its generality: it learns to assign credit as a separate offline supervised process and exclusively modifies the reward function. Consequently, it can be supplemented by transfer methods that do not modify the reward function and it can be plugged on top of any RL algorithm.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:2662-2668
【Authors】: Jonathan Zarecki ; Shaul Markovitch
【Abstract】: Human labeling of data can be very time-consuming and expensive, yet, in many cases it is critical for the success of the learning process. In order to minimize human labeling efforts, we propose a novel active learning solution that does not rely on existing sources of unlabeled data. It uses a small amount of labeled data as the core set for the synthesis of useful membership queries (MQs) — unlabeled instances generated by an algorithm for human labeling.
Our solution uses modification operators, functions that modify instances to some extent. We apply the operators on a small set of instances (core set), creating a set of new membership queries. Using this framework, we look at the instance space as a search space and apply search algorithms in order to generate new examples highly relevant to the learner. We implement this framework in the textual domain and test it on several text classification tasks and show improved classifier performance as more MQs are labeled and incorporated into the training set. To the best of our knowledge, this is the first work on membership queries in the textual domain.
【Keywords】: Machine Learning: Active Learning; Heuristic Search and Game Playing: Heuristic Search and Machine Learning; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:2669-2675
【Authors】: Xufang Luo ; Qi Meng ; Di He ; Wei Chen ; Yunhong Wang
【Abstract】: Learning expressive representations is always crucial for well-performed policies in deep reinforcement learning (DRL). Different from supervised learning, in DRL, accurate targets are not always available, and some inputs with different actions only have tiny differences, which stimulates the demand for learning expressive representations. In this paper, firstly, we empirically compare the representations of DRL models with different performances. We observe that the representations of a better state extractor (SE) are more scattered than a worse one when they are visualized. Thus, we investigate the singular values of representation matrix, and find that, better SEs always correspond to smaller differences among these singular values. Next, based on such observations, we define an indicator of the representations for DRL model, which is the Number of Significant Singular Values (NSSV) of a representation matrix. Then, we propose I4R algorithm, to improve DRL algorithms by adding the corresponding regularization term to enhance the NSSV. Finally, we apply I4R to both policy gradient and value based algorithms on Atari games, and the results show the superiority of our proposed method.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning Applications: Game Playing;
【Paper Link】 【Pages】:2676-2682
【Authors】: Masataro Asai ; Christian Muise
【Abstract】: We achieved a new milestone in the difficult task of enabling agents to learn about their environment autonomously. Our neuro-symbolic architecture is trained end-to-end to produce a succinct and effective discrete state transition model from images alone. Our target representation (the Planning Domain Definition Language) is already in a form that off-the-shelf solvers can consume, and opens the door to the rich array of modern heuristic search capabilities. We demonstrate how the sophisticated innate prior we place on the learning process significantly reduces the complexity of the learned representation, and reveals a connection to the graph-theoretic notion of ``cube-like graphs'', thus opening the door to a deeper understanding of the ideal properties for learned symbolic representations. We show that the powerful domain-independent heuristics allow our system to solve visual 15-Puzzle instances which are beyond the reach of blind search, without resorting to the Reinforcement Learning approach that requires a huge amount of training on the domain-dependent reward information.
【Keywords】: Machine Learning: Neuro-Symbolic Methods; Planning and Scheduling: Planning and Scheduling;
【Paper Link】 【Pages】:2683-2689
【Authors】: Hien H. Nguyen ; Hua Zhong ; Mingzhou Song
【Abstract】: Functional dependency can lead to discoveries of new mechanisms not possible via symmetric association. Most asymmetric methods for causal direction inference are not driven by the function-versus-independence question. A recent exact functional test (EFT) was designed to detect functionally dependent patterns model-free with an exact null distribution. However, the EFT lacked a theoretical justification, had not been compared with other asymmetric methods, and was practically slow. Here, we prove the functional optimality of the EFT statistic, demonstrate its advantage in functional inference accuracy over five other methods, and develop a branch-and-bound algorithm with dynamic and quadratic programming to run at orders of magnitude faster than its previous implementation. Our results make it practical to answer the exact functional dependency question arising from discovery-driven artificial intelligence applications. Software that implements EFT is freely available in the R package 'FunChisq' (≥2.5.0) at https://cran.r-project.org/package=FunChisq
【Keywords】: Machine Learning: Unsupervised Learning; Uncertainty in AI: Exact Probabilistic Inference; Data Mining: Theoretical Foundations; Machine Learning: Relational Learning;
【Paper Link】 【Pages】:2690-2696
【Authors】: Deng Pan ; Xiangrui Li ; Xin Li ; Dongxiao Zhu
【Abstract】: Latent factor collaborative filtering (CF) has been a widely used technique for recommender system by learning the semantic representations of users and items. Recently, explainable recommendation has attracted much attention from research community. However, trade-off exists between explainability and performance of the recommendation where metadata is often needed to alleviate the dilemma. We present a novel feature mapping approach that maps the uninterpretable general features onto the interpretable aspect features, achieving both satisfactory accuracy and explainability in the recommendations by simultaneous minimization of rating prediction loss and interpretation loss. To evaluate the explainability, we propose two new evaluation metrics specifically designed for aspect-level explanation using surrogate ground truth. Experimental results demonstrate a strong performance in both recommendation and explaining explanation, eliminating the need for metadata. Code is available from https://github.com/pd90506/AMCF.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Interpretability; Machine Learning: Recommender Systems;
【Paper Link】 【Pages】:2697-2703
【Authors】: Zilun Peng ; Ahmed Touati ; Pascal Vincent ; Doina Precup
【Abstract】: Stochastic variance-reduced gradient (SVRG) is an optimization method originally designed for tackling machine learning problems with a finite sum structure. SVRG was later shown to work for policy evaluation, a problem in reinforcement learning in which one aims to estimate the value function of a given policy. SVRG makes use of gradient estimates at two scales. At the slower scale, SVRG computes a full gradient over the whole dataset, which could lead to prohibitive computation costs. In this work, we show that two variants of SVRG for policy evaluation could significantly diminish the number of gradient calculations while preserving a linear convergence speed. More importantly, our theoretical result implies that one does not need to use the entire dataset in every epoch of SVRG when it is applied to policy evaluation with linear function approximation. Our experiments demonstrate large computational savings provided by the proposed methods.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2704-2710
【Authors】: Luis A. Pérez Rey ; Vlado Menkovski ; Jim Portegies
【Abstract】: A standard Variational Autoencoder, with a Euclidean latent space, is structurally incapable of capturing topological properties of certain datasets. To remove topological obstructions, we introduce Diffusion Variational Autoencoders (DeltaVAE) with arbitrary (closed) manifolds as a latent space. A Diffusion Variational Autoencoder uses transition kernels of Brownian motion on the manifold. In particular, it uses properties of the Brownian motion to implement the reparametrization trick and fast approximations to the KL divergence. We show that the DeltaVAE is indeed capable of capturing topological properties for datasets with a known underlying latent structure derived from generative processes such as rotations and translations.
【Keywords】: Machine Learning: Bayesian Optimization; Machine Learning: Deep Generative Models; Machine Learning: Dimensionality Reduction and Manifold Learning;
【Paper Link】 【Pages】:2711-2717
【Authors】: Yannis Flet-Berliac ; Philippe Preux
【Abstract】: In reinforcement learning, policy gradient algorithms optimize the policy directly and rely on sampling efficiently an environment. Nevertheless, while most sampling procedures are based on direct policy sampling, self-performance measures could be used to improve such sampling prior to each policy update. Following this line of thought, we introduce SAUNA, a method where non-informative transitions are rejected from the gradient update. The level of information is estimated according to the fraction of variance explained by the value function: a measure of the discrepancy between V and the empirical returns. In this work, we use this criterion to select samples that are useful to learn from, and we demonstrate that this selection can significantly improve the performance of policy gradient methods. In this paper: (a) We introduce the SAUNA method to filter transitions. (b) We conduct experiments on a set of benchmark continuous control problems. SAUNA significantly improves performance. (c) We investigate how SAUNA reliably selects samples with the most positive impact on learning and study its improvement on both performance and sample efficiency.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:2718-2724
【Authors】: Pinzhuo Tian ; Lei Qi ; Shaokang Dong ; Yinghuan Shi ; Yang Gao
【Abstract】: In the few-shot learning scenario, the data-distribution discrepancy between training data and test data in a task usually exists due to the limited data. However, most existing meta-learning approaches seldom consider this intra-task discrepancy in the meta-training phase which might deteriorate the performance. To overcome this limitation, we develop a new consistent meta-regularization method to reduce the intra-task data-distribution discrepancy. Moreover, the proposed meta-regularization method could be readily inserted into existing optimization-based meta-learning models to learn better meta-knowledge. Particularly, we provide the theoretical analysis to prove that using the proposed meta-regularization, the conventional gradient-based meta-learning method can reach the lower regret bound. The extensive experiments also demonstrate the effectiveness of our method, which indeed improves the performances of the state-of-the-art gradient-based meta-learning models in the few-shot classification task.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:2725-2731
【Authors】: Weijie Liu ; Hui Qian ; Chao Zhang ; Zebang Shen ; Jiahao Xie ; Nenggan Zheng
【Abstract】: In this paper, a novel stratified sampling strategy is designed to accelerate the mini-batch SGD. We derive a new iteration-dependent surrogate which bound the stochastic variance from above. To keep the strata minimizing this surrogate with high probability, a stochastic stratifying algorithm is adopted in an adaptive manner, that is, in each iteration, strata are reconstructed only if an easily verifiable condition is met. Based on this novel sampling strategy, we propose an accelerated mini-batch SGD algorithm named SGD-RS. Our theoretical analysis shows that the convergence rate of SGD-RS is superior to the state-of-the-art. Numerical experiments corroborate our theory and demonstrate that SGD-RS achieves at least 3.48-times speed-ups compared to vanilla minibatch SGD.
【Keywords】: Machine Learning: Deep-learning Theory; Machine Learning: Deep Learning; Machine Learning: Clustering; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:2732-2738
【Authors】: Ruobing Xie ; Zhijie Qiu ; Jun Rao ; Yi Liu ; Bo Zhang ; Leyu Lin
【Abstract】: Real-world integrated personalized recommendation systems usually deal with millions of heterogeneous items. It is extremely challenging to conduct full corpus retrieval with complicated models due to the tremendous computation costs. Hence, most large-scale recommendation systems consist of two modules: a multi-channel matching module to efficiently retrieve a small subset of candidates, and a ranking module for precise personalized recommendation. However, multi-channel matching usually suffers from cold-start problems when adding new channels or new data sources. To solve this issue, we propose a novel Internal and contextual attention network (ICAN), which highlights channel-specific contextual information and feature field interactions between multiple channels. In experiments, we conduct both offline and online evaluations with case studies on a real-world integrated recommendation system. The significant improvements confirm the effectiveness and robustness of ICAN, especially for cold-start channels. Currently, ICAN has been deployed on WeChat Top Stories used by millions of users. The source code can be obtained from https://github.com/zhijieqiu/ICAN.
【Keywords】: Machine Learning: Recommender Systems;
【Paper Link】 【Pages】:2739-2745
【Authors】: Xuan Lin ; Zhe Quan ; Zhi-Jie Wang ; Tengfei Ma ; Xiangxiang Zeng
【Abstract】: Drug-drug interaction (DDI) prediction is a challenging problem in pharmacology and clinical application, and effectively identifying potential DDIs during clinical trials is critical for patients and society. Most of existing computational models with AI techniques often concentrate on integrating multiple data sources and combining popular embedding methods together. Yet, researchers pay less attention to the potential correlations between drug and other entities such as targets and genes. Moreover, recent studies also adopted knowledge graph (KG) for DDI prediction. Yet, this line of methods learn node latent embedding directly, but they are limited in obtaining the rich neighborhood information of each entity in the KG. To address the above limitations, we propose an end-to-end framework, called Knowledge Graph Neural Network (KGNN), to resolve the DDI prediction. Our framework can effectively capture drug and its potential neighborhoods by mining their associated relations in KG. To extract both high-order structures and semantic relations of the KG, we learn from the neighborhoods for each entity in the KG as their local receptive, and then integrate neighborhood information with bias from representation of the current entity. This way, the receptive field can be naturally extended to multiple hops away to model high-order topological information and to obtain drugs potential long-distance correlations. We have implemented our method and conducted experiments based on several widely-used datasets. Empirical results show that KGNN outperforms the classic and state-of-the-art models.
【Keywords】: Machine Learning: Deep Learning; Machine Learning Applications: Bio/Medicine;
【Paper Link】 【Pages】:2746-2753
【Authors】: Payel Das ; Brian Quanz ; Pin-Yu Chen ; Jae-wook Ahn ; Dhruv Shah
【Abstract】: Creativity, a process that generates novel and meaningful ideas, involves increased association between task-positive (control) and task-negative (default) networks in the human brain. Inspired by this seminal finding, in this study we propose a creative decoder within a deep generative framework, which involves direct modulation of the neuronal activation pattern after sampling from the learned latent space. The proposed approach is fully unsupervised and can be used off- the-shelf. Several novelty metrics and human evaluation were used to evaluate the creative capacity of the deep decoder. Our experiments on different image datasets (MNIST, FMNIST, MNIST+FMNIST, WikiArt and CelebA) reveal that atypical co-activation of highly activated and weakly activated neurons in a deep decoder promotes generation of novel and meaningful artifacts.
【Keywords】: Machine Learning: Deep Learning; Multidisciplinary Topics and Applications: Art and Music;
【Paper Link】 【Pages】:2754-2761
【Authors】: Quan Guo ; Hossein Rajaby Faghihi ; Yue Zhang ; Andrzej Uszok ; Parisa Kordjamshidi
【Abstract】: Structured learning algorithms usually involve an inference phase that selects the best global output variables assignments based on the local scores of all possible assignments. We extend deep neural networks with structured learning to combine the power of learning representations and leveraging the use of domain knowledge in the form of output constraints during training. Introducing a non-differentiable inference module to gradient-based training is a critical challenge. Compared to using conventional loss functions that penalize every local error independently, we propose an inference-masked loss that takes into account the effect of inference and does not penalize the local errors that can be corrected by the inference. We empirically show the inference-masked loss combined with the negative log-likelihood loss improves the performance on different tasks, namely entity relation recognition on CoNLL04 and ACE2005 corpora, and spatial role labeling on CLEF 2017 mSpRL dataset. We show the proposed approach helps to achieve better generalizability, particularly in the low-data regime.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Relational Learning; Machine Learning: Structured Prediction;
【Paper Link】 【Pages】:2762-2768
【Authors】: Zhao Zhang ; Jiahuan Ren ; Zheng Zhang ; Guangcan Liu
【Abstract】: Low-rank representation is powerful for recover-ing and clustering the subspace structures, but it cannot obtain deep hierarchical information due to the single-layer mode. In this paper, we present a new and effective strategy to extend the sin-gle-layer latent low-rank models into multi-ple-layers, and propose a new and progressive Deep Latent Low-Rank Fusion Network (DLRF-Net) to uncover deep features and struc-tures embedded in input data. The basic idea of DLRF-Net is to refine features progressively from the previous layers by fusing the subspaces in each layer, which can potentially obtain accurate fea-tures and subspaces for representation. To learn deep information, DLRF-Net inputs shallow fea-tures of the last layers into subsequent layers. Then, it recovers the deeper features and hierar-chical information by congregating the projective subspaces and clustering subspaces respectively in each layer. Thus, one can learn hierarchical sub-spaces, remove noise and discover the underlying clean subspaces. Note that most existing latent low-rank coding models can be extended to multi-layers using DLRF-Net. Extensive results show that our network can deliver enhanced perfor-mance over other related frameworks.
【Keywords】: Machine Learning: Clustering; Machine Learning: Deep Learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2769-2776
【Authors】: Hao Xiong ; Nicholas Ruozzi
【Abstract】: Maximum likelihood learning is a well-studied approach for fitting discrete Markov random fields (MRFs) to data. However, general purpose maximum likelihood estimation for fitting MRFs with continuous variables have only been studied in much more limited settings. In this work, we propose a generic MLE estimation procedure for MRFs whose potential functions are modeled by neural networks. To make learning effective in practice, we show how to leverage a highly parallelizable variational inference method that can easily fit into popular machining learning frameworks like TensorFlow. We demonstrate experimentally that our approach is capable of effectively modeling the data distributions of a variety of real data sets and that it can compete effectively with other common methods on multilabel classification and generative modeling tasks.
【Keywords】: Machine Learning: Learning Graphical Models; Uncertainty in AI: Approximate Probabilistic Inference; Uncertainty in AI: Graphical Models;
【Paper Link】 【Pages】:2777-2784
【Authors】: Shujian Yu ; Ammar Shaker ; Francesco Alesiani ; José C. Príncipe
【Abstract】: We propose a simple yet powerful test statistic to quantify the discrepancy between two conditional distributions. The new statistic avoids the explicit estimation of the underlying distributions in high-dimensional space and it operates on the cone of symmetric positive semidefinite (SPS) matrix using the Bregman matrix divergence. Moreover, it inherits the merits of the correntropy function to explicitly incorporate high-order statistics in the data. We present the properties of our new statistic and illustrate its connections to prior art. We finally show the applications of our new statistic on three different machine learning problems, namely the multi-task learning over graphs, the concept drift detection, and the information-theoretic feature selection, to demonstrate its utility and advantage. Code of our statistic is available at https://bit.ly/BregmanCorrentropy.
【Keywords】: Machine Learning: Time-series;Data Streams; Machine Learning: Transfer, Adaptation, Multi-task Learning; Data Mining: Theoretical Foundations;
【Paper Link】 【Pages】:2785-2791
【Authors】: Sankalp Garg ; Navodita Sharma ; Woojeong Jin ; Xiang Ren
【Abstract】: Time series prediction is an important problem in machine learning. Previous methods for time series prediction did not involve additional information. With a lot of dynamic knowledge graphs available, we can use this additional information to predict the time series better. Recently, there has been a focus on the application of deep representation learning on dynamic graphs. These methods predict the structure of the graph by reasoning over the interactions in the graph at previous time steps. In this paper, we propose a new framework to incorporate the information from dynamic knowledge graphs for time series prediction. We show that if the information contained in the graph and the time series data are closely related, then this inter-dependence can be used to predict the time series with improved accuracy. Our framework, DArtNet, learns a static embedding for every node in the graph as well as a dynamic embedding which is dependent on the dynamic attribute value (time-series). Then it captures the information from the neighborhood by taking a relation specific mean and encodes the history information using RNN. We jointly train the model link prediction and attribute prediction. We evaluate our method on five specially curated datasets for this problem and show a consistent improvement in time series prediction results. We release the data and code of model DArtNet for future research.
【Keywords】: Machine Learning: Time-series;Data Streams; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:2792-2798
【Authors】: Lifu Wang ; Bo Shen ; Ning Zhao ; Zhiyuan Zhang
【Abstract】: The residual network is now one of the most effective structures in deep learning, which utilizes the skip connections to “guarantee" the performance will not get worse. However, the non-convexity of the neural network makes it unclear whether the skip connections do provably improve the learning ability since the nonlinearity may create many local minima. In some previous works [Freeman and Bruna, 2016], it is shown that despite the non-convexity, the loss landscape of the two-layer ReLU network has good properties when the number m of hidden nodes is very large. In this paper, we follow this line to study the topology (sub-level sets) of the loss landscape of deep ReLU neural networks with a skip connection and theoretically prove that the skip connection network inherits the good properties of the two-layer network and skip connections can help to control the connectedness of the sub-level sets, such that any local minima worse than the global minima of some two-layer ReLU network will be very “shallow". The “depth" of these local minima are at most O(m^(η-1)/n), where n is the input dimension, η<1. This provides a theoretical explanation for the effectiveness of the skip connection in deep learning.
【Keywords】: Machine Learning: Deep-learning Theory; Machine Learning: Learning Theory; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:2799-2806
【Authors】: Haowen Fang ; Amar Shrestha ; Ziyi Zhao ; Qinru Qiu
【Abstract】: The recently discovered spatial-temporal information processing capability of bio-inspired Spiking neural networks (SNN) has enabled some interesting models and applications. However designing large-scale and high-performance model is yet a challenge due to the lack of robust training algorithms. A bio-plausible SNN model with spatial-temporal property is a complex dynamic system. Synapses and neurons behave as filters capable of preserving temporal information. As such neuron dynamics and filter effects are ignored in existing training algorithms, the SNN downgrades into a memoryless system and loses the ability of temporal signal processing. Furthermore, spike timing plays an important role in information representation, but conventional rate-based spike coding models only consider spike trains statistically, and discard information carried by its temporal structures. To address the above issues, and exploit the temporal dynamics of SNNs, we formulate SNN as a network of infinite impulse response (IIR) filters with neuron nonlinearity. We proposed a training algorithm that is capable to learn spatial-temporal patterns by searching for the optimal synapse filter kernels and weights. The proposed model and training algorithm are applied to construct associative memories and classifiers for synthetic and public datasets including MNIST, NMNIST, DVS 128 etc. Their accuracy outperforms state-of-the-art approaches.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep-learning Theory; Machine Learning: Other;
【Paper Link】 【Pages】:2807-2815
【Authors】: Ryan Spring ; Anshumali Shrivastava
【Abstract】: Learning representations in an unsupervised or self-supervised manner is a growing area of research. Current approaches in representation learning seek to maximize the mutual information between the learned representation and original data. One of the most popular ways to estimate mutual information (MI) is based on Noise Contrastive Estimation (NCE). This MI estimate exhibits low variance, but it is upper-bounded by log(N), where N is the number of samples.
In an ideal scenario, we would use the entire dataset to get the most accurate estimate. However, using such a large number of samples is computationally prohibitive. Our proposed solution is to decouple the upper-bound for the MI estimate from the sample size. Instead, we estimate the partition function of the NCE loss function for the entire dataset using importance sampling (IS). In this paper, we use locality-sensitive hashing (LSH) as an adaptive sampler and propose an unbiased estimator that accurately approximates the partition function in sub-linear (near-constant) time. The samples are correlated and non-normalized, but the derived estimator is unbiased without any assumptions. We show that our LSH sampling estimate provides a superior bias-variance trade-off when compared to other state-of-the-art approaches.
【Keywords】: Machine Learning: Big data; Scalability; Data Mining: Big Data, Large-Scale Systems; Data Mining: Clustering, Unsupervised Learning; Machine Learning: Bayesian Optimization;
【Paper Link】 【Pages】:2816-2823
【Authors】: Riley Simmons-Edler ; Ben Eisner ; Daniel Yang ; Anthony Bisulco ; Eric Mitchell ; H. Sebastian Seung ; Daniel D. Lee
【Abstract】: A major challenge in reinforcement learning is exploration, when local dithering methods such as epsilon-greedy sampling are insufficient to solve a given task. Many recent methods have proposed to intrinsically motivate an agent to seek novel states, driving the agent to discover improved reward. However, while state-novelty exploration methods are suitable for tasks where novel observations correlate well with improved reward, they may not explore more efficiently than epsilon-greedy approaches in environments where the two are not well-correlated. In this paper, we distinguish between exploration tasks in which seeking novel states aids in finding new reward, and those where it does not, such as goal-conditioned tasks and escaping local reward maxima. We propose a new exploration objective, maximizing the reward prediction error (RPE) of a value function trained to predict extrinsic reward. We then propose a deep reinforcement learning method, QXplore, which exploits the temporal difference error of a Q-function to solve hard exploration tasks in high-dimensional MDPs. We demonstrate the exploration behavior of QXplore on several OpenAI Gym MuJoCo tasks and Atari games and observe that QXplore is comparable to or better than a baseline state-novelty method in all cases, outperforming the baseline on tasks where state novelty is not well-correlated with improved reward.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:2824-2830
【Authors】: Sungryull Sohn ; Yinlam Chow ; Jayden Ooi ; Ofir Nachum ; Honglak Lee ; Ed Chi ; Craig Boutilier
【Abstract】: In batch reinforcement learning (RL), one often constrains a learned policy to be close to the behavior (data-generating) policy, e.g., by constraining the learned action distribution to differ from the behavior policy by some maximum degree that is the same at each state. This can cause batch RL to be overly conservative, unable to exploit large policy changes at frequently-visited, high-confidence states without risking poor performance at sparsely-visited states. To remedy this, we propose residual policies, where the allowable deviation of the learned policy is state-action-dependent. We derive a new for RL method, BRPO, which learns both the policy and allowable deviation that jointly maximize a lower bound on policy performance. We show that BRPO achieves the state-of-the-art performance in a number of tasks.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:2831-2838
【Authors】: Ying Song ; Shuangjia Zheng ; Zhangming Niu ; Zhang-Hua Fu ; Yutong Lu ; Yuedong Yang
【Abstract】: Constructing proper representations of molecules lies at the core of numerous tasks such as molecular property prediction and drug design. Graph neural networks, especially message passing neural network (MPNN) and its variants, have recently made remarkable achievements in molecular graph modeling. Albeit powerful, the one-sided focuses on atom (node) or bond (edge) information of existing MPNN methods lead to the insufficient representations of the attributed molecular graphs. Herein, we propose a Communicative Message Passing Neural Network (CMPNN) to improve the molecular embedding by strengthening the message interactions between nodes and edges through a communicative kernel. In addition, the message generation process is enriched by introducing a new message booster module. Extensive experiments demonstrated that the proposed model obtained superior performances against state-of-the-art baselines on six chemical property datasets. Further visualization also showed better representation capacity of our model.
【Keywords】: Machine Learning: Deep Learning; Machine Learning Applications: Bio/Medicine; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:2839-2846
【Authors】: Taiji Suzuki ; Hiroshi Abe ; Tomoya Murata ; Shingo Horiuchi ; Kotaro Ito ; Tokuma Wachi ; So Hirai ; Masatoshi Yukishima ; Tomoaki Nishimura
【Abstract】: Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective compression method and its rigorous background of statistical learning theory. To resolve this issue, we develop a new theoretical framework for model compression and propose a new pruning method called {\it spectral pruning} based on this framework. We define the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes and show that the compression ability is essentially controlled by this quantity. Moreover, we present a sharp generalization error bound of the compressed model and characterize the bias--variance tradeoff induced by the compression procedure. We apply our method to several datasets to justify our theoretical analyses and show the superiority of the the proposed method.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep-learning Theory; Machine Learning: Learning Theory; Machine Learning: Feature Selection; Learning Sparse Models;
【Paper Link】 【Pages】:2847-2854
【Authors】: Dmitrii Krylov ; Remi Tachet des Combes ; Romain Laroche ; Michael Rosenblum ; Dmitry V. Dylov
【Abstract】: Malfunctioning neurons in the brain sometimes operate synchronously, reportedly causing many neurological diseases, e.g. Parkinson’s. Suppression and control of this collective synchronous activity are therefore of great importance for neuroscience, and can only rely on limited engineering trials due to the need to experiment with live human brains. We present the first Reinforcement Learning (RL) gym framework that emulates this collective behavior of neurons and allows us to find suppression parameters for the environment of synthetic degenerate models of neurons. We successfully suppress synchrony via RL for three pathological signaling regimes, characterize the framework’s stability to noise, and further remove the unwanted oscillations by engaging multiple PPO agents.
【Keywords】: Machine Learning: Reinforcement Learning; Multidisciplinary Topics and Applications: Biology and Medicine; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2855-2862
【Authors】: Kentaro Kanamori ; Takuya Takagi ; Ken Kobayashi ; Hiroki Arimura
【Abstract】: Counterfactual Explanation (CE) is one of the post-hoc explanation methods that provides a perturbation vector so as to alter the prediction result obtained from a classifier. Users can directly interpret the perturbation as an "action" for obtaining their desired decision results. However, an action extracted by existing methods often becomes unrealistic for users because they do not adequately care about the characteristics corresponding to the empirical data distribution such as feature-correlations and outlier risk. To suggest an executable action for users, we propose a new framework of CE for extracting an action by evaluating its reality on the empirical data distribution. The key idea of our proposed method is to define a new cost function based on the Mahalanobis' distance and the local outlier factor. Then, we propose a mixed-integer linear optimization approach to extracting an optimal action by minimizing our cost function. By experiments on real datasets, we confirm the effectiveness of our method in comparison with existing methods for CE.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Interpretability; Machine Learning: Classification;
【Paper Link】 【Pages】:2863-2871
【Authors】: Elad Sarafian ; Aviv Tamar ; Sarit Kraus
【Abstract】: We propose a policy improvement algorithm for Reinforcement Learning (RL) termed Rerouted Behavior Improvement (RBI). RBI is designed to take into account the evaluation errors of the Q-function. Such errors are common in RL when learning the Q-value from finite experience data. Greedy policies or even constrained policy optimization algorithms that ignore these errors may suffer from an improvement penalty (i.e., a policy impairment). To reduce the penalty, the idea of RBI is to attenuate rapid policy changes to actions that were rarely sampled. This approach is shown to avoid catastrophic performance degradation and reduce regret when learning from a batch of transition samples. Through a two-armed bandit example, we show that it also increases data efficiency when the optimal action has a high variance. We evaluate RBI in two tasks in the Atari Learning Environment: (1) learning from observations of multiple behavior policies and (2) iterative RL. Our results demonstrate the advantage of RBI over greedy policies and other constrained policy optimization algorithms both in learning from observations and in RL tasks.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Deep Reinforcement Learning;
【Paper Link】 【Pages】:2872-2878
【Authors】: Xuemiao Zhang ; Zhouxing Tan ; Xiaoning Zhang ; Yang Cao ; Rui Yan
【Abstract】: Naive neural dialogue generation models tend to produce repetitive and dull utterances. The promising adversarial models train the generator against a well-designed discriminator to push it to improve towards the expected direction. However, assessing dialogues requires consideration of many aspects of linguistics, which are difficult to be fully covered by a single discriminator. To address it, we reframe the dialogue generation task as a multi-objective optimization problem and propose a novel adversarial dialogue generation framework with multiple discriminators that excel in different objectives for multiple linguistic aspects, called AMPGAN, whose feasibility is proved by theoretical derivations. Moreover, we design an adaptively adjusted sampling distribution to balance the discriminators and promote the overall improvement of the generator by continuing to focus on these objectives that the generator is not performing well relatively. Experimental results on two real-world datasets show a significant improvement over the baselines.
【Keywords】: Machine Learning: Reinforcement Learning; Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Generation; Machine Learning: Adversarial Machine Learning;
【Paper Link】 【Pages】:2879-2885
【Authors】: Min Shi ; Yufei Tang ; Xingquan Zhu ; David A. Wilson ; Jianxun Liu
【Abstract】: Networked data often demonstrate the Pareto principle (i.e., 80/20 rule) with skewed class distributions, where most vertices belong to a few majority classes and minority classes only contain a handful of instances. When presented with imbalanced class distributions, existing graph embedding learning tends to bias to nodes from majority classes, leaving nodes from minority classes under-trained. In this paper, we propose Dual-Regularized Graph Convolutional Networks (DR-GCN) to handle multi-class imbalanced graphs, where two types of regularization are imposed to tackle class imbalanced representation learning. To ensure that all classes are equally represented, we propose a class-conditioned adversarial training process to facilitate the separation of labeled nodes. Meanwhile, to maintain training equilibrium (i.e., retaining quality of fit across all classes), we force unlabeled nodes to follow a similar latent distribution to the labeled nodes by minimizing their difference in the embedding space. Experiments on real-world imbalanced graphs demonstrate that DR-GCN outperforms the state-of-the-art methods in node classification, graph clustering, and visualization.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Data Mining: Classification, Semi-Supervised Learning; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:2886-2892
【Authors】: Andrea Bontempelli ; Stefano Teso ; Fausto Giunchiglia ; Andrea Passerini
【Abstract】: The ability to learn from human supervision is fundamental for personal assistants and other interactive applications of AI. Two central challenges for deploying interactive learners in the wild are the unreliable nature of the supervision and the varying complexity of the prediction task. We address a simple but representative setting, incremental classification in the wild, where the supervision is noisy and the number of classes grows over time. In order to tackle this task, we propose a redesign of skeptical learning centered around Gaussian Processes (GPs). Skeptical learning is a recent interactive strategy in which, if the machine is sufficiently confident that an example is mislabeled, it asks the annotator to reconsider her feedback. In many cases, this is often enough to obtain clean supervision. Our redesign, dubbed ISGP , leverages the uncertainty estimates supplied by GPs to better allocate labeling and contradiction queries, especially in the presence of noise. Our experiments on synthetic and real-world data show that, as a result, while the original formulation of skeptical learning produces over-confident models that can fail completely in the wild, ISGP works well at varying levels of noise and as new classes are observed.
【Keywords】: Machine Learning: Active Learning; Machine Learning: Classification; Machine Learning: Kernel Methods;
【Paper Link】 【Pages】:2893-2900
【Authors】: Daniel Stoller ; Mi Tian ; Sebastian Ewert ; Simon Dixon
【Abstract】: Convolutional neural networks (CNNs) with dilated filters such as the Wavenet or the Temporal Convolutional Network (TCN) have shown good results in a variety of sequence modelling tasks. While their receptive field grows exponentially with the number of layers, computing the convolutions over very long sequences of features in each layer is time and memory-intensive, and prohibits the use of longer receptive fields in practice. To increase efficiency, we make use of the "slow feature" hypothesis stating that many features of interest are slowly varying over time. For this, we use a U-Net architecture that computes features at multiple time-scales and adapt it to our auto-regressive scenario by making convolutions causal. We apply our model ("Seq-U-Net") to a variety of tasks including language and audio generation. In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance on real-world tasks.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:2901-2907
【Authors】: Qiangxing Tian ; Guanchu Wang ; Jinxin Liu ; Donglin Wang ; Yachen Kang
【Abstract】: Recently, diverse primitive skills have been learned by adopting the entropy as intrinsic reward, which further shows that new practical skills can be produced by combining a variety of primitive skills. This is essentially skill transfer, very useful for learning high-level skills but quite challenging due to the low efficiency of transferring primitive skills. In this paper, we propose a novel efficient skill transfer method, where we learn independent skills and only independent components of skills are transferred instead of the whole set of skills. More concretely, independent components of skills are obtained through independent component analysis (ICA), which always have a smaller amount (or lower dimension) compared with their mixtures. With a lower dimension, independent skill transfer (IST) exhibits a higher efficiency on learning a given task. Extensive experiments including three robotic tasks demonstrate the effectiveness and high efficiency of our proposed IST method in comparison to direct primitive-skill transfer and conventional reinforcement learning.
【Keywords】: Machine Learning: Deep Reinforcement Learning;
【Paper Link】 【Pages】:2908-2914
【Authors】: Guangyang Han ; Jinzheng Tu ; Guoxian Yu ; Jun Wang ; Carlotta Domeniconi
【Abstract】: Crowdsourcing is a new computing paradigm that harnesses human effort to solve computer-hard problems. Budget and quality are two fundamental factors in crowdsourcing, but they are antagonistic and their balance is crucially important. Induction and inference are principled ways for humans to acquire knowledge. Transfer learning can also enable induction and inference processes. When a new task comes, we may not know how to go about approaching it. On the other hand, we may have easy access to relevant knowledge that can help us with the new task. As such, via appropriate knowledge transfer, for example, an improved annotation can be achieved for the task at a small cost. To make this idea concrete, we introduce the Crowdsourcing with Multiple-source Knowledge Transfer (CrowdMKT)approach to transfer knowledge from multiple, similar, but different domains for a new task, and to reduce the negative impact of irrelevant sources. CrwodMKT first learns a set of concentrated high-level feature vectors of tasks using knowledge transfer from multiple sources, and then introduces a probabilistic graphical model to jointly model the tasks with high-level features, workers, and their annotations. Finally, it adopts an EM algorithm to estimatethe workers strengths and consensus. Experimental results on real-world image and text datasets prove the effectiveness of CrowdMKT in improving quality and reducing the budget.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Humans and AI: Human Computation and Crowdsourcing;
【Paper Link】 【Pages】:2915-2921
【Authors】: Lorenzo Perini ; Vincent Vercruyssen ; Jesse Davis
【Abstract】: Estimating the proportion of positive examples (i.e., the class prior) from positive and unlabeled (PU) data is an important task that facilitates learning a classifier from such data. In this paper, we explore how to tackle this problem when the observed labels were acquired via active learning. This introduces the challenge that the observed labels were not selected completely at random, which is the primary assumption underpinning existing approaches to estimating the class prior from PU data. We analyze this new setting and design an algorithm that is able to estimate the class prior for a given active learning strategy. Empirically, we show that our approach accurately recovers the true class prior on a benchmark of anomaly detection datasets and that it does so more accurately than existing methods.
【Keywords】: Machine Learning: Semi-Supervised Learning; Data Mining: Classification, Semi-Supervised Learning; Machine Learning: Active Learning;
【Paper Link】 【Pages】:2922-2928
【Authors】: Ruizhe Zhao ; Brian Vogel ; Tanvir Ahmed ; Wayne Luk
【Abstract】: By leveraging the half-precision floating-point format (FP16) well supported by recent GPUs, mixed precision training (MPT) enables us to train larger models under the same or even smaller budget. However, due to the limited representation range of FP16, gradients can often experience severe underflow problems that hinder backpropagation and degrade model accuracy. MPT adopts loss scaling, which scales up the loss value just before backpropagation starts, to mitigate underflow by enlarging the magnitude of gradients. Unfortunately, scaling once is insufficient: gradients from distinct layers can each have different data distributions and require non-uniform scaling. Heuristics and hyperparameter tuning are needed to minimize these side-effects on loss scaling. We propose gradient scaling, a novel method that analytically calculates the appropriate scale for each gradient on-the-fly. It addresses underflow effectively without numerical problems like overflow and the need for tedious hyperparameter tuning. Experiments on a variety of networks and tasks show that gradient scaling can improve accuracy and reduce overall training effort compared with the state-of-the-art MPT.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:2929-2935
【Authors】: Cong Fei ; Bin Wang ; Yuzheng Zhuang ; Zongzhang Zhang ; Jianye Hao ; Hongbo Zhang ; Xuewu Ji ; Wulong Liu
【Abstract】: Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning. However, the requirement of isolated single modal demonstrations limits the scalability of the approach to real world scenarios such as autonomous vehicles' demand for a proper understanding of human drivers' behavior. In this paper, we propose a novel multi-modal GAIL framework, named Triple-GAIL, that is able to learn skill selection and imitation jointly from both expert demonstrations and continuously generated experiences with data augmentation purpose by introducing an auxiliary selector. We provide theoretical guarantees on the convergence to optima for both of the generator and the selector respectively. Experiments on real driver trajectories and real-time strategy game datasets demonstrate that Triple-GAIL can better fit multi-modal behaviors close to the demonstrators and outperforms state-of-the-art methods.
【Keywords】: Machine Learning: Adversarial Machine Learning; Machine Learning: Reinforcement Learning; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:2936-2942
【Authors】: Bo Xue ; Guanghui Wang ; Yimu Wang ; Lijun Zhang
【Abstract】: In this paper, we study the problem of stochastic linear bandits with finite action sets. Most of existing work assume the payoffs are bounded or sub-Gaussian, which may be violated in some scenarios such as financial markets. To settle this issue, we analyze the linear bandits with heavy-tailed payoffs, where the payoffs admit finite 1+epsilon moments for some epsilon in (0,1]. Through median of means and dynamic truncation, we propose two novel algorithms which enjoy a sublinear regret bound of widetilde{O}(d^(1/2)T^(1/(1+epsilon))), where d is the dimension of contextual information and T is the time horizon. Meanwhile, we provide an Omega(d^(epsilon/(1+epsilon))T^(1/(1+epsilon))) lower bound, which implies our upper bound matches the lower bound up to polylogarithmic factors in the order of d and T when epsilon=1. Finally, we conduct numerical experiments to demonstrate the effectiveness of our algorithms and the empirical results strongly support our theoretical guarantees.
【Keywords】: Machine Learning: Online Learning; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:2943-2949
【Authors】: Haobo Wang ; Weiwei Liu ; Yang Zhao ; Tianlei Hu ; Ke Chen ; Gang Chen
【Abstract】: Multi-dimensional classification has attracted huge attention from the community. Though most studies consider fully annotated data, in real practice obtaining fully labeled data in MDC tasks is usually intractable. In this paper, we propose a novel learning paradigm: MultiDimensional Partial Label Learning (MDPL) where the ground-truth labels of each instance are concealed in multiple candidate label sets. We first introduce the partial hamming loss for MDPL that incurs a large loss if the predicted labels are not in candidate label sets, and provide an empirical risk minimization (ERM) framework. Theoretically, we rigorously prove the conditions for ERM learnability of MDPL in both independent and dependent cases. Furthermore, we present two MDPL algorithms under our proposed ERM framework. Comprehensive experiments on both synthetic and real-world datasets validate the effectiveness of our proposals.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning;
【Paper Link】 【Pages】:2950-2956
【Authors】: Hu Wang ; Guansong Pang ; Chunhua Shen ; Congbo Ma
【Abstract】: Deep neural networks have gained great success in a broad range of tasks due to its remarkable capability to learn semantically rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption in unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications to critical domains where obtaining massive labelled data is prohibitively expensive. To enable unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space. Random mapping is a theoretically proven approach to obtain approximately preserved distances. To well predict these distances, the representation learner is optimised to learn genuine class structures that are implicitly embedded in the randomly projected space. Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. Code is available at: \url{https://git.io/RDP}
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Unsupervised Learning; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:2957-2963
【Authors】: Wenbin Li ; Lei Wang ; Jing Huo ; Yinghuan Shi ; Yang Gao ; Jiebo Luo
【Abstract】: The core idea of metric-based few-shot image classification is to directly measure the relations between query images and support classes to learn transferable feature embeddings. Previous work mainly focuses on image-level feature representations, which actually cannot effectively estimate a class's distribution due to the scarcity of samples. Some recent work shows that local descriptor based representations can achieve richer representations than image-level based representations. However, such works are still based on a less effective instance-level metric, especially a symmetric metric, to measure the relation between a query image and a support class. Given the natural asymmetric relation between a query image and a support class, we argue that an asymmetric measure is more suitable for metric-based few-shot learning. To that end, we propose a novel Asymmetric Distribution Measure (ADM) network for few-shot learning by calculating a joint local and global asymmetric measure between two multivariate local distributions of a query and a class. Moreover, a task-aware Contrastive Measure Strategy (CMS) is proposed to further enhance the measure function. On popular miniImageNet and tieredImageNet, ADM can achieve the state-of-the-art results, validating our innovative design of asymmetric distribution measures for few-shot learning. The source code can be downloaded from https://github.com/WenbinLee/ADM.git.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:2964-2972
【Authors】: Jiaqi Zhang ; Meng Wang ; Qinchi Li ; Sen Wang ; Xiaojun Chang ; Beilun Wang
【Abstract】: We consider the problem of estimating a sparse Gaussian Graphical Model with a special graph topological structure and more than a million variables. Most previous scalable estimators still contain expensive calculation steps (e.g., matrix inversion or Hessian matrix calculation) and become infeasible in high-dimensional scenarios, where p (number of variables) is larger than n (number of samples). To overcome this challenge, we propose a novel method, called Fast and Scalable Inverse Covariance Estimator by Thresholding (FST). FST first obtains a graph structure by applying a generalized threshold to the sample covariance matrix. Then, it solves multiple block-wise subproblems via element-wise thresholding. By using matrix thresholding instead of matrix inversion as the computational bottleneck, FST reduces its computational complexity to a much lower order of magnitude (O(p2)). We show that FST obtains the same sharp convergence rate O(√(log max{p, n}/n) as other state-of-the-art methods. We validate the method empirically, on multiple simulated datasets and one real-world dataset, and show that FST is two times faster than the four baselines while achieving a lower error rate under both Frobenius-norm and max-norm.
【Keywords】: Machine Learning: Big data; Scalability; Machine Learning: Learning Graphical Models; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:2973-2979
【Authors】: Jiafeng Cheng ; Qianqian Wang ; Zhiqiang Tao ; De-Yan Xie ; Quanxue Gao
【Abstract】: Graph neural networks (GNNs) have made considerable achievements in processing graph-structured data. However, existing methods can not allocate learnable weights to different nodes in the neighborhood and lack of robustness on account of neglecting both node attributes and graph reconstruction. Moreover, most of multi-view GNNs mainly focus on the case of multiple graphs, while designing GNNs for solving graph-structured data of multi-view attributes is still under-explored. In this paper, we propose a novel Multi-View Attribute Graph Convolution Networks (MAGCN) model for the clustering task. MAGCN is designed with two-pathway encoders that map graph embedding features and learn the view-consistency information. Specifically, the first pathway develops multi-view attribute graph attention networks to reduce the noise/redundancy and learn the graph embedding features for each multi-view graph data. The second pathway develops consistent embedding encoders to capture the geometric relationship and probability distribution consistency among different views, which adaptively finds a consistent clustering embedding space for multi-view attributes. Experiments on three benchmark graph datasets show the superiority of our method compared with several state-of-the-art algorithms.
【Keywords】: Machine Learning: Clustering; Machine Learning: Deep Learning; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:2980-2986
【Authors】: Xin Wang ; Siu-Ming Yiu
【Abstract】: Deep Infomax (DIM) is an unsupervised representation learning framework by maximizing the mutual information between the inputs and the outputs of an encoder, while probabilistic constraints are imposed on the outputs. In this paper, we propose Supervised Deep InfoMax (SDIM), which introduces supervised probabilistic constraints to the encoder outputs. The supervised probabilistic constraints are equivalent to a generative classifier on high-level data representations, where class conditional log-likelihoods of samples can be evaluated. Unlike other works building generative classifiers with conditional generative models, SDIMs scale on complex datasets, and can achieve comparable performance with discriminative counterparts. With SDIM, we could perform classification with rejection. Instead of always reporting a class label, SDIM only makes predictions when test samples' largest class conditional surpass some pre-chosen thresholds, otherwise they will be deemed as out of the data distributions, and be rejected. Our experiments show that SDIM with rejection policy can effectively reject illegal inputs, including adversarial examples and out-of-distribution samples.
【Keywords】: Machine Learning: Classification; Multidisciplinary Topics and Applications: Security and Privacy; Machine Learning: Learning Generative Models;
【Paper Link】 【Pages】:2987-2993
【Authors】: Fuxiang Zhang ; Xin Wang ; Zhao Li ; Jianxin Li
【Abstract】: Representation learning of knowledge graphs aims to project both entities and relations as vectors in a continuous low-dimensional space. Relation Hierarchical Structure (RHS), which is constructed by a generalization relationship named subRelationOf between relations, can improve the overall performance of knowledge representation learning. However, most of the existing methods ignore this critical information, and a straightforward way of considering RHS may have a negative effect on the embeddings and thus reduce the model performance. In this paper, we propose a novel method named TransRHS, which is able to incorporate RHS seamlessly into the embeddings. More specifically, TransRHS encodes each relation as a vector together with a relation-specific sphere in the same space. Our TransRHS employs the relative positions among the vectors and spheres to model the subRelationOf, which embodies the inherent generalization relationships among relations. We evaluate our model on two typical tasks, i.e., link prediction and triple classification. The experimental results show that our TransRHS model significantly outperforms all baselines on both tasks, which verifies that the RHS information is significant to representation learning of knowledge graphs, and TransRHS can effectively and efficiently fuse RHS into knowledge graph embeddings.
【Keywords】: Machine Learning: Knowledge-based Learning; Natural Language Processing: Embeddings;
【Paper Link】 【Pages】:2994-3000
【Authors】: Zhongxia Chen ; Xiting Wang ; Xing Xie ; Mehul Parsana ; Akshay Soni ; Xiang Ao ; Enhong Chen
【Abstract】: Recent studies have shown that both accuracy and explainability are important for recommendation. In this paper, we introduce explainable conversational recommendation, which enables incremental improvement of both recommendation accuracy and explanation quality through multi-turn user-model conversation. We show how the problem can be formulated, and design an incremental multi-task learning framework that enables tight collaboration between recommendation prediction, explanation generation, and user feedback integration. We also propose a multi-view feedback integration method to enable effective incremental model update. Empirical results demonstrate that our model not only consistently improves the recommendation accuracy but also generates explanations that fit user interests reflected in the feedbacks.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Recommender Systems; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3001-3008
【Authors】: Feng Zhu ; Yan Wang ; Chaochao Chen ; Guanfeng Liu ; Xiaolin Zheng
【Abstract】: The conventional single-target Cross-Domain Recommendation (CDR) only improves the recommendation accuracy on a target domain with the help of a source domain (with relatively richer information). In contrast, the novel dual-target CDR has been proposed to improve the recommendation accuracies on both domains simultaneously. However, dual-target CDR faces two new challenges: (1) how to generate more representative user and item embeddings, and (2) how to effectively optimize the user/item embeddings on each domain. To address these challenges, in this paper, we propose a graphical and attentional framework, called GA-DTCDR. In GA-DTCDR, we first construct two separate heterogeneous graphs based on the rating and content information from two domains to generate more representative user and item embeddings. Then, we propose an element-wise attention mechanism to effectively combine the embeddings of common users learned from both domains. Both steps significantly enhance the quality of user and item embeddings and thus improve the recommendation accuracy on each domain. Extensive experiments conducted on four real-world datasets demonstrate that GA-DTCDR significantly outperforms the state-of-the-art approaches.
【Keywords】: Machine Learning: Recommender Systems; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:3009-3015
【Authors】: Zheng Wang ; Feiping Nie ; Lai Tian ; Rong Wang ; Xuelong Li
【Abstract】: In this paper, we first propose a novel Structured Sparse Subspace Learning S^3L module to address the long-standing subspace sparsity issue. Elicited by proposed module, we design a new discriminative feature selection method, named Subspace Sparsity Discriminant Feature Selection S^2DFS which enables the following new functionalities: 1) Proposed S^2DFS method directly joints trace ratio objective and structured sparse subspace constraint via L2,0-norm to learn a row-sparsity subspace, which improves the discriminability of model and overcomes the parameter-tuning trouble with comparison to the methods used L2,1-norm regularization; 2) An alternative iterative optimization algorithm based on the proposed S^3L module is presented to explicitly solve the proposed problem with a closed-form solution and strict convergence proof. To our best knowledge, such objective function and solver are first proposed in this paper, which provides a new though for the development of feature selection methods. Extensive experiments conducted on several high-dimensional datasets demonstrate the discriminability of selected features via S^2DFS with comparison to several related SOTA feature selection methods. Source matlab code: https://github.com/StevenWangNPU/L20-FS.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Data Mining: Feature Extraction, Selection and Dimensionality Reduction;
【Paper Link】 【Pages】:3016-3022
【Authors】: Umang Bhatt ; Adrian Weller ; José M. F. Moura
【Abstract】: A feature-based model explanation denotes how much each input feature contributes to a model's output for a given data point. As the number of proposed explanation functions grows, we lack quantitative evaluation criteria to help practitioners know when to use which explanation function. This paper proposes quantitative evaluation criteria for feature-based explanations: low sensitivity, high faithfulness, and low complexity. We devise a framework for aggregating explanation functions. We develop a procedure for learning an aggregate explanation function with lower complexity and then derive a new aggregate Shapley value explanation function that minimizes sensitivity.
【Keywords】: Machine Learning: Explainable Machine Learning; AI Ethics: Explainability; Machine Learning: Interpretability;
【Paper Link】 【Pages】:3023-3029
【Authors】: Chuhan Wu ; Fangzhao Wu ; Tao Qi ; Yongfeng Huang
【Abstract】: Modeling user interest is critical for accurate news recommendation. Existing news recommendation methods usually infer user interest from click behaviors on news. However, users may click a news article because attracted by its title shown on the news website homepage, but may not be satisfied with its content after reading. In many cases users close the news page quickly after click. In this paper we propose to model user interest from both click behaviors on news titles and reading behaviors on news content for news recommendation. More specifically, we propose a personalized reading speed metric to measure users’ satisfaction with news content. We learn embeddings of users from the news content they have read and their satisfaction with these news to model their interest in news content. In addition, we also learn another user embedding from the news titles they have clicked to model their preference in news titles. We combine both kinds of user embeddings into a unified user representation for news recommendation. We train the user representation model using two supervised learning tasks built from user behaviors, i.e., news title based click prediction and news content based satisfaction prediction, to encourage our model to recommend the news articles which not only are likely to be clicked but also have the content satisfied by the user. Experiments on real-world dataset show our method can effectively boost the performance of user modeling for news recommendation.
【Keywords】: Machine Learning: Recommender Systems; Humans and AI: Personalization and User Modeling;
【Paper Link】 【Pages】:3030-3036
【Authors】: Hanning Gao ; Lingfei Wu ; Po Hu ; Fangli Xu
【Abstract】: The task of RDF-to-text generation is to generate a corresponding descriptive text given a set of RDF triples. Most of the previous approaches either cast this task as a sequence-to-sequence problem or employ graph-based encoder for modeling RDF triples and decode a text sequence. However, none of these methods can explicitly model both local and global structure information between and within the triples. To address these issues, we propose to jointly learn local and global structure information via combining two new graph-augmented structural neural encoders (i.e., a bidirectional graph encoder and a bidirectional graph-based meta-paths encoder) for the input triples. Experimental results on two different WebNLG datasets show that our proposed model outperforms the state-of-the-art baselines. Furthermore, we perform a human evaluation that demonstrates the effectiveness of the proposed method by evaluating generated text quality using various subjective metrics.
【Keywords】: Machine Learning: Relational Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3037-3043
【Authors】: Peixi Peng ; Junliang Xing ; Lili Cao
【Abstract】: This paper aims to learn multi-agent cooperation where each agent performs its actions in a decentralized way. In this case, it is very challenging to learn decentralized policies when the rewards are global and sparse. Recently, learning from demonstrations (LfD) provides a promising way to handle this challenge. However, in many practical tasks, the available demonstrations are often sub-optimal. To learn better policies from these sub-optimal demonstrations, this paper follows a centralized learning and decentralized execution framework and proposes a novel hybrid learning method based on multi-agent actor-critic. At first, the expert trajectory returns generated from demonstration actions are used to pre-train the centralized critic network. Then, multi-agent decisions are made by best response dynamics based on the critic and used to train the decentralized actor networks. Finally, the demonstrations are updated by the actor networks, and the critic and actor networks are learned jointly by running the above two steps alliteratively. We evaluate the proposed approach on a real-time strategy combat game. Experimental results show that the approach outperforms many competing demonstration-based methods.
【Keywords】: Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Multi-agent Learning; Multidisciplinary Topics and Applications: Computer Games;
【Paper Link】 【Pages】:3044-3050
【Authors】: Chao Qian ; Hang Xiong ; Ke Xue
【Abstract】: Bayesian optimization (BO) is a popular approach for expensive black-box optimization, with applications including parameter tuning, experimental design, and robotics. BO usually models the objective function by a Gaussian process (GP), and iteratively samples the next data point by maximizing an acquisition function. In this paper, we propose a new general framework for BO by generating pseudo-points (i.e., data points whose objective values are not evaluated) to improve the GP model. With the classic acquisition function, i.e., upper confidence bound (UCB), we prove that the cumulative regret can be generally upper bounded. Experiments using UCB and other acquisition functions, i.e., probability of improvement (PI) and expectation of improvement (EI), on synthetic as well as real-world problems clearly show the advantage of generating pseudo-points.
【Keywords】: Machine Learning: Bayesian Optimization; Heuristic Search and Game Playing: Heuristic Search; Heuristic Search and Game Playing: Heuristic Search and Machine Learning;
【Paper Link】 【Pages】:3051-3057
【Authors】: Bowen Weng ; Huaqing Xiong ; Yingbin Liang ; Wei Zhang
【Abstract】: Existing convergence analyses of Q-learning mostly focus on the vanilla stochastic gradient descent (SGD) type of updates. Despite the Adaptive Moment Estimation (Adam) has been commonly used for practical Q-learning algorithms, there has not been any convergence guarantee provided for Q-learning with such type of updates. In this paper, we first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update (a commonly adopted alternative of Adam for theoretical analysis). To further improve the performance, we propose to incorporate the momentum restart scheme to Q-AMSGrad, resulting in the so-called Q-AMSGradR algorithm. The convergence rate of Q-AMSGradR is also established. Our experiments on a linear quadratic regulator problem demonstrate that the two proposed Q-learning algorithms outperform the vanilla Q-learning with SGD updates. The two algorithms also exhibit significantly better performance than the DQN learning method over a batch of Atari 2600 games.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Deep Reinforcement Learning; Machine Learning: Deep Learning; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:3058-3064
【Authors】: Jun Huang ; Linchuan Xu ; Jing Wang ; Lei Feng ; Kenji Yamanishi
【Abstract】: Existing multi-label learning (MLL) approaches mainly assume all the labels are observed and construct classification models with a fixed set of target labels (known labels). However, in some real applications, multiple latent labels may exist outside this set and hide in the data, especially for large-scale data sets. Discovering and exploring the latent labels hidden in the data may not only find interesting knowledge but also help us to build a more robust learning model. In this paper, a novel approach named DLCL (i.e., Discovering Latent Class Labels for MLL) is proposed which can not only discover the latent labels in the training data but also predict new instances with the latent and known labels simultaneously. Extensive experiments show a competitive performance of DLCL against other state-of-the-art MLL approaches.
【Keywords】: Machine Learning: Classification; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:3065-3072
【Authors】: Xiaoxing Wang ; Chao Xue ; Junchi Yan ; Xiaokang Yang ; Yonggang Hu ; Kewei Sun
【Abstract】: Differentiable architecture search (DARTS) has been a promising one-shot architecture search approach for its mathematical formulation and competitive results. However, besides its caused high memory utilization and a large computation requirement, many research works have shown that DARTS also often suffers notable over-fitting and thus does not work robustly for some new tasks. In this paper, we propose a one-shot neural architecture search method referred to as MergeNAS by merging different types of operations e.g. convolutions into one operation. This merge-based approach not only reduces the search cost (about half a GPU day), but also alleviates over-fitting by reducing the redundant parameters. Extensive experiments on different search space and various datasets have been conducted to verify our approach, showing that MergeNAS can converge to a stable architecture and achieve better performance with fewer parameters and search cost. For test accuracy and its stability, MergeNAS outperforms all NAS baseline methods implemented on NAS-Bench-201, including DARTS, ENAS, RS, BOHB, GDAS and hand-crafted ResNet.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:3073-3079
【Authors】: Hui Xue ; Zheng-Fan Wu
【Abstract】: Recently, deep spectral kernel networks (DSKNs) have attracted wide attention. They consist of periodic computational elements that can be activated across the whole feature spaces. In theory, DSKNs have the potential to reveal input-dependent and long-range characteristics, and thus are expected to perform more competitive than prevailing networks. But in practice, they are still unable to achieve the desired effects. The structural superiority of DSKNs comes at the cost of the difficult optimization. The periodicity of computational elements leads to many poor and dense local minima in loss landscapes. DSKNs are more likely stuck in these local minima, and perform worse than expected. Hence, in this paper, we propose the novel Bayesian random Kernel mapping Networks (BaKer-Nets) with preferable learning processes by escaping randomly from most local minima. Specifically, BaKer-Nets consist of two core components: 1) a prior-posterior bridge is derived to enable the uncertainty of computational elements reasonably; 2) a Bayesian learning paradigm is presented to optimize the prior-posterior bridge efficiently. With the well-tuned uncertainty, BaKer-Nets can not only explore more potential solutions to avoid local minima, but also exploit these ensemble solutions to strengthen their robustness. Systematical experiments demonstrate the significance of BaKer-Nets in improving learning processes on the premise of preserving the structural superiority.
【Keywords】: Machine Learning: Kernel Methods; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3080-3086
【Authors】: Hu Ding ; Fan Yang ; Mingyue Wang
【Abstract】: The density based clustering method Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a popular method for outlier recognition and has received tremendous attention from many different areas. A major issue of the original DBSCAN is that the time complexity could be as large as quadratic. Most of existing DBSCAN algorithms focus on developing efficient index structures to speed up the procedure in low-dimensional Euclidean space. However, the research of DBSCAN in high-dimensional Euclidean space or general metric spaces is still quite limited, to the best of our knowledge. In this paper, we consider the metric DBSCAN problem under the assumption that the inliers (excluding the outliers) have a low doubling dimension. We apply a novel randomized k-center clustering idea to reduce the complexity of range query, which is the most time consuming step in the whole DBSCAN procedure. Our proposed algorithms do not need to build any complicated data structures and are easy to implement in practice. The experimental results show that our algorithms can significantly outperform the existing DBSCAN algorithms in terms of running time.
【Keywords】: Machine Learning: Clustering; Machine Learning: Unsupervised Learning; Data Mining: Clustering, Unsupervised Learning;
【Paper Link】 【Pages】:3087-3093
【Authors】: Luting Yang ; Jianyi Yang ; Shaolei Ren
【Abstract】: Contextual bandit is a classic multi-armed bandit setting, where side information (i.e., context) is available before arm selection. A standard assumption is that exact contexts are perfectly known prior to arm selection and only single feedback is returned. In this work, we focus on multi-feedback bandit learning with probabilistic contexts, where a bundle of contexts are revealed to the agent along with their corresponding probabilities at the beginning of each round. This models such scenarios as where contexts are drawn from the probability output of a neural network and the reward function is jointly determined by multiple feedback signals. We propose a kernelized learning algorithm based on upper confidence bound to choose the optimal arm in reproducing kernel Hilbert space for each context bundle. Moreover, we theoretically establish an upper bound on the cumulative regret with respect to an oracle that knows the optimal arm given probabilistic contexts, and show that the bound grows sublinearly with time. Our simula- tion on machine learning model recommendation further validates the sub-linearity of our cumulative regret and demonstrates that our algorithm outper- forms the approach that selects arms based on the most probable context.
【Keywords】: Machine Learning: Online Learning; Machine Learning: Recommender Systems;
【Paper Link】 【Pages】:3094-3100
【Authors】: Tianpei Yang ; Jianye Hao ; Zhaopeng Meng ; Zongzhang Zhang ; Yujing Hu ; Yingfeng Chen ; Changjie Fan ; Weixun Wang ; Wulong Liu ; Zhaodong Wang ; Jiajie Peng
【Abstract】: Transfer learning has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing approaches either transfer previous knowledge by explicitly computing similarities between tasks or select appropriate source policies to provide guided explorations. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarities is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) by taking advantage of this idea. PTF learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as an option learning problem. PTF can be easily combined with existing DRL methods and experimental results show it significantly accelerates RL and surpasses state-of-the-art policy transfer methods in terms of learning efficiency and final performance in both discrete and continuous action spaces.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:3101-3107
【Authors】: Thanh Tan Nguyen ; Nan Ye ; Peter L. Bartlett
【Abstract】: We consider learning a convex combination of basis models, and present some new theoretical and empirical results that demonstrate the effectiveness of a greedy approach. Theoretically, we first consider whether we can use linear, instead of convex, combinations, and obtain generalization results similar to existing ones for learning from a convex hull. We obtain a negative result that even the linear hull of very simple basis functions can have unbounded capacity, and is thus prone to overfitting; on the other hand, convex hulls are still rich but have bounded capacities. Secondly, we obtain a generalization bound for a general class of Lipschitz loss functions. Empirically, we first discuss how a convex combination can be greedily learned with early stopping, and how a convex combination can be non-greedily learned when the number of basis models is known a priori. Our experiments suggest that the greedy scheme is competitive with or better than several baselines, including boosting and random forests. The greedy algorithm requires little effort in hyper-parameter tuning, and also seems able to adapt to the underlying complexity of the problem. Our code is available at https://github.com/tan1889/gce.
【Keywords】: Machine Learning: Ensemble Methods; Machine Learning: Learning Theory;
【Paper Link】 【Pages】:3108-3116
【Authors】: Mohammadamin Barekatain ; Ryo Yonetani ; Masashi Hamaya
【Abstract】: Transfer reinforcement learning (RL) aims at improving the learning efficiency of an agent by exploiting knowledge from other source agents trained on relevant tasks. However, it remains challenging to transfer knowledge between different environmental dynamics without having access to the source environments. In this work, we explore a new challenge in transfer RL, where only a set of source policies collected under diverse unknown dynamics is available for learning a target task efficiently. To address this problem, the proposed approach, MULTI-source POLicy AggRegation (MULTIPOLAR), comprises two key techniques. We learn to aggregate the actions provided by the source policies adaptively to maximize the target task performance. Meanwhile, we learn an auxiliary network that predicts residuals around the aggregated actions, which ensures the target policy's expressiveness even when some of the source policies perform poorly. We demonstrated the effectiveness of MULTIPOLAR through an extensive experimental evaluation across six simulated environments ranging from classic control problems to challenging robotics simulations, under both continuous and discrete action spaces. The demo videos and code are available on the project webpage: https://omron-sinicx.github.io/multipolar/.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Deep Reinforcement Learning;
【Paper Link】 【Pages】:3117-3123
【Authors】: Da Yu ; Huishuai Zhang ; Wei Chen ; Jian Yin ; Tie-Yan Liu
【Abstract】: Gradient perturbation, widely used for differentially private optimization, injects noise at every iterative update to guarantee differential privacy. Previous work first determines the noise level that can satisfy the privacy requirement and then analyzes the utility of noisy gradient updates as in the non-private case. In contrast, we explore how the privacy noise affects the optimization property. We show that for differentially private convex optimization, the utility guarantee of differentially private (stochastic) gradient descent is determined by an expected curvature rather than the minimum curvature. The expected curvature, which represents the average curvature over the optimization path, is usually much larger than the minimum curvature. By using the expected curvature, we show that gradient perturbation can achieve a significantly improved utility guarantee that can theoretically justify the advantage of gradient perturbation over other perturbation methods. Finally, our extensive experiments suggest that gradient perturbation with the advanced composition method indeed outperforms other perturbation approaches by a large margin, matching our theoretical findings.
【Keywords】: Machine Learning: Federated Learning; Multidisciplinary Topics and Applications: Security and Privacy;
【Paper Link】 【Pages】:3124-3130
【Authors】: Yuying Xing ; Guoxian Yu ; Jun Wang ; Carlotta Domeniconi ; Xiangliang Zhang
【Abstract】: Multi-view, Multi-instance, and Multi-label Learning (M3L) can model complex objects (bags), which are represented with different feature views, made of diverse instances, and annotated with discrete non-exclusive labels. Existing M3L approaches assume a complete correspondence between bags and views, and also assume a complete annotation for training. However, in practice, neither the correspondence between bags, nor the bags' annotations are complete. To tackle such a weakly-supervised M3L task, a solution called WSM3L is introduced. WSM3L adapts multimodal dictionary learning to learn a shared dictionary (representational space) across views and individual encoding vectors of bags for each view. The label similarity and feature similarity of encoded bags are jointly used to match bags across views. In addition, it replenishes the annotations of a bag based on the annotations of its neighborhood bags, and introduces a dispatch and aggregation term to dispatch bag-level annotations to instances and to reversely aggregate instance-level annotations to bags. WSM3L unifies these objectives and processes in a joint objective function to predict the instance-level and bag-level annotations in a coordinated fashion, and it further introduces an alternative solution for the objective function optimization. Extensive experimental results show the effectiveness of WSM3L on benchmark datasets.
【Keywords】: Machine Learning: Classification; Machine Learning: Multi-instance;Multi-label;Multi-view learning;
【Paper Link】 【Pages】:3131-3138
【Authors】: Rundong Wang ; Runsheng Yu ; Bo An ; Zinovi Rabinovich
【Abstract】: Hierarchical reinforcement learning (HRL) is a promising approach to solve tasks with long time horizons and sparse rewards. It is often implemented as a high-level policy assigning subgoals to a low-level policy. However, it suffers the high-level non-stationarity problem since the low-level policy is constantly changing. The non-stationarity also leads to the data efficiency problem: policies need more data at non-stationary states to stabilize training. To address these issues, we propose a novel HRL method: Interactive Influence-based Hierarchical Reinforcement Learning (I^2HRL). First, inspired by agent modeling, we enable the interaction between the low-level and high-level policies to stabilize the high-level policy training. The high-level policy makes decisions conditioned on the received low-level policy representation as well as the state of the environment. Second, we furthermore stabilize the high-level policy via an information-theoretic regularization with minimal dependence on the changing low-level policy. Third, we propose the influence-based exploration to more frequently visit the non-stationary states where more transition data is needed. We experimentally validate the effectiveness of the proposed solution in several tasks in MuJoCo domains by demonstrating that our approach can significantly boost the learning performance and accelerate learning compared with state-of-the-art HRL methods.
【Keywords】: Machine Learning: Deep Reinforcement Learning;
【Paper Link】 【Pages】:3139-3145
【Authors】: Wantong Lu ; Yantao Yu ; Yongzhe Chang ; Zhen Wang ; Chenhui Li ; Bo Yuan
【Abstract】: Factorization Machines (FMs) refer to a class of general predictors working with real valued feature vectors, which are well-known for their ability to estimate model parameters under significant sparsity and have found successful applications in many areas such as the click-through rate (CTR) prediction. However, standard FMs only produce a single fixed representation for each feature across different input instances, which may limit the CTR model’s expressive and predictive power. Inspired by the success of Input-aware Factorization Machines (IFMs), which aim to learn more flexible and informative representations of a given feature according to different input instances, we propose a novel model named Dual Input-aware Factorization Machines (DIFMs) that can adaptively reweight the original feature representations at the bit-wise and vector-wise levels simultaneously. Furthermore, DIFMs strategically integrate various components including Multi-Head Self-Attention, Residual Networks and DNNs into a unified end-to-end model. Comprehensive experiments on two real-world CTR prediction datasets show that the DIFM model can outperform several state-of-the-art models consistently.
【Keywords】: Machine Learning: Recommender Systems; Multidisciplinary Topics and Applications: Recommender Systems; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:3146-3152
【Authors】: Kwei-Herng Lai ; Daochen Zha ; Yuening Li ; Xia Hu
【Abstract】: Policy distillation, which transfers a teacher policy to a student policy has achieved great success in challenging tasks of deep reinforcement learning. This teacher-student framework requires a well-trained teacher model which is computationally expensive. Moreover, the performance of the student model could be limited by the teacher model if the teacher model is not optimal. In the light of collaborative learning, we study the feasibility of involving joint intellectual efforts from diverse perspectives of student models. In this work, we introduce dual policy distillation (DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment and extract knowledge from each other to enhance their learning. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms, since it is unclear whether the knowledge distilled from an imperfect and noisy peer learner would be helpful. To address the challenge, we theoretically justify that distilling knowledge from a peer learner will lead to policy improvement and propose a disadvantageous distillation strategy based on the theoretical results. The conducted experiments on several continuous control tasks show that the proposed framework achieves superior performance with a learning-based agent and function approximation without the use of expensive teacher models.
【Keywords】: Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Multi-agent Learning;
【Paper Link】 【Pages】:3153-3159
【Authors】: Hui Xu ; Chong Zhang ; Jiaxing Wang ; Deqiang Ouyang ; Yu Zheng ; Jie Shao
【Abstract】: Efficient exploration is a major challenge in Reinforcement Learning (RL) and has been studied extensively. However, for a new task existing methods explore either by taking actions that maximize task agnostic objectives (such as information gain) or applying a simple dithering strategy (such as noise injection), which might not be effective enough. In this paper, we investigate whether previous learning experiences can be leveraged to guide exploration of current new task. To this end, we propose a novel Exploration with Structured Noise in Parameter Space (ESNPS) approach. ESNPS utilizes meta-learning and directly uses meta-policy parameters, which contain prior knowledge, as structured noises to perturb the base model for effective exploration in new tasks. Experimental results on four groups of tasks: cheetah velocity, cheetah direction, ant velocity and ant direction demonstrate the superiority of ESNPS against a number of competitive baselines.
【Keywords】: Machine Learning: Deep Reinforcement Learning;
【Paper Link】 【Pages】:3160-3166
【Authors】: Feiping Nie ; Han Zhang ; Rong Wang ; Xuelong Li
【Abstract】: In this paper, we present a technique of definitely addressing the pairwise constraints in the semi-supervised clustering. Our method contributes to formulating the cannot-link relations and propagating them over the affinity graph flexibly. The pairwise constrained instances are provably guaranteed to be in the same or different connected components of the graph. Combined with the Laplacian rank constraint, the proposed model learns a Pairwise Constrained structured Optimal Graph (PCOG), from which the specified c clusters supporting the known pairwise constraints are directly obtained. An efficient algorithm invoked by the label propagation is designed to solve the formulation. Additionally, we also provide a compact criterion to acquire the key pairwise constraints for prompting the semi-supervised graph clustering. Substantial experimental results show that the proposed method achieves the significant improvements by using a few prior pairwise constraints.
【Keywords】: Machine Learning: Semi-Supervised Learning; Machine Learning: Clustering;
【Paper Link】 【Pages】:3167-3173
【Authors】: Hongting Zhang ; Pan Zhou ; Qiben Yan ; Xiao-Yang Liu
【Abstract】: Audio adversarial examples, imperceptible to humans, have been constructed to attack automatic speech recognition (ASR) systems. However, the adversarial examples generated by existing approaches usually incorporate noticeable noises, especially during the periods of silences and pauses. Moreover, the added noises often break temporal dependency property of the original audio, which can be easily detected by state-of-the-art defense mechanisms. In this paper, we propose a new Iterative Proportional Clipping (IPC) algorithm that preserves temporal dependency in audios for generating more robust adversarial examples. We are motivated by an observation that the temporal dependency in audios imposes a significant effect on human perception. Following our observation, we leverage a proportional clipping strategy to reduce noise during the low-intensity periods. Experimental results and user study both suggest that the generated adversarial examples can significantly reduce human-perceptible noises and resist the defenses based on the temporal structure.
【Keywords】: Machine Learning: Adversarial Machine Learning; Natural Language Processing: Speech;
【Paper Link】 【Pages】:3174-3180
【Authors】: Xiaobin Tang ; Jing Zhang ; Bo Chen ; Yang Yang ; Hong Chen ; Cuiping Li
【Abstract】: Knowledge graph alignment aims to link equivalent entities across different knowledge graphs. To utilize both the graph structures and the side information such as name, description and attributes, most of the works propagate the side information especially names through linked entities by graph neural networks. However, due to the heterogeneity of different knowledge graphs, the alignment accuracy will be suffered from aggregating different neighbors. This work presents an interaction model to only leverage the side information. Instead of aggregating neighbors, we compute the interactions between neighbors which can capture fine-grained matches of neighbors. Similarly, the interactions of attributes are also modeled. Experimental results show that our model significantly outperforms the best state-of-the-art methods by 1.9-9.7% in terms of HitRatio@1 on the dataset DBP15K.
【Keywords】: Machine Learning: Knowledge-based Learning;
【Paper Link】 【Pages】:3181-3187
【Authors】: Shanshan Wang ; Lei Zhang
【Abstract】: Existing adversarial domain adaptation methods mainly consider the marginal distribution and these methods may lead to either under transfer or negative transfer. To address this problem, we present a self-adaptive re-weighted adversarial domain adaptation approach, which tries to enhance domain alignment from the perspective of conditional distribution. In order to promote positive transfer and combat negative transfer, we reduce the weight of the adversarial loss for aligned features while increasing the adversarial force for those poorly aligned measured by the conditional entropy. Additionally, triplet loss leveraging source samples and pseudo-labeled target samples is employed on the confusing domain. Such metric loss ensures the distance of the intra-class sample pairs closer than the inter-class pairs to achieve the class-level alignment. In this way, the high accurate pseudolabeled target samples and semantic alignment can be captured simultaneously in the co-training process. Our method achieved low joint error of the ideal source and target hypothesis. The expected target error can then be upper bounded following Ben-David’s theorem. Empirical evidence demonstrates that the proposed model outperforms state of the arts on standard domain adaptation datasets.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning: Classification; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:3188-3194
【Authors】: Miao Zhang ; Huiqi Li ; Shirui Pan ; Taoping Liu ; Steven W. Su
【Abstract】: One-Shot Neural architecture search (NAS) has received wide attentions due to its computational efficiency. Most state-of-the-art One-Shot NAS methods use the validation accuracy based on inheriting weights from the supernet as the stepping stone to search for the best performing architecture, adopting a bilevel optimization pattern with assuming this validation accuracy approximates to the test accuracy after re-training. However, recent works have found that there is no positive correlation between the above validation accuracy and test accuracy for these One-Shot NAS methods, and this reward based sampling for supernet training also entails the rich-get-richer problem. To handle this deceptive problem, this paper presents a new approach, Efficient Novelty-driven Neural Architecture Search, to sample the most abnormal architecture to train the supernet. Specifically, a single-path supernet is adopted, and only the weights of a single architecture sampled by our novelty search are optimized in each step to reduce the memory demand greatly. Experiments demonstrate the effectiveness and efficiency of our novelty search based architecture sampling method.
【Keywords】: Machine Learning: Deep Learning; Heuristic Search and Game Playing: Heuristic Search and Machine Learning; Heuristic Search and Game Playing: Meta-Reasoning and Meta-heuristics; Machine Learning: Online Learning;
【Paper Link】 【Pages】:3195-3201
【Authors】: Qiulin Zhang ; Zhuqing Jiang ; Qishuo Lu ; Jia'nan Han ; Zhengxin Zeng ; Shang-hua Gao ; Aidong Men
【Abstract】: Many effective solutions have been proposed to reduce the redundancy of models for inference acceleration. Nevertheless, common approaches mostly focus on eliminating less important filters or constructing efficient operations, while ignoring the pattern redundancy in feature maps. We reveal that many feature maps within a layer share similar but not identical patterns. However, it is difficult to identify if features with similar patterns are redundant or contain essential details. Therefore, instead of directly removing uncertain redundant features, we propose a split based convolutional operation, namely SPConv, to tolerate features with similar patterns but require less computation. Specifically, we split input feature maps into the representative part and the uncertain redundant part, where intrinsic information is extracted from the representative part through relatively heavy computation while tiny hidden details in the uncertain redundant part are processed with some light-weight operation. To recalibrate and fuse these two groups of processed features, we propose a parameters-free feature fusion module. Moreover, our SPConv is formulated to replace the vanilla convolution in a plug-and-play way. Without any bells and whistles, experimental results on benchmarks demonstrate SPConv-equipped networks consistently outperform state-of-the-art baselines in both accuracy and inference time on GPU, with FLOPs and parameters dropped sharply.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Feature Selection; Learning Sparse Models;
【Paper Link】 【Pages】:3202-3208
【Authors】: Jinglin Xu ; Xiangsen Zhang ; Wenbin Li ; Xinwang Liu ; Junwei Han
【Abstract】: Three-dimensional (3D) object classification is widely involved in various computer vision applications, e.g., autonomous driving, simultaneous localization and mapping, which has attracted lots of attention in the committee. However, solving 3D object classification by directly employing the 3D convolutional neural networks (CNNs) generally suffers from high computational cost. Besides, existing view-based methods cannot better explore the content relationships between views. To this end, this work proposes a novel multi-view framework by jointly using multiple 2D-CNNs to capture discriminative information with relationships as well as a new multi-view loss fusion strategy, in an end-to-end manner. Specifically, we utilize multiple 2D views of a 3D object as input and integrate the intra-view and inter-view information of each view through the view-specific 2D-CNN and a series of modules (outer product, view pair pooling, 1D convolution, and fully connected transformation). Furthermore, we design a novel view ensemble mechanism that selects several discriminative and informative views to jointly infer the category of a 3D object. Extensive experiments demonstrate that the proposed method is able to outperform current state-of-the-art methods on 3D object classification. More importantly, this work provides a new way to improve 3D object classification from the perspective of fully utilizing well-established 2D-CNNs.
【Keywords】: Machine Learning: Classification; Machine Learning: Multi-instance;Multi-label;Multi-view learning;
【Paper Link】 【Pages】:3209-3215
【Authors】: Hanyuan Zhang ; Xinyu Zhang ; Qize Jiang ; Baihua Zheng ; Zhenbang Sun ; Weiwei Sun ; Changhu Wang
【Abstract】: Trajectory similarity computation is a core problem in the field of trajectory data queries. However, the high time complexity of calculating the trajectory similarity has always been a bottleneck in real-world applications. Learning-based methods can map trajectories into a uniform embedding space to calculate the similarity of two trajectories with embeddings in constant time. In this paper, we propose a novel trajectory representation learning framework Traj2SimVec that performs scalable and robust trajectory similarity computation. We use a simple and fast trajectory simplification and indexing approach to obtain triplet training samples efficiently. We make the framework more robust via taking full use of the sub-trajectory similarity information as auxiliary supervision. Furthermore, the framework supports the point matching query by modeling the optimal matching relationship of trajectory points under different distance metrics. The comprehensive experiments on real-world datasets demonstrate that our model substantially outperforms all existing approaches.
【Keywords】: Machine Learning: Deep Learning; Data Mining: Mining Spatial, Temporal Data; Multidisciplinary Topics and Applications: Transportation;
【Paper Link】 【Pages】:3216-3222
【Authors】: Kangzhi Zhao ; Yong Zhang ; Hongzhi Yin ; Jin Wang ; Kai Zheng ; Xiaofang Zhou ; Chunxiao Xing
【Abstract】: Next Point-of-Interest (POI) recommendation plays an important role in location-based services. State-of-the-art methods learn the POI-level sequential patterns in the user's check-in sequence but ignore the subsequence patterns that often represent the socio-economic activities or coherence of preference of the users. However, it is challenging to integrate the semantic subsequences due to the difficulty to predefine the granularity of the complex but meaningful subsequences. In this paper, we propose Adaptive Sequence Partitioner with Power-law Attention (ASPPA) to automatically identify each semantic subsequence of POIs and discover their sequential patterns. Our model adopts a state-based stacked recurrent neural network to hierarchically learn the latent structures of the user's check-in sequence. We also design a power-law attention mechanism to integrate the domain knowledge in spatial and temporal contexts. Extensive experiments on two real-world datasets demonstrate the effectiveness of our model.
【Keywords】: Machine Learning: Recommender Systems; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:3223-3229
【Authors】: Yongbiao Gao ; Yu Zhang ; Xin Geng
【Abstract】: Label distribution learning (LDL) is a novel machine learning paradigm that gives a description degree of each label to an instance. However, most of training datasets only contain simple logical labels rather than label distributions due to the difficulty of obtaining the label distributions directly. We propose to use the prior knowledge to recover the label distributions. The process of recovering the label distributions from the logical labels is called label enhancement. In this paper, we formulate the label enhancement as a dynamic decision process. Thus, the label distribution is adjusted by a series of actions conducted by a reinforcement learning agent according to sequential state representations. The target state is defined by the prior knowledge. Experimental results show that the proposed approach outperforms the state-of-the-art methods in both age estimation and image emotion recognition.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Classification; Machine Learning: Deep Reinforcement Learning; Machine Learning Applications: Applications of Reinforcement Learning;
【Paper Link】 【Pages】:3230-3236
【Authors】: Jie Wen ; Zheng Zhang ; Yong Xu ; Bob Zhang ; Lunke Fei ; Guo-Sen Xie
【Abstract】: In recent years, incomplete multi-view clustering, which studies the challenging multi-view clustering problem on missing views, has received growing research interests. Although a series of methods have been proposed to address this issue, the following problems still exist: 1) Almost all of the existing methods are based on shallow models, which is difficult to obtain discriminative common representations. 2) These methods are generally sensitive to noise or outliers since the negative samples are treated equally as the important samples. In this paper, we propose a novel incomplete multi-view clustering network, called Cognitive Deep Incomplete Multi-view Clustering Network (CDIMC-net), to address these issues. Specifically, it captures the high-level features and local structure of each view by incorporating the view-specific deep encoders and graph embedding strategy into a framework. Moreover, based on the human cognition, \emph{i.e.}, learning from easy to hard, it introduces a self-paced strategy to select the most confident samples for model training, which can reduce the negative influence of outliers. Experimental results on several incomplete datasets show that CDIMC-net outperforms the state-of-the-art incomplete multi-view clustering methods.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Dimensionality Reduction and Manifold Learning; Data Mining: Clustering, Unsupervised Learning;
【Paper Link】 【Pages】:3237-3243
【Authors】: Zhiwei Zhang ; Shifeng Chen ; Lei Sun
【Abstract】: One-class novelty detection is to identify anomalous instances that do not conform to the expected normal instances. In this paper, the Generative Adversarial Networks (GANs) based on encoder-decoder-encoder pipeline are used for detection and achieve state-of-the-art performance. However, deep neural networks are too over-parameterized to deploy on resource-limited devices. Therefore, Progressive Knowledge Distillation with GANs (P-KDGAN) is proposed to learn compact and fast novelty detection networks. The P-KDGAN is a novel attempt to connect two standard GANs by the designed distillation loss for transferring knowledge from the teacher to the student. The progressive learning of knowledge distillation is a two-step approach that continuously improves the performance of the student GAN and achieves better performance than single step methods. In the first step, the student GAN learns the basic knowledge totally from the teacher via guiding of the pre-trained teacher GAN with fixed weights. In the second step, joint fine-training is adopted for the knowledgeable teacher and student GANs to further improve the performance and stability. The experimental results on CIFAR-10, MNIST, and FMNIST show that our method improves the performance of the student GAN by 2.44%, 1.77%, and 1.73% when compressing the computation at ratios of 24.45:1, 311.11:1, and 700:1, respectively.
【Keywords】: Machine Learning: Adversarial Machine Learning; Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks;
【Paper Link】 【Pages】:3244-3250
【Authors】: Jiankai Sun ; Jie Zhao ; Huan Sun ; Srinivasan Parthasarathy
【Abstract】: Routing newly posted questions (a.k.a cold questions) to potential answerers with suitable expertise in Community Question Answering sites (CQAs) is an important and challenging task. The existing methods either focus only on embedding the graph structural information and are less effective for newly posted questions, or adopt manually engineered feature vectors that are not as representative as the graph embedding methods. Therefore, we propose to address the challenge of leveraging heterogeneous graph and textual information for cold question routing by designing an end-to-end framework that jointly learns CQA node embeddings and finds best answerers for cold questions. We conducted extensive experiments to confirm the usefulness of incorporating the textual information from question tags and demonstrate that an end-2-end framework can achieve promising performances on routing newly posted questions asked by both existing users and newly registered users.
【Keywords】: Machine Learning: Recommender Systems; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Applications of Supervised Learning; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:3251-3257
【Authors】: Yucheng Zhao ; Chong Luo ; Zheng-Jun Zha ; Wenjun Zeng
【Abstract】: In this paper, we introduce Transformer to the time-domain methods for single-channel speech separation. Transformer has the potential to boost speech separation performance because of its strong sequence modeling capability. However, its computational complexity, which grows quadratically with the sequence length, has made it largely inapplicable to speech applications. To tackle this issue, we propose a novel variation of Transformer, named multi-scale group Transformer (MSGT). The key ideas are group self-attention, which significantly reduces the complexity, and multi-scale fusion, which retains Transform's ability to capture long-term dependency. We implement two versions of MSGT with different complexities, and apply them to a well-known time-domain speech separation method called Conv-TasNet. By simply replacing the original temporal convolutional network (TCN) with MSGT, our approach called MSGT-TasNet achieves a large gain over Conv-TasNet on both WSJ0-2mix and WHAM! benchmarks. Without bells and whistles, the performance of MSGT-TasNet is already on par with the SOTA methods.
【Keywords】: Machine Learning: Deep Learning: Sequence Modeling; Natural Language Processing: Speech;
【Paper Link】 【Pages】:3258-3266
【Authors】: Beitong Zhou ; Jun Liu ; Weigao Sun ; Ruijuan Chen ; Claire Tomlin ; Ye Yuan
【Abstract】: We propose a novel technique for improving the stochastic gradient descent (SGD) method to train deep networks, which we term pbSGD. The proposed pbSGD method simply raises the stochastic gradient to a certain power elementwise during iterations and introduces only one additional parameter, namely, the power exponent (when it equals to 1, pbSGD reduces to SGD). We further propose pbSGD with momentum, which we term pbSGDM. The main results of this paper present comprehensive experiments on popular deep learning models and benchmark datasets. Empirical results show that the proposed pbSGD and pbSGDM obtain faster initial training speed than adaptive gradient methods, comparable generalization ability with SGD, and improved robustness to hyper-parameter selection and vanishing gradients. pbSGD is essentially a gradient modifier via a nonlinear transformation. As such, it is orthogonal and complementary to other techniques for accelerating gradient-based optimization such as learning rate schedules. Finally, we show convergence rate analysis for both pbSGD and pbSGDM methods. The theoretical rates of convergence match the best known theoretical rates of convergence for SGD and SGDM methods on nonconvex functions.
【Keywords】: Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3267-3275
【Authors】: Jinghui Chen ; Dongruo Zhou ; Yiqi Tang ; Ziyan Yang ; Yuan Cao ; Quanquan Gu
【Abstract】: Adaptive gradient methods, which adopt historical gradient information to automatically adjust the learning rate, despite the nice property of fast convergence, have been observed to generalize worse than stochastic gradient descent (SGD) with momentum in training deep neural networks. This leaves how to close the generalization gap of adaptive gradient methods an open problem. In this work, we show that adaptive gradient methods such as Adam, Amsgrad, are sometimes "over adapted". We design a new algorithm, called Partially adaptive momentum estimation method, which unifies the Adam/Amsgrad with SGD by introducing a partial adaptive parameter $p$, to achieve the best from both worlds. We also prove the convergence rate of our proposed algorithm to a stationary point in the stochastic nonconvex optimization setting. Experiments on standard benchmarks show that our proposed algorithm can maintain fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks. These results would suggest practitioners pick up adaptive gradient methods once again for faster training of deep neural networks.
【Keywords】: Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3276-3282
【Authors】: Qinghua Ding ; Kaiwen Zhou ; James Cheng
【Abstract】: Riemannian gradient descent (RGD) is a simple, popular and efficient algorithm for leading eigenvector computation [AMS08]. However, the existing analysis of RGD for eigenproblem is still not tight, which is O(log(n/epsilon)/Delta^2) due to [Xu et al., 2018]. In this paper, we show that RGD in fact converges at rate O(log(n/epsilon)/Delta), and give instances to shows the tightness of our result. This improves the best prior analysis by a quadratic factor. Besides, we also give tight convergence analysis of a deterministic variant of Oja's rule due to [Oja, 1982]. We show that it also enjoys fast convergence rate of O(log(n/epsilon)/Delta). Previous papers only gave asymptotic characterizations [Oja, 1982; Oja, 1989; Yi et al., 2005]. Our tools for proving convergence results include an innovative reduction and chaining technique, and a noisy fixed point iteration argument. Besides, we also give empirical justifications of our convergence rates over synthetic and real data.
【Keywords】: Machine Learning: Dimensionality Reduction and Manifold Learning;
【Paper Link】 【Pages】:3283-3290
【Authors】: Yuan Zhuang ; Zhenguang Liu ; Peng Qian ; Qi Liu ; Xiang Wang ; Qinming He
【Abstract】: The security problems of smart contracts have drawn extensive attention due to the enormous financial losses caused by vulnerabilities. Existing methods on smart contract vulnerability detection heavily rely on fixed expert rules, leading to low detection accuracy. In this paper, we explore using graph neural networks (GNNs) for smart contract vulnerability detection. Particularly, we construct a contract graph to represent both syntactic and semantic structures of a smart contract function. To highlight the major nodes, we design an elimination phase to normalize the graph. Then, we propose a degree-free graph convolutional neural network (DR-GCN) and a novel temporal message propagation network (TMP) to learn from the normalized graphs for vulnerability detection. Extensive experiments show that our proposed approach significantly outperforms state-of-the-art methods in detecting three different types of vulnerabilities.
【Keywords】: Machine Learning: Knowledge-based Learning; Multidisciplinary Topics and Applications: Security and Privacy; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:3291-3298
【Authors】: Danbing Zou ; Qikui Zhu ; Pingkun Yan
【Abstract】: Domain adaptation aims to alleviate the problem of retraining a pre-trained model when applying it to a different domain, which requires large amount of additional training data of the target domain. Such an objective is usually achieved by establishing connections between the source domain labels and target domain data. However, this imbalanced source-to-target one way pass may not eliminate the domain gap, which limits the performance of the pre-trained model. In this paper, we propose an innovative Dual-Scheme Fusion Network (DSFN) for unsupervised domain adaptation. By building both source-to-target and target-to-source connections, this balanced joint information flow helps reduce the domain gap to further improve the network performance. The mechanism is further applied to the inference stage, where both the original input target image and the generated source images are segmented with the proposed joint network. The results are fused to obtain more robust segmentation. Extensive experiments of unsupervised cross-modality medical image segmentation are conducted on two tasks -- brain tumor segmentation and cardiac structures segmentation. The experimental results show that our method achieved significant performance improvement over other state-of-the-art domain adaptation methods.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Deep-learning Theory;
【Paper Link】 【Pages】:3299-3305
【Authors】: Tanli Zuo ; Yukun Qiu ; Weishi Zheng
【Abstract】: Graph convolutional networks (GCNs) have been widely used to process graph-structured data. However, existing GNN methods do not explicitly extract critical structures, which reflect the intrinsic property of a graph. In this work, we propose a novel GCN module named Neighbor Combinatorial ATtention (NCAT) to find critical structure in graph-structured data. NCAT attempts to match combinatorial neighbors with learnable patterns and assigns different weights to each combination based on the matching degree between the patterns and combinations. By stacking several NCAT modules, we can extract hierarchical structures that is helpful for down-stream tasks. Our experimental results show that NCAT achieves state-of-the-art performance on several benchmark graph classification datasets. In addition, we interpret what kind of features our model learned by visualizing the extracted critical structures.
【Keywords】: Machine Learning: Deep Learning; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:3307-3313
【Authors】: Chunheng Jiang ; Jianxi Gao ; Malik Magdon-Ismail
【Abstract】: Inferring topological characteristics of complex networks from observed data is critical to understand the dynamical behavior of networked systems, ranging from the Internet and the World Wide Web to biological networks and social networks. Prior studies usually focus on the structure-based estimation to infer network sizes, degree distributions, average degrees, and more. Little effort attempted to estimate the specific degree of each vertex from a sampled induced graph, which prevents us from measuring the lethality of nodes in protein networks and influencers in social networks. The current approaches dramatically fail for a tiny sampled induced graph and require a specific sampling method and a large sample size. These approaches neglect information of the vertex state, representing the dynamical behavior of the networked system, such as the biomass of species or expression of a gene, which is useful for degree estimation. We fill this gap by developing a framework to infer individual vertex degrees using both information of the sampled topology and vertex state. We combine the mean-field theory with combinatorial optimization to learn vertex degrees. Experimental results on real networks with a variety of dynamics demonstrate that our framework can produce reliable degree estimates and dramatically improve existing link prediction methods by replacing the sampled degrees with our estimated degrees.
【Keywords】: Machine Learning Applications: Networks; Uncertainty in AI: Approximate Probabilistic Inference; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:3314-3320
【Authors】: Penghao Sun ; Zehua Guo ; Junchao Wang ; Junfei Li ; Julong Lan ; Yuxiang Hu
【Abstract】: To improve the processing efficiency of jobs in distributed computing, the concept of coflow is proposed. A coflow is a collection of flows that are semantically correlated in a multi-stage computation task. A job consists of multiple coflows and can be usually formulated as a Directed-Acyclic Graph (DAG). A proper scheduling of coflows can significantly reduce the completion time of jobs in distributed computing. However, this scheduling problem is proved to be NP-hard. Different from existing schemes that use hand-crafted heuristic algorithms to solve this problem, in this paper, we propose a Deep Reinforcement Learning (DRL) framework named DeepWeave to generate coflow scheduling policies. To improve the inter-coflow scheduling ability in the job DAG, DeepWeave employs a Graph Neural Network (GNN) to process the DAG information. DeepWeave learns from the history workload trace to train the neural networks of the DRL agent and encodes the scheduling policy in the neural networks, which make coflow scheduling decisions without expert knowledge or a pre-assumed model. The proposed scheme is evaluated with a simulator using real-life traces. Simulation results show that DeepWeave completes jobs at least 1.7X faster than the state-of-the-art solutions.
【Keywords】: Machine Learning Applications: Applications of Reinforcement Learning; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:3321-3327
【Authors】: Dongxiao He ; Lu Zhai ; Zhigang Li ; Di Jin ; Liang Yang ; Yuxiao Huang ; Philip S. Yu
【Abstract】: Network embedding which is to learn a low dimensional representation of nodes in a network has been used in many network analysis tasks. Some network embedding methods, including those based on generative adversarial networks (GAN) (a promising deep learning technique), have been proposed recently. Existing GAN-based methods typically use GAN to learn a Gaussian distribution as a priori for network embedding. However, this strategy makes it difficult to distinguish the node representation from Gaussian distribution. Moreover, it does not make full use of the essential advantage of GAN (that is to adversarially learn the representation mechanism rather than the representation itself), leading to compromised performance of the method. To address this problem, we propose to use the adversarial idea on the representation mechanism, i.e. on the encoding mechanism under the framework of autoencoder. Specifically, we use the mutual information between node attributes and embedding as a reasonable alternative of this encoding mechanism (which is much easier to track). Additionally, we introduce another mapping mechanism (which is based on GAN) as a competitor into the adversarial learning system. A range of empirical results demonstrate the effectiveness of the proposed approach.
【Keywords】: Machine Learning Applications: Networks; Machine Learning: Adversarial Machine Learning; Multidisciplinary Topics and Applications: Web Analysis of Communities;
【Paper Link】 【Pages】:3328-3334
【Authors】: Kazuhiko Shinoda ; Hirotaka Kaji ; Masashi Sugiyama
【Abstract】: Positive-confidence (Pconf) classification [Ishida et al., 2018] is a promising weakly-supervised learning method which trains a binary classifier only from positive data equipped with confidence. However, in practice, the confidence may be skewed by bias arising in an annotation process. The Pconf classifier cannot be properly learned with skewed confidence, and consequently, the classification performance might be deteriorated. In this paper, we introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyperparameter which cancels out the negative impact of the skewed confidence under the assumption that we have the misclassification rate of positive samples as a prior knowledge. We demonstrate the effectiveness of the proposed method through a synthetic experiment with simple linear models and benchmark problems with neural network models. We also apply our method to drivers’ drowsiness prediction to show that it works well with a real-world problem where confidence is obtained based on manual annotation.
【Keywords】: Machine Learning Applications: Applications of Unsupervised Learning; Humans and AI: Personalization and User Modeling; Multidisciplinary Topics and Applications: Transportation;
【Paper Link】 【Pages】:3335-3341
【Authors】: Thanh Vu ; Dat Quoc Nguyen ; Anthony Nguyen
【Abstract】: ICD coding is a process of assigning the International Classification of Disease diagnosis codes to clinical/medical notes documented by health professionals (e.g. clinicians). This process requires significant human resources, and thus is costly and prone to error. To handle the problem, machine learning has been utilized for automatic ICD coding. Previous state-of-the-art models were based on convolutional neural networks, using a single/several fixed window sizes. However, the lengths and interdependence between text fragments related to ICD codes in clinical text vary significantly, leading to the difficulty of deciding what the best window sizes are. In this paper, we propose a new label attention model for automatic ICD coding, which can handle both the various lengths and the interdependence of the ICD code related text fragments. Furthermore, as the majority of ICD codes are not frequently used, leading to the extremely imbalanced data issue, we additionally propose a hierarchical joint learning mechanism extending our label attention model to handle the issue, using the hierarchical relationships among the codes. Our label attention model achieves new state-of-the-art results on three benchmark MIMIC datasets, and the joint learning mechanism helps improve the performances for infrequent codes.
【Keywords】: Machine Learning Applications: Bio/Medicine; Natural Language Processing: Text Classification; Machine Learning: Classification; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3342-3348
【Authors】: Ainaz Hajimoradlou ; Gioachino Roberti ; David Poole
【Abstract】: Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. Our model achieves 2-7% improvement in terms of accuracy and 2-15% boost in terms of log likelihood compared to the other proposed baselines.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Machine Learning Applications: Environmental; Machine Learning: Knowledge-based Learning;
【Paper Link】 【Pages】:3349-3355
【Authors】: Sebastian Schmoll ; Matthias Schubert
【Abstract】: We show that the task of collecting stochastic, spatially distributed resources (Stochastic Resource Collection, SRC) may be considered as a Semi-Markov-Decision-Process. Our Deep-Q-Network (DQN) based approach uses a novel scalable and transferable artificial neural network architecture. The concrete use-case of the SRC is an officer (single agent) trying to maximize the amount of fined parking violations in his area. We evaluate our approach on a environment based on the real-world parking data of the city of Melbourne. In small, hence simple, settings with short distances between resources and few simultaneous violations, our approach is comparable to previous work. When the size of the network grows (and hence the amount of resources) our solution significantly outperforms preceding methods. Moreover, applying a trained agent to a non-overlapping new area outperforms existing approaches.
【Keywords】: Machine Learning Applications: Applications of Reinforcement Learning; Multidisciplinary Topics and Applications: Transportation; Machine Learning: Deep Reinforcement Learning; Planning and Scheduling: Markov Decisions Processes;
【Paper Link】 【Pages】:3356-3363
【Authors】: Yudong Luo ; Oliver Schulte ; Pascal Poupart
【Abstract】: A major task of sports analytics is to rank players based on the impact of their actions. Recent methods have applied reinforcement learning (RL) to assess the value of actions from a learned action value or Q-function. A fundamental challenge for estimating action values is that explicit reward signals (goals) are very sparse in many team sports, such as ice hockey and soccer. This paper combines Q-function learning with inverse reinforcement learning (IRL) to provide a novel player ranking method. We treat professional play as expert demonstrations for learning an implicit reward function. Our method alternates single-agent IRL to learn a reward function for multiple agents; we provide a theoretical justification for this procedure. Knowledge transfer is used to combine learned rewards and observed rewards from goals. Empirical evaluation, based on 4.5M play-by-play events in the National Hockey League (NHL), indicates that player ranking using the learned rewards achieves high correlations with standard success measures and temporal consistency throughout a season.
【Keywords】: Machine Learning Applications: Applications of Reinforcement Learning; Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Other;
【Paper Link】 【Pages】:3364-3370
【Authors】: Govind Sharma ; Prasanna Patil ; M. Narasimha Murty
【Abstract】: Usual networks lossily (if not incorrectly) represent higher-order relations, i.e. those between multiple entities instead of a pair. This calls for complex structures such as hypergraphs to be used instead. Akin to the link prediction problem in graphs, we deal with hyperlink (higher-order link) prediction in hypergraphs. With a handful of solutions in the literature that seem to have merely scratched the surface, we provide improvements for the same. Motivated by observations in recent literature, we first formulate a "clique-closure" hypothesis (viz., hyperlinks are more likely to be formed from near-cliques rather than from non-cliques), test it on real hypergraphs, and then exploit it for our very problem. In the process, we generalize hyperlink prediction on two fronts: (1) from small-sized to arbitrary-sized hyperlinks, and (2) from a couple of domains to a handful. We perform experiments (both the hypothesis-test as well as the hyperlink prediction) on multiple real datasets, report results, and provide both quantitative and qualitative arguments favoring better performances w.r.t. the state-of-the-art.
【Keywords】: Machine Learning Applications: Networks; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning: Structured Prediction;
【Paper Link】 【Pages】:3371-3377
【Authors】: Ruimin Shen ; Yan Zheng ; Jianye Hao ; Zhaopeng Meng ; Yingfeng Chen ; Changjie Fan ; Yang Liu
【Abstract】: Generating diverse behaviors for game artificial intelligence (Game AI) has been long recognized as a challenging task in the game industry. Designing a Game AI with a satisfying behavioral characteristic (style) heavily depends on the domain knowledge and is hard to achieve manually. Deep reinforcement learning sheds light on advancing the automatic Game AI design. However, most of them focus on creating a superhuman Game AI, ignoring the importance of behavioral diversity in games. To bridge the gap, we introduce a new framework, named EMOGI, which can automatically generate desirable styles with almost no domain knowledge. More importantly, EMOGI succeeds in creating a range of diverse styles, providing behavior-diverse Game AIs. Evaluations on the Atari and real commercial games indicate that, compared to existing algorithms, EMOGI performs better in generating diverse behaviors and significantly improves the efficiency of Game AI design.
【Keywords】: Machine Learning Applications: Applications of Reinforcement Learning; Machine Learning Applications: Game Playing; Heuristic Search and Game Playing: Game Playing and Machine Learning;
【Paper Link】 【Pages】:3378-3384
【Authors】: Yuanrui Dong ; Peng Zhao ; Hanqiao Yu ; Cong Zhao ; Shusen Yang
【Abstract】: The emerging edge-cloud collaborative Deep Learning (DL) paradigm aims at improving the performance of practical DL implementations in terms of cloud bandwidth consumption, response latency, and data privacy preservation. Focusing on bandwidth efficient edge-cloud collaborative training of DNN-based classifiers, we present CDC, a Classification Driven Compression framework that reduces bandwidth consumption while preserving classification accuracy of edge-cloud collaborative DL. Specifically, to reduce bandwidth consumption, for resource-limited edge servers, we develop a lightweight autoencoder with a classification guidance for compression with classification driven feature preservation, which allows edges to only upload the latent code of raw data for accurate global training on the Cloud. Additionally, we design an adjustable quantization scheme adaptively pursuing the tradeoff between bandwidth consumption and classification accuracy under different network conditions, where only fine-tuning is required for rapid compression ratio adjustment. Results of extensive experiments demonstrate that, compared with DNN training with raw data, CDC consumes 14.9× less bandwidth with an accuracy loss no more than 1.06%, and compared with DNN training with data compressed by AE without guidance, CDC introduces at least 100% lower accuracy loss.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Agent-based and Multi-agent Systems: Coordination and Cooperation; Machine Learning: Federated Learning;
【Paper Link】 【Pages】:3385-3391
【Authors】: Honglu Zhou ; Shuyuan Xu ; Zuohui Fu ; Gerard de Melo ; Yongfeng Zhang ; Mubbasir Kapadia
【Abstract】: Multiscale modeling has yielded immense success on various machine learning tasks. However, it has not been properly explored for the prominent task of information diffusion, which aims to understand how information propagates along users in online social networks. For a specific user, whether and when to adopt a piece of information propagated from another user is affected by complex interactions, and thus, is very challenging to model. Current state-of-the-art techniques invoke deep neural models with vector representations of users. In this paper, we present a Hierarchical Information Diffusion (HID) framework by integrating user representation learning and multiscale modeling. The proposed framework can be layered on top of all information diffusion techniques that leverage user representations, so as to boost the predictive power and learning efficiency of the original technique. Extensive experiments on three real-world datasets showcase the superiority of our method.
【Keywords】: Machine Learning Applications: Web Sciences; Data Mining: Applications; Data Mining: Mining Text, Web, Social Media; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:3393-3399
【Authors】: Quan Yuan ; Jun Chen ; Chao Lu ; Haifeng Huang
【Abstract】: The automatic diagnosis has been suffering from the problem of inadequate reliable corpus to train a trustworthy predictive model. Besides, most of the previous deep learning based diagnosis models adopt the sequence learning techniques (CNN or RNN), which is difficult to extract the complex structural information, e.g. graph structure, between the critical medical entities. In this paper, we propose to build the diagnosis model based on the high-standard EMR documents from real hospitals to improve the accuracy and the credibility of the resulting model. Meanwhile, we introduce the Graph Convolutional Network into the model that alleviates the sparse feature problem and facilitates the extraction of structural information for diagnosis. Moreover, we propose the mutual attentive network to enhance the representation of inputs towards the better model performance. The evaluation conducted on the real EMR documents demonstrates that the proposed model is more accurate compared to the previous sequence learning based diagnosis models. The proposed model has been integrated into the information systems in over hundreds of primary health care facilities in China to assist physicians in the diagnostic process.
【Keywords】: Multidisciplinary Topics and Applications: Biology and Medicine; Machine Learning Applications: Bio/Medicine; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:3400-3406
【Authors】: Ke Li ; Lisi Chen ; Shuo Shang
【Abstract】: We investigate the problem of optimal route planning for massive-scale trips: Given a traffic-aware road network and a set of trip queries Q, we aim to find a route for each trip such that the global travel time cost for all queries in Q is minimized. Our problem is designed for a range of applications such as traffic-flow management, route planning and congestion prevention in rush hours. The exact algorithm bears exponential time complexity and is computationally prohibitive for application scenarios in dynamic traffic networks. To address the challenge, we propose a greedy algorithm and an epsilon-refining algorithm. Extensive experiments offer insight into the accuracy and efficiency of our proposed algorithms.
【Keywords】: Multidisciplinary Topics and Applications: Databases; Multidisciplinary Topics and Applications: Transportation; Planning and Scheduling: Real-time Planning; Planning and Scheduling: Planning Algorithms;
【Paper Link】 【Pages】:3407-3414
【Authors】: Ke Wang ; Xuyan Chen ; Ning Chen ; Ting Chen
【Abstract】: Automatic diagnosis based on clinical notes is critical especially in the emergency department, where a fast and professional result is vital in assuring proper and timely treatment. Previous works formalize this task as plain text classification and fail to utilize the medically significant tree structure of International Classification of Diseases (ICD) coding system. Besides, external medical knowledge is rarely used before, and we explore it by extracting relevant materials from Wikipedia or Baidupedia. In this paper, we propose a knowledge-based tree decoding model (K-BTD), and the inference procedure is a top-down decoding process from the root node to leaf nodes. The stepwise inference procedure enables the model to give support for decision at each step, which visualizes the diagnosis procedure and adds to the interpretability of final predictions. Experiments on real-world data from the emergency department of a large-scale hospital indicate that the proposed model outperforms all baselines in both micro-F1 and macro-F1, and reduce the semantic distance dramatically.
【Keywords】: Multidisciplinary Topics and Applications: Biology and Medicine; AI Ethics: Explainability; Natural Language Processing: Text Classification; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:3415-3421
【Authors】: Pengyang Wang ; Yanjie Fu ; Yuanchun Zhou ; Kunpeng Liu ; Xiaolin Li ; Kien A. Hua
【Abstract】: In this paper, we design and evaluate a new substructure-aware Graph Representation Learning (GRL) approach. GRL aims to map graph structure information into low-dimensional representations. While extensive efforts have been made for modeling global and/or local structure information, GRL can be improved by substructure information. Some recent studies exploit adversarial learning to incorporate substructure awareness, but hindered by unstable convergence. This study will address the major research question: is there a better way to integrate substructure awareness into GRL? As subsets of the graph structure, interested substructures (i.e., subgraph) are unique and representative for differentiating graphs, leading to the high correlation between the representation of the graph-level structure and substructures. Since mutual information (MI) is to evaluate the mutual dependence between two variables, we develop a MI inducted substructure-aware GRL method. We decompose the GRL pipeline into two stages: (1) node-level, where we introduce to maximize MI between the original and learned representation by the intuition that the original and learned representation should be highly correlated; (2) graph-level, where we preserve substructures by maximizing MI between the graph-level structure and substructure representation. Finally, we present extensive experimental results to demonstrate the improved performances of our method with real-world data.
【Keywords】: Multidisciplinary Topics and Applications: Recommender Systems; Data Mining: Applications; Data Mining: Mining Spatial, Temporal Data;
【Paper Link】 【Pages】:3422-3429
【Authors】: Seyed-Iman Mirzadeh ; Hassan Ghasemzadeh
【Abstract】: With the recent advances in both machine learning and embedded systems research, the demand to deploy computational models for real-time execution on edge devices has increased substantially. Without deploying computational models on edge devices, the frequent transmission of sensor data to the cloud results in rapid battery draining due to the energy consumption of wireless data transmission. This rapid power dissipation leads to a considerable reduction in the battery lifetime of the system, therefore jeopardizing the real-world utility of smart devices. It is well-established that for difficult machine learning tasks, models with higher performance often require more computation power and thus are not power-efficient choices for deployment on edge devices. However, the trade-offs between performance and power consumption are not well studied. While numerous methods (e.g., model compression) have been developed to obtain an optimal model, these methods focus on improving the efficiency of a "single" model. In an entirely new direction, we introduce an effective method to find a combination of "multiple" models that are optimal in terms of power-efficiency and performance by solving an optimization problem in which both performance and power consumption are taken into account. Experimental results demonstrate that on the ImageNet dataset, we can achieve a 20% energy reduction with only 0.3% accuracy drop compared to Squeeze-and-Excitation Networks. Compared to a pruned convolutional neural network for human activity recognition, while consuming 1.7% less energy, our proposed policy achieves 1.3% higher accuracy.
【Keywords】: Multidisciplinary Topics and Applications: Real-Time Systems; Multidisciplinary Topics and Applications: Ubiquitous Computing Systems; Machine Learning: Ensemble Methods; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3430-3436
【Authors】: Zhouxing Su ; Shihao Huang ; Chungen Li ; Zhipeng Lü
【Abstract】: The inventory routing problem (IRP), which is NP-hard, tackles the combination of inventory management and transportation optimization in supply chains. It seeks a minimum-cost schedule which utilizes a single vehicle to perform deliveries in multiple periods, so that no customer runs out of stock. Specifically, the solution of IRP can be represented as how many products should be delivered to which customer during each period, as well as the route in each period. We propose a two-stage matheuristic (TSMH) algorithm to solve the IRP. The first stage optimizes the overall schedule and generates an initial solution by a relax-and-repair method. The second stage employs an iterated tabu search procedure to achieve a fine-grained optimization to the current solution. Tested on 220 most commonly used benchmark instances, TSMH obtains advantages comparing to the state-of-the-art algorithms. The experimental results show that the proposed algorithm can obtain not only the optimal solutions for most small instances, but also better upper bounds for 40 out of 60 large instances. These results demonstrate that the TSMH algorithm is effective and efficient in solving the IRP. In addition, the comparative experiments justify the importance of two optimization stages of TSMH.
【Keywords】: Multidisciplinary Topics and Applications: Transportation; Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Heuristic Search and Game Playing: Meta-Reasoning and Meta-heuristics; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:3437-3443
【Authors】: Xiaotian Hao ; Junqi Jin ; Jianye Hao ; Jin Li ; Weixun Wang ; Yi Ma ; Zhenzhe Zheng ; Han Li ; Jian Xu ; Kun Gai
【Abstract】: Bipartite b-matching is fundamental in algorithm design, and has been widely applied into diverse applications, such as economic markets, labor markets, etc. These practical problems usually exhibit two distinct features: large-scale and dynamic, which requires the matching algorithm to be repeatedly executed at regular intervals. However, existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource. To address this issue, based on a key observation that the matching instances vary not too much, we propose NeuSearcher which leverage the knowledge learned from previously instances to solve new problem instances. Specifically, we design a multichannel graph neural network to predict the threshold of the matched edges, by which the search region could be significantly reduced. We further propose a parallel heuristic search algorithm to iteratively improve the solution quality until convergence. Experiments on both open and industrial datasets demonstrate that NeuSearcher can speed up 2 to 3 times while achieving exactly the same matching solution compared with the state-of-the-art approximation approaches.
【Keywords】: Multidisciplinary Topics and Applications: Recommender Systems; Machine Learning Applications: Applications of Supervised Learning; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:3444-3451
【Authors】: Run Wang ; Felix Juefei-Xu ; Lei Ma ; Xiaofei Xie ; Yihao Huang ; Jian Wang ; Yang Liu
【Abstract】: In recent years, generative adversarial networks (GANs) and its variants have achieved unprecedented success in image synthesis. They are widely adopted in synthesizing facial images which brings potential security concerns to humans as the fakes spread and fuel the misinformation. However, robust detectors of these AI-synthesized fake faces are still in their infancy and are not ready to fully tackle this emerging challenge. In this work, we propose a novel approach, named FakeSpotter, based on monitoring neuron behaviors to spot AI-synthesized fake faces. The studies on neuron coverage and interactions have successfully shown that they can be served as testing criteria for deep learning systems, especially under the settings of being exposed to adversarial attacks. Here, we conjecture that monitoring neuron behavior can also serve as an asset in detecting fake faces since layer-by-layer neuron activation patterns may capture more subtle features that are important for the fake detector. Experimental results on detecting four types of fake faces synthesized with the state-of-the-art GANs and evading four perturbation attacks show the effectiveness and robustness of our approach.
【Keywords】: Multidisciplinary Topics and Applications: Security and Privacy;
【Paper Link】 【Pages】:3452-3458
【Authors】: Huiyuan Chen ; Jing Li
【Abstract】: Precise medicine recommendations provide more effective treatments and cause fewer drug side effects. A key step is to understand the mechanistic relationships among drugs, targets, and diseases. Tensor-based models have the ability to explore relationships of drug-target-disease based on large amount of labeled data. However, existing tensor models fail to capture complex nonlinear dependencies among tensor data. In addition, rich medical knowledge are far less studied, which may lead to unsatisfied results. Here we propose a Neural Tensor Network (NeurTN) to assist personalized medicine treatments. NeurTN seamlessly combines tensor algebra and deep neural networks, which offers a more powerful way to capture the nonlinear relationships among drugs, targets, and diseases. To leverage medical knowledge, we augment NeurTN with geometric neural networks to capture the structural information of both drugs’ chemical structures and targets’ sequences. Extensive experiments on real-world datasets demonstrate the effectiveness of the NeurTN model.
【Keywords】: Multidisciplinary Topics and Applications: AI for Life Science; Multidisciplinary Topics and Applications: Biology and Medicine; Data Mining: Applications; Machine Learning Applications: Bio/Medicine;
【Paper Link】 【Pages】:3459-3465
【Authors】: Renjun Hu ; Xinjiang Lu ; Chuanren Liu ; Yanyan Li ; Hao Liu ; Jingjing Gu ; Shuai Ma ; Hui Xiong
【Abstract】: While Point-of-Interest (POI) recommendation has been a popular topic of study for some time, little progress has been made for understanding why and how people make their decisions for the selection of POIs. To this end, in this paper, we propose a user decision profiling framework, named PROUD, which can identify the key factors in people's decisions on choosing POIs. Specifically, we treat each user decision as a set of factors and provide a method for learning factor embeddings. A unique perspective of our approach is to identify key factors, while preserving decision structures seamlessly, via a novel scalar projection maximization objective. Exactly solving the objective is non-trivial due to a sparsity constraint. To address this, our PROUD adopts a self projection attention and an L2 regularized sparse activation to directly estimate the likelihood of each factor to be a key factor. Finally, extensive experiments on real-world data validate the advantage of PROUD in preserving user decision structures. Also, our case study indicates that the identified key decision factors can help us to provide more interpretable recommendations and analyses.
【Keywords】: Multidisciplinary Topics and Applications: Other; Multidisciplinary Topics and Applications: Ubiquitous Computing Systems; Data Mining: Applications;
【Paper Link】 【Pages】:3466-3472
【Authors】: Rong-Cheng Tu ; Xianling Mao ; Wei Wei
【Abstract】: Most of the unsupervised hashing methods usually map images into semantic similarity-preserving hash codes by constructing local semantic similarity structure as guiding information, i.e., treating each point similar to its k nearest neighbours. However, for an image, some of its k nearest neighbours may be dissimilar to it, i.e., they are noisy datapoints which will damage the retrieval performance. Thus, to tackle this problem, in this paper, we propose a novel deep unsupervised hashing method, called MLS3RDUH, which can reduce the noisy datapoints to further enhance retrieval performance. Specifically, the proposed method first defines a novel similarity matrix by utilising the intrinsic manifold structure in feature space and the cosine similarity of datapoints to reconstruct the local semantic similarity structure. Then a novel log-cosh hashing loss function is used to optimize the hashing network to generate compact hash codes by incorporating the defined similarity as guiding information. Extensive experiments on three public datasets show that the proposed method outperforms the state-of-the-art baselines.
【Keywords】: Multidisciplinary Topics and Applications: Information Retrieval; Machine Learning: Unsupervised Learning; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3473-3479
【Authors】: Sourav Medya ; Tiyani Ma ; Arlei Silva ; Ambuj K. Singh
【Abstract】: K-cores are maximal induced subgraphs where all vertices have degree at least k. These dense patterns have applications in community detection, network visualization and protein function prediction. However, k-cores can be quite unstable to network modifications, which motivates the question: How resilient is the k-core structure of a network, such as the Web or Facebook, to edge deletions? We investigate this question from an algorithmic perspective. More specifically, we study the problem of computing a small set of edges for which the removal minimizes the k-core structure of a network. This paper provides a comprehensive characterization of the hardness of the k-core minimization problem (KCM), including innaproximability and parameterized complexity. Motivated by these challenges, we propose a novel algorithm inspired by Shapley value---a cooperative game-theoretic concept--- that is able to leverage the strong interdependencies in the effects of edge removals in the search space. We efficiently approximate Shapley values using a randomized algorithm with probabilistic guarantees. Our experiments, show that the proposed algorithm outperforms competing solutions in terms of k-core minimization while being able to handle large graphs. Moreover, we illustrate how KCM can be applied in the analysis of the k-core resilience of networks.
【Keywords】: Multidisciplinary Topics and Applications: Social Sciences; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Theoretical Foundations;
【Paper Link】 【Pages】:3480-3486
【Authors】: Ferdinando Fioretto ; Lesia Mitridati ; Pascal Van Hentenryck
【Abstract】: This paper introduces a differentially private (DP) mechanism to protect the information exchanged during the coordination of sequential and interdependent markets. This coordination represents a classic Stackelberg game and relies on the exchange of sensitive information between the system agents. The paper is motivated by the observation that the perturbation introduced by traditional DP mechanisms fundamentally changes the underlying optimization problem and even leads to unsatisfiable instances. To remedy such limitation, the paper introduces the Privacy-Preserving Stackelberg Mechanism (PPSM), a framework that enforces the notions of feasibility and fidelity (i.e. near-optimality) of the privacy-preserving information to the original problem objective. PPSM complies with the notion of differential privacy and ensures that the outcomes of the privacy-preserving coordination mechanism are close-to-optimality for each agent. Experimental results on several gas and electricity market benchmarks based on a real case study demonstrate the effectiveness of the proposed approach. A full version of this paper [Fioretto et al., 2020b] contains complete proofs and additional discussion on the motivating application.
【Keywords】: Multidisciplinary Topics and Applications: Security and Privacy; Agent-based and Multi-agent Systems: Noncooperative Games; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:3487-3493
【Authors】: Rui Liu ; Huilin Peng ; Yong Chen ; Dell Zhang
【Abstract】: Personalized news recommendation can help users stay on top of the current affairs without being overwhelmed by the endless torrents of online news. However, the freshness or timeliness of news has been largely ignored by current news recommendation systems. In this paper, we propose a novel approach dubbed HyperNews which explicitly models the effect of timeliness on news recommendation. Furthermore, we introduce an auxiliary task of predicting the so-called "active-time" that users spend on each news article. Our key finding is that it is beneficial to address the problem of news recommendation together with the related problem of active-time prediction in a multi-task learning framework. Specifically, we train a double-task deep neural network (with a built-in timeliness module) to carry out news recommendation and active-time prediction simultaneously. To the best of our knowledge, such a "kill-two-birds-with-one-stone" solution has seldom been tried in the field of news recommendation before. Our extensive experiments on real-life news datasets have not only confirmed the mutual reinforcement of news recommendation and active-time prediction but also demonstrated significant performance improvements over state-of-the-art news recommendation techniques.
【Keywords】: Multidisciplinary Topics and Applications: Information Retrieval; Multidisciplinary Topics and Applications: Recommender Systems; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:3494-3500
【Authors】: Andrea Piazzoni ; Jim Cherian ; Martin Slavík ; Justin Dauwels
【Abstract】: Sensing and Perception (S&P;) is a crucial component of an autonomous system (such as a robot), especially when deployed in highly dynamic environments where it is required to react to unexpected situations. This is particularly true in case of Autonomous Vehicles (AVs) driving on public roads. However, the current evaluation metrics for perception algorithms are typically designed to measure their accuracy per se and do not account for their impact on the decision making subsystem(s). This limitation does not help developers and third party evaluators to answer a critical question: is the performance of a perception subsystem sufficient for the decision making subsystem to make robust, safe decisions? In this paper, we propose a simulation-based methodology towards answering this question. At the same time, we show how to analyze the impact of different kinds of sensing and perception errors on the behavior of the autonomous system.
【Keywords】: Multidisciplinary Topics and Applications: Transportation; Multidisciplinary Topics and Applications: Validation and Verification; Robotics: Vision and Perception; Agent-based and Multi-agent Systems: Agent-Based Simulation and Emergence;
【Paper Link】 【Pages】:3501-3507
【Authors】: Yunqiu Shao ; Jiaxin Mao ; Yiqun Liu ; Weizhi Ma ; Ken Satoh ; Min Zhang ; Shaoping Ma
【Abstract】: Legal case retrieval is a specialized IR task that involves retrieving supporting cases given a query case. Compared with traditional ad-hoc text retrieval, the legal case retrieval task is more challenging since the query case is much longer and more complex than common keyword queries. Besides that, the definition of relevance between a query case and a supporting case is beyond general topical relevance and it is therefore difficult to construct a large-scale case retrieval dataset, especially one with accurate relevance judgments. To address these challenges, we propose BERT-PLI, a novel model that utilizes BERT to capture the semantic relationships at the paragraph-level and then infers the relevance between two cases by aggregating paragraph-level interactions. We fine-tune the BERT model with a relatively small-scale case law entailment dataset to adapt it to the legal scenario and employ a cascade framework to reduce the computational cost. We conduct extensive experiments on the benchmark of the relevant case retrieval task in COLIEE 2019. Experimental results demonstrate that our proposed method outperforms existing solutions.
【Keywords】: Multidisciplinary Topics and Applications: Information Retrieval; Natural Language Processing: Information Retrieval; Machine Learning Applications: Other;
【Paper Link】 【Pages】:3508-3514
【Authors】: Jinhuan Liu ; Xuemeng Song ; Zhaochun Ren ; Liqiang Nie ; Zhaopeng Tu ; Jun Ma
【Abstract】: In recent years, there has been a growing interest in the fashion analysis (e.g., clothing matching) due to the huge economic value of the fashion industry. The essential problem is to model the compatibility between the complementary fashion items, such as the top and bottom in clothing matching. The majority of existing work on fashion analysis has focused on measuring the item-item compatibility in a latent space with deep learning methods. In this work, we aim to improve the compatibility modeling by sketching a compatible template for a given item as an auxiliary link between fashion items. Specifically, we propose an end-to-end Auxiliary Template-enhanced Generative Compatibility Modeling (AT-GCM) scheme, which introduces an auxiliary complementary template generation network equipped with the pixel-wise consistency and compatible template regularization. Extensive experiments on two real-world datasets demonstrate the superiority of the proposed approach.
【Keywords】: Multidisciplinary Topics and Applications: Information Retrieval; Multidisciplinary Topics and Applications: Recommender Systems;
【Paper Link】 【Pages】:3515-3521
【Authors】: Dongxiao He ; Yue Song ; Di Jin ; Zhiyong Feng ; Binbin Zhang ; Zhizhi Yu ; Weixiong Zhang
【Abstract】: Community detection, aiming at partitioning a network into multiple substructures, is practically importance. Graph convolutional network (GCN), a new deep-learning technique, has recently been developed for community detection. Markov Random Fields (MRF) has been combined with GCN in the MRFasGCN method to improve accuracy. However, the existing GCN community-finding methods are semi-supervised, even though community finding is essentially an unsupervised learning problem. We developed a new GCN approach for unsupervised community detection under the framework of Autoencoder. We cast MRFasGCN as an encoder and then derived node community membership in the hidden layer of the encoder. We introduced a community-centric dual decoder to reconstruct network structures and node attributes separately in an unsupervised fashion, for faithful community detection in the input space. We designed a scheme of local enhancement to accommodate nodes to have more common neighbors and similar attributes with similar community memberships. Experimental results on real networks showed that our new method outperformed the best existing methods, showing the effectiveness of the novel decoding mechanism for generating links and attributes together over the commonly used methods for reconstructing links alone.
【Keywords】: Multidisciplinary Topics and Applications: Web Analysis of Communities; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:3522-3528
【Authors】: Yumin Su ; Liang Zhang ; Quanyu Dai ; Bo Zhang ; Jinyao Yan ; Dan Wang ; Yongjun Bao ; Sulong Xu ; Yang He ; Weipeng Yan
【Abstract】: Conversion rate (CVR) prediction is becoming increasingly important in the multi-billion dollar online display advertising industry. It has two major challenges: firstly, the scarce user history data is very complicated and non-linear; secondly, the time delay between the clicks and the corresponding conversions can be very large, e.g., ranging from seconds to weeks. Existing models usually suffer from such scarce and delayed conversion behaviors. In this paper, we propose a novel deep learning framework to tackle the two challenges. Specifically, we extract the pre-trained embedding from impressions/clicks to assist in conversion models and propose an inner/self-attention mechanism to capture the fine-grained personalized product purchase interests from the sequential click data. Besides, to overcome the time-delay issue, we calibrate the delay model by learning dynamic hazard function with the abundant post-click data more in line with the real distribution. Empirical experiments with real-world user behavior data prove the effectiveness of the proposed method.
【Keywords】: Multidisciplinary Topics and Applications: Information Retrieval; Humans and AI: Personalization and User Modeling;
【Paper Link】 【Pages】:3529-3535
【Authors】: Harry W. H. Wong ; Jack P. K. Ma ; Donald P. H. Wong ; Lucien K. L. Ng ; Sherman S. M. Chow
【Abstract】: Privacy-preserving deep neural network (DNN) inference remains an intriguing problem even after the rapid developments of different communities. One challenge is that cryptographic techniques such as homomorphic encryption (HE) do not natively support non-linear computations (e.g., sigmoid). A recent work, BAYHENN (Xie et al., IJCAI'19), considers HE over the Bayesian neural network (BNN). The novelty lies in "meta-prediction" over a few noisy DNNs. The claim was that the clients can get intermediate outputs (to apply non-linear function) but are still prevented from learning the exact model parameters, which was justified via the widely-used learning-with-error (LWE) assumption (with Gaussian noises as the error). This paper refutes the security claim of BAYHENN via both theoretical and empirical analyses. We formally define a security game with different oracle queries capturing two realistic threat models. Our attack assuming a semi-honest adversary reveals all the parameters of single-layer BAYHENN, which generalizes to recovering the whole model that is "as good as" the BNN approximation of the original DNN, either under the malicious adversary model or with an increased number of oracle queries. This shows the need for rigorous security analysis ("the noise introduced by BNN can obfuscate the model" fails -- it is beyond what LWE guarantees) and calls for the collaboration between cryptographers and machine-learning experts to devise practical yet provably-secure solutions.
【Keywords】: Multidisciplinary Topics and Applications: Security and Privacy; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3536-3543
【Authors】: Zhi Li ; Bo Wu ; Qi Liu ; Likang Wu ; Hongke Zhao ; Tao Mei
【Abstract】: Complementary recommendations, which aim at providing users product suggestions that are supplementary and compatible with their obtained items, have become a hot topic in both academia and industry in recent years. Existing work mainly focused on modeling the co-purchased relations between two items, but the compositional associations of item collections are largely unexplored. Actually, when a user chooses the complementary items for the purchased products, it is intuitive that she will consider the visual semantic coherence (such as color collocations, texture compatibilities) in addition to global impressions. Towards this end, in this paper, we propose a novel Content Attentive Neural Network (CANN) to model the comprehensive compositional coherence on both global contents and semantic contents. Specifically, we first propose a Global Coherence Learning (GCL) module based on multi-heads attention to model the global compositional coherence. Then, we generate the semantic-focal representations from different semantic regions and design a Focal Coherence Learning (FCL) module to learn the focal compositional coherence from different semantic-focal representations. Finally, we optimize the CANN in a novel compositional optimization strategy. Extensive experiments on the large-scale real-world data clearly demonstrate the effectiveness of CANN compared with several state-of-the-art methods.
【Keywords】: Multidisciplinary Topics and Applications: Recommender Systems; Data Mining: Mining Text, Web, Social Media; Humans and AI: Personalization and User Modeling; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3544-3550
【Authors】: Yankai Chen ; Jie Zhang ; Yixiang Fang ; Xin Cao ; Irwin King
【Abstract】: Given a graph G and a query vertex q, the topic of community search (CS), aiming to retrieve a dense subgraph of G containing q, has gained much attention. Most existing works focus on undirected graphs which overlooks the rich information carried by the edge directions. Recently, the problem of community search over directed graphs (or CSD problem) has been studied [Fang et al., 2019b]; it finds a connected subgraph containing q, where the in-degree and out-degree of each vertex within the subgraph are at least k and l, respectively. However, existing solutions are inefficient, especially on large graphs. To tackle this issue, in this paper we propose a novel index called D-Forest, which allows a CSD query to be completed within the optimal time cost. We further propose efficient index construction methods. Extensive experiments on six real large graphs show that our index-based query algorithm is up to two orders of magnitude faster than existing solutions.
【Keywords】: Multidisciplinary Topics and Applications: Web Analysis of Communities; Data Mining: Big Data, Large-Scale Systems; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:3551-3557
【Authors】: Lu Zhang ; Zhu Sun ; Jie Zhang ; Yu Lei ; Chen Li ; Ziqing Wu ; Horst Kloeden ; Felix Klanner
【Abstract】: Studies on next point-of-interest (POI) recommendation mainly seek to learn users' transition patterns with certain historical check-ins. However, in reality, users' movements are typically uncertain (i.e., fuzzy and incomplete) where most existing methods suffer from the transition pattern vanishing issue. To ease this issue, we propose a novel interactive multi-task learning (iMTL) framework to better exploit the interplay between activity and location preference. Specifically, iMTL introduces: (1) temporal-aware activity encoder equipped with fuzzy characterization over uncertain check-ins to unveil the latent activity transition patterns; (2) spatial-aware location preference encoder to capture the latent location transition patterns; and (3) task-specific decoder to make use of the learned latent transition patterns and enhance both activity and location prediction tasks in an interactive manner. Extensive experiments on three real-world datasets show the superiority of iMTL.
【Keywords】: Multidisciplinary Topics and Applications: Recommender Systems; Humans and AI: Personalization and User Modeling;
【Paper Link】 【Pages】:3558-3564
【Authors】: Aman Abidi ; Rui Zhou ; Lu Chen ; Chengfei Liu
【Abstract】: Enumerating maximal bicliques in a bipartite graph is an important problem in data mining, with innumerable real-world applications across different domains such as web community, bioinformatics, etc. Although substantial research has been conducted on this problem, surprisingly, we find that pivot-based search space pruning, which is quite effective in clique enumeration, has not been exploited in biclique scenario. Therefore, in this paper, we explore the pivot-based pruning for biclique enumeration. We propose an algorithm for implementing the pivot-based pruning, powered by an effective index structure Containment Directed Acyclic Graph (CDAG). Meanwhile, existing literature indicates contradictory findings on the order of vertex selection in biclique enumeration. As such, we re-examine the problem and suggest an offline ordering of vertices which expedites the pivot pruning. We conduct an extensive performance study using real-world datasets from a wide range of domains. The experimental results demonstrate that our algorithm is more scalable and outperforms all the existing algorithms across all datasets and can achieve a significant speedup against the previous algorithms.
【Keywords】: Multidisciplinary Topics and Applications: Databases; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:3565-3571
【Authors】: Ziye Zhu ; Yun Li ; Hanghang Tong ; Yu Wang
【Abstract】: Bug localization plays an important role in software quality control. Many supervised machine learning models have been developed based on historical bug-fix information. Despite being successful, these methods often require sufficient historical data (i.e., labels), which is not always available especially for newly developed software projects. In response, cross-project bug localization techniques have recently emerged whose key idea is to transferring knowledge from label-rich source project to locate bugs in the target project. However, a major limitation of these existing techniques lies in that they fail to capture the specificity of each individual project, and are thus prone to negative transfer. To address this issue, we propose an adversarial transfer learning bug localization approach, focusing on only transferring the common characteristics (i.e., public information) across projects. Specifically, our approach (CooBa) learns the indicative public information from cross-project bug reports through a shared encoder, and extracts the private information from code files by an individual feature extractor for each project. CooBa further incorporates adversarial learning mechanism to ensure that public information shared between multiple projects could be effectively extracted. Extensive experiments on four large-scale real-world data sets demonstrate that the proposed CooBa significantly outperforms the state of the art techniques.
【Keywords】: Multidisciplinary Topics and Applications: Knowledge-based Software Engineering; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:3573-3579
【Authors】: Rana Alshaikh ; Zied Bouraoui ; Steven Schockaert
【Abstract】: Conceptual spaces are geometric meaning representations in which similar entities are represented by similar vectors. They are widely used in cognitive science, but there has been relatively little work on learning such representations from data. In particular, while standard representation learning methods can be used to induce vector space embeddings from text corpora, these differ from conceptual spaces in two crucial ways. First, the dimensions of a conceptual space correspond to salient semantic features, known as quality dimensions, whereas the dimensions of learned vector space embeddings typically lack any clear interpretation. This has been partially addressed in previous work, which has shown that it is possible to identify directions in learned vector spaces which capture semantic features. Second, conceptual spaces are normally organised into a set of domains, each of which is associated with a separate vector space. In contrast, learned embeddings represent all entities in a single vector space. Our hypothesis in this paper is that such single-space representations are sub-optimal for learning quality dimensions, due to the fact that semantic features are often only relevant to a subset of the entities. We show that this issue can be mitigated by identifying features in a hierarchical fashion. Intuitively, the top-level features split the vector space into different domains, making it possible to subsequently identify domain-specific quality dimensions.
【Keywords】: Natural Language Processing: Natural Language Processing; Machine Learning: Interpretability; Humans and AI: Cognitive Modeling;
【Paper Link】 【Pages】:3580-3586
【Authors】: Qian Liu ; Bei Chen ; Jiaqi Guo ; Jian-Guang Lou ; Bin Zhou ; Dongmei Zhang
【Abstract】: Recently semantic parsing in context has received a considerable attention, which is challenging since there are complex contextual phenomena. Previous works verified their proposed methods in limited scenarios, which motivates us to conduct an exploratory study on context modeling methods under real-world semantic parsing in context. We present a grammar-based decoding semantic parser and adapt typical context modeling methods on top of it. We evaluate 13 context modeling methods on two large complex cross-domain datasets, and our best model achieves state-of-the-art performances on both datasets with significant improvements. Furthermore, we summarize the most frequent contextual phenomena, with a fine-grained analysis on representative models, which may shed light on potential research directions. Our code is available at https://github.com/microsoft/ContextualSP.
【Keywords】: Natural Language Processing: Natural Language Semantics; Natural Language Processing: Coreference Resolution; Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3587-3593
【Authors】: Guanhua Chen ; Yun Chen ; Yong Wang ; Victor O. K. Li
【Abstract】: Leveraging lexical constraint is extremely significant in domain-specific machine translation and interactive machine translation. Previous studies mainly focus on extending beam search algorithm or augmenting the training corpus by replacing source phrases with the corresponding target translation. These methods either suffer from the heavy computation cost during inference or depend on the quality of the bilingual dictionary pre-specified by user or constructed with statistical machine translation. In response to these problems, we present a conceptually simple and empirically effective data augmentation approach in lexical constrained neural machine translation. Specifically, we make constraint-aware training data by first randomly sampling the phrases of the reference as constraints, and then packing them together into the source sentence with a separation symbol. Extensive experiments on several language pairs demonstrate that our approach achieves superior translation results over the existing systems, improving translation of constrained sentences without hurting the unconstrained ones.
【Keywords】: Natural Language Processing: Machine Translation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3594-3600
【Authors】: Zhiwei Yang ; Hechang Chen ; Jiawei Zhang ; Jing Ma ; Yi Chang
【Abstract】: Named entity recognition (NER) is a fundamental task in the natural language processing (NLP) area. Recently, representation learning methods (e.g., character embedding and word embedding) have achieved promising recognition results. However, existing models only consider partial features derived from words or characters while failing to integrate semantic and syntactic information (e.g., capitalization, inter-word relations, keywords, lexical phrases, etc.) from multi-level perspectives. Intuitively, multi-level features can be helpful when recognizing named entities from complex sentences. In this study, we propose a novel framework called attention-based multi-level feature fusion (AMFF), which is used to capture the multi-level features from different perspectives to improve NER. Our model consists of four components to respectively capture the local character-level, global character-level, local word-level, and global word-level features, which are then fed into a BiLSTM-CRF network for the final sequence labeling. Extensive experimental results on four benchmark datasets show that our proposed model outperforms a set of state-of-the-art baselines.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Tagging, chunking, and parsing; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3601-3607
【Authors】: Hengyi Cai ; Hongshen Chen ; Yonghao Song ; Xiaofang Zhao ; Dawei Yin
【Abstract】: Humans benefit from previous experiences when taking actions. Similarly, related examples from the training data also provide exemplary information for neural dialogue models when responding to a given input message. However, effectively fusing such exemplary information into dialogue generation is non-trivial: useful exemplars are required to be not only literally-similar, but also topic-related with the given context. Noisy exemplars impair the neural dialogue models understanding the conversation topics and even corrupt the response generation. To address the issues, we propose an exemplar guided neural dialogue generation model where exemplar responses are retrieved in terms of both the text similarity and the topic proximity through a two-stage exemplar retrieval model. In the first stage, a small subset of conversations is retrieved from a training set given a dialogue context. These candidate exemplars are then finely ranked regarding the topical proximity to choose the best-matched exemplar response. To further induce the neural dialogue generation model consulting the exemplar response and the conversation topics more faithfully, we introduce a multi-source sampling mechanism to provide the dialogue model with both local exemplary semantics and global topical guidance during decoding. Empirical evaluations on a large-scale conversation dataset show that the proposed approach significantly outperforms the state-of-the-art in terms of both the quantitative metrics and human evaluations.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:3608-3614
【Authors】: Jian Liu ; Yubo Chen ; Jun Zhao
【Abstract】: Identifying causal relations of events is a crucial language understanding task. Despite many efforts for this task, existing methods lack the ability to adopt background knowledge, and they typically generalize poorly to new, previously unseen data. In this paper, we present a new method for event causality identification, aiming to address limitations of previous methods. On the one hand, our model can leverage external knowledge for reasoning, which can greatly enrich the representation of events; On the other hand, our model can mine event-agnostic, context-specific patterns, via a mechanism called event mention masking generalization, which can greatly enhance the ability of our model to handle new, previously unseen cases. In experiments, we evaluate our model on three benchmark datasets and show our model outperforms previous methods by a significant margin. Moreover, we perform 1) cross-topic adaptation, 2) exploiting unseen predicates, and 3) cross-task adaptation to evaluate the generalization ability of our model. Experimental results show that our model demonstrates a definite advantage over previous methods.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Knowledge Extraction; Natural Language Processing: Discourse;
【Paper Link】 【Pages】:3615-3621
【Authors】: Jiale Han ; Bo Cheng ; Xu Wang
【Abstract】: Multi-hop knowledge base question answering (KBQA) aims at finding the answers to a factoid question by reasoning across multiple triples. Note that when human performs multi-hop reasoning, one tends to concentrate on specific relation at different hops and pinpoint a group of entities connected by the relation. Hypergraph convolutional networks (HGCN) can simulate this behavior by leveraging hyperedges to connect more than two nodes more than pairwise connection. However, HGCN is for undirected graphs and does not consider the direction of information transmission. We introduce the directed-HGCN (DHGCN) to adapt to the knowledge graph with directionality. Inspired by human's hop-by-hop reasoning, we propose an interpretable KBQA model based on DHGCN, namely two-phase hypergraph based reasoning with dynamic relations, which explicitly updates relation information and dynamically pays attention to different relations at different hops. Moreover, the model predicts relations hop-by-hop to generate an intermediate relation path. We conduct extensive experiments on two widely used multi-hop KBQA datasets to prove the effectiveness of our model.
【Keywords】: Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3622-3628
【Authors】: Jian Liu ; Leyang Cui ; Hanmeng Liu ; Dandan Huang ; Yile Wang ; Yue Zhang
【Abstract】: Machine reading is a fundamental task for testing the capability of natural language understand- ing, which is closely related to human cognition in many aspects. With the rising of deep learning techniques, algorithmic models rival human performances on simple QA, and thus increasingly challenging machine reading datasets have been proposed. Though various challenges such as evidence integration and commonsense knowledge have been integrated, one of the fundamental capabilities in human reading, namely logical reasoning, is not fully investigated. We build a comprehensive dataset, named LogiQA, which is sourced from expert-written questions for testing human Logical reasoning. It consists of 8,678 QA instances, covering multiple types of deductive reasoning. Results show that state-of-the-art neural models perform by far worse than human ceiling. Our dataset can also serve as a benchmark for reinvestigating logical AI under the deep learning NLP setting. The dataset is freely available at https://github.com/lgw863/LogiQA-dataset.
【Keywords】: Natural Language Processing: Resources and Evaluation; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3629-3636
【Authors】: Zhongyang Li ; Xiao Ding ; Ting Liu ; J. Edward Hu ; Benjamin Van Durme
【Abstract】: We present a conditional text generation framework that posits sentential expressions of possible causes and effects. This framework depends on two novel resources we develop in the course of this work: a very large-scale collection of English sentences expressing causal patterns (CausalBank); and a refinement over previous work on constructing large lexical causal knowledge graphs (Cause Effect Graph). Further, we extend prior work in lexically-constrained decoding to support disjunctive positive constraints. Human assessment confirms that our approach gives high-quality and diverse outputs. Finally, we use CausalBank to perform continued training of an encoder supporting a recent state-of-the-art model for causal reasoning, leading to a 3-point improvement on the COPA challenge set, with no change in model architecture.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Knowledge Extraction; Natural Language Processing: Natural Language Processing; Knowledge Representation and Reasoning: Action, Change and Causality;
【Paper Link】 【Pages】:3637-3643
【Authors】: Shifeng Li ; Shi Feng ; Daling Wang ; Kaisong Song ; Yifei Zhang ; Weichao Wang
【Abstract】: Generating emotional responses is crucial for building human-like dialogue systems. However, existing studies have focused only on generating responses by controlling the agents' emotions, while the feelings of the users, which are the ultimate concern of a dialogue system, have been neglected. In this paper, we propose a novel variational model named EmoElicitor to generate appropriate responses that can elicit user's specific emotion. We incorporate the next-round utterance after the response into the posterior network to enrich the context, and we decompose single latent variable into several sequential ones to guide response generation with the help of a pre-trained language model. Extensive experiments conducted on real-world dataset show that EmoElicitor not only performs better than the baselines in term of diversity and semantic similarity, but also can elicit emotion with higher accuracy.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3644-3650
【Authors】: Yu Zeng ; Yan Gao ; Jiaqi Guo ; Bei Chen ; Qian Liu ; Jian-Guang Lou ; Fei Teng ; Dongmei Zhang
【Abstract】: Neural semantic parsers usually fail to parse long and complicated utterances into nested SQL queries, due to the large search space. In this paper, we propose a novel recursive semantic parsing framework called RECPARSER to generate the nested SQL query layer-by-layer. It decomposes the complicated nested SQL query generation problem into several progressive non-nested SQL query generation problems. Furthermore, we propose a novel Question Decomposer module to explicitly encourage RECPARSER to focus on different components of an utterance when predicting SQL queries of different layers. Experiments on the Spider dataset show that our approach is more effective compared to the previous works at predicting the nested SQL queries. In addition, we achieve an overall accuracy that is comparable with state-of-the-art approaches.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:3651-3657
【Authors】: Zhijiang Guo ; Guoshun Nan ; Wei Lu ; Shay B. Cohen
【Abstract】: The goal of medical relation extraction is to detect relations among entities, such as genes, mutations and drugs in medical texts. Dependency tree structures have been proven useful for this task. Existing approaches to such relation extraction leverage off-the-shelf dependency parsers to obtain a syntactic tree or forest for the text. However, for the medical domain, low parsing accuracy may lead to error propagation downstream the relation extraction pipeline. In this work, we propose a novel model which treats the dependency structure as a latent variable and induces it from the unstructured text in an end-to-end fashion. Our model can be understood as composing task-specific dependency forests that capture non-local interactions for better relation extraction. Extensive results on four datasets show that our model is able to significantly outperform state-of-the-art systems without relying on any direct tree supervision or pre-training.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3658-3664
【Authors】: Hao Nie ; Xianpei Han ; Le Sun ; Chi Man Wong ; Qiang Chen ; Suhui Wu ; Wei Zhang
【Abstract】: Entity alignment (EA) aims to identify entities located in different knowledge graphs (KGs) that refer to the same real-world object. To learn the entity representations, most EA approaches rely on either translation-based methods which capture the local relation semantics of entities or graph convolutional networks (GCNs), which exploit the global KG structure. Afterward, the aligned entities are identified based on their distances. In this paper, we propose to jointly leverage the global KG structure and entity-specific relational triples for better entity alignment. Specifically, a global structure and local semantics preserving network is proposed to learn entity representations in a coarse-to-fine manner. Experiments on several real-world datasets show that our method significantly outperforms other entity alignment approaches and achieves the new state-of-the-art performance.
【Keywords】: Natural Language Processing: Embeddings; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3665-3671
【Authors】: Cheng Fu ; Xianpei Han ; Jiaming He ; Le Sun
【Abstract】: Entity resolution (ER) aims to identify data records referring to the same real-world entity. Most existing ER approaches rely on the assumption that the entity records to be resolved are homogeneous, i.e., their attributes are aligned. Unfortunately, entities in real-world datasets are often heterogeneous, usually coming from different sources and being represented using different attributes. Furthermore, the entities’ attribute values may be redundant, noisy, missing, misplaced, or misspelled—we refer to it as the dirty data problem. To resolve the above problems, this paper proposes an end-to-end hierarchical matching network (HierMatcher) for entity resolution, which can jointly match entities in three levels—token, attribute, and entity. At the token level, a cross-attribute token alignment and comparison layer is designed to adaptively compare heterogeneous entities. At the attribute level, an attribute-aware attention mechanism is proposed to denoise dirty attribute values. Finally, the entity level matching layer effectively aggregates all matching evidence for the final ER decisions. Experimental results show that our method significantly outperforms previous ER methods on homogeneous, heterogeneous and dirty datasets.
【Keywords】: Natural Language Processing: Coreference Resolution; Natural Language Processing: Information Extraction; Data Mining: Classification, Semi-Supervised Learning;
【Paper Link】 【Pages】:3672-3678
【Authors】: Juntao Li ; Ruidan He ; Hai Ye ; Hwee Tou Ng ; Lidong Bing ; Rui Yan
【Abstract】: Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements over various cross-lingual and low-resource tasks. Through training on one hundred languages and terabytes of texts, cross-lingual language models have proven to be effective in leveraging high-resource languages to enhance low-resource language processing and outperform monolingual models. In this paper, we further investigate the cross-lingual and cross-domain (CLCD) setting when a pretrained cross-lingual language model needs to adapt to new domains. Specifically, we propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features and domain-invariant features from the entangled pretrained cross-lingual representations, given unlabeled raw texts in the source language. Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts. Experimental results show that our proposed method achieves significant performance improvements over the state-of-the-art pretrained cross-lingual language model in the CLCD setting.
【Keywords】: Natural Language Processing: Sentiment Analysis and Text Mining;
【Paper Link】 【Pages】:3679-3686
【Authors】: Yuncheng Hua ; Yuan-Fang Li ; Gholamreza Haffari ; Guilin Qi ; Wei Wu
【Abstract】: A compelling approach to complex question answering is to convert the question to a sequence of actions, which can then be executed on the knowledge base to yield the answer, aka the programmer-interpreter approach. Use similar training questions to the test question, meta-learning enables the programmer to adapt to unseen questions to tackle potential distributional biases quickly. However, this comes at the cost of manually labeling similar questions to learn a retrieval model, which is tedious and expensive. In this paper, we present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision, i.e., the system’s performance with respect to the produced answers. To the best of our knowledge, this is the first attempt to train the retrieval model with the programmer jointly. Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases. We have released our code at https://github.com/DevinJake/MARL.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3687-3693
【Authors】: Weijing Huang ; Xianfeng Liao ; Zhiqiang Xie ; Jiang Qian ; Bojin Zhuang ; Shaojun Wang ; Jing Xiao
【Abstract】: Due to the improvement of Language Modeling, the emerging NLP assistant tools aiming for text generation greatly reduce the human workload on writing documents. However, the generation of legal text faces greater challenges than ordinary texts because of its high requirement for keeping logic reasonable, which can not be guaranteed by Language Modeling right now. To generate reasonable legal documents, we propose a novel method CoLMQA, which (1) combines Language Modeling and Question Answering, (2) generates text with slots by Language Modeling, and (3) fills the slots by our proposed Question Answering method named Transformer-based Key-Value Memory Networks. In CoLMQA, the slots represent the text part that needs to be highly constrained by logic, such as the name of the law and the number of the law article. And the Question Answering fills the slots in context with the help of Legal Knowledge Base to keep logic reasonable. The experiment verifies the quality of legal documents generated by CoLMQA, surpassing the documents generated by pure Language Modeling.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Question Answering; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:3694-3701
【Authors】: Xuancheng Huang ; Jiacheng Zhang ; Zhixing Tan ; Derek F. Wong ; Huanbo Luan ; Jingfang Xu ; Maosong Sun ; Yang Liu
【Abstract】: System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance. Although early statistical approaches to system combination have been proven effective in analyzing the consensus between hypotheses, they suffer from the error propagation problem due to the use of pipelines. While this problem has been alleviated by end-to-end training of multi-source sequence-to-sequence models recently, these neural models do not explicitly analyze the relations between hypotheses and fail to capture their agreement because the attention to a word in a hypothesis is calculated independently, ignoring the fact that the word might occur in multiple hypotheses. In this work, we propose an approach to modeling voting for system combination in machine translation. The basic idea is to enable words in hypotheses from different systems to vote on words that are representative and should get involved in the generation process. This can be done by quantifying the influence of each voter and its preference for each candidate. Our approach combines the advantages of statistical and neural methods since it can not only analyze the relations between hypotheses but also allow for end-to-end training. Experiments show that our approach is capable of better taking advantage of the consensus between hypotheses and achieves significant improvements over state-of-the-art baselines on Chinese-English and English-German machine translation tasks.
【Keywords】: Natural Language Processing: Machine Translation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3702-3708
【Authors】: Xin Lian ; Kshitij Jain ; Jakub Truszkowski ; Pascal Poupart ; Yaoliang Yu
【Abstract】: We study unsupervised multilingual alignment, the problem of finding word-to-word translations between multiple languages without using any parallel data. One popular strategy is to reduce multilingual alignment to the much simplified bilingual setting, by picking one of the input languages as the pivot language that we transit through. However, it is well-known that transiting through a poorly chosen pivot language (such as English) may severely degrade the translation quality, since the assumed transitive relations among all pairs of languages may not be enforced in the training process. Instead of going through a rather arbitrarily chosen pivot language, we propose to use the Wasserstein barycenter as a more informative ``mean'' language: it encapsulates information from all languages and minimizes all pairwise transportation costs. We evaluate our method on standard benchmarks and demonstrate state-of-the-art performances.
【Keywords】: Natural Language Processing: Machine Translation; Machine Learning: Unsupervised Learning; Machine Learning Applications: Applications of Unsupervised Learning;
【Paper Link】 【Pages】:3709-3715
【Authors】: Conghui Tan ; Di Jiang ; Jinhua Peng ; Xueyang Wu ; Qian Xu ; Qiang Yang
【Abstract】: Due to the rising awareness of privacy protection and the voluminous scale of speech data, it is becoming infeasible for Automatic Speech Recognition (ASR) system developers to train the acoustic model with complete data as before. In this paper, we propose a novel Divide-and-Merge paradigm to solve salient problems plaguing the ASR field. In the Divide phase, multiple acoustic models are trained based upon different subsets of the complete speech data, while in the Merge phase two novel algorithms are utilized to generate a high-quality acoustic model based upon those trained on data subsets. We first propose the Genetic Merge Algorithm (GMA), which is a highly specialized algorithm for optimizing acoustic models but suffers from low efficiency. We further propose the SGD-Based Optimizational Merge Algorithm (SOMA), which effectively alleviates the efficiency bottleneck of GMA and maintains superior performance. Extensive experiments on public data show that the proposed methods can significantly outperform the state-of-the-art.
【Keywords】: Natural Language Processing: Speech;
【Paper Link】 【Pages】:3716-3722
【Authors】: Tanya Chowdhury ; Sachin Kumar ; Tanmoy Chakraborty
【Abstract】: Attentional, RNN-based encoder-decoder architectures have obtained impressive performance on abstractive summarization of news articles. However, these methods fail to account for long term dependencies within the sentences of a document. This problem is exacerbated in multi-document summarization tasks such as summarizing the popular opinion in threads present in community question answering (CQA) websites such as Yahoo! Answers and Quora. These threads contain answers which often overlap or contradict each other. In this work, we present a hierarchical encoder based on structural attention to model such inter-sentence and inter-document dependencies. We set the popular pointer-generator architecture and some of the architectures derived from it as our baselines and show that they fail to generate good summaries in a multi-document setting. We further illustrate that our proposed model achieves significant improvement over the baseline in both single and multi-document summarization settings -- in the former setting, it beats the baseline by 1.31 and 7.8 ROUGE-1 points on CNN and CQA datasets, respectively; in the latter setting, the performance is further improved by 1.6 ROUGE-1 points on the CQA dataset.
【Keywords】: Natural Language Processing: Natural Language Summarization; Natural Language Processing: Other;
【Paper Link】 【Pages】:3723-3729
【Authors】: Zechang Li ; Yuxuan Lai ; Yansong Feng ; Dongyan Zhao
【Abstract】: Recently, semantic parsing has attracted much attention in the community. Although many neural modeling efforts have greatly improved the performance, it still suffers from the data scarcity issue. In this paper, we propose a novel semantic parser for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain. Our semantic parser benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages, i.e., focusing on domain invariant and domain specific information, respectively. In the coarse stage, our novel domain discrimination component and domain relevance attention encourage the model to learn transferable domain general structures. In the fine stage, the model is guided to concentrate on domain related details. Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies. Additionally, we show that our model can well exploit limited target data to capture the difference between the source and target domain, even when the target domain has far fewer training instances.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:3730-3736
【Authors】: Yimeng Chen ; Yanyan Lan ; Ruibin Xiong ; Liang Pang ; Zhiming Ma ; Xueqi Cheng
【Abstract】: Embedding-based evaluation measures have shown promising improvements on the correlation with human judgments in natural language generation. In these measures, various intrinsic metrics are used in the computation, including generalized precision, recall, F-score and the earth mover's distance. However, the relations between these metrics are unclear, making it difficult to determine which measure to use in real applications. In this paper, we provide an in-depth study on the relations between these metrics. Inspired by the optimal transportation theory, we prove that these metrics correspond to the optimal transport problem with different hard marginal constraints. However, these hard marginal constraints may cause the problem of incomplete and noisy matching in the evaluation process. Therefore we propose a family of new evaluation metrics, namely Lazy Earth Mover's Distances, based on the more general unbalanced optimal transport problem. Experimental results on WMT18 and WMT19 show that our proposed metrics have the ability to produce more consistent evaluation results with human judgements, as compared with existing intrinsic metrics.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Machine Translation; Natural Language Processing: Dialogue;
【Paper Link】 【Pages】:3737-3743
【Authors】: Hainan Zhang ; Yanyan Lan ; Liang Pang ; Hongshen Chen ; Zhuoye Ding ; Dawei Yin
【Abstract】: Topic drift is a common phenomenon in multi-turn dialogue. Therefore, an ideal dialogue generation models should be able to capture the topic information of each context, detect the relevant context, and produce appropriate responses accordingly. However, existing models usually use word or sentence level similarities to detect the relevant contexts, which fail to well capture the topical level relevance. In this paper, we propose a new model, named STAR-BTM, to tackle this problem. Firstly, the Biterm Topic Model is pre-trained on the whole training dataset. Then, the topic level attention weights are computed based on the topic representation of each context. Finally, the attention weights and the topic distribution are utilized in the decoding process to generate the corresponding responses. Experimental results on both Chinese customer services data and English Ubuntu dialogue data show that STAR-BTM significantly outperforms several state-of-the-art methods, in terms of both metric-based and human evaluations.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3744-3750
【Authors】: Hyeongju Kim ; Hyeonseung Lee ; Woo Hyun Kang ; Hyung Yong Kim ; Nam Soo Kim
【Abstract】: For multi-channel speech recognition, speech enhancement techniques such as denoising or dereverberation are conventionally applied as a front-end processor. Deep learning-based front-ends using such techniques require aligned clean and noisy speech pairs which are generally obtained via data simulation. Recently, several joint optimization techniques have been proposed to train the front-end without parallel data within an end-to-end automatic speech recognition (ASR) scheme. However, the ASR objective is sub-optimal and insufficient for fully training the front-end, which still leaves room for improvement. In this paper, we propose a novel approach which incorporates flow-based density estimation for the robust front-end using non-parallel clean and noisy speech. Experimental results on the CHiME-4 dataset show that the proposed method outperforms the conventional techniques where the front-end is trained only with ASR objective.
【Keywords】: Natural Language Processing: Speech; Machine Learning: Deep Generative Models; Machine Learning: Transfer, Adaptation, Multi-task Learning;
【Paper Link】 【Pages】:3751-3758
【Authors】: Yongrui Chen ; Huiying Li ; Yuncheng Hua ; Guilin Qi
【Abstract】: Formal query building is an important part of complex question answering over knowledge bases. It aims to build correct executable queries for questions. Recent methods try to rank candidate queries generated by a state-transition strategy. However, this candidate generation strategy ignores the structure of queries, resulting in a considerable number of noisy queries. In this paper, we propose a new formal query building approach that consists of two stages. In the first stage, we predict the query structure of the question and leverage the structure to constrain the generation of the candidate queries. We propose a novel graph generation framework to handle the structure prediction task and design an encoder-decoder model to predict the argument of the predetermined operation in each generative step. In the second stage, we follow the previous methods to rank the candidate queries. The experimental results show that our formal query building approach outperforms existing methods on complex questions while staying competitive on simple questions.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3759-3765
【Authors】: Ye Lin ; Yanyang Li ; Tengbo Liu ; Tong Xiao ; Tongran Liu ; Jingbo Zhu
【Abstract】: 8-bit integer inference, as a promising direction in reducing both the latency and storage of deep neural networks, has made great progress recently. On the other hand, previous systems still rely on 32-bit floating point for certain functions in complex models (e.g., Softmax in Transformer), and make heavy use of quantization and de-quantization. In this work, we show that after a principled modification on the Transformer architecture, dubbed Integer Transformer, an (almost) fully 8-bit integer inference algorithm Scale Propagation could be derived. De-quantization is adopted when necessary, which makes the network more efficient. Our experiments on WMT16 En<->Ro, WMT14 En<->De and En->Fr translation tasks as well as the WikiText-103 language modelling task show that the fully 8-bit Transformer system achieves comparable performance with the floating point baseline but requires nearly 4x less memory footprint.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Machine Translation;
【Paper Link】 【Pages】:3766-3772
【Authors】: Sixing Wu ; Ying Li ; Dawei Zhang ; Yang Zhou ; Zhonghai Wu
【Abstract】: Insufficient semantic understanding of dialogue always leads to the appearance of generic responses, in generative dialogue systems. Recently, high-quality knowledge bases have been introduced to enhance dialogue understanding, as well as to reduce the prevalence of boring responses. Although such knowledge-aware approaches have shown tremendous potential, they always utilize the knowledge in a black-box fashion. As a result, the generation process is somewhat uncontrollable, and it is also not interpretable. In this paper, we introduce a topic fact-based commonsense knowledge-aware approach, TopicKA. Different from previous works, TopicKA generates responses conditioned not only on the query message but also on a topic fact with an explicit semantic meaning, which also controls the direction of generation. Topic facts are recommended by a recommendation network trained under the Teacher-Student framework. To integrate the recommendation network and the generation network, this paper designs four schemes, which include two non-sampling schemes and two sampling methods. We collected and constructed a large-scale Chinese commonsense knowledge graph. Experimental results on an open Chinese benchmark dataset indicate that our model outperforms baselines in terms of both the objective and the subjective metrics.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3773-3779
【Authors】: Yang Bai ; Ziran Li ; Ning Ding ; Ying Shen ; Hai-Tao Zheng
【Abstract】: We study the problem of infobox-to-text generation that aims to generate a textual description from a key-value table. Representing the input infobox as a sequence, previous neural methods using end-to-end models without order-planning suffer from the problems of incoherence and inadaptability to disordered input. Recent planning-based models only implement static order-planning to guide the generation, which may cause error propagation between planning and generation. To address these issues, we propose a Tree-like PLanning based Attention Network (Tree-PLAN) which leverages both static order-planning and dynamic tuning to guide the generation. A novel tree-like tuning encoder is designed to dynamically tune the static order-plan for better planning by merging the most relevant attributes together layer by layer. Experiments conducted on two datasets show that our model outperforms previous methods on both automatic and human evaluation, and demonstrate that our model has better adaptability to disordered input.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3780-3786
【Authors】: Ziran Li ; Zibo Lin ; Ning Ding ; Hai-Tao Zheng ; Ying Shen
【Abstract】: Generating a textual description from a set of RDF triplets is a challenging task in natural language generation. Recent neural methods have become the mainstream for this task, which often generate sentences from scratch. However, due to the huge gap between the structured input and the unstructured output, the input triples alone are insufficient to decide an expressive and specific description. In this paper, we propose a novel anchor-to-prototype framework to bridge the gap between structured RDF triples and natural text. The model retrieves a set of prototype descriptions from the training data and extracts writing patterns from them to guide the generation process. Furthermore, to make a more precise use of the retrieved prototypes, we employ a triple anchor that aligns the input triples into groups so as to better match the prototypes. Experimental results on both English and Chinese datasets show that our method significantly outperforms the state-of-the-art baselines in terms of both automatic and manual evaluation, demonstrating the benefit of learning guidance from retrieved prototypes to facilitate triple-to-text generation.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing; Natural Language Processing: NLP Applications and Tools; Natural Language Processing: Other;
【Paper Link】 【Pages】:3787-3793
【Authors】: Jie Liu ; Shaowei Chen ; Bingquan Wang ; Jiaxin Zhang ; Na Li ; Tong Xu
【Abstract】: Joint entity and relation extraction is critical for many natural language processing (NLP) tasks, which has attracted increasing research interest. However, it is still faced with the challenges of identifying the overlapping relation triplets along with the entire entity boundary and detecting the multi-type relations. In this paper, we propose an attention-based joint model, which mainly contains an entity extraction module and a relation detection module, to address the challenges. The key of our model is devising a supervised multi-head self-attention mechanism as the relation detection module to learn the token-level correlation for each relation type separately. With the attention mechanism, our model can effectively identify overlapping relations and flexibly predict the relation type with its corresponding intensity. To verify the effectiveness of our model, we conduct comprehensive experiments on two benchmark datasets. The experimental results demonstrate that our model achieves state-of-the-art performances.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3794-3800
【Authors】: Xin Liu ; Kai Liu ; Xiang Li ; Jinsong Su ; Yubin Ge ; Bin Wang ; Jiebo Luo
【Abstract】: The lack of sufficient training data in many domains, poses a major challenge to the construction of domain-specific machine reading comprehension (MRC) models with satisfying performance. In this paper, we propose a novel iterative multi-source mutual knowledge transfer framework for MRC. As an extension of the conventional knowledge transfer with one-to-one correspondence, our framework focuses on the many-to-many mutual transfer, which involves synchronous executions of multiple many-to-one transfers in an iterative manner.Specifically, to update a target-domain MRC model, we first consider other domain-specific MRC models as individual teachers, and employ knowledge distillation to train a multi-domain MRC model, which is differentially required to fit the training data and match the outputs of these individual models according to their domain-level similarities to the target domain. After being initialized by the multi-domain MRC model, the target-domain MRC model is fine-tuned to match both its training data and the output of its previous best model simultaneously via knowledge distillation. Compared with previous approaches, our framework can continuously enhance all domain-specific MRC models by enabling each model to iteratively and differentially absorb the domain-shared knowledge from others. Experimental results and in-depth analyses on several benchmark datasets demonstrate the effectiveness of our framework.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3801-3807
【Authors】: Xiaoyuan Yi ; Zhenghao Liu ; Wenhao Li ; Maosong Sun
【Abstract】: Text style transfer pursues altering the style of a sentence while remaining its main content unchanged. Due to the lack of parallel corpora, most recent work focuses on unsupervised methods and has achieved noticeable progress. Nonetheless, the intractability of completely disentangling content from style for text leads to a contradiction of content preservation and style transfer accuracy. To address this problem, we propose a style instance supported method, StyIns. Instead of representing styles with embeddings or latent variables learned from single sentences, our model leverages the generative flow technique to extract underlying stylistic properties from multiple instances of each style, which form a more discriminative and expressive latent style space. By combining such a space with the attention-based structure, our model can better maintain the content and simultaneously achieve high transfer accuracy. Furthermore, the proposed method can be flexibly extended to semi-supervised learning so as to utilize available limited paired data. Experiments on three transfer tasks, sentiment modification, formality rephrasing, and poeticness generation, show that StyIns obtains a better balance between content and style, outperforming several recent baselines.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3808-3815
【Authors】: Haoyu Zhang ; Dingkun Long ; Guangwei Xu ; Muhua Zhu ; Pengjun Xie ; Fei Huang ; Ji Wang
【Abstract】: Fine-grained entity typing (FET) is a fundamental task for various entity-leveraging applications. Although great success has been made, existing systems still have challenges in handling noisy samples in training data introduced by distant supervision methods. To address these noise, previous studies either focus on processing the clean samples (i,e., have only one label) and noisy samples (i,e., have multiple labels) with different strategies or filtering the noisy labels based on the assumption that the distantly-supervised label set certainly contains the correct type label. In this paper, we propose a probabilistic automatic relabeling method which treats all training samples uniformly. Our method aims to estimate the pseudo-truth label distribution of each sample, and the pseudo-truth distribution will be treated as part of trainable parameters which are jointly updated during the training process. The proposed approach does not rely on any prerequisite or extra supervision, making it effective on real applications. Experiments on several benchmarks show that our method outperforms previous approaches and alleviates the noisy labeling problem.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:3816-3822
【Authors】: Chuanxin Tang ; Chong Luo ; Zhiyuan Zhao ; Wenxuan Xie ; Wenjun Zeng
【Abstract】: For single-channel speech enhancement, both time-domain and time-frequency-domain methods have their respective pros and cons. In this paper, we present a cross-domain framework named TFT-Net, which takes time-frequency spectrogram as input and produces time-domain waveform as output. Such a framework takes advantage of the knowledge we have about spectrogram and avoids some of the drawbacks that T-F-domain methods have been suffering from. In TFT-Net, we design an innovative dual-path attention block (DAB) to fully exploit correlations along the time and frequency axes. We further discover that a sample-independent DAB (SDAB) achieves a good tradeoff between enhanced speech quality and complexity. Ablation studies show that both the cross-domain design and the SDAB block bring large performance gain. When logarithmic MSE is used as the training criteria, TFT-Net achieves the highest SDR and SSNR among state-of-the-art methods on two major speech enhancement benchmarks.
【Keywords】: Natural Language Processing: Speech; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3823-3829
【Authors】: Makoto Nakatsuji ; Sohei Okui
【Abstract】: Machine reading comprehension methods that gen- erate answers by referring to multiple passages for a question have gained much attention in AI and NLP communities. The current methods, however, do not investigate the relationships among multi- ple passages in the answer generation process, even though topics correlated among the passages may be answer candidates. Our method, called neural answer Generation through Unified Memories over Multiple Passages (GUM-MP), solves this problem as follows. First, it determines which tokens in the passages are matched to the question. In particular, it investigates matches between tokens in positive passages, which are assigned to the question, and those in negative passages, which are not related to the question. Next, it determines which tokens in the passage are matched to other passages assigned to the same question and at the same time it investi- gates the topics in which they are matched. Finally, it encodes the token sequences with the above two matching results into unified memories in the pas- sage encoders and learns the answer sequence by using an encoder-decoder with a multiple-pointer- generator mechanism. As a result, GUM-MP can generate answers by pointing to important tokens present across passages. Evaluations indicate that GUM-MP generates much more accurate results than the current models do.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3830-3836
【Authors】: Xin Liu ; Jiefu Ou ; Yangqiu Song ; Xin Jiang
【Abstract】: Implicit discourse relation classification is one of the most difficult parts in shallow discourse parsing as the relation prediction without explicit connectives requires the language understanding at both the text span level and the sentence level. Previous studies mainly focus on the interactions between two arguments. We argue that a powerful contextualized representation module, a bilateral multi-perspective matching module, and a global information fusion module are all important to implicit discourse analysis. We propose a novel model to combine these modules together. Extensive experiments show that our proposed model outperforms BERT and other state-of-the-art systems on the PDTB dataset by around 8% and CoNLL 2016 datasets around 16%. We also analyze the effectiveness of different modules in the implicit discourse relation classification task and demonstrate how different levels of representation learning can affect the results.
【Keywords】: Natural Language Processing: Discourse; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:3837-3844
【Authors】: Edoardo Barba ; Luigi Procopio ; Niccolò Campolungo ; Tommaso Pasini ; Roberto Navigli
【Abstract】: The knowledge acquisition bottleneck strongly affects the creation of multilingual sense-annotated data, hence limiting the power of supervised systems when applied to multilingual Word Sense Disambiguation. In this paper, we propose a semi-supervised approach based upon a novel label propagation scheme, which, by jointly leveraging contextualized word embeddings and the multilingual information enclosed in a knowledge base, projects sense labels from a high-resource language, i.e., English, to lower-resourced ones. Backed by several experiments, we provide empirical evidence that our automatically created datasets are of a higher quality than those generated by other competitors and lead a supervised model to achieve state-of-the-art performances in all multilingual Word Sense Disambiguation tasks. We make our datasets available for research purposes at https://github.com/SapienzaNLP/mulan.
【Keywords】: Natural Language Processing: Natural Language Semantics; Natural Language Processing: Resources and Evaluation;
【Paper Link】 【Pages】:3845-3852
【Authors】: Qingkai Min ; Libo Qin ; Zhiyang Teng ; Xiao Liu ; Yue Zhang
【Abstract】: Dialogue state modules are a useful component in a task-oriented dialogue system. Traditional methods find dialogue states by manually labeling training corpora, upon which neural models are trained. However, the labeling process can be costly, slow, error-prone, and more importantly, cannot cover the vast range of domains in real-world dialogues for customer service. We propose the task of dialogue state induction, building two neural latent variable models that mine dialogue states automatically from unlabeled customer service dialogue records. Results show that the models can effectively find meaningful dialogue states. In addition, equipped with induced dialogue states, a state-of-the-art dialogue system gives better performance compared with not using a dialogue state module.
【Keywords】: Natural Language Processing: Dialogue;
【Paper Link】 【Pages】:3853-3860
【Authors】: Libo Qin ; Minheng Ni ; Yue Zhang ; Wanxiang Che
【Abstract】: Multi-lingual contextualized embeddings, such as multilingual-BERT (mBERT), have shown success in a variety of zero-shot cross-lingual tasks. However, these models are limited by having inconsistent contextualized representations of subwords across different languages. Existing work addresses this issue by bilingual projection and fine-tuning technique. We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT, which encourages model to align representations from source and multiple target languages once by mixing their context information. Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages. Experimental results on five tasks with 19 languages show that our method leads to significantly improved performances for all the tasks compared with mBERT.
【Keywords】: Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3861-3867
【Authors】: Jinglin Liu ; Yi Ren ; Xu Tan ; Chen Zhang ; Tao Qin ; Zhou Zhao ; Tie-Yan Liu
【Abstract】: Non-autoregressive translation (NAT) achieves faster inference speed but at the cost of worse accuracy compared with autoregressive translation (AT). Since AT and NAT can share model structure and AT is an easier task than NAT due to the explicit dependency on previous target-side tokens, a natural idea is to gradually shift the model training from the easier AT task to the harder NAT task. To smooth the shift from AT training to NAT training, in this paper, we introduce semi-autoregressive translation (SAT) as intermediate tasks. SAT contains a hyperparameter k, and each k value defines a SAT task with different degrees of parallelism. Specially, SAT covers AT and NAT as its special cases: it reduces to AT when k=1 and to NAT when k=N (N is the length of target sentence). We design curriculum schedules to gradually shift k from 1 to N, with different pacing functions and number of tasks trained at the same time. We called our method as task-level curriculum learning for NAT (TCL-NAT). Experiments on IWSLT14 De-En, IWSLT16 En-De, WMT14 En-De and De-En datasets show that TCL-NAT achieves significant accuracy improvements over previous NAT baselines and reduces the performance gap between NAT and AT models to 1-2 BLEU points, demonstrating the effectiveness of our proposed method.
【Keywords】: Natural Language Processing: Machine Translation;
【Paper Link】 【Pages】:3868-3874
【Authors】: Hui Liu ; Zhan Shi ; Jia-Chen Gu ; Quan Liu ; Si Wei ; Xiaodan Zhu
【Abstract】: Dialogue disentanglement aims to separate intermingled messages into detached sessions. The existing research focuses on two-step architectures, in which a model first retrieves the relationships between two messages and then divides the message stream into separate clusters. Almost all existing work puts significant efforts on selecting features for message-pair classification and clustering, while ignoring the semantic coherence within each session. In this paper, we introduce the first end-to- end transition-based model for online dialogue disentanglement. Our model captures the sequential information of each session as the online algorithm proceeds on processing a dialogue. The coherence in a session is hence modeled when messages are sequentially added into their best-matching sessions. Meanwhile, the research field still lacks data for studying end-to-end dialogue disentanglement, so we construct a large-scale dataset by extracting coherent dialogues from online movie scripts. We evaluate our model on both the dataset we developed and the publicly available Ubuntu IRC dataset [Kummerfeld et al., 2019]. The results show that our model significantly outperforms the existing algorithms. Further experiments demonstrate that our model better captures the sequential semantics and obtains more coherent disentangled sessions.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Information Extraction; Natural Language Processing: Information Retrieval;
【Paper Link】 【Pages】:3875-3881
【Authors】: Wei Song ; Ziyao Song ; Lizhen Liu ; Ruiji Fu
【Abstract】: Organization evaluation is an important dimension of automated essay scoring. This paper focuses on discourse element (i.e., functions of sentences and paragraphs) based organization evaluation. Existing approaches mostly separate discourse element identification and organization evaluation. In contrast, we propose a neural hierarchical multi-task learning approach for jointly optimizing sentence and paragraph level discourse element identification and organization evaluation. We represent the organization as a grid to simulate the visual layout of an essay and integrate discourse elements at multiple linguistic levels. Experimental results show that the multi-task learning based organization evaluation can achieve significant improvements compared with existing work and pipeline baselines. Multiple level discourse element identification also benefits from multi-task learning through mutual enhancement.
【Keywords】: Natural Language Processing: Discourse; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:3882-3890
【Authors】: Peter Clark ; Oyvind Tafjord ; Kyle Richardson
【Abstract】: Beginning with McCarthy's Advice Taker (1959), AI has pursued the goal of providing a system with explicit, general knowledge and having the system reason over that knowledge. However, expressing the knowledge in a formal (logical or probabilistic) representation has been a major obstacle to this research. This paper investigates a modern approach to this problem where the facts and rules are provided as natural language sentences, thus bypassing a formal representation. We train transformers to reason (or emulate reasoning) over these sentences using synthetically generated data. Our models, that we call RuleTakers, provide the first empirical demonstration that this kind of soft reasoning over language is learnable, can achieve high (99%) accuracy, and generalizes to test data requiring substantially deeper chaining than seen during training (95%+ scores). We also demonstrate that the models transfer well to two hand-authored rulebases, and to rulebases paraphrased into more natural language. These findings are significant as it suggests a new role for transformers, namely as limited "soft theorem provers" operating over explicit theories in language. This in turn suggests new possibilities for explainability, correctability, and counterfactual reasoning in question-answering.
【Keywords】: Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3891-3897
【Authors】: Kaitao Song ; Xu Tan ; Jianfeng Lu
【Abstract】: Neural machine translation (NMT) generates the next target token given as input the previous ground truth target tokens during training while the previous generated target tokens during inference, which causes discrepancy between training and inference as well as error propagation, and affects the translation accuracy. In this paper, we introduce an error correction mechanism into NMT, which corrects the error information in the previous generated tokens to better predict the next token. Specifically, we introduce two-stream self-attention from XLNet into NMT decoder, where the query stream is used to predict the next token, and meanwhile the content stream is used to correct the error information from the previous predicted tokens. We leverage scheduled sampling to simulate the prediction errors during training. Experiments on three IWSLT translation datasets and two WMT translation datasets demonstrate that our method achieves improvements over Transformer baseline and scheduled sampling. Further experimental analyses also verify the effectiveness of our proposed error correction mechanism to improve the translation quality.
【Keywords】: Natural Language Processing: Machine Translation; Natural Language Processing: Natural Language Generation;
【Paper Link】 【Pages】:3898-3904
【Authors】: Haochen Shi ; Siliang Tang ; Xiaotao Gu ; Bo Chen ; Zhigang Chen ; Jian Shao ; Xiang Ren
【Abstract】: The recent success of Distant Supervision (DS) brings abundant labeled data for the task of fine-grained entity typing (FET) without human annotation. However, the heuristically generated labels inevitably bring a significant distribution gap, namely dataset shift, between the distantly labeled training set and the manually curated test set. Considerable efforts have been made to alleviate this problem from the label perspective by either intelligently denoising the training labels, or designing noise-aware loss functions. Despite their progress, the dataset shift can hardly be eliminated completely. In this work, complementary to the label perspective, we reconsider this problem from the model perspective: Can we learn a more robust typing model with the existence of dataset shift? To this end, we propose a novel regularization module based on virtual adversarial training (VAT). The proposed approach first uses a self-paced sample selection function to select suitable samples for VAT, then constructs virtual adversarial perturbations based on the selected samples, and finally regularizes the model to be robust to such perturbations. Experiments on two benchmarks demonstrate the effectiveness of the proposed method, with an average 3.8%, 2.5%, and 3.2% improvement in accuracy, Macro F1 and Micro F1 respectively compared to the next best method.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3905-3911
【Authors】: Zeyun Tang ; Yongliang Shen ; Xinyin Ma ; Wei Xu ; Jiale Yu ; Weiming Lu
【Abstract】: Multi-hop reading comprehension across multiple documents attracts much attentions recently. In this paper, we propose a novel approach to tackle this multi-hop reading comprehension problem. Inspired by the human reasoning processing, we introduce a path-based graph with reasoning paths which extracted from supporting documents. The path-based graph can combine both the idea of the graph-based and path-based approaches, so it is better for multi-hop reasoning. Meanwhile, we propose Gated-GCN to accumulate evidences on the path-based graph, which contains a new question-aware gating mechanism to regulate the usefulness of information propagating across documents and add question information during reasoning. We evaluate our approach on WikiHop dataset, and our approach achieves the the-state-of-art accuracy against previous published approaches. Especially, our ensemble model surpasses the human performance by 4.2%.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:3912-3918
【Authors】: Antoine Gourru ; Julien Velcin ; Julien Jacques
【Abstract】: Gaussian Embedding of Linked Documents (GELD) is a new method that embeds linked documents (e.g., citation networks) onto a pretrained semantic space (e.g., a set of word embeddings). We formulate the problem in such a way that we model each document as a Gaussian distribution in the word vector space. We design a generative model that combines both words and links in a consistent way. Leveraging the variance of a document allows us to model the uncertainty related to word and link generation. In most cases, our method outperforms state-of-the-art methods when using our document vectors as features for usual downstream tasks. In particular, GELD achieves better accuracy in classification and link prediction on Cora and Dblp. In addition, we demonstrate qualitatively the convenience of several properties of our method. We provide the implementation of GELD and the evaluation datasets to the community (https://github.com/AntoineGourru/DNEmbedding).
【Keywords】: Natural Language Processing: Embeddings; Machine Learning: Probabilistic Machine Learning; Machine Learning: Learning Generative Models;
【Paper Link】 【Pages】:3919-3925
【Authors】: Tianming Wang ; Xiaojun Wan ; Shaowei Yao
【Abstract】: AMR-to-text generation is a challenging task of generating texts from graph-based semantic representations. Recent studies formalize this task a graph-to-sequence learning problem and use various graph neural networks to model graph structure. In this paper, we propose a novel approach that generates texts from AMR graphs while reconstructing the input graph structures. Our model employs graph attention mechanism to aggregate information for encoding the inputs. Moreover, better node representations are learned by optimizing two simple but effective auxiliary reconstruction objectives: link prediction objective which requires predicting the semantic relationship between nodes, and distance prediction objective which requires predicting the distance between nodes. Experimental results on two benchmark datasets show that our proposed model improves considerably over strong baselines and achieves new state-of-the-art.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3926-3932
【Authors】: Qianhui Wu ; Zijia Lin ; Börje F. Karlsson ; Biqing Huang ; Jianguang Lou
【Abstract】: Prior work in cross-lingual named entity recognition (NER) with no/little labeled data falls into two primary categories: model transfer- and data transfer-based methods. In this paper, we find that both method types can complement each other, in the sense that, the former can exploit context information via language-independent features but sees no task-specific information in the target language; while the latter generally generates pseudo target-language training data via translation but its exploitation of context information is weakened by inaccurate translations. Moreover, prior work rarely leverages unlabeled data in the target language, which can be effortlessly collected and potentially contains valuable information for improved results. To handle both problems, we propose a novel approach termed UniTrans to Unify both model and data Transfer for cross-lingual NER, and furthermore, leverage the available information from unlabeled target-language data via enhanced knowledge distillation. We evaluate our proposed UniTrans over 4 target languages on benchmark datasets. Our experimental results show that it substantially outperforms the existing state-of-the-art methods.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Named Entities; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3933-3940
【Authors】: Hongfei Xu ; Deyi Xiong ; Josef van Genabith ; Qiuhui Liu
【Abstract】: Existing Neural Machine Translation (NMT) systems are generally trained on a large amount of sentence-level parallel data, and during prediction sentences are independently translated, ignoring cross-sentence contextual information. This leads to inconsistency between translated sentences. In order to address this issue, context-aware models have been proposed. However, document-level parallel data constitutes only a small part of the parallel data available, and many approaches build context-aware models based on a pre-trained frozen sentence-level translation model in a two-step training manner. The computational cost of these approaches is usually high. In this paper, we propose to make the most of layers pre-trained on sentence-level data in contextual representation learning, reusing representations from the sentence-level Transformer and significantly reducing the cost of incorporating contexts in translation. We find that representations from shallow layers of a pre-trained sentence-level encoder play a vital role in source context encoding, and propose to perform source context encoding upon weighted combinations of pre-trained encoder layers' outputs. Instead of separately performing source context and input encoding, we propose to iteratively and jointly encode the source input and its contexts and to generate input-aware context representations with a cross-attention layer and a gating mechanism, which resets irrelevant information in context encoding. Our context-aware Transformer model outperforms the recent CADec [Voita et al., 2019c] on the English-Russian subtitle data and is about twice as fast in training and decoding.
【Keywords】: Natural Language Processing: Machine Translation;
【Paper Link】 【Pages】:3941-3947
【Authors】: Jun Xu ; Zeyang Lei ; Haifeng Wang ; Zheng-Yu Niu ; Hua Wu ; Wanxiang Che
【Abstract】: How to generate informative, coherent and sustainable open-domain conversations is a non-trivial task. Previous work on knowledge grounded conversation generation focus on improving dialog informativeness with little attention on dialog coherence. In this paper, to enhance multi-turn dialog coherence, we propose to leverage event chains to help determine a sketch of a multi-turn dialog. We first extract event chains from narrative texts and connect them as a graph. We then present a novel event graph grounded Reinforcement Learning (RL) framework. It conducts high-level response content (simply an event) planning by learning to walk over the graph, and then produces a response conditioned on the planned content. In particular, we devise a novel multi-policy decision making mechanism to foster a coherent dialog with both appropriate content ordering and high contextual relevance. Experimental results indicate the effectiveness of this framework in terms of dialog coherence and informativeness.
【Keywords】: Natural Language Processing: Dialogue;
【Paper Link】 【Pages】:3948-3954
【Authors】: Tianyang Zhao ; Zhao Yan ; Yunbo Cao ; Zhoujun Li
【Abstract】: Recent advances cast the entity-relation extraction to a multi-turn question answering (QA) task and provide an effective solution based on the machine reading comprehension (MRC) models. However, they use a single question to characterize the meaning of entities and relations, which is intuitively not enough because of the variety of context semantics. Meanwhile, existing models enumerate all relation types to generate questions, which is inefficient and easily leads to confusing questions. In this paper, we improve the existing MRC-based entity-relation extraction model through diverse question answering. First, a diversity question answering mechanism is introduced to detect entity spans and two answering selection strategies are designed to integrate different answers. Then, we propose to predict a subset of potential relations and filter out irrelevant ones to generate questions effectively. Finally, entity and relation extractions are integrated in an end-to-end way and optimized through joint learning. Experiment results show that the proposed method significantly outperforms baseline models, which improves the relation F1 to 62.1% (+1.9%) on ACE05 and 71.9% (+3.0%) on CoNLL04. Our implementation is available at https://github.com/TanyaZhao/MRC4ERE.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3955-3961
【Authors】: Mingtong Liu ; Erguang Yang ; Deyi Xiong ; Yujie Zhang ; Chen Sheng ; Changjian Hu ; Jinan Xu ; Yufeng Chen
【Abstract】: Paraphrase generation is of great importance to many downstream tasks in natural language processing. Recent efforts have focused on generating paraphrases in specific syntactic forms, which, generally, heavily relies on manually annotated paraphrase data that is not easily available for many languages and domains. In this paper, we propose a novel end-to-end framework to leverage existing large-scale bilingual parallel corpora to generate paraphrases under the control of syntactic exemplars. In order to train one model over the two languages of parallel corpora, we embed sentences of them into the same content and style spaces with shared content and style encoders using cross-lingual word embeddings. We propose an adversarial discriminator to disentangle the content and style space, and employ a latent variable to model the syntactic style of a given exemplar in order to guide the two decoders for generation. Additionally, we introduce cycle and masking learning schemes to efficiently train the model. Experiments and analyses demonstrate that the proposed model trained only on bilingual parallel data is capable of generating diverse paraphrases with desirable syntactic styles. Fine-tuning the trained model on a small paraphrase corpus makes it substantially outperform state-of-the-art paraphrase generation models trained on a larger paraphrase dataset.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:3962-3968
【Authors】: Shijie Yang ; Liang Li ; Shuhui Wang ; Weigang Zhang ; Qingming Huang ; Qi Tian
【Abstract】: Building intelligent agents to generate realistic Weibo comments is challenging. For such realistic Weibo comments, the key criterion is improving diversity while maintaining coherency. Considering that the variability of linguistic comments arises from multi-level sources, including both discourse-level properties and word-level selections, we improve the comment diversity by leveraging such inherent hierarchy. In this paper, we propose a structured latent variable recurrent network, which exploits the hierarchical-structured latent variables with stochastic attention to model the variations of comments. First, we endow both discourse-level and word-level latent variables with hierarchical and temporal dependencies for constructing multi-level hierarchy. Second, we introduce a stochastic attention to infer the key-words of interest in the input post. As a result, diverse comments can be generated with both discourse-level properties and local-word selections. Experiments on open-domain Weibo data show that our model generates more diverse and realistic comments.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: NLP Applications and Tools; Machine Learning: Deep Generative Models;
【Paper Link】 【Pages】:3969-3975
【Authors】: Yi Yang ; Hongan Wang ; Jiaqi Zhu ; Yunkun Wu ; Kailong Jiang ; Wenli Guo ; Wandong Shi
【Abstract】: Dataless text classification has attracted increasing attentions recently. It only needs very few seed words of each category to classify documents, which is much cheaper than supervised text classification that requires massive labeling efforts. However, most of existing models pay attention to long texts, but get unsatisfactory performance on short texts, which have become increasingly popular on the Internet. In this paper, we at first propose a novel model named Seeded Biterm Topic Model (SeedBTM) extending BTM to solve the problem of dataless short text classification with seed words. It takes advantage of both word co-occurrence information in the topic model and category-word similarity from widely used word embeddings as the prior topic-in-set knowledge. Moreover, with the same approach, we also propose Seeded Twitter Biterm Topic Model (SeedTBTM), which extends Twitter-BTM and utilizes additional user information to achieve higher classification accuracy. Experimental results on five real short-text datasets show that our models outperform the state-of-the-art methods, and especially perform well when the categories are overlapping and interrelated.
【Keywords】: Natural Language Processing: Text Classification; Data Mining: Mining Text, Web, Social Media; Machine Learning: Knowledge-based Learning; Machine Learning: Learning Graphical Models;
【Paper Link】 【Pages】:3976-3982
【Authors】: Tao Gui ; Jiacheng Ye ; Qi Zhang ; Yaqian Zhou ; Yeyun Gong ; Xuanjing Huang
【Abstract】: Document-level label consistency is an effective indicator that different occurrences of a particular token sequence are very likely to have the same entity types. Previous work focused on better context representations and used the CRF for label decoding. However, CRF-based methods are inadequate for modeling document-level label consistency. This work introduces a novel two-stage label refinement approach to handle document-level label consistency, where a key-value memory network is first used to record draft labels predicted by the base model, and then a multi-channel Transformer makes refinements on these draft predictions based on the explicit co-occurrence relationship derived from the memory network. In addition, in order to mitigate the side effects of incorrect draft labels, Bayesian neural networks are used to indicate the labels with a high probability of being wrong, which can greatly assist in preventing the incorrect refinement of correct draft labels. The experimental results on three named entity recognition benchmarks demonstrated that the proposed method significantly outperformed the state-of-the-art methods.
【Keywords】: Natural Language Processing: Named Entities; Natural Language Processing: Tagging, chunking, and parsing;
【Paper Link】 【Pages】:3983-3989
【Authors】: Zaixiang Zheng ; Xiang Yue ; Shujian Huang ; Jiajun Chen ; Alexandra Birch
【Abstract】: Document-level machine translation manages to outperform sentence level models by a small margin, but have failed to be widely adopted. We argue that previous research did not make a clear use of the global context, and propose a new document-level NMT framework that deliberately models the local context of each sentence with the awareness of the global context of the document in both source and target languages. We specifically design the model to be able to deal with documents containing any number of sentences, including single sentences. This unified approach allows our model to be trained elegantly on standard datasets without needing to train on sentence and document level data separately. Experimental results demonstrate that our model outperforms Transformer baselines and previous document-level NMT models with substantial margins of up to 2.1 BLEU on state-of-the-art baselines. We also provide analyses which show the benefit of context far beyond the neighboring two or three sentences, which previous studies have typically incorporated.
【Keywords】: Natural Language Processing: Machine Translation;
【Paper Link】 【Pages】:3990-3996
【Authors】: Chengkun Zhang ; Junbin Gao
【Abstract】: Hyperbolic space is a well-defined space with constant negative curvature. Recent research demonstrates its odds of capturing complex hierarchical structures with its exceptional high capacity and continuous tree-like properties. This paper bridges hyperbolic space's superiority to the power-law structure of documents by introducing a hyperbolic neural network architecture named Hyperbolic Hierarchical Attention Network (Hype-HAN). Hype-HAN defines three levels of embeddings (word/sentence/document) and two layers of hyperbolic attention mechanism (word-to-sentence/sentence-to-document) on Riemannian geometries of the Lorentz model, Klein model and Poincaré model. Situated on the evolving embedding spaces, we utilize both conventional GRUs (Gated Recurrent Units) and hyperbolic GRUs with Möbius operations. Hype-HAN is applied to large scale datasets. The empirical experiments show the effectiveness of our method.
【Keywords】: Natural Language Processing: Embeddings; Natural Language Processing: Text Classification; Machine Learning: Explainable Machine Learning;
【Paper Link】 【Pages】:3997-4003
【Authors】: Dongling Xiao ; Han Zhang ; Yu-Kun Li ; Yu Sun ; Hao Tian ; Hua Wu ; Haifeng Wang
【Abstract】: Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA). The source codes and pre-trained models have been released at https://github.com/PaddlePaddle/ERNIE/ernie-gen.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Natural Language Processing; Natural Language Processing: Natural Language Summarization;
【Paper Link】 【Pages】:4004-4010
【Authors】: Hongming Zhang ; Daniel Khashabi ; Yangqiu Song ; Dan Roth
【Abstract】: Commonsense knowledge acquisition is a key problem for artificial intelligence. Conventional methods of acquiring commonsense knowledge generally require laborious and costly human annotations, which are not feasible on a large scale. In this paper, we explore a practical way of mining commonsense knowledge from linguistic graphs, with the goal of transferring cheap knowledge obtained with linguistic patterns into expensive commonsense knowledge. The result is a conversion of ASER [Zhang et al., 2020], a large-scale selectional preference knowledge resource, into TransOMCS, of the same representation as ConceptNet [Liu and Singh, 2004] but two orders of magnitude larger. Experimental results demonstrate the transferability of linguistic knowledge to commonsense knowledge and the effectiveness of the proposed approach in terms of quantity, novelty, and quality. TransOMCS is publicly available at: https://github.com/HKUST-KnowComp/TransOMCS.
【Keywords】: Natural Language Processing: Knowledge Extraction; Natural Language Processing: Resources and Evaluation;
【Paper Link】 【Pages】:4011-4017
【Authors】: Jipeng Zhang ; Roy Ka-Wei Lee ; Ee-Peng Lim ; Wei Qin ; Lei Wang ; Jie Shao ; Qianru Sun
【Abstract】: Math word problem (MWP) is challenging due to the limitation in training data where only one “standard” solution is available. MWP models often simply fit this solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, this paper proposes a novel approach, TSN-MD, by leveraging the teacher network to integrate the knowledge of equivalent solution expressions and then to regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by a large margin. More intriguingly, the visualization results demonstrate that TSN-MD not only produces correct final answers but also generates diverse equivalent expressions of the solution.
【Keywords】: Natural Language Processing: Natural Language Generation; Natural Language Processing: Question Answering;
【Paper Link】 【Pages】:4018-4024
【Authors】: Congzheng Song ; Shanghang Zhang ; Najmeh Sadoughi ; Pengtao Xie ; Eric P. Xing
【Abstract】: The International Classification of Diseases (ICD) is a list of classification codes for the diagnoses. Automatic ICD coding is a multi-label text classification problem with noisy clinical document inputs and long-tailed label distribution, making it difficult for fine-grained classification on both frequent and zero-shot codes at the same time, i.e. generalized zero-shot ICD coding. In this paper, we propose a latent feature generation framework to improve the prediction on unseen codes without compromising the performance on seen codes. Our framework generates semantically meaningful features for zero-shot codes by exploiting ICD code hierarchical structure and reconstructing the code-relevant keywords with a novel cycle architecture. To the best of our knowledge, this is the first adversarial generative model for generalized zero-shot learning on multi-label text classification. Extensive experiments demonstrate the effectiveness of our approach. On the public MIMIC-III dataset, our methods improve the F1 score from nearly 0 to 20.91% for the zero-shot codes, and increase the AUC score by 3% (absolute improvement) from previous state of the art. Code is available at https://github.com/csong27/gzsl_text.
【Keywords】: Natural Language Processing: Text Classification; Machine Learning: Transfer, Adaptation, Multi-task Learning; Natural Language Processing: NLP Applications and Tools; Machine Learning Applications: Bio/Medicine;
【Paper Link】 【Pages】:4025-4031
【Authors】: Linshu Ouyang ; Yongzheng Zhang ; Hui Liu ; Yige Chen ; Yipeng Wang
【Abstract】: Authorship verification is an important problem that has many applications. The state-of-the-art deep authorship verification methods typically leverage character-level language models to encode author-specific writing styles. However, they often fail to capture syntactic level patterns, leading to sub-optimal accuracy in cross-topic scenarios. Also, due to imperfect cross-author parameter sharing, it's difficult for them to distinguish author-specific writing style from common patterns, leading to data-inefficient learning.
This paper introduces a novel POS-level (Part of Speech) gated RNN based language model to effectively learn the author-specific syntactic styles. The author-agnostic syntactic information obtained from the POS tagger pre-trained on large external datasets greatly reduces the number of effective parameters of our model, enabling the model to learn accurate author-specific syntactic styles with limited training data. We also utilize a gated architecture to learn the common syntactic writing styles with a small set of shared parameters and let the author-specific parameters focus on each author's special syntactic styles. Extensive experimental results show that our method achieves significantly better accuracy than state-of-the-art competing methods, especially in cross-topic scenarios (over 5\% in terms of AUC-ROC).
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: NLP Applications and Tools; Data Mining: Mining Text, Web, Social Media; Natural Language Processing: Other;
【Paper Link】 【Pages】:4032-4038
【Authors】: Shan Zhao ; Minghao Hu ; Zhiping Cai ; Fang Liu
【Abstract】: Joint extraction of entities and their relations benefits from the close interaction between named entities and their relation information. Therefore, how to effectively model such cross-modal interactions is critical for the final performance. Previous works have used simple methods such as label-feature concatenation to perform coarse-grained semantic fusion among cross-modal instances, but fail to capture fine-grained correlations over token and label spaces, resulting in insufficient interactions. In this paper, we propose a deep Cross-Modal Attention Network (CMAN) for joint entity and relation extraction. The network is carefully constructed by stacking multiple attention units in depth to fully model dense interactions over token-label spaces, in which two basic attention units are proposed to explicitly capture fine-grained correlations across different modalities (e.g., token-to-token and labelto-token). Experiment results on CoNLL04 dataset show that our model obtains state-of-the-art results by achieving 90.62% F1 on entity recognition and 72.97% F1 on relation classification. In ADE dataset, our model surpasses existing approaches by more than 1.9% F1 on relation classification. Extensive analyses further confirm the effectiveness of our approach.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Named Entities;
【Paper Link】 【Pages】:4039-4045
【Authors】: Yang Zhao ; Jiajun Zhang ; Yu Zhou ; Chengqing Zong
【Abstract】: Knowledge graphs (KGs) store much structured information on various entities, many of which are not covered by the parallel sentence pairs of neural machine translation (NMT). To improve the translation quality of these entities, in this paper we propose a novel KGs enhanced NMT method. Specifically, we first induce the new translation results of these entities by transforming the source and target KGs into a unified semantic space. We then generate adequate pseudo parallel sentence pairs that contain these induced entity pairs. Finally, NMT model is jointly trained by the original and pseudo sentence pairs. The extensive experiments on Chinese-to-English and Englishto-Japanese translation tasks demonstrate that our method significantly outperforms the strong baseline models in translation quality, especially in handling the induced entities.
【Keywords】: Natural Language Processing: Machine Translation; Natural Language Processing: Natural Language Generation; Knowledge Representation and Reasoning: Semantic Web;
【Paper Link】 【Pages】:4046-4053
【Authors】: Yu Zhang ; Houquan Zhou ; Zhenghua Li
【Abstract】: Estimating probability distribution is one of the core issues in the NLP field. However, in both deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in sequence labeling tasks, very few works have applied tree-structure CRF to constituency parsing, mainly due to the complexity and inefficiency of the inside-outside algorithm. This work presents a fast and accurate neural CRF constituency parser. The key idea is to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation. We also propose a simple two-stage bracketing-then-labeling parsing approach to improve efficiency further. To improve the parsing performance, inspired by recent progress in dependency parsing, we introduce a new scoring architecture based on boundary representation and biaffine attention, and a beneficial dropout strategy. Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second. We release our code at https://github.com/yzhangcs/crfpar.
【Keywords】: Natural Language Processing: Tagging, chunking, and parsing;
【Paper Link】 【Pages】:4054-4060
【Authors】: Yue Yuan ; Xiaofei Zhou ; Shirui Pan ; Qiannan Zhu ; Zeliang Song ; Li Guo
【Abstract】: Joint extraction of entities and relations is an important task in natural language processing (NLP), which aims to capture all relational triplets from plain texts. This is a big challenge due to some of the triplets extracted from one sentence may have overlapping entities. Most existing methods perform entity recognition followed by relation detection between every possible entity pairs, which usually suffers from numerous redundant operations. In this paper, we propose a relation-specific attention network (RSAN) to handle the issue. Our RSAN utilizes relation-aware attention mechanism to construct specific sentence representations for each relation, and then performs sequence labeling to extract its corresponding head and tail entities. Experiments on two public datasets show that our model can effectively extract overlapping triplets and achieve state-of-the-art performance.
【Keywords】: Natural Language Processing: Information Extraction; Natural Language Processing: Knowledge Extraction; Data Mining: Mining Graphs, Semi Structured Data, Complex Data; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:4062-4068
【Authors】: Dor Atzmon ; Jiaoyang Li ; Ariel Felner ; Eliran Nachmani ; Shahaf S. Shperberg ; Nathan Sturtevant ; Sven Koenig
【Abstract】: In the Multi-Agent Meeting problem (MAM), the task is to find a meeting location for multiple agents, as well as a path for each agent to that location. In this paper, we introduce MM, a Multi-Directional Heuristic Search algorithm that finds the optimal meeting location under different cost functions. MM generalizes the Meet in the Middle (MM) bidirectional search algorithm to the case of finding an optimal meeting location for multiple agents. Several admissible heuristics are proposed, and experiments demonstrate the benefits of MM*.
【Keywords】: Planning and Scheduling: Planning and Scheduling; Agent-based and Multi-agent Systems: Multi-agent Planning; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:4069-4075
【Authors】: George K. Atia ; Andre Beckus ; Ismail Alkhouri ; Alvaro Velasquez
【Abstract】: The formal synthesis of automated or autonomous agents has elicited strong interest from the artificial intelligence community in recent years. This problem space broadly entails the derivation of decision-making policies for agents acting in an environment such that a formal specification of behavior is satisfied. Popular formalisms for such specifications include the quintessential Linear Temporal Logic (LTL) and Computation Tree Logic (CTL) which reason over infinite sequences and trees, respectively, of states. However, the related and relevant problem of reasoning over the frequency with which states are visited infinitely and enforcing behavioral specifications on the same has received little attention. That problem, known as Steady-State Policy Synthesis (SSPS) or steady-state control, is the focus of this paper. Prior related work has been mostly confined to unichain Markov Decision Processes (MDPs), while a tractable solution to the general multichain setting heretofore remains elusive. In this paper, we provide a solution to the latter within the context of multichain MDPs over a class of policies that account for all possible transitions in the given MDP. The solution policy is derived from a novel linear program (LP) that encodes constraints on the limiting distributions of the Markov chain induced by said policy. We establish a one-to-one correspondence between the feasible solutions of the LP and the stationary distributions of the induced Markov chains. The derived policy is shown to maximize the reward among the constrained class of stationary policies and to satisfy the specification constraints even when it does not exercise all possible transitions.
【Keywords】: Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: Markov Decisions Processes; Agent-based and Multi-agent Systems: Formal Verification, Validation and Synthesis;
【Paper Link】 【Pages】:4076-4083
【Authors】: Daniel Höller ; Pascal Bercher ; Gregor Behnke
【Abstract】: In HTN planning, the hierarchy has a wide impact on solutions. First, there is (usually) no state-based goal given, the objective is given via the hierarchy. Second, it enforces actions to be in a plan. Third, planners are not allowed to add actions apart from those introduced via decomposition, i.e. via the hierarchy. However, no heuristic considers the interplay of hierarchy and actions in the plan exactly (without relaxation) because this makes heuristic calculation NP-hard even under delete relaxation. We introduce the problem class of delete- and ordering-free HTN planning as basis for novel HTN heuristics and show that its plan existence problem is still NP-complete. We then introduce heuristics based on the new class using an integer programming model to solve it.
【Keywords】: Planning and Scheduling: Hierarchical planning; Planning and Scheduling: Search in Planning and Scheduling;
【Paper Link】 【Pages】:4084-4090
【Authors】: Eli Boyarski ; Ariel Felner ; Daniel Harabor ; Peter J. Stuckey ; Liron Cohen ; Jiaoyang Li ; Sven Koenig
【Abstract】: Conflict-Based Search (CBS) is a leading algorithm for optimal Multi-Agent Path Finding (MAPF). CBS variants typically compute MAPF solutions using some form of A* search. However, they often do so under strict time limits so as to avoid exhausting the available memory. In this paper, we present IDCBS, an iterative-deepening variant of CBS which can be executed without exhausting the memory and without strict time limits. IDCBS can be substantially faster than CBS due to incremental methods that it uses when processing CBS nodes.
【Keywords】: Planning and Scheduling: Planning and Scheduling; Agent-based and Multi-agent Systems: Multi-agent Planning; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:4091-4097
【Authors】: Rebecca Eifler ; Marcel Steinmetz ; Álvaro Torralba ; Jörg Hoffmann
【Abstract】: Justifying a plan to a user requires answering questions about the space of possible plans. Recent work introduced a framework for doing so via plan-property dependencies, where plan properties p are Boolean functions on plans, and p entails q if all plans that satisfy p also satisfy q. We extend this work in two ways. First, we introduce new algorithms for computing plan-property dependencies, leveraging symbolic search and devising pruning methods for this purpose. Second, while the properties p were previously limited to goal facts and so-called action-set (AS) properties, here we extend them to LTL. Our new algorithms vastly outperform the previous ones, and our methods for LTL cause little overhead on AS properties.
【Keywords】: Planning and Scheduling: Planning Algorithms; Planning and Scheduling: Search in Planning and Scheduling;
【Paper Link】 【Pages】:4098-4105
【Authors】: Ryo Kuroiwa ; Alex Fukunaga
【Abstract】:
Although symbolic bidirectional search is successful in optimal classical planning, state-of-the-art satisficing planners do not use bidirectional search. Previous bidirectional search planners for satisficing planning behaved similarly to a trivial portfolio, which independently executes forward and backward search without the desired meet-in-the-middle'' behavior of bidirectional search where the forward and backward search frontiers intersect at some point relatively far from the forward and backward start states. In this paper, we propose Top-to-Top Bidirectional Search (TTBS), a new bidirectional search strategy with front-to-front heuristic evaluation. We show that TTBS strongly exhibits
meet-in-the-middle'' behavior and can solve instances solved by neither forward nor backward search on a number of domains.
【Keywords】: Planning and Scheduling: Planning Algorithms; Planning and Scheduling: Planning and Scheduling; Planning and Scheduling: Search in Planning and Scheduling;
【Paper Link】 【Pages】:4106-4112
【Authors】: Shant Boodaghians ; Federico Fusco ; Stefano Leonardi ; Yishay Mansour ; Ruta Mehta
【Abstract】: Efficient and truthful mechanisms to price time on remote servers/machines have been the subject of much work in recent years due to the importance of the cloud market. This paper considers online revenue maximization for a unit capacity server, when jobs are non preemptive, in the Bayesian setting: at each time step, one job arrives, with parameters drawn from an underlying distribution. We design an efficiently computable truthful posted price mechanism, which maximizes revenue in expectation and in retrospect, up to additive error. The prices are posted prior to learning the agent's type, and the computed pricing scheme is deterministic. We also show the pricing mechanism is robust to learning the job distribution from samples, where polynomially many samples suffice to obtain near optimal prices.
【Keywords】: Planning and Scheduling: Markov Decisions Processes; Planning and Scheduling: Planning under Uncertainty; Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:4113-4120
【Authors】: Marnix Suilen ; Nils Jansen ; Murat Cubuktepe ; Ufuk Topcu
【Abstract】: We study the problem of policy synthesis for uncertain partially observable Markov decision processes (uPOMDPs). The transition probability function of uPOMDPs is only known to belong to a so-called uncertainty set, for instance in the form of probability intervals. Such a model arises when, for example, an agent operates under information limitation due to imperfect knowledge about the accuracy of its sensors. The goal is to compute a policy for the agent that is robust against all possible probability distributions within the uncertainty set. In particular, we are interested in a policy that robustly ensures the satisfaction of temporal logic and expected reward specifications. We state the underlying optimization problem as a semi-infinite quadratically-constrained quadratic program (QCQP), which has finitely many variables and infinitely many constraints. Since QCQPs are non-convex in general and practically infeasible to solve, we resort to the so-called convex-concave procedure to convexify the QCQP. Even though convex, the resulting optimization problem still has infinitely many constraints and is NP-hard. For uncertainty sets that form convex polytopes, we provide a transformation of the problem to a convex QCQP with finitely many constraints. We demonstrate the feasibility of our approach by means of several case studies that highlight typical bottlenecks for our problem. In particular, we show that we are able to solve benchmarks with hundreds of thousands of states, hundreds of different observations, and we investigate the effect of different levels of uncertainty in the models.
【Keywords】: Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: POMDPs; Uncertainty in AI: Markov Decision Processes;
【Paper Link】 【Pages】:4121-4127
【Authors】: Steven Carr ; Nils Jansen ; Ufuk Topcu
【Abstract】: Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems. However, a major drawback in the application of RNN-based policies is the difficulty in providing formal guarantees on the satisfaction of behavioral specifications, e.g. safety and/or reachability. By integrating techniques from formal methods and machine learning, we propose an approach to automatically extract a finite-state controller (FSC) from an RNN, which, when composed with a finite-state system model, is amenable to existing formal verification tools. Specifically, we introduce an iterative modification to the so-called quantized bottleneck insertion technique to create an FSC as a randomized policy with memory. For the cases in which the resulting FSC fails to satisfy the specification, verification generates diagnostic information. We utilize this information to either adjust the amount of memory in the extracted FSC or perform focused retraining of the RNN. While generally applicable, we detail the resulting iterative procedure in the context of policy synthesis for partially observable Markov decision processes (POMDPs), which is known to be notoriously hard. The numerical experiments show that the proposed approach outperforms traditional POMDP synthesis methods by 3 orders of magnitude within 2% of optimal benchmark values.
【Keywords】: Planning and Scheduling: POMDPs; Planning and Scheduling: Planning with Incomplete information; Planning and Scheduling: Markov Decisions Processes;
【Paper Link】 【Pages】:4128-4134
【Authors】: Francesco Leofante ; Enrico Giunchiglia ; Erika Ábrahám ; Armando Tacchella
【Abstract】: We consider the problem of planning with arithmetic theories, and focus on generating optimal plans for numeric domains with constant and state-dependent action costs. Solving these problems efficiently requires a seamless integration between propositional and numeric reasoning. We propose a novel approach that leverages Optimization Modulo Theories (OMT) solvers to implement a domain-independent optimal theory-planner. We present a new encoding for optimal planning in this setting and we evaluate our approach using well-known, as well as new, numeric benchmarks.
【Keywords】: Planning and Scheduling: Planning Algorithms; Constraints and SAT: Satisfiability Modulo Theories;
【Paper Link】 【Pages】:4135-4142
【Authors】: Michael H. Lim ; Claire Tomlin ; Zachary N. Sunberg
【Abstract】: Partially observable Markov decision processes (POMDPs) with continuous state and observation spaces have powerful flexibility for representing real-world decision and control problems but are notoriously difficult to solve. Recent online sampling-based algorithms that use observation likelihood weighting have shown unprecedented effectiveness in domains with continuous observation spaces. However there has been no formal theoretical justification for this technique. This work offers such a justification, proving that a simplified algorithm, partially observable weighted sparse sampling (POWSS), will estimate Q-values accurately with high probability and can be made to perform arbitrarily near the optimal solution by increasing computational power.
【Keywords】: Planning and Scheduling: POMDPs; Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: Real-time Planning;
【Paper Link】 【Pages】:4143-4151
【Authors】: Max Waters ; Lin Padgham ; Sebastian Sardiña
【Abstract】: This work investigates the problem of optimising a partial-order plan’s (POP) flexibility through the simultaneous transformation of its action ordering and variable binding constraints. While the former has been extensively studied through the notions of deordering and reordering, the latter has received much less attention. We show that a plan’s variable bindings are often related to resource usage and their reinstantiation can yield more flexible plans. To do so, we extend existing POP optimality criteria to support variable reinstantiation, and prove that checking if a plan can be optimised further is NP-complete. We also propose a MaxSAT-based technique for increasing plan flexibility and provide a thorough experimental evaluation that suggests that there are benefits in action reinstantiation.
【Keywords】: Planning and Scheduling: Planning and Scheduling;
【Paper Link】 【Pages】:4152-4160
【Authors】: Silvan Sievers ; Florian Pommerening ; Thomas Keller ; Malte Helmert
【Abstract】: Cost partitioning is a method for admissibly combining admissible heuristics. In this work, we extend this concept to merge-and-shrink (M&S;) abstractions that may use labels that do not directly correspond to operators. We investigate how optimal and saturated cost partitioning (SCP) interact with M&S; transformations and develop a method to compute SCPs during the computation of M&S.; Experiments show that SCP significantly improves M&S; on standard planning benchmarks.
【Keywords】: Planning and Scheduling: Planning Algorithms; Planning and Scheduling: Search in Planning and Scheduling; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:4161-4167
【Authors】: Andrés Occhipinti Liberman ; Rasmus Kræmmer Rendsvig
【Abstract】: Propositional Dynamic Epistemic Logic (DEL) provides an expressive framework for epistemic planning, but lacks desirable features that are standard in first-order planning languages (such as problem-independent action representations via action schemas). A recent epistemic planning formalism based on First-Order Dynamic Epistemic Logic (FODEL) combines the strengths of DEL (higher-order epistemics) with those of first-order languages (lifted representation), yielding benefits in terms of expressiveness and representational succinctness. This paper studies the plan existence problem for FODEL planning, showing that while the problem is generally undecidable, the cases of single-agent planning and multi-agent planning with non-modal preconditions are decidable.
【Keywords】: Planning and Scheduling: Distributed;Multi-agent Planning; Planning and Scheduling: Theoretical Foundations of Planning; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Agent-based and Multi-agent Systems: Multi-agent Planning;
【Paper Link】 【Pages】:4168-4175
【Authors】: Michael Saint-Guillain ; Tiago Stegun Vaquero ; Jagriti Agrawal ; Steve A. Chien
【Abstract】: Most existing works in Probabilistic Simple Temporal Networks (PSTNs) base their frameworks on well-defined probability distributions. This paper addresses on PSTN Dynamic Controllability (DC) robustness measure, i.e. the execution success probability of a network under dynamic control. We consider PSTNs where the probability distributions of the contingent edges are ordinary distributed (e.g. non-parametric, non-symmetric). We introduce the concepts of dispatching protocol (DP) as well as DP-robustness, the probability of success under a predefined dynamic policy. We propose a fixed-parameter pseudo-polynomial time algorithm to compute the exact DP-robustness of any PSTN under NextFirst protocol, and apply to various PSTN datasets, including the real case of planetary exploration in the context of the Mars 2020 rover, and propose an original structural analysis.
【Keywords】: Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: Robot Planning; Planning and Scheduling: Temporal and Hybrid planning;
【Paper Link】 【Pages】:4176-4182
【Authors】: Shahaf S. Shperberg ; Andrew Coles ; Erez Karpas ; Solomon Eyal Shimony ; Wheeler Ruml
【Abstract】: If a planning agent is considering taking a bus, for example, the time that passes during its planning can affect the feasibility of its plans, as the bus may depart before the agent has found a complete plan. Previous work on this situated temporal planning setting proposed an abstract deliberation scheduling scheme for maximizing the probability of finding a plan that is still feasible at the time it is found. In this paper, we extend the deliberation scheduling approach to address problems in which plans can differ in their cost. Like the planning deadlines, these costs can be uncertain until a complete plan has been found. We show that finding a deliberation policy that minimizes expected cost is PSPACE-hard and that even for known costs and deadlines the optimal solution is a contingent, rather than sequential, schedule. We then analyze special cases of the problem and use these results to propose a greedy scheme that considers both the uncertain deadlines and costs. Our empirical evaluation shows that the greedy scheme performs well in practice on a variety of problems, including some generated from planner search trees.
【Keywords】: Planning and Scheduling: Temporal and Hybrid planning; Heuristic Search and Game Playing: Meta-Reasoning and Meta-heuristics;
【Paper Link】 【Pages】:4183-4189
【Authors】: Florent Teichteil-Königsbuch ; Miquel Ramírez ; Nir Lipovetzky
【Abstract】: Width-based planning algorithms have been demonstrated to be competitive with state-of-the-art heuristic search and SAT-based approaches, without requiring access to a model of action effects and preconditions, just access to a black-box simulator. Width-based planners search is guided by a measure of the novelty of states, that requires observations on simulator states to be given as a set of features. This paper proposes agnostic feature mapping mechanisms that define the features online, as exploration progresses and the domain of continuous state variables is revealed. We demonstrate the effectiveness of these features on the OpenAI gym "classical control" suite of benchmarks. We compare our online planners with state-of-the-art deep reinforcement learning algorithms, and show that width-based planners using our features can find policies of the same quality with significantly less computational resources.
【Keywords】: Planning and Scheduling: Planning Algorithms; Planning and Scheduling: Real-time Planning; Planning and Scheduling: Search in Planning and Scheduling; Machine Learning: Reinforcement Learning;
【Paper Link】 【Pages】:4190-4198
【Authors】: Yunbo Wang ; Bo Liu ; Jiajun Wu ; Yuke Zhu ; Simon S. Du ; Fei-Fei Li ; Joshua B. Tenenbaum
【Abstract】: A major difficulty of solving continuous POMDPs is to infer the multi-modal distribution of the unobserved true states and to make the planning algorithm dependent on the perceived uncertainty. We cast POMDP filtering and planning problems as two closely related Sequential Monte Carlo (SMC) processes, one over the real states and the other over the future optimal trajectories, and combine the merits of these two parts in a new model named the DualSMC network. In particular, we first introduce an adversarial particle filter that leverages the adversarial relationship between its internal components. Based on the filtering results, we then propose a planning algorithm that extends the previous SMC planning approach [Piche et al., 2018] to continuous POMDPs with an uncertainty-dependent policy. Crucially, not only can DualSMC handle complex observations such as image input but also it remains highly interpretable. It is shown to be effective in three continuous POMDP domains: the floor positioning domain, the 3D light-dark navigation domain, and a modified Reacher domain.
【Keywords】: Planning and Scheduling: Planning Algorithms; Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: Planning with Incomplete information;
【Paper Link】 【Pages】:4199-4205
【Abstract】: Several scientific studies have reported the existence of the income gap among rideshare drivers based on demographic factors such as gender, age, race, etc. In this paper, we study the income inequality among rideshare drivers due to discriminative cancellations from riders, and the tradeoff between the income inequality (called fairness objective) with the system efficiency (called profit objective). We proposed an online bipartite-matching model where riders are assumed to arrive sequentially following a distribution known in advance. The highlight of our model is the concept of acceptance rate between any pair of driver-rider types, where types are defined based on demographic factors. Specially, we assume each rider can accept or cancel the driver assigned to her, each occurs with a certain probability which reflects the acceptance degree from the rider type towards the driver type. We construct a bi-objective linear program as a valid benchmark and propose two LP-based parameterized online algorithms. Rigorous online competitive ratio analysis is offered to demonstrate the flexibility and efficiency of our online algorithms in balancing the two conflicting goals, promotions of fairness and profit. Experimental results on a real-world dataset are provided as well, which confirm our theoretical predictions.
【Keywords】: Planning and Scheduling: Scheduling; Agent-based and Multi-agent Systems: Resource Allocation; AI Ethics: Fairness;
【Paper Link】 【Pages】:4206-4212
【Authors】: Yifan Xu ; Pan Xu ; Jianping Pan ; Jun Tao
【Abstract】: With the popularity of the Internet, traditional offline resource allocation has evolved into a new form, called online resource allocation. It features the online arrivals of agents in the system and the real-time decision-making requirement upon the arrival of each online agent. Both offline and online resource allocation have wide applications in various real-world matching markets ranging from ridesharing to crowdsourcing. There are some emerging applications such as rebalancing in bike sharing and trip-vehicle dispatching in ridesharing, which involve a two-stage resource allocation process. The process consists of an offline phase and another sequential online phase, and both phases compete for the same set of resources. In this paper, we propose a unified model which incorporates both offline and online resource allocation into a single framework. Our model assumes non-uniform and known arrival distributions for online agents in the second online phase, which can be learned from historical data. We propose a parameterized linear programming (LP)-based algorithm, which is shown to be at most a constant factor of 1/4 from the optimal. Experimental results on the real dataset show that our LP-based approaches outperform the LP-agnostic heuristics in terms of robustness and effectiveness.
【Keywords】: Planning and Scheduling: Scheduling; Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Agent-based and Multi-agent Systems: Resource Allocation;
【Paper Link】 【Pages】:4213-4219
【Authors】: Jan Bürmann ; Jie Zhang
【Abstract】: In full-knowledge multi-robot adversarial patrolling, a group of robots have to detect an adversary who knows the robots' strategy. The adversary can easily take advantage of any deterministic patrolling strategy, which necessitates the employment of a randomised strategy. While the Markov decision process has been the dominant methodology in computing the penetration detection probabilities, we apply enumerative combinatorics to characterise the penetration detection probabilities. It allows us to provide the closed formulae of these probabilities and facilitates characterising optimal random defence strategies. Comparing to iteratively updating the Markov transition matrices, our methods significantly reduces the time and space complexity of solving the problem. We use this method to tackle four penetration configurations.
【Keywords】: Planning and Scheduling: Robot Planning; Planning and Scheduling: Planning under Uncertainty; Planning and Scheduling: Theoretical Foundations of Planning;
【Paper Link】 【Pages】:4221-4228
【Authors】: Jing Liang ; Utsav Patel ; Adarsh Jagan Sathyamoorthy ; Dinesh Manocha
【Abstract】: We present a novel high fidelity 3-D simulator that significantly reduces the sim-to-real gap for collision avoidance in dense crowds using Deep Reinforcement Learning (DRL). Our simulator models realistic crowd and pedestrian behaviors, along with friction, sensor noise and delays in the simulated robot model. We also describe a technique to incrementally control the randomness and complexity of training scenarios to achieve better convergence and generalization capabilities. We demonstrate the effectiveness of our simulator by training a policy that fuses data from multiple perception sensors such as a 2-D lidar and a depth camera to detect pedestrians and computes smooth, collision-free velocities. Our novel reward function and multi-sensor formulation results in smooth and unobtrusive navigation. We have evaluated the learned policy on two differential drive robots and evaluate its performance in new dense crowd scenarios, narrow corridors, T and L-junctions, etc. We observe that our algorithm outperforms prior dynamic navigation techniques in terms of metrics such as success rate, trajectory length, mean time to goal, and smoothness.
【Keywords】: Robotics: Motion and Path Planning; Robotics: Learning in Robotics; Planning and Scheduling: Robot Planning;
【Paper Link】 【Pages】:4229-4235
【Authors】: Bojie Shen ; Muhammad Aamir Cheema ; Daniel Harabor ; Peter J. Stuckey
【Abstract】: We consider optimal and anytime algorithms for the Euclidean Shortest Path Problem (ESPP) in two dimensions. Our approach leverages ideas from two recent works: Polyanya, a mesh-based ESPP planner which we use to represent and reason about the environment, and Compressed Path Databases, a speedup technique for pathfinding on grids and spatial networks, which we exploit to compute fast candidate paths. In a range of experiments and empirical comparisons we show that: (i) the auxiliary data structures required by the new method are cheap to build and store; (ii) for optimal search, the new algorithm is faster than a range of recent ESPP planners, with speedups ranging from several factors to over one order of magnitude; (iii) for anytime search, where feasible solutions are needed fast, we report even better runtimes.
【Keywords】: Robotics: Motion and Path Planning; Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:4237-4244
【Authors】: Yuqiao Chen ; Yibo Yang ; Sriraam Natarajan ; Nicholas Ruozzi
【Abstract】: Lifted inference algorithms exploit model symmetry to reduce computational cost in probabilistic inference. However, most existing lifted inference algorithms operate only over discrete domains or continuous domains with restricted potential functions. We investigate two approximate lifted variational approaches that apply to domains with general hybrid potentials, and are expressive enough to capture multi-modality. We demonstrate that the proposed variational methods are highly scalable and can exploit approximate model symmetries even in the presence of a large amount of continuous evidence, outperforming existing message-passing-based approaches in a variety of settings. Additionally, we present a sufficient condition for the Bethe variational approximation to yield a non-trivial estimate over the marginal polytope.
【Keywords】: Uncertainty in AI: Approximate Probabilistic Inference; Uncertainty in AI: Graphical Models; Uncertainty in AI: Statistical Relational AI;
【Paper Link】 【Pages】:4245-4251
【Authors】: Niels Grüttemeier ; Christian Komusiewicz
【Abstract】: We study the problem of learning the structure of an optimal Bayesian network when additional structural constraints are posed on the network or on its moralized graph. More precisely, we consider the constraint that the moralized graph can be transformed to a graph from a sparse graph class Π by at most k vertex deletions. We show that for Π being the graphs with maximum degree 1, an optimal network can be computed in polynomial time when k is constant, extending previous work that gave an algorithm with such a running time for Π being the class of edgeless graphs [Korhonen & Parviainen, NIPS 2015]. We then show that further extensions or improvements are presumably impossible. For example, we show that when Π is the set of graphs in which each component has size at most three, then learning an optimal network is NP-hard even if k=0. Finally, we show that learning an optimal network with at most k edges in the moralized graph presumably is not fixed-parameter tractable with respect to k and that, in contrast, computing an optimal network with at most k arcs can be computed is fixed-parameter tractable in k.
【Keywords】: Uncertainty in AI: Bayesian Networks; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:4252-4258
【Authors】: Timothy van Bremen ; Ondrej Kuzelka
【Abstract】: We study the symmetric weighted first-order model counting task and present ApproxWFOMC, a novel anytime method for efficiently bounding the weighted first-order model count of a sentence given an unweighted first-order model counting oracle. The algorithm has applications to inference in a variety of first-order probabilistic representations, such as Markov logic networks and probabilistic logic programs. Crucially for many applications, no assumptions are made on the form of the input sentence. Instead, the algorithm makes use of the symmetry inherent in the problem by imposing cardinality constraints on the number of possible true groundings of a sentence's literals. Realising the first-order model counting oracle in practice using the approximate hashing-based model counter ApproxMC3, we show how our algorithm is competitive with existing approximate and exact techniques for inference in first-order probabilistic models. We additionally provide PAC guarantees on the accuracy of the bounds generated.
【Keywords】: Uncertainty in AI: Approximate Probabilistic Inference; Uncertainty in AI: Statistical Relational AI; Constraints and SAT: SAT: Algorithms and Techniques;
【Paper Link】 【Pages】:4259-4265
【Authors】: Cheng Chen ; Luo Luo ; Weinan Zhang ; Yong Yu ; Yijiang Lian
【Abstract】: The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) updating time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.
【Keywords】: Uncertainty in AI: Sequential Decision Making; Machine Learning: Online Learning;
【Paper Link】 【Pages】:4266-4274
【Authors】: Kalev Kask ; Bobak Pezeshki ; Filjor Broka ; Alexander T. Ihler ; Rina Dechter
【Abstract】: Abstraction Sampling (AS) is a recently introduced enhancement of Importance Sampling that exploits stratification by using a notion of abstractions: groupings of similar nodes into abstract states. It was previously shown that AS performs particularly well when sampling over an AND/OR search space; however, existing schemes were limited to ``proper'' abstractions in order to ensure unbiasedness, severely hindering scalability. In this paper, we introduce AOAS, a new Abstraction Sampling scheme on AND/OR search spaces that allow more flexible use of abstractions by circumventing the properness requirement. We analyze the properties of this new algorithm and, in an extensive empirical evaluation on five benchmarks, over 480 problems, and comparing against other state of the art algorithms, illustrate AOAS's properties and show that it provides a far more powerful and competitive Abstraction Sampling framework.
【Keywords】: Uncertainty in AI: Approximate Probabilistic Inference; Uncertainty in AI: Bayesian Networks; Uncertainty in AI: Graphical Models;
【Paper Link】 【Pages】:4275-4282
【Authors】: Haifeng Qian
【Abstract】: This paper proposes a new generative model called neural belief reasoner (NBR). It differs from previous models in that it specifies a belief function rather than a probability distribution. Its implementation consists of neural networks, fuzzy-set operations and belief-function operations, and query-answering, sample-generation and training algorithms are presented. This paper studies NBR in two tasks. The first is a synthetic unsupervised-learning task, which demonstrates NBR's ability to perform multi-hop reasoning, reasoning with uncertainty and reasoning about conflicting information. The second is supervised learning: a robust MNIST classifier for 4 and 9, which is the most challenging pair of digits. This classifier needs no adversarial training, and it substantially exceeds the state of the art in adversarial robustness as measured by the L2 metric, while at the same time maintains 99.1% accuracy on natural images.
【Keywords】: Uncertainty in AI: Uncertainty Representations; Machine Learning: Adversarial Machine Learning; Machine Learning: Unsupervised Learning; Machine Learning: Neuro-Symbolic Methods;
【Paper Link】 【Pages】:4283-4290
【Authors】: Manfred Jaeger ; Oliver Schulte
【Abstract】: A generative probabilistic model for relational data consists of a family of probability distributions for relational structures over domains of different sizes. In most existing statistical relational learning (SRL) frameworks, these models are not projective in the sense that the marginal of the distribution for size-n structures on induced substructures of size k
Keywords:Uncertainty in AI: Statistical Relational AI Machine Learning: Relational Learning Machine Learning: Probabilistic Machine Learning
【Keywords】: Uncertainty in AI: Statistical Relational AI; Machine Learning: Relational Learning; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:4291-4297
【Authors】: Debarun Bhattacharjya ; Dharmashankar Subramanian ; Tian Gao
【Abstract】: Many real-world domains involve co-evolving relationships between events, such as meals and exercise, and time-varying random variables, such as a patient's blood glucose levels. In this paper, we propose a general framework for modeling joint temporal dynamics involving continuous time transitions of discrete state variables and irregular arrivals of events over the timeline. We show how conditional Markov processes (as represented by continuous time Bayesian networks) and multivariate point processes (as represented by graphical event models) are among various processes that are covered by the framework. We introduce and compare two simple and interpretable yet practical joint models within the framework with relevant baselines on simulated and real-world datasets, using a graph search algorithm for learning. The experiments highlight the importance of jointly modeling event arrivals and state variable transitions to better fit joint temporal datasets, and the framework opens up possibilities for models involving even more complex dynamics whenever suitable.
【Keywords】: Uncertainty in AI: Graphical Models; Machine Learning: Learning Graphical Models; Data Mining: Mining Spatial, Temporal Data; Uncertainty in AI: Uncertainty Representations;
【Paper Link】 【Pages】:4299-4305
【Authors】: Luca Romeo ; Giuseppe Armentano ; Antonio Nicolucci ; Marco Vespasiani ; Giacomo Vespasiani ; Emanuele Frontoni
【Abstract】: The prediction of the risk profile related to the cardiopathy complication is a core research task that could support clinical decision making. However, the design and implementation of a clinical decision support system based on Electronic Health Record (EHR) temporal data comprise of several challenges. Several single task learning approaches consider the prediction of the risk profile related to a specific diabetes complication (i.e., cardiopathy) independent from other complications. Accordingly, the state-of-the-art multi-task learning (MTL) model encapsulates only the temporal relatedness among the EHR data. However, this assumption might be restricted in the clinical scenario where both spatio-temporal constraints should be taken into account. The aim of this study is the proposal of two different MTL procedures, called spatio-temporal lasso (STL-MTL) and spatio-temporal group lasso (STGL-MTL), which encode the spatio-temporal relatedness using a regularization term and a graph-based approach (i.e., encoding the task relatedness using the structure matrix). Experimental results on a real-world EHR dataset demonstrate the robust performance and the interpretability of the proposed approach.
【Keywords】: Machine Learning: Transfer, Adaptation, Multi-task Learning; Machine Learning Applications: Bio/Medicine; Machine Learning: Classification; Machine Learning Applications: Applications of Supervised Learning;
【Paper Link】 【Pages】:4306-4312
【Authors】: Moein Khajehnejad ; Ahmad Asgharian Rezaei ; Mahmoudreza Babaei ; Jessica Hoffmann ; Mahdi Jalili ; Adrian Weller
【Abstract】: Influence maximization is a widely studied topic in network science, where the aim is to reach the maximum possible number of nodes, while only targeting a small initial set of individuals. It has critical applications in many fields, including viral marketing, information propagation, news dissemination, and vaccinations. However, the objective does not usually take into account whether the final set of influenced nodes is fair with respect to sensitive attributes, such as race or gender. Here we address fair influence maximization, aiming to reach minorities more equitably. We introduce Adversarial Graph Embeddings: we co-train an auto-encoder for graph embedding and a discriminator to discern sensitive attributes. This leads to embeddings which are similarly distributed across sensitive attributes. We then find a good initial set by clustering the embeddings. We believe we are the first to use embeddings for the task of fair influence maximization. While there are typically trade-offs between fairness and influence maximization objectives, our experiments on synthetic and real-world datasets show that our approach dramatically reduces disparity while remaining competitive with state-of-the-art influence maximization methods.
【Keywords】: AI Ethics: Fairness; Machine Learning: Adversarial Machine Learning; Natural Language Processing: Embeddings; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:4313-4321
【Authors】: Junwen Bai ; Shufeng Kong ; Carla P. Gomes
【Abstract】: Multi-label classification is the challenging task of predicting the presence and absence of multiple targets, involving representation learning and label correlation modeling. We propose a novel framework for multi-label classification, Multivariate Probit Variational AutoEncoder (MPVAE), that effectively learns latent embedding spaces as well as label correlations. MPVAE learns and aligns two probabilistic embedding spaces for labels and features respectively. The decoder of MPVAE takes in the samples from the embedding spaces and models the joint distribution of output targets under a Multivariate Probit model by learning a shared covariance matrix. We show that MPVAE outperforms the existing state-of-the-art methods on important computational sustainability applications as well as on other application domains, using public real-world datasets. MPVAE is further shown to remain robust under noisy settings. Lastly, we demonstrate the interpretability of the learned covariance by a case study on a bird observation dataset.
【Keywords】: Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Deep Generative Models; Uncertainty in AI: Approximate Probabilistic Inference; Machine Learning: Interpretability;
【Paper Link】 【Pages】:4322-4329
【Authors】: Hau Chan ; Long Tran-Thanh ; Vignesh Viswanathan
【Abstract】: Standard disaster response involves using drones (or helicopters) for reconnaissance and using people on the ground to mitigate the damage. In this paper, we look at the problem of wildfires and propose an efficient resource allocation strategy to cope with both dynamically changing environment and uncertainty. In particular, we propose Firefly, a new resource allocation algorithm, that can provably achieve optimal or near optimal solutions with high probability by first efficiently allocating observation drones to collect information to reduce uncertainty, and then allocate the firefighting units to extinguish fire. For the former, Firefly uses a combination of maximum set coverage formulation and a novel utility estimation technique, and it uses a knapsack formulation to calculate the allocation for the latter. We also demonstrate empirically by using a real-world dataset that Firefly achieves up to 80-90% performance of the offline optimal solution, even with a small amount of drones, in most of the cases.
【Keywords】: Agent-based and Multi-agent Systems: Resource Allocation; Multidisciplinary Topics and Applications: Other; Uncertainty in AI: Other; Machine Learning Applications: Environmental;
【Paper Link】 【Pages】:4330-4337
【Authors】: Yingwei Zhang ; Yiqiang Chen ; Hanchao Yu ; Zeping Lv ; Qing Li ; Xiaodong Yang
【Abstract】: Discriminating pathologic cognitive decline from the expected decline of normal aging is an important research topic for elderly care and health monitoring. However, most cognitive assessment methods only work when data distributions of the training set and testing set are consistent. Enabling existing cognitive assessment models to adapt to the data in new cognitive assessment tasks is a significant challenge. In this paper, we propose a novel domain adaptation method, namely the Fine-Grained Adaptation Random Forest (FAT), to bridge the cognitive assessment gap when the data distribution is changed. FAT is composed of two essential parts 1) information gain based model evaluation strategy (IGME) and 2) domain adaptation tree growing mechanism (DATG). IGME is used to evaluate every individual tree, and DATG is used to transfer the source model to the target domain. To evaluate the performance of FAT, we conduct experiments in real clinical environments. Experimental results demonstrate that FAT is significantly more accurate and efficient compared with other state-of-the-art methods.
【Keywords】: Humans and AI: Cognitive Modeling; Humans and AI: Cognitive Systems; Machine Learning: Transfer, Adaptation, Multi-task Learning; Humans and AI: Human-Computer Interaction;
【Paper Link】 【Pages】:4338-4344
【Authors】: Pramith Devulapalli ; Bistra Dilkina ; Yexiang Xue
【Abstract】: Models capturing parameterized random walks on graphs have been widely adopted in wildlife conservation to study species dispersal as a function of landscape features. Learning the probabilistic model empowers ecologists to understand animal responses to conservation strategies. By exploiting the connection between random walks and simple electric networks, we show that learning a random walk model can be reduced to finding the optimal graph Laplacian for a circuit. We propose a moment matching strategy that correlates the model’s hitting and commuting times with those observed empirically. To find the best Laplacian, we propose a neural network capable of back-propagating gradients through the matrix inverse in an end-to-end fashion. We developed a scalable method called CGInv which back-propagates the gradients through a neural network encoding each layer as a conjugate gradient iteration. To demonstrate its effectiveness, we apply our computational framework to applications in landscape connectivity modeling. Our experiments successfully demonstrate that our framework effectively and efficiently recovers the ground-truth configurations.
【Keywords】: Machine Learning Applications: Environmental; Machine Learning: Learning Graphical Models; Machine Learning: Deep Learning; Machine Learning: Time-series;Data Streams;
【Paper Link】 【Pages】:4345-4351
【Authors】: Zhengcong Fei
【Abstract】: Tandem mass spectrometry is the most widely used technology to identify proteins in a complex biological sample, which produces a large number of spectra representative of protein subsequences named peptide. In this paper, we propose a hierarchical multi-stage framework, referred as DeepTag, to identify the peptide sequence for each given spectrum. Compared with the traditional one-stage generation, our sequencing model starts the inference with a selected high-confidence guiding tag and provides the complete sequence based on this guiding tag. Besides, we introduce a cross-modality refining module to asist the decoder focus on effective peaks and fine-tune with a reinforcement learning technique. Experiments on different public datasets demonstrate that our method achieves a new state-of-the-art performance in peptide identification task, leading to a marked improvement in terms of both precision and recall.
【Keywords】: Computer Vision: Biomedical Image Understanding; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:4352-4358
【Authors】: Chung-Kyun Han ; Shih-Fen Cheng
【Abstract】: The trend of moving online in the retail industry has created great pressure for the logistics industry to catch up both in terms of volume and response time. On one hand, volume is fluctuating at greater magnitude, making peaks higher; on the other hand, customers are also expecting shorter response time. As a result, logistics service providers are pressured to expand and keep up with the demands. Expanding fleet capacity, however, is not sustainable as capacity built for the peak seasons would be mostly vacant during ordinary days. One promising solution is to engage crowdsourced workers, who are not employed full-time but would be willing to help with the deliveries if their schedules permit. The challenge, however, is to choose appropriate sets of tasks that would not cause too much disruption from their intended routes, while satisfying each delivery task's delivery time window requirement. In this paper, we propose a decision-support algorithm to select delivery tasks for a single crowdsourced worker that best fit his/her upcoming route both in terms of additional travel time and the time window requirements at all stops along his/her route, while at the same time satisfies tasks' delivery time windows. Our major contributions are in the formulation of the problem and the design of an efficient exact algorithm based on the branch-and-cut approach. The major innovation we introduce is the efficient generation of promising valid inequalities via our separation heuristics. In all numerical instances we study, our approach manages to reach optimality yet with much fewer computational resource requirement than the plain integer linear programming formulation. The greedy heuristic, while efficient in time, only achieves around 40-60% of the optimum in all cases. To illustrate how our solver could help in advancing the sustainability objective, we also quantify the reduction in the carbon footprint.
【Keywords】: Humans and AI: Human Computation and Crowdsourcing; Planning and Scheduling: Planning Algorithms; Multidisciplinary Topics and Applications: Transportation; Planning and Scheduling: Search in Planning and Scheduling;
【Paper Link】 【Pages】:4359-4365
【Authors】: Chao Huang ; Chuxu Zhang ; Peng Dai ; Liefeng Bo
【Abstract】: Predicting anomalies (e.g., blocked driveway and vehicle collisions) in urban space plays an important role in assisting governments and communities for building smart city applications, ranging from intelligent transportation to public safety. However, predicting urban anomalies is not trivial due to the following two factors: i) The sequential transition regularities of anomaly occurrences is complex, which exhibit with high-order and dynamic correlations. ii) The Interactions between region, time and anomaly category is multi-dimensional in real-world urban anomaly forecasting scenario. How to fuse multiple relations from spatial, temporal and categorical dimensions in the predictive framework remains a significant challenge. To address these two challenges, we propose a Cross-Interaction Hierarchical Attention network model (CHAT) which uncovers the dynamic occurrence patterns of time-stamped urban anomaly data. Our CHAT framework could automatically capture the relevance of past anomaly occurrences across different time steps, and discriminates which types of cross-modal interactions are more important for making future predictions. Experiment results demonstrate the superiority of CHAT framework over state-of-the-art baselines.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Data Mining: Applications; Data Mining: Mining Data Streams;
【Paper Link】 【Pages】:4366-4374
【Authors】: Ashiqur R. KhudaBukhsh ; Shriphani Palakodety ; Jaime G. Carbonell
【Abstract】: Code mixing (or code switching) is a common phenomenon observed in social-media content generated by a linguistically diverse user-base. Studies show that in the Indian sub-continent, a substantial fraction of social media posts exhibit code switching. While the difficulties posed by code mixed documents to further downstream analyses are well-understood, lending visibility to code mixed documents under certain scenarios may have utility that has been previously overlooked. For instance, a document written in a mixture of multiple languages can be partially accessible to a wider audience; this could be particularly useful if a considerable fraction of the audience lacks fluency in one of the component languages. In this paper, we provide a systematic approach to sample code mixed documents leveraging a polyglot embedding based method that requires minimal supervision. In the context of the 2019 India-Pakistan conflict triggered by the Pulwama terror attack, we demonstrate an untapped potential of harnessing code mixing for human well-being: starting from an existing hostility diffusing hope speech classifier solely trained on English documents, code mixed documents are utilized to perform cross-lingual sampling and retrieve hope speech content written in a low-resource but widely used language - Romanized Hindi. Our proposed pipeline requires minimal supervision and holds promise in substantially reducing web moderation efforts. A further exploratory study on a new COVID-19 data set introduced in this paper demonstrates the generalizability of our cross-lingual sampling technique.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: Information Retrieval; Natural Language Processing: Embeddings; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:4375-4381
【Authors】: Shufeng Kong ; Junwen Bai ; Jae Hee Lee ; Di Chen ; Andrew Allyn ; Michelle Stuart ; Malin Pinsky ; Katherine Mills ; Carla Gomes
【Abstract】: A key problem in computational sustainability is to understand the distribution of species across landscapes over time. This question gives rise to challenging large-scale prediction problems since (i) hundreds of species have to be simultaneously modeled and (ii) the survey data are usually inflated with zeros due to the absence of species for a large number of sites. The problem of tackling both issues simultaneously, which we refer to as the zero-inflated multi-target regression problem, has not been addressed by previous methods in statistics and machine learning. In this paper, we propose a novel deep model for the zero-inflated multi-target regression problem. To this end, we first model the joint distribution of multiple response variables as a multivariate probit model and then couple the positive outcomes with a multivariate log-normal distribution. By penalizing the difference between the two distributions’ covariance matrices, a link between both distributions is established. The whole model is cast as an end-to-end learning framework and we provide an efficient learning algorithm for our model that can be fully implemented on GPUs. We show that our model outperforms the existing state-of-the-art baselines on two challenging real-world species distribution datasets concerning bird and fish populations.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Deep Learning; Machine Learning: Big data; Scalability;
【Paper Link】 【Pages】:4382-4388
【Authors】: Kehinde Owoeye
【Abstract】: Early forecasting of bird migration patterns has important application for example in reducing avian biodiversity loss. An estimated 100 million to 1 billion birds are known to die yearly during migration due to fatal collisions with human made infrastructures such as buildings, high tension lines, wind turbines and aircrafts thus raising a huge concern for conservationists. Building models that can forecast accurate migration patterns is therefore important to enable the optimal management of these critical infrastructures with the sole aim of reducing biodiversity loss. While previous works have largely focused on the task of forecasting migration intensities and the onset of just one migration state, predicting several migration states at even finer granularity is more useful towards optimally managing the infrastructures that causes these deaths. In this work, we consider the task of forecasting migration patterns of the popular Turkey Vulture (Cathartes aura) collected with the aid of satellite telemetry for multiple years at a resolution of one hour. We use a deep Bidirectional-GRU recurrent neural network augmented with an auxiliary task where the state information of one layer is used to initialise the other. Empirical results on a variety of experiments with our approach show we can accurately forecast migration up to one week in advance performing better than a variety of baselines.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Multidisciplinary Topics and Applications: AI for Life Science; Machine Learning: Classification; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:4389-4395
【Authors】: Amulya Yadav ; Roopali Singh ; Nikolas Siapoutis ; Anamika Barman-Adhikari ; Yu Liang
【Abstract】: This paper presents CORTA, a software agent that designs personalized rehabilitation programs for homeless youth suffering from opioid addiction. Many rehabilitation centers treat opioid addiction in homeless youth by prescribing rehabilitation programs that are tailored to the underlying causes of addiction. To date, rehabilitation centers have relied on ad-hoc assessments and unprincipled heuristics to deliver rehabilitation programs to homeless youth suffering from opioid addiction, which greatly undermines the effectiveness of the delivered programs. CORTA addresses these challenges via three novel contributions. First, CORTA utilizes a first-of-its-kind real-world dataset collected from ~1400 homeless youth to build causal inference models which predict the likelihood of opioid addiction among these youth. Second, utilizing counterfactual predictions generated by our causal inference models, CORTA solves novel optimization formulations to assign appropriate rehabilitation programs to the correct set of homeless youth in order to minimize the expected number of homeless youth suffering from opioid addiction. Third, we provide a rigorous experimental analysis of CORTA along different dimensions, e.g., importance of causal modeling, importance of optimization, and impact of incorporating fairness considerations, etc. Our simulation results show that CORTA outperforms baselines by ~110% in minimizing the number of homeless youth suffering from opioid addiction.
【Keywords】: AI Ethics: Societal Impact of AI; Multidisciplinary Topics and Applications: Social Sciences; Multidisciplinary Topics and Applications: Other; Machine Learning Applications: Bio/Medicine;
【Paper Link】 【Pages】:4396-4402
【Authors】: Budhitama Subagdja ; Han Yi Tay ; Ah-Hwee Tan
【Abstract】: Most of today's AI technologies are geared towards mastering specific tasks performance through learning from a huge volume of data. However, less attention has still been given to make the AI understand its own purposes or be responsible socially. In this paper, a new model of agent is presented with the capacity to represent itself as a distinct individual with identity, a mind of its own, unique experiences, and social lives. In this way, the agent can interact with its surroundings and other agents seamlessly and meaningfully. A practical framework for developing an agent architecture with this model of self and self-awareness is proposed allowing self to be ascribed to an existing intelligent agent architecture in general to enable its social ability, interactivity, and co-presence with others. Possible applications are discussed with some exemplifying cases based on an implementation of a conversational agent.
【Keywords】: Humans and AI: Cognitive Modeling; Agent-based and Multi-agent Systems: Human-Agent Interaction; Humans and AI: Human-AI Collaboration; Humans and AI: Personalization and User Modeling;
【Paper Link】 【Pages】:4403-4409
【Authors】: Qingsong Xie ; Shikui Tu ; Guoxing Wang ; Yong Lian ; Lei Xu
【Abstract】: For the problem of early detection of atrial fibrillation (AF) from electrocardiogram (ECG), it is difficult to capture subject-invariant discriminative features from ECG signals, due to the high variation in ECG morphology across subjects and the noise in ECG. In this paper, we propose an Discrete Biorthogonal Wavelet Transform (DBWT) Based Convolutional Neural Network (CNN) for AF detection, shortly called DBWT-AFNet. In DBWT-AFNet, rather than directly feeding ECG into CNN, DBWT is used to separate sub-signals in frequency band of heart beat from ECG, whose output is fed to CNN for AF diagnosis. Such sub-signals are better than the raw ECG for subject-invariant CNN representation learning because noisy information irrelevant to human beat has been largely filtered out. To strengthen the generalization ability of CNN to discover subject-invariant pattern in ECG, skip connection is exploited to propagate information well in neural network and channel attention is designed to adaptively highlight informative channel-wise features. Experiments show that the proposed DBWT-AFNet outperforms the state-of- the-art methods, especially for ECG segments classification across different subjects, where no data from testing subjects have been used in training.
【Keywords】: Multidisciplinary Topics and Applications: Biology and Medicine; Machine Learning Applications: Bio/Medicine; Data Mining: Mining Data Streams; Machine Learning: Time-series;Data Streams;
【Paper Link】 【Pages】:4410-4416
【Authors】: Kumar Ayush ; Burak Uzkent ; Marshall Burke ; David B. Lobell ; Stefano Ermon
【Abstract】: Accurate local-level poverty measurement is an essential task for governments and humanitarian organizations to track the progress towards improving livelihoods and distribute scarce resources. Recent computer vision advances in using satellite imagery to predict poverty have shown increasing accuracy, but they do not generate features that are interpretable to policymakers, inhibiting adoption by practitioners. Here we demonstrate an interpretable computational framework to accurately predict poverty at a local level by applying object detectors to high resolution (30cm) satellite images. Using the weighted counts of objects as features, we achieve 0.539 Pearson's r^2 in predicting village-level poverty in Uganda, a 31% improvement over existing (and less interpretable) benchmarks. Feature importance and ablation analysis reveal intuitive relationships between object counts and poverty predictions. Our results suggest that interpretability does not have to come at the cost of performance, at least in this important domain.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Machine Learning: Deep Learning: Convolutional networks; AI Ethics: Societal Impact of AI; Machine Learning: Interpretability;
【Paper Link】 【Pages】:4417-4423
【Authors】: Connor Riley ; Pascal Van Hentenryck ; Enpeng Yuan
【Abstract】: This paper considers the dispatching of large-scale real-time ride-sharing systems to address congestion issues faced by many cities. The goal is to serve all customers (service guarantees) with a small number of vehicles while minimizing waiting times under constraints on ride duration. This paper proposes an end-to-end approach that tightly integrates a state-of-the-art dispatching algorithm, a machine-learning model to predict zone-to-zone demand over time, and a model predictive control optimization to relocate idle vehicles. Experiments using historic taxi trips in New York City indicate that this integration decreases average waiting times by about 30% over all test cases and reaches close to 55% on the largest instances for high-demand zones.
【Keywords】: Constraints and SAT: Constraint Optimization; Machine Learning Applications: Other; Planning and Scheduling: Planning and Scheduling; Uncertainty in AI: Sequential Decision Making;
【Paper Link】 【Pages】:4424-4430
【Authors】: Feng Zhang ; Ningxuan Feng ; Yani Liu ; Cheng Yang ; Jidong Zhai ; Shuhao Zhang ; Bingsheng He ; Jiazao Lin ; Xiaoyong Du
【Abstract】: In big cities, there are plenty of parking spaces, but we often find nowhere to park. For example, New York has 1.4 million cars and 4.4 million on-street parking spaces, but it is still not easy to find a parking place near our destination, especially during peak hours. The reason is the lack of prediction of parking behavior. If we could provide parking behavior in advance, we can ease this parking problem that affects human well-being. We observe that parking lots have periodic parking patterns, which is an important factor for parking behavior prediction. Unfortunately, existing work ignores such periodic parking patterns in parking behavior prediction, and thus incurs low accuracy. To solve this problem, we propose PewLSTM, a novel periodic weather-aware LSTM model that successfully predicts the parking behavior based on historical records, weather, environments, and weekdays. PewLSTM has been successfully integrated into a real parking space reservation system, ThsParking, which is one of the top smart parking platforms in China. Based on 452,480real parking records in 683 days from 10 parking lots, PewLSTM yields 85.3% parking prediction accuracy, which is about 20% higher than the state-of-the-art parking behavior prediction method. The code and data can be obtained fromhttps://github.com/NingxuanFeng/PewLSTM.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Machine Learning: Explainable Machine Learning; Machine Learning: Deep Learning; Machine Learning: Deep Learning: Sequence Modeling;
【Paper Link】 【Pages】:4431-4437
【Authors】: Mingyang Zhang ; Tong Li ; Yong Li ; Pan Hui
【Abstract】: The increasing amount of urban data enable us to investigate urban dynamics, assist urban planning, and eventually, make our cities more livable and sustainable. In this paper, we focus on learning an embedding space from urban data for urban regions. For the first time, we propose a multi-view joint learning model to learn comprehensive and representative urban region embeddings. We first model different types of region correlations based on both human mobility and inherent region properties. Then, we apply a graph attention mechanism in learning region representations from each view of the built correlations. Moreover, we introduce a joint learning module that boosts the region embedding learning by sharing cross-view information and fuses multi-view embeddings by learning adaptive weights. Finally, we exploit the learned embeddings in the downstream applications of land usage classification and crime prediction in urban areas with real-world data. Extensive experiment results demonstrate that by exploiting our proposed joint learning model, the performance is improved by a large margin on both tasks compared with the state-of-the-art methods.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Data Mining: Applications; Humans and AI: Other; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:4439-4445
【Authors】: Cuneyt Gurcan Akcora ; Yitao Li ; Yulia R. Gel ; Murat Kantarcioglu
【Abstract】: Recent proliferation of cryptocurrencies that allow for pseudo-anonymous transactions has resulted in a spike of various e-crime activities and, particularly, cryptocurrency payments in hacking attacks demanding ransom by encrypting sensitive user data. Currently, most hackers use Bitcoin for payments, and existing ransomware detection tools depend only on a couple of heuristics and/or tedious data gathering steps. By capitalizing on the recent advances in Topological Data Analysis, we propose a novel efficient and tractable framework to automatically predict new ransomware transactions in a ransomware family, given only limited records of past transactions. Moreover, our new methodology exhibits high utility to detect emergence of new ransomware families, that is, detecting ransomware with no past records of transactions.
【Keywords】: AI for banking: AI for cryptocurrencies; AI for regulation: AI for financial crime detection; AI for lending: AI for blockchain;
【Paper Link】 【Pages】:4446-4452
【Authors】: Honglei Guo ; Bang An ; Zhili Guo ; Zhong Su
【Abstract】: Unstructured document compliance checking is always a big challenge for banks since huge amounts of contracts and regulations written in natural language require professionals' interpretation and judgment. Traditional rule-based or keyword-based methods cannot precisely characterize the deep semantic distribution in the unstructured document semantic compliance checking due to the semantic complexity of contracts and regulations. Deep Semantic Compliance Advisor (DSCA) is an unstructured document compliance checking platform which provides multi-level semantic comparison by deep learning algorithms. In the statement-level semantic comparison, a Graph Neural Network (GNN) based syntactic sentence encoder is proposed to capture the complicate syntactic and semantic clues of the statement sentences. This GNN-based encoder outperforms existing syntactic sentence encoders in deep semantic comparison and is more beneficial for long sentences. In the clause-level semantic comparison, an attention-based semantic relatedness detection model is applied to find the most relevant legal clauses. DSCA significantly enhances the productivity of legal professionals in the unstructured document compliance checking for banks.
【Keywords】: AI for regulation: AI for corporate governance and regulation; AI for regulation: AI for international regulation; AI for regulation: General;
【Paper Link】 【Pages】:4453-4460
【Authors】: Lu Bai ; Lixin Cui ; Yue Wang ; Yuhang Jiao ; Edwin R. Hancock
【Abstract】: Network representations are powerful tools for the analysis of time-varying financial complex systems consisting of multiple co-evolving financial time series, e.g., stock prices, etc. In this work, we develop a new kernel-based similarity measure between dynamic time-varying financial networks. Our ideas is to transform each original financial network into quantum-based entropy time series and compute the similarity measure based on the classical dynamic time warping framework associated with the entropy time series. The proposed method bridges the gap between graph kernels and the classical dynamic time warping framework for multiple financial time series analysis. Experiments on time-varying networks abstracted from financial time series of New York Stock Exchange (NYSE) database demonstrate that our approach can effectively discriminate the abrupt structural changes in terms of the extreme financial events.
【Keywords】: Foundation for AI in FinTech: Analyzing big financial data; Foundation for AI in FinTech: Analyzing highdimentional, sequential and evolving financial data;
【Paper Link】 【Pages】:4461-4468
【Authors】: Yueyang Zhong ; YeeMan Bergstrom ; Amy Ward
【Abstract】: This paper studies when a market-making firm should place orders to maximize their expected net profit, while also constraining risk, assuming orders are maintained on an electronic limit order book (LOB). To do this, we use a model-free and off-policy method, Q-learning, coupled with state aggregation, to develop a proposed trading strategy that can be implemented using a simple lookup table. Our main training dataset is derived from event-by-event data recording the state of the LOB. Our proposed trading strategy has passed both in-sample and out-of-sample testing in the backtester of the market-making firm with whom we are collaborating, and it also outperforms other benchmark strategies. As a result, the firm desires to put the strategy into production.
【Keywords】: AI for trading: AI for algorithmic trading; AI for trading: AI for strategic trading and strategy design; AI for trading: AI for trading incentive and strategy optimization;
【Paper Link】 【Pages】:4469-4475
【Authors】: Xia Cai
【Abstract】: Aiming to improve the performance of existing reversion based online portfolio selection strategies, we propose a novel multi-period strategy named “Vector Autoregressive Weighting Reversion” (VAWR). Firstly, vector autoregressive moving-average algorithm used in time series prediction is transformed into exploring the dynamic relationships between different assets for more accurate price prediction. Secondly, we design the modified online passive aggressive technique and advance a scheme to weigh investment risk and cumulative experience to update the closed-form of portfolio. Theoretical analysis and experimental results confirm the effectiveness and robustness of our strategy. Compared with the state-of-the-art strategies, VAWR greatly increases cumulative wealth, and it obtains the highest annualized percentage yield and sharp ratio on various public datasets. These improvements and easy implementation support the practical applications of VAWR.
【Keywords】: AI for trading: AI for portfolio analytics; AI for trading: AI for algorithmic trading; AI for trading: AI for strategic trading and strategy design; AI for trading: General;
【Paper Link】 【Pages】:4476-4482
【Authors】: Di Chen ; Yada Zhu ; Xiaodong Cui ; Carla P. Gomes
【Abstract】: Real-world applications often involve domain-specific and task-based performance objectives that are not captured by the standard machine learning losses, but are critical for decision making. A key challenge for direct integration of more meaningful domain and task-based evaluation criteria into an end-to-end gradient-based training process is the fact that often such performance objectives are not necessarily differentiable and may even require additional decision-making optimization processing. We propose the Task-Oriented Prediction Network (TOPNet), an end-to-end learning scheme that automatically integrates task-based evaluation criteria into the learning process via a learnable surrogate loss function, which directly guides the model towards the task-based goal. A major benefit of the proposed TOPNet learning scheme lies in its capability of automatically integrating non-differentiable evaluation criteria, which makes it particularly suitable for diversified and customized task-based evaluation criteria in real-world tasks. We validate the performance of TOPNet on two real-world financial prediction tasks, revenue surprise forecasting and credit risk modeling. The experimental results demonstrate that TOPNet significantly outperforms both traditional modeling with standard losses and modeling with hand-crafted heuristic differentiable surrogate losses.
【Keywords】: Foundation for AI in FinTech: Computational intelligence for FinTech; Foundation for AI in FinTech: General;
【Paper Link】 【Pages】:4483-4489
【Authors】: Dawei Cheng ; Xiaoyang Wang ; Ying Zhang ; Liqing Zhang
【Abstract】: The guaranteed loan is a debt obligation promise that if one corporation gets trapped in risks, its guarantors will back the loan. When more and more companies involve, they subsequently form complex networks. Detecting and predicting risk guarantee in these networked-loans is important for the loan issuer. Therefore, in this paper, we propose a dynamic graph-based attention neural network for risk guarantee relationship prediction (DGANN). In particular, each guarantee is represented as an edge in dynamic loan networks, while companies are denoted as nodes. We present an attention-based graph neural network to encode the edges that preserve the financial status as well as network structures. The experimental result shows that DGANN could significantly improve the risk prediction accuracy in both the precision and recall compared with state-of-the-art baselines. We also conduct empirical studies to uncover the risk guarantee patterns from the learned attentional network features. The result provides an alternative way for loan risk management, which may inspire more work in the future.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; AI for risk and security: AI for financial network risk; AI for banking: AI for credit loan;
【Paper Link】 【Pages】:4490-4496
【Authors】: Xin Liang ; Dawei Cheng ; Fangzhou Yang ; Yifeng Luo ; Weining Qian ; Aoying Zhou
【Abstract】: The share prices of listed companies in the stock trading market are prone to be influenced by various events. Performing event detection could help people to timely identify investment risks and opportunities accompanying these events. The financial events inherently present hierarchical structures, which could be represented as tree-structured schemes in real-life applications, and detecting events could be modeled as a hierarchical multi-label text classification problem, where an event is designated to a tree node with a sequence of hierarchical event category labels. Conventional hierarchical multi-label text classification methods usually ignore the hierarchical relationships existing in the event classification scheme, and treat the hierarchical labels associated with an event as uniform labels, where correct or wrong label predictions are assigned with equal rewards or penalties. In this paper, we propose a neural hierarchical multi-label text classification method, namely F-HMTC, for a financial application scenario with massive event category labels. F-HMTC learns the latent features based on bidirectional encoder representations from transformers, and directly maps them to hierarchical labels with a delicate hierarchy-based loss layer. We conduct extensive experiments on a private financial dataset with elaborately-annotated labels, and F-HMTC consistently outperforms state-of-art baselines by substantial margins. We will release both the source codes and dataset on the first author's repository.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; Foundation for AI in FinTech: Deep learning and representation for FinTech;
【Paper Link】 【Pages】:4497-4505
【Authors】: Yimu Ji ; Weiheng Gu ; Fei Chen ; Xiaoying Xiao ; Jing Sun ; Shangdong Liu ; Jing He ; Yunyao Li ; Kaixiang Zhang ; Fen Mei ; Fei Wu
【Abstract】: The traditional blockchain has the shortcoming that a single-chain can only deal with one or a few specific data types. The research question of how to make blockchain be able to deal with various data types has not been well studied. In this paper, we propose a single-chain based extension model of blockchain for fintech (SEBF). In the financial environment, we design a four-layer architecture for this model. By employing the external trusted or-acle group and a financial regulator agency, a variety types of data can be effectively stored in the blockchain, such that the data type extension based on a single-chain is realized. The experimental results indicate that the proposed model can improve the efficiency of simplified payment verifi-cation.
【Keywords】: AI for lending: AI for blockchain; AI for banking: General;
【Paper Link】 【Pages】:4506-4512
【Authors】: Weili Chen ; Xiongfeng Guo ; Zhiguang Chen ; Zibin Zheng ; Yutong Lu
【Abstract】: In recent years, blockchain technology has created a new cryptocurrency world and has attracted a lot of attention. It also is rampant with various scams. For example, phishing scams have grabbed a lot of money and has become an important threat to users' financial security in the blockchain ecosystem. To help deal with this issue, this paper proposes a systematic approach to detect phishing accounts based on blockchain transactions and take Ethereum as an example to verify its effectiveness. Specifically, we propose a graph-based cascade feature extraction method based on transaction records and a lightGBM-based Dual-sampling Ensemble algorithm to build the identification model. Extensive experiments show that the proposed algorithm can effectively identify phishing scams.
【Keywords】: AI for risk and security: AI for financial security; AI for banking: AI for cryptocurrencies; AI for banking: AI for digital currencies;
【Paper Link】 【Pages】:4513-4519
【Authors】: Zhuang Liu ; Degen Huang ; Kaiyu Huang ; Zhuang Li ; Jun Zhao
【Abstract】: There is growing interest in the tasks of financial text mining. Over the past few years, the progress of Natural Language Processing (NLP) based on deep learning advanced rapidly. Significant progress has been made with deep learning showing promising results on financial text mining models. However, as NLP models require large amounts of labeled training data, applying deep learning to financial text mining is often unsuccessful due to the lack of labeled training data in financial fields. To address this issue, we present FinBERT (BERT for Financial Text Mining) that is a domain specific language model pre-trained on large-scale financial corpora. In FinBERT, different from BERT, we construct six pre-training tasks covering more knowledge, simultaneously trained on general corpora and financial domain corpora, which can enable FinBERT model better to capture language knowledge and semantic information. The results show that our FinBERT outperforms all current state-of-the-art models. Extensive experimental results demonstrate the effectiveness and robustness of FinBERT. The source code and pre-trained models of FinBERT are available online.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; Foundation for AI in FinTech: Deep learning and representation for FinTech; Foundation for AI in FinTech: General; Foundation for AI in FinTech: Analyzing big financial data; AI for lending: General; AI for marketing: General; AI for marketing: AI for consumer sentiment analysis; AI for payment: AI for payment risk modeling; Other areas: Financial decision-support system;
【Paper Link】 【Pages】:4520-4526
【Authors】: Jinho Lee ; Raehyun Kim ; Seok-Won Yi ; Jaewoo Kang
【Abstract】: Generating an investment strategy using advanced deep learning methods in stock markets has recently been a topic of interest. Most existing deep learning methods focus on proposing an optimal model or network architecture by maximizing return. However, these models often fail to consider and adapt to the continuously changing market conditions. In this paper, we propose the Multi-Agent reinforcement learning-based Portfolio management System (MAPS). MAPS is a cooperative system in which each agent is an independent "investor" creating its own portfolio. In the training procedure, each agent is guided to act as diversely as possible while maximizing its own return with a carefully designed loss function. As a result, MAPS as a system ends up with a diversified portfolio. Experiment results with 12 years of US market data show that MAPS outperforms most of the baselines in terms of Sharpe ratio. Furthermore, our results show that adding more agents to our system would allow us to get a higher Sharpe ratio by lowering risk with a more diversified portfolio.
【Keywords】: AI for trading: AI for novel financial models; AI for trading: AI for portfolio analytics; AI for trading: AI for predictive trading;
【Paper Link】 【Pages】:4527-4533
【Authors】: Xin Huang ; Duan Li
【Abstract】: Traditional modeling on the mean-variance portfolio selection often assumes a full knowledge on statistics of assets' returns. It is, however, not always the case in real financial markets. This paper deals with an ambiguous mean-variance portfolio selection problem with a mixture model on the returns of risky assets, where the proportions of different component distributions are assumed to be unknown to the investor, but being constants (in any time instant). Taking into consideration the updates of proportions from future observations is essential to find an optimal policy with active learning feature, but makes the problem intractable when we adopt the classical methods. Using reinforcement learning, we derive an investment policy with a learning feature in a two-level framework. In the lower level, the time-decomposed approach (dynamic programming) is adopted to solve a family of scenario subcases where in each case the series of component distributions along multiple time periods is specified. At the upper level, a scenario-decomposed approach (progressive hedging algorithm) is applied in order to iteratively aggregate the scenario solutions from the lower layer based on the current knowledge on proportions, and this two-level solution framework is repeated in a manner of rolling horizon. We carry out experimental studies to illustrate the execution of our policy scheme.
【Keywords】: AI for trading: AI for portfolio analytics;
【Paper Link】 【Pages】:4534-4540
【Authors】: Ruocheng Guo ; Jundong Li ; Yichuan Li ; K. Selçuk Candan ; Adrienne Raglin ; Huan Liu
【Abstract】: Networked observational data presents new opportunities for learning individual causal effects, which plays an indispensable role in decision making. Such data poses the challenge of confounding bias. Previous work presents two desiderata to handle confounding bias. On the treatment group level, we aim to balance the distributions of confounder representations. On the individual level, it is desirable to capture patterns of hidden confounders that predict treatment assignments. Existing methods show the potential of utilizing network information to handle confounding bias, but they only try to satisfy one of the two desiderata. This is because the two desiderata seem to contradict each other. When the two distributions of confounder representations are highly overlapped, then we confront the undiscriminating problem between the treated and the controlled. In this work, we formulate the two desiderata as a minimax game. We propose IGNITE that learns representations of confounders from networked observational data, which is trained by a minimax game to achieve the two desiderata. Experiments verify the efficacy of IGNITE on two datasets under various settings.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; AI for marketing: AI for econometrics;
【Paper Link】 【Pages】:4541-4547
【Authors】: Wei Li ; Ruihan Bao ; Keiko Harimoto ; Deli Chen ; Jingjing Xu ; Qi Su
【Abstract】: Stock movement prediction is a hot topic in the Fintech area. Previous works usually predict the price movement in a daily basis, although the market impact of news can be absorbed much shorter, and the exact time is hard to estimate. In this work, we propose a more practical objective to predict the overnight stock movement between the previous close price and the open price. As no trading operation occurs after market close, the market impact of overnight news will be reflected by the overnight movement. One big obstacle for such task is the lacking of data, in this work we collect and publish the overnight stock price movement dataset of Reuters Financial News. Another challenge is that the stocks in the market are not independent, which is omitted by previous works. To make use of the connection among stocks, we propose a LSTM Relational Graph Convolutional Network (LSTM-RGCN) model, which models the connection among stocks with their correlation matrix. Extensive experiment results show that our model outperforms the baseline models. Further analysis shows that the introduction of the graph enables our model to predict the movement of stocks that are not directly associated with news as well as the whole market, which is not available in most previous methods.
【Keywords】: AI for trading: AI for predictive trading; AI for trading: General;
【Paper Link】 【Pages】:4548-4554
【Authors】: Siyu Lin ; Peter A. Beling
【Abstract】: In this article, we propose an end-to-end adaptive framework for optimal trade execution based on Proximal Policy Optimization (PPO). We use two methods to account for the time dependencies in the market data based on two different neural network architecture: 1) Long short-term memory (LSTM) networks, 2) Fully-connected networks (FCN) by stacking the most recent limit orderbook (LOB) information as model inputs. The proposed framework can make trade execution decisions based on level-2 limit order book (LOB) information such as bid/ask prices and volumes directly without manually designed attributes as in previous research. Furthermore, we use a sparse reward function, which gives the agent reward signals at the end of each episode as an indicator of its relative performances against the baseline model, rather than implementation shortfall (IS) or a shaped reward function. The experimental results have demonstrated advantages over IS and the shaped reward function in terms of performance and simplicity. The proposed framework has outperformed the industry commonly used baseline models such as TWAP, VWAP, and AC as well as several Deep Reinforcement Learning (DRL) models on most of the 14 US equities in our experiments.
【Keywords】: Foundation for AI in FinTech: Reinforcement learning for FinTech; AI for trading: AI for algorithmic trading;
【Paper Link】 【Pages】:4555-4561
【Authors】: Guang Liu ; Yuzhao Mao ; Qi Sun ; Hailong Huang ; Weiguo Gao ; Xuan Li ; Jianping Shen ; Ruifan Li ; Xiaojie Wang
【Abstract】: Stock Trend Prediction(STP) has drawn wide attention from various fields, especially Artificial Intelligence. Most previous studies are single-scale oriented which results in information loss from a multi-scale perspective. In fact, multi-scale behavior is vital for making intelligent investment decisions. A mature investor will thoroughly investigate the state of a stock market at various time scales. To automatically learn the multi-scale information in stock data, we propose a Multi-scale Two-way Deep Neural Network. It learns multi-scale patterns from two types of scale-information, wavelet-based and downsampling-based, by eXtreme Gradient Boosting and Recurrent Convolutional Neural Network, respectively. After combining the learned patterns from the two-way, our model achieves state-of-the-art performance on FI-2010 and CSI-2016, where the latter is our published long-range stock dataset to help future studies for STP task. Extensive experimental results on the two datasets indicate that multi-scale information can significantly improve the STP performance and our model is superior in capturing such information.
【Keywords】: Foundation for AI in FinTech: Analyzing big financial data; Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; Foundation for AI in FinTech: Modeling financial market microstructure; AI for trading: AI for novel financial models; AI for risk and security: AI for market movement and change analysis; Other areas: Financial decision-support system; Foundation for AI in FinTech: General;
【Paper Link】 【Pages】:4562-4568
【Authors】: Kei Nakagawa ; Shuhei Noma ; Masaya Abe
【Abstract】: The problem of finding the optimal portfolio for investors is called the portfolio optimization problem. Such problem mainly concerns the expectation and variability of return (i.e., mean and variance). Although the variance would be the most fundamental risk measure to be minimized, it has several drawbacks. Conditional Value-at-Risk (CVaR) is a relatively new risk measure that addresses some of the shortcomings of well-known variance-related risk measures, and because of its computational efficiencies, it has gained popularity. CVaR is defined as the expected value of the loss that occurs beyond a certain probability level (β). However, portfolio optimization problems that use CVaR as a risk measure are formulated with a single β and may output significantly different portfolios depending on how the β is selected. We confirm even small changes in β can result in huge changes in the whole portfolio structure. In order to improve this problem, we propose RM-CVaR: Regularized Multiple β-CVaR Portfolio. We perform experiments on well-known benchmarks to evaluate the proposed portfolio. Compared with various portfolios, RM-CVaR demonstrates a superior performance of having both higher risk-adjusted returns and lower maximum drawdown.
【Keywords】: AI for trading: AI for portfolio analytics; AI for risk and security: AI for financial risk analytics; AI for risk and security: AI for institutional risk modeling; AI for wealth: AI for roboadvising; Other areas: Financial decision-support system; AI for trading: General;
【Paper Link】 【Pages】:4576-4582
【Authors】: Zhen Ye ; Yu Qin ; Wei Xu
【Abstract】: Financial risk is an essential indicator of investment, which can help investors to understand the market and companies better. Among the many influencing factors of financial risk, researchers find the earnings conference call is the most significant one. Predicting financial volatility after the earnings conference call has been critical to beneficiaries, including investors and company managers. However, previous work mainly focuses on the feature extraction from the word-level or document-level.The vital structure of conferences, the alternate dialogue, is ignored. In this paper, we introduced our Multi-Round Q&A; Attention Network, which brings into account the dialogue form in the first place. Based on the data of earnings call transcripts, we apply our model to extract features of each round of dialogue through a bidirectional attention mechanism and predict the volatility after the earnings conference call events. The results prove that our model significantly outperforms the previous state-of-the-art methods and other baselines in three different periods.
【Keywords】: Foundation for AI in FinTech: Analyzing big financial data; AI for risk and security: AI for financial risk analytics; AI for risk and security: AI for financial risk factors and prediction;
【Paper Link】 【Pages】:4583-4589
【Authors】: Lorenzo Bisi ; Luca Sabbioni ; Edoardo Vittori ; Matteo Papini ; Marcello Restelli
【Abstract】: The use of reinforcement learning in algorithmic trading is of growing interest, since it offers the opportunity of making profit through the development of autonomous artificial traders, that do not depend on hard-coded rules. In such a framework, keeping uncertainty under control is as important as maximizing expected returns. Risk aversion has been addressed in reinforcement learning through measures related to the distribution of returns. However, in trading it is essential to keep under control the risk of portfolio positions in the intermediate steps. In this paper, we define a novel measure of risk, which we call reward volatility, consisting of the variance of the rewards under the state-occupancy measure. This new risk measure is shown to bound the return variance so that reducing the former also constrains the latter. We derive a policy gradient theorem with a new objective function that exploits the mean-volatility relationship. Furthermore, we adapt TRPO, the well-known policy gradient algorithm with monotonic improvement guarantees, in a risk-averse manner. Finally, we test the proposed approach in two financial environments using real market data.
【Keywords】: Foundation for AI in FinTech: Reinforcement learning for FinTech; AI for trading: AI for algorithmic trading;
【Paper Link】 【Pages】:4590-4596
【Authors】: Thomas Spooner ; Rahul Savani
【Abstract】: We show that adversarial reinforcement learning (ARL) can be used to produce market marking agents that are robust to adversarial and adaptively-chosen market conditions. To apply ARL, we turn the well-studied single-agent model of Avellaneda and Stoikov [2008] into a discrete-time zero-sum game between a market maker and adversary. The adversary acts as a proxy for other market participants that would like to profit at the market maker's expense. We empirically compare two conventional single-agent RL agents with ARL, and show that our ARL approach leads to: 1) the emergence of risk-averse behaviour without constraints or domain-specific penalties; 2) significant improvements in performance across a set of standard metrics, evaluated with or without an adversary in the test environment, and; 3) improved robustness to model uncertainty. We empirically demonstrate that our ARL method consistently converges, and we prove for several special cases that the profiles that we converge to correspond to Nash equilibria in a simplified single-stage game.
【Keywords】: Foundation for AI in FinTech: Reinforcement learning for FinTech; Foundation for AI in FinTech: Computational intelligence for FinTech; AI for trading: AI for algorithmic trading; AI for trading: AI for high frequency (cross-market) trading; AI for trading: AI for strategic trading and strategy design; AI for trading: AI for trading incentive and strategy optimization;
【Paper Link】 【Pages】:4597-4603
【Authors】: Xuan-Hong Dang ; Syed Yousaf Shah ; Petros Zerfos
【Abstract】: Multimodal analysis that incorporates time series and textual corpora as input data sources is becoming a promising approach, especially in the financial industry. However, the main focus of such analysis has been on achieving high prediction accuracy rather than on understanding the association between the two data modalities. In this work, we address the important problem of automatically dis- covering a small set of top news articles associated with a given time series. Towards this goal, we pro- pose a novel multi-modal neural model called MSIN that jointly learns both the numerical time series and the categorical text articles in order to unearth the correlation between them. Through multiple steps of data interrelation between the two data modalities, MSIN learns to focus on a small subset of text articles that best align with the current performance in the time series. This succinct set is timely discovered and presented as recommended documents for the given time series, offering MSIN as an automated information filtering system. We empirically evaluate its performance on discovering daily top relevant news articles collected from Thomson Reuters for two given stock time series, AAPL and GOOG, over a period of seven consecutive years. The experimental results demonstrate MSIN achieves up to 84.9% and 87.2% respectively in recalling the ground truth articles, superior to SOTA algorithms that rely on conventional attention mechanisms in deep learning.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; Foundation for AI in FinTech: Intelligent financial recommendation; AI for trading: AI for novel financial models;
【Paper Link】 【Pages】:4604-4610
【Authors】: Naman Goel ; Cyril van Schreven ; Aris Filos-Ratsikas ; Boi Faltings
【Abstract】: Blockchain based systems allow various kinds of financial transactions to be executed in a decentralized manner. However, these systems often rely on a trusted third party (oracle) to get correct information about the real-world events, which trigger the financial transactions. In this paper, we identify two biggest challenges in building decentralized, trustless and transparent oracles. The first challenge is acquiring correct information about the real-world events without relying on a trusted information provider. We show how a peer-consistency incentive mechanism can be used to acquire truthful information from an untrusted and self-interested crowd, even when the crowd has outside incentives to provide wrong informations. The second is a system design and implementation challenge. For the first time, we show how to implement a trustless and transparent oracle in Ethereum. We discuss various non-trivial issues that arise in implementing peer-consistency mechanisms in Ethereum, suggest several optimizations to reduce gas cost and provide empirical analysis.
【Keywords】: Foundation for AI in FinTech: AI for financial infrastructure; Foundation for AI in FinTech: Modeling economic incentives; Foundation for AI in FinTech: Modeling economic mechanisms and social welfare; AI for lending: AI for blockchain; AI for lending: AI for smart contracts; Foundation for AI in FinTech: General;
【Paper Link】 【Pages】:4611-4618
【Authors】: Cheng Wang
【Abstract】: As a matter of fact, it is usually taken for granted that the occurrence of unauthorized behaviors is necessary for the fraud detection in online payment services. However, we seek to break this stereotype in this work. We strive to design an ex-ante anti-fraud method that can work before unauthorized behaviors occur. The feasibility of our solution is supported by the cooperation of a characteristic and a finding in online payment fraud scenarios: The well-recognized characteristic is that online payment frauds are mostly caused by account compromise. Our finding is that account theft is indeed predictable based on users' high-risk behaviors, without relying on the behaviors of thieves. Accordingly, we propose an account risk prediction scheme to realize the ex-ante fraud detection. It takes in an account's historical transaction sequence, and outputs its risk score. The risk score is then used as an early evidence of whether a new transaction is fraudulent or not, before the occurrence of the new transaction. We examine our method on a real-world B2C transaction dataset from a commercial bank. Experimental results show that the ex-ante detection method can prevent more than 80\% of the fraudulent transactions before they actually occur. When the proposed method is combined with an interim detection to form a real-time anti-fraud system, it can detect more than 94\% of fraudulent transactions while maintaining a very low false alarm rate (less than 0.1\%).
【Keywords】: AI for banking: AI for banking risk and fraud modeling; AI for payment: AI for payment risk modeling; AI for payment: AI for payment security;
【Paper Link】 【Pages】:4619-4625
【Authors】: Chi Seng Pun ; Lei Wang ; Hoi Ying Wong
【Abstract】: Modern day trading practice resembles a thought experiment, where investors imagine various possibilities of future stock market and invest accordingly. Generative adversarial network (GAN) is highly relevant to this trading practice in two ways. First, GAN generates synthetic data by a neural network that is technically indistinguishable from the reality, which guarantees the reasonableness of the experiment. Second, GAN generates multitudes of fake data, which implements half of the experiment. In this paper, we present a new architecture of GAN and adapt it to portfolio risk minimization problem by adding a regression network to GAN (implementing the second half of the experiment). The new architecture is termed GANr. Battling against two distinctive networks: discriminator and regressor, GANr's generator aims to simulate a stock market that is close to the reality while allow for all possible scenarios. The resulting portfolio resembles a robust portfolio with data-driven ambiguity. Our empirical studies show that GANr portfolio is more resilient to bleak financial scenarios than CLSGAN and LASSO portfolios.
【Keywords】: AI for trading: AI for portfolio analytics; AI for trading: AI for novel financial models; Foundation for AI in FinTech: Analyzing highdimentional, sequential and evolving financial data; Other areas: Financial decision-support system; AI for risk and security: AI for financial risk analytics;
【Paper Link】 【Pages】:4626-4632
【Authors】: Xintong Wang ; Michael P. Wellman
【Abstract】: We propose an adversarial learning framework to capture the evolving game between a regulator who develops tools to detect market manipulation and a manipulator who obfuscates actions to evade detection. The model includes three main parts: (1) a generator that learns to adapt original manipulation order streams to resemble trading patterns of a normal trader while preserving the manipulation intent; (2) a discriminator that differentiates the adversarially adapted manipulation order streams from normal trading activities; and (3) an agent-based simulator that evaluates the manipulation effect of adapted outputs. We conduct experiments on simulated order streams associated with a manipulator and a market-making agent respectively. We show examples of adapted manipulation order streams that mimic a specified market maker's quoting patterns and appear qualitatively different from the original manipulation strategy we implemented in the simulator. These results demonstrate the possibility of automatically generating a diverse set of (unseen) manipulation strategies that can facilitate the training of more robust detection algorithms.
【Keywords】: AI for regulation: AI for financial fraud detection; Foundation for AI in FinTech: Deep learning and representation for FinTech; AI for regulation: AI for financial market regulation, design and policy implication; AI for regulation: AI for financial crime detection;
【Paper Link】 【Pages】:4633-4639
【Authors】: Zihao Wang ; Jia Liu ; Hengbin Cui ; Chunxiang Jin ; Minghui Yang ; Yafang Wang ; Xiaolong Li ; Renxin Mao
【Abstract】: With the rapid growth of internet finance and the booming of financial lending, the intelligent calling for debt collection in FinTech companies has driven increasing attention. Nowadays, the widely used intelligent calling system is based on dialogue flow, namely configuring the interaction flow with the finite-state machine. In our scenario of debt collection, the completed dialogue flow contains more than one thousand interactive paths. All the dialogue procedures are artificially specified, with extremely high maintenance costs and error-prone. To solve this problem, we propose the behavior-cloning-based collection robot framework without any dialogue flow configuration, called two-stage behavior cloning (TSBC). In the first stage, we use multi-label classification model to obtain policies that may be able to cope with the current situation according to the dialogue state; in the second stage, we score several scripts under each obtained policy to select the script with the highest score as the reply for the current state. This framework makes full use of the massive manual collection records without labeling and fully absorbs artificial wisdom and experience. We have conducted extensive experiments in both single-round and multi-round scenarios and showed the effectiveness of the proposed system. The accuracy of a single round of dialogue can be improved by 5%, and the accuracy of multiple rounds of dialogue can be increased by 3.1%.
【Keywords】: Foundation for AI in FinTech: Automated financial systems and services; Foundation for AI in FinTech: Deep learning and representation for FinTech; AI for banking: AI for smarter banking services; Foundation for AI in FinTech: Analyzing big financial data;
【Paper Link】 【Pages】:4640-4646
【Authors】: Qianggang Ding ; Sifan Wu ; Hao Sun ; Jiadong Guo ; Jian Guo
【Abstract】: Predicting the price movement of finance securities like stocks is an important but challenging task, due to the uncertainty of financial markets. In this paper, we propose a novel approach based on the Transformer to tackle the stock movement prediction task. Furthermore, we present several enhancements for the proposed basic Transformer. Firstly, we propose a Multi-Scale Gaussian Prior to enhance the locality of Transformer. Secondly, we develop an Orthogonal Regularization to avoid learning redundant heads in the multi-head self-attention mechanism. Thirdly, we design a Trading Gap Splitter for Transformer to learn hierarchical features of high-frequency finance data. Compared with other popular recurrent neural networks such as LSTM, the proposed method has the advantage to mine extremely long-term dependencies from financial time series. Experimental results show our proposed models outperform several competitive methods in stock price prediction tasks for the NASDAQ exchange market and the China A-shares market.
【Keywords】: Foundation for AI in FinTech: Deep learning and representation for FinTech; Foundation for AI in FinTech: Modeling financial structure and hierarchy; AI for trading: AI for algorithmic trading; AI for trading: AI for novel financial mechanisms; AI for trading: AI for novel financial models; AI for trading: AI for predictive trading; AI for trading: General;
【Paper Link】 【Pages】:4647-4653
【Authors】: Ke Xu ; Yifan Zhang ; Deheng Ye ; Peilin Zhao ; Mingkui Tan
【Abstract】: Portfolio selection is an important yet challenging task in AI for FinTech. One of the key issues is how to represent the non-stationary price series of assets in a portfolio, which is important for portfolio decisions. The existing methods, however, fall short of capturing: 1) the complicated sequential patterns for asset price series and 2) the price correlations among multiple assets. In this paper, under a deep reinforcement learning paradigm for portfolio selection, we propose a novel Relation-aware Transformer (RAT) to handle these aspects. Specifically, being equipped with our newly developed attention modules, RAT is structurally innovated to capture both sequential patterns and asset correlations for portfolio selection. Based on the extracted sequential features, RAT is able to make profitable portfolio decisions regarding each asset via a newly devised leverage operation. Extensive experiments on real-world crypto-currency and stock datasets verify the state-of-the-art performance of RAT.
【Keywords】: Foundation for AI in FinTech: Reinforcement learning for FinTech; AI for trading: AI for portfolio analytics; AI for wealth: AI for digital asset management;
【Paper Link】 【Pages】:4654-4660
【Authors】: Wenbo Zheng ; Lan Yan ; Chao Gou ; Fei-Yue Wang
【Abstract】: Credit card transaction fraud costs billions of dollars to card issuers every year. Besides, the credit card transaction dataset is very skewed, there are much fewer samples of frauds than legitimate transactions. Due to the data security and privacy, different banks are usually not allowed to share their transaction datasets. These problems make traditional model difficult to learn the patterns of frauds and also difficult to detect them. In this paper, we introduce a novel framework termed as federated meta-learning for fraud detection. Different from the traditional technologies trained with data centralized in the cloud, our model enables banks to learn fraud detection model with the training data distributed on their own local database. A shared whole model is constructed by aggregating locallycomputed updates of fraud detection model. Banks can collectively reap the benefits of shared model without sharing the dataset and protect the sensitive information of cardholders. To achieve the good performance of classification, we further formulate an improved triplet-like metric learning, and design a novel meta-learning-based classifier, which allows joint comparison with K negative samples in each mini-batch. Experimental results demonstrate that the proposed approach achieves significantly higher performance compared with the other state-of-the-art approaches.
【Keywords】: AI for banking: AI for banking risk and fraud modeling; AI for banking: AI for credit analysis and pricing; AI for payment: AI for payment risk modeling; AI for regulation: AI for financial crime detection; AI for regulation: AI for financial fraud detection;
【Paper Link】 【Pages】:4661-4667
【Authors】: Shuo Yang ; Zhiqiang Zhang ; Jun Zhou ; Yang Wang ; Wang Sun ; Xingyu Zhong ; Yanming Fang ; Quan Yu ; Yuan Qi
【Abstract】: Small and Medium-sized Enterprises (SMEs) are playing a vital role in the modern economy. Recent years, financial risk analysis for SMEs attracts lots of attentions from financial institutions. However, the financial risk analysis for SMEs usually suffers data deficiency problem, especially for the mobile financial institutions which seldom collect credit-related data directly from SMEs. Fortunately, although credit-related information of SMEs is hard to be acquired sufficiently, the interactive relationships between SMEs, which may contain valuable information of financial risk, is usually available for the mobile financial institutions. Finding out credit-related relationship of SME from massive interactions helps comprehensively model the SMEs thus improve the performance of financial risk analysis.
In this paper, tackling the data deficiency problem of financial risk analysis for SMEs, we propose an innovative financial risk analysis framework with graph-based supply chain mining. Specifically, to capture the credit-related topology structural and temporal variation information of SMEs, we design and employ a novel spatial-temporal aware graph neural network, to mine supply chain relationship on a SME graph, and then analysis the credit risk based on the mined supply chain graph. Experimental results on real-world financial datasets prove the effectiveness of our proposal for financial risk analysis for SMEs.
【Keywords】: AI for risk and security: AI for financial risk factors and prediction; AI for banking: AI for credit loan; AI for lending: General;
【Paper Link】 【Pages】:4668-4674
【Authors】: Quanzhi Li ; Qiong Zhang
【Abstract】: There is massive amount of news on financial events every day. In this paper, we present a unified model for detecting, classifying and summarizing financial events. This model exploits a multi-task learning approach, in which a pre-trained BERT model is used to encode the news articles, and the encoded information are shared by event type classification, detection and summarization tasks. For event summarization, we use a Transformer structure as the decoder. In addition to the input document encoded by BERT, the decoder also utilizes the predicted event type and cluster information, so that it can focus on the specific aspects of the event when generating summary. Our experiments show that our approach outperforms other methods.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; Foundation for AI in FinTech: Computational intelligence for FinTech;
【Paper Link】 【Pages】:4675-4681
【Authors】: Shuoyao Wang ; Diwei Zhu
【Abstract】: With the explosive growth of transaction activities in online payment systems, effective and real-time regulation becomes a critical problem for payment service providers. Thanks to the rapid development of artificial intelligence (AI), AI-enable regulation emerges as a promising solution. One main challenge of the AI-enabled regulation is how to utilize multimedia information, i.e., multimodal signals, in Financial Technology (FinTech). Inspired by the attention mechanism in nature language processing, we propose a novel cross-modal and intra-modal attention network (CIAN) to investigate the relation between the text and transaction. More specifically, we integrate the text and transaction information to enhance the text-trade joint-embedding learning, which clusters positive pairs and push negative pairs away from each other. Another challenge of intelligent regulation is the interpretability of complicated machine learning models. To sustain the requirements of financial regulation, we design a CIAN-Explainer to interpret how the attention mechanism interacts the original features, which is formulated as a low-rank matrix approximation problem. With the real datasets from the largest online payment system, WeChat Pay of Tencent, we conduct experiments to validate the practical application value of CIAN, where our method outperforms the state-of-the-art methods.
【Keywords】: Foundation for AI in FinTech: Data mining and knowledge discovery for FinTech; AI for regulation: AI for international regulation; AI for regulation: AI for financial crime detection; Other areas: Interpretability of FinTech;
【Paper Link】 【Pages】:4682-4689
【Authors】: Mengying Zhu ; Xiaolin Zheng ; Yan Wang ; Qianqiao Liang ; Wenfang Zhang
【Abstract】: Online portfolio selection (OLPS) is a fundamental and challenging problem in financial engineering, which faces two practical constraints during the real trading, i.e., cardinality constraint and non-zero transaction costs. In order to achieve greater feasibility in financial markets, in this paper, we propose a novel online portfolio selection method named LExp4.TCGP with theoretical guarantee of sublinear regret to address the OLPS problem with the two constraints. In addition, we incorporate side information into our method based on contextual bandit, which further improves the effectiveness of our method. Extensive experiments conducted on four representative real-world datasets demonstrate that our method significantly outperforms the state-of-the-art methods when cardinality constraint and non-zero transaction costs co-exist.
【Keywords】: AI for trading: AI for portfolio analytics; AI for trading: AI for strategic trading and strategy design; Foundation for AI in FinTech: Reinforcement learning for FinTech; AI for trading: General;
【Paper Link】 【Pages】:4691-4695
【Authors】: Guojing Zhou ; Hamoon Azizsoltani ; Markel Sanz Ausin ; Tiffany Barnes ; Min Chi
【Abstract】: In interactive e-learning environments such as Intelligent Tutoring Systems, there are pedagogical decisions to make at two main levels of granularity: whole problems and single steps. In recent years, there is growing interest in applying data-driven techniques for adaptive decision making that can dynamically tailor students' learning experiences. Most existing data-driven approaches, however, treat these pedagogical decisions equally, or independently, disregarding the long-term impact that tutor decisions may have across these two levels of granularity. In this paper, we propose and apply an offline Gaussian Processes based Hierarchical Reinforcement Learning (HRL) framework to induce a hierarchical pedagogical policy that makes decisions at both problem and step levels. An empirical classroom study shows that the HRL policy is significantly more effective than a Deep Q-Network (DQN) induced policy and a random yet reasonable baseline policy.
【Keywords】: Humans and AI: Computer-Aided Education; Machine Learning Applications: Applications of Reinforcement Learning; Planning and Scheduling: Hierarchical planning;
【Paper Link】 【Pages】:4696-4700
【Authors】: Tom Decroos ; Lotte Bransen ; Jan Van Haaren ; Jesse Davis
【Abstract】: Despite the fact that objectively assessing the impact of the individual actions performed by soccer players during games is a crucial task, most traditional metrics have substantial shortcomings. First, many metrics only consider rare actions like shots and goals which account for less than 2% of all on-the-ball actions. Second, they fail to account for the context in which the actions occurred. This work summarizes several important contributions. First, we describe a language for representing individual player actions on the pitch. This language unifies several existing formats which greatly simplifies automated analysis and this language is becoming widely used in the soccer analytics community. Second, we describe our framework for valuing any type of player action based on its impact on the game outcome while accounting for the context in which the action happened. This framework enables giving a broad overview of a player's performance, including quantifying a player's total offensive and defensive contributions to their team. Third, we provide illustrative use cases that highlight the working and benefits of our framework.
【Keywords】: Machine Learning Applications: Applications of Supervised Learning; Machine Learning: Time-series;Data Streams; Data Mining: Classification, Semi-Supervised Learning; Multidisciplinary Topics and Applications: Other;
【Paper Link】 【Pages】:4701-4705
【Authors】: Zhenpeng Chen ; Sheng Shen ; Ziniu Hu ; Xuan Lu ; Qiaozhu Mei ; Xuanzhe Liu
【Abstract】: Sentiment classification typically relies on a large amount of labeled data. In practice, the availability of labels is highly imbalanced among different languages. To tackle this problem, cross-lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled examples (i.e., the source language, usually English) to another language with fewer labels (i.e., the target language). The source and the target languages are usually bridged through off-the-shelf machine translation tools. Through such a channel, cross-language sentiment patterns can be successfully learned from English and transferred into the target languages. This approach, however, often fails to capture sentiment knowledge specific to the target language. In this paper, we employ emojis, which are widely available in many languages, as a new channel to learn both the cross-language and the language-specific sentiment patterns. We propose a novel representation learning method that uses emoji prediction as an instrument to learn respective sentiment-aware representations for each language. The learned representations are then integrated to facilitate cross-lingual sentiment classification.
【Keywords】: Natural Language Processing: Text Classification; Data Mining: Mining Text, Web, Social Media;
【Paper Link】 【Pages】:4706-4710
【Authors】: Maurizio Ferrari Dacrema ; Paolo Cremonesi ; Dietmar Jannach
【Abstract】: The development of continuously improved machine learning algorithms for personalized item ranking lies at the core of today's research in the area of recommender systems. Over the years, the research community has developed widely-agreed best practices for comparing algorithms and demonstrating progress with offline experiments. Unfortunately, we find this accepted research practice can easily lead to phantom progress due to the following reasons: limited reproducibility, comparison with complex but weak and non-optimized baseline algorithms, over-generalization from a small set of experimental configurations. To assess the extent of such problems, we analyzed 18 research papers published recently at top-ranked conferences. Only 7 were reproducible with reasonable effort, and 6 of them could often be outperformed by relatively simple heuristic methods, e.g., nearest neighbors. In this paper, we discuss these observations in detail, and reflect on the related fundamental problem of over-reliance on offline experiments in recommender systems research.
【Keywords】: Machine Learning: Recommender Systems; Machine Learning: Deep Learning; Multidisciplinary Topics and Applications: Validation and Verification;
【Paper Link】 【Pages】:4711-4715
【Authors】: Giuseppe Cuccu ; Julian Togelius ; Philippe Cudré-Mauroux
【Abstract】: Deep reinforcement learning applied to vision-based problems like Atari games maps pixels directly to actions; internally, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it. By separating image processing from decision-making, one could better understand the complexity of each task, as well as potentially find smaller policy representations that are easier for humans to understand and may generalize better. To this end, we propose a new method for learning policies and compact state representations separately but simultaneously for policy approximation in reinforcement learning. State representations are generated by an encoder based on two novel algorithms: Increasing Dictionary Vector Quantization makes the encoder capable of growing its dictionary size over time, to address new observations; and Direct Residuals Sparse Coding encodes observations by aiming for highest information inclusion. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on the game's controls). These are still capable of achieving results comparable---and occasionally superior---to state-of-the-art techniques which use two orders of magnitude more neurons.
【Keywords】: Machine Learning: Reinforcement Learning; Machine Learning Applications: Game Playing; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:4716-4720
【Authors】: Giovanni Amendola ; Carmine Dodaro ; Marco Maratea
【Abstract】: The issue of describing in a formal way solving algorithms in various fields such as Propositional Satisfiability (SAT), Quantified SAT, Satisfiability Modulo Theories, Answer Set Programming (ASP), and Constraint ASP, has been relatively recently solved employing abstract solvers. In this paper we deal with cautious reasoning tasks in ASP, and design, implement and test novel abstract solutions, borrowed from backbone computation in SAT. By employing abstract solvers, we also formally show that the algorithms for solving cautious reasoning tasks in ASP are strongly related to those for computing backbones of Boolean formulas. Some of the new solutions have been implemented in the ASP solver WASP, and tested.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning;
【Paper Link】 【Pages】:4721-4725
【Authors】: Pedro Cabalar ; Jorge Fandinno ; Luis Fariñas del Cerro
【Abstract】: Epistemic logic programs constitute an extension of the stable model semantics to deal with new constructs called "subjective literals." Informally speaking, a subjective literal allows checking whether some objective literal is true in all or some stable models. However, its associated semantics has proved to be non-trivial, since the truth of subjective literals may interfere with the set of stable models it is supposed to query. As a consequence, no clear agreement has been reached and different semantic proposals have been made in the literature. In this paper, we review an extension of the well-known splitting property for logic programs to the epistemic case. This "epistemic splitting property" is defined as a general condition that can be checked on any arbitrary epistemic semantics. Its satisfaction has desirable consequences both in the representation of conformant planning problems and in the encoding of the so-called subjective constraints.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief;
【Paper Link】 【Pages】:4726-4729
【Authors】: Dylan J. Foster ; Vasilis Syrgkanis
【Abstract】: We provide excess risk guarantees for statistical learning in a setting where the population risk with respect to which we evaluate a target parameter depends on an unknown parameter that must be estimated from data (a "nuisance parameter"). We analyze a two-stage sample splitting meta-algorithm that takes as input two arbitrary estimation algorithms: one for the target parameter and one for the nuisance parameter. We show that if the population risk satisfies a condition called Neyman orthogonality, the impact of the nuisance estimation error on the excess risk bound achieved by the meta-algorithm is of second order. Our theorem is agnostic to the particular algorithms used for the target and nuisance and only makes an assumption on their individual performance. This enables the use of a plethora of existing results from statistical learning and machine learning literature to give new guarantees for learning with a nuisance component. Moreover, by focusing on excess risk rather than parameter estimation, we can give guarantees under weaker assumptions than in previous works and accommodate the case where the target parameter belongs to a complex nonparametric class. We characterize conditions on the metric entropy such that oracle rates---rates of the same order as if we knew the nuisance parameter---are achieved. We also analyze the rates achieved by specific estimation algorithms such as variance-penalized empirical risk minimization, neural network estimation and sparse high-dimensional linear model estimation. We highlight the applicability of our results in four settings of central importance in the literature: 1) heterogeneous treatment effect estimation, 2) offline policy optimization, 3) domain adaptation, and 4) learning with missing data.
【Keywords】: Machine Learning: Learning Theory;
【Paper Link】 【Pages】:4730-4734
【Authors】: Hai Huang ; Fabien Gandon
【Abstract】: A Linked Data crawler performs a selection to focus on collecting linked RDF (including RDFa) data on the Web. From the perspectives of throughput and coverage, given a newly discovered and targeted URI, the key issue of Linked Data crawlers is to decide whether this URI is likely to dereference into an RDF data source and therefore it is worth downloading the representation it points to. Current solutions adopt heuristic rules to filter irrelevant URIs. But when the heuristics are too restrictive this hampers the coverage of crawling. In this paper, we propose and compare approaches to learn strategies for crawling Linked Data on the Web by predicting whether a newly discovered URI will lead to an RDF data source or not. We detail the features used in predicting the relevance and the methods we evaluated including a promising adaptation of FTRL-proximal online learning algorithm. We compare several options through extensive experiments including existing crawlers as baseline methods to evaluate their efficiency.
【Keywords】: Knowledge Representation and Reasoning: Semantic Web; Data Mining: Mining Text, Web, Social Media; Data Mining: Feature Extraction, Selection and Dimensionality Reduction; Data Mining: Applications;
【Paper Link】 【Pages】:4735-4739
【Authors】: Yelena Mejova ; Kyriaki Kalimeri
【Abstract】: In this study, we present a unique demographically representative dataset of 15k US residents that combines technology use logs with surveys on moral views, human values, and emotional contagion. First, we show which values determine the adoption of Health & Fitness mobile applications, finding that users who prioritize the value of purity and de-emphasize values of conformity, hedonism, and security are more likely to use such apps. Further, we achieve a weighted AUROC of .673 in predicting whether individual exercises and find a strong link of exercise to respondent socioeconomic status, as well as the value of loyalty.
【Keywords】: Humans and AI: Personalization and User Modeling; Humans and AI: Human-Computer Interaction; Multidisciplinary Topics and Applications: Social Sciences;
【Paper Link】 【Pages】:4740-4744
【Authors】: Eoin M. Kenny ; Elodie Ruelle ; Anne Geoghegan ; Laurence Shalloo ; Micheál O'Leary ; Michael O'Donovan ; Mohammed Temraz ; Mark T. Keane
【Abstract】: Smart agriculture (SmartAg) has emerged as a rich domain for AI-driven decision support systems (DSS); however, it is often challenged by user-adoption issues. This paper reports a case-based reasoning (CBR) system, PBI-CBR, that predicts grass growth for dairy farmers, that combines predictive accuracy and explanations to improve user adoption. PBI-CBR’s key novelty is its use of Bayesian methods for case-base maintenance in a regression domain. Experiments report the tradeoff between predictive accuracy and explanatory capability for different variants of PBI-CBR, and how updating Bayesian priors each year improves performance.
【Keywords】: Knowledge Representation and Reasoning: Case-based Reasoning; Machine Learning: Interpretability; Machine Learning: Explainable Machine Learning; AI Ethics: Explainability;
【Paper Link】 【Pages】:4745-4749
【Authors】: Xiang Lisa Li ; Jason Eisner
【Abstract】: Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks. We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. We compress each word embedding to either a discrete tag or a continuous vector. In the discrete version, our automatically compressed tags form an alternative tag set: we show experimentally that our tags capture most of the information in traditional POS tag annotations, but our tag sequences can be parsed more accurately at the same level of tag granularity. In the continuous version, we show experimentally that moderately compressing the word embeddings by our method yields a more accurate parser in 8 of 9 languages, unlike simple dimensionality reduction.
【Keywords】: Natural Language Processing: Embeddings; Natural Language Processing: Tagging, chunking, and parsing; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:4750-4754
【Authors】: Zhichao Lu ; Ian Whalen ; Yashesh D. Dhebar ; Kalyanmoy Deb ; Erik D. Goodman ; Wolfgang Banzhaf ; Vishnu Naresh Boddeti
【Abstract】: Convolutional neural networks (CNNs) are the backbones of deep learning paradigms for numerous vision tasks. Early advancements in CNN architectures are primarily driven by human expertise and elaborate design. Recently, neural architecture search (NAS) was proposed with the aim of automating the network design process and generating task-dependent architectures. This paper introduces NSGA-Net -- an evolutionary search algorithm that explores a space of potential neural network architectures in three steps, namely, a population initialization step that is based on prior-knowledge from hand-crafted architectures, an exploration step comprising crossover and mutation of architectures, and finally an exploitation step that utilizes the hidden useful knowledge stored in the entire history of evaluated neural architectures in the form of a Bayesian Network. The integration of these components allows an efficient design of architectures that are competitive and in many cases outperform both manually and automatically designed architectures on CIFAR-10 classification task. The flexibility provided from simultaneously obtaining multiple architecture choices for different compute requirements further differentiates our approach from other methods in the literature.
【Keywords】: Machine Learning: Deep Learning: Convolutional networks; Machine Learning: Classification;
【Paper Link】 【Pages】:4755-4759
【Authors】: Vikram Mohanty ; David Thames ; Sneha Mehta ; Kurt Luther
【Abstract】: Identifying people in historical photographs is important for interpreting material culture, correcting the historical record, and creating economic value, but it is also a complex and challenging task. In this paper, we focus on identifying portraits of soldiers who participated in the American Civil War (1861-65). Millions of these portraits survive, but only 10-20% are identified. We created Photo Sleuth, a web-based platform that combines crowdsourced human expertise and automated face recognition to support Civil War portrait identification. Our mixed-methods evaluation of Photo Sleuth one month after its public launch showed that it helped users successfully identify unknown portraits.
【Keywords】: Humans and AI: Human-AI Collaboration; Humans and AI: Human-Computer Interaction; Humans and AI: Human Computation and Crowdsourcing; Computer Vision: Biometrics, Face and Gesture Recognition;
【Paper Link】 【Pages】:4760-4764
【Authors】: Mohan Sridharan ; Tiago Mota
【Abstract】: Our architecture uses non-monotonic logical reasoning with incomplete commonsense domain knowledge, and incremental inductive learning, to guide the construction of deep network models from a small number of training examples. Experimental results in the context of a robot reasoning about the partial occlusion of objects and the stability of object configurations in simulated images indicate an improvement in reliability and a reduction in computational effort in comparison with an architecture based just on deep networks.
【Keywords】: Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Machine Learning: Deep Learning; Machine Learning: Online Learning; Robotics: Robotics and Vision;
【Paper Link】 【Pages】:4765-4769
【Authors】: Hélène Verhaeghe ; Siegfried Nijssen ; Gilles Pesant ; Claude-Guy Quimper ; Pierre Schaus
【Abstract】: Decision trees are among the most popular classification models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms have their disadvantages: it is difficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we introduce a new approach to learn decision trees using constraint programming. Compared to earlier approaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our approach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This allows our constraint programming approach to deal in a much more efficient way with the decompositions in the learning problem.
【Keywords】: Constraints and SAT: Constraints and Data Mining ; Constraints and Machine Learning; Constraints and SAT: Constraint Optimization; Constraints and SAT: Constraints: Modeling, Solvers, Applications; Constraints and SAT: Global Constraints;
【Paper Link】 【Pages】:4770-4774
【Authors】: Florian Pommerening ; Gabriele Röger ; Malte Helmert ; Hadrien Cambazard ; Louis-Martin Rousseau ; Domenico Salvagnin
【Abstract】: Optimal cost partitioning of classical planning heuristics has been shown to lead to excellent heuristic values but is often prohibitively expensive to compute. We analyze the application of Lagrangian decomposition, a classical tool in mathematical programming, to cost partitioning of operator-counting heuristics. This allows us to view the computation as an iterative process that can be seeded with any cost partitioning and that improves over time. In the case of non-negative cost partitioning of abstraction heuristics the computation reduces to independent shortest path problems and does not require an LP solver.
【Keywords】: Planning and Scheduling: Search in Planning and Scheduling; Planning and Scheduling: Theoretical Foundations of Planning; Heuristic Search and Game Playing: Heuristic Search; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:4775-4779
【Authors】: Shahaf S. Shperberg ; Ariel Felner ; Nathan Sturtevant ; Solomon Eyal Shimony ; Avi Hayoun
【Abstract】: Recent work on bidirectional search defined a lower bound on costs of paths between pairs of nodes, and introduced a new algorithm, NBS, which is based on this bound. Building on these results, we introduce DVCBS, a new algorithm that aims to to further reduce the number of expansions. Generalizing beyond specific algorithms, we then propose a method for enhancing heuristics by propagating such lower bounds (lb-propagation) between frontiers. This lb-propagation can be used in existing algorithms, often improving their performance, as well as making them "well behaved".
【Keywords】: Heuristic Search and Game Playing: Heuristic Search;
【Paper Link】 【Pages】:4780-4784
【Authors】: Kuniyuki Takahashi ; Jethro Tan
【Abstract】: Estimation of tactile properties from vision, such as slipperiness or roughness, is important to effectively interact with the environment. These tactile properties help humans, as well as robots, decide which actions they should choose and how to perform them. We, therefore, propose a model to estimate the degree of tactile properties from visual perception alone (e.g., the level of slipperiness or roughness). Our method extends an encoder-decoder network, in which the latent variables are visual and tactile features. In contrast to previous works, our method does not require manual labeling, but only RGB images and the corresponding tactile sensor data. All our data is collected with a webcam and tactile sensor mounted on the end-effector of a robot, which strokes the material surfaces. We show that our model generalizes to materials not included in the training data.
【Keywords】: Robotics: Learning in Robotics; Robotics: Vision and Perception;
【Paper Link】 【Pages】:4785-4789
【Authors】: Hernán Vargas ; Carlos Buil Aranda ; Aidan Hogan ; Claudia López
【Abstract】: As the adoption of knowledge graphs grows, more and more non-experts users need to be able to explore and query such graphs. These users are not typically familiar with graph query languages such as SPARQL, and may not be familiar with the knowledge graph's structure. In this extended abstract, we provide a summary of our work on a language and visual interface -- called RDF Explorer -- that help non-expert users to navigate and query knowledge graphs. A usability study over Wikidata shows that users successfully complete more tasks with RDF Explorer than with the existing Wikidata Query Helper interface.
【Keywords】: Knowledge Representation and Reasoning: Semantic Web; Multidisciplinary Topics and Applications: Databases;
【Paper Link】 【Pages】:4790-4794
【Authors】: Wen Zhang ; Yang Feng ; Qun Liu
【Abstract】: Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the translation. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence during training. Experimental results on NIST Chinese->English and WMT2014 English->German translation tasks demonstrate that our method can achieve significant improvements on multiple data sets compared to strong baselines.
【Keywords】: Natural Language Processing: Machine Translation; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:4796-4802
【Authors】: Maroua Bahri ; Albert Bifet ; Silviu Maniu ; Heitor Murilo Gomes
【Abstract】: Mining high-dimensional data streams poses a fundamental challenge to machine learning as the presence of high numbers of attributes can remarkably degrade any mining task's performance. In the past several years, dimension reduction (DR) approaches have been successfully applied for different purposes (e.g., visualization). Due to their high-computational costs and numerous passes over large data, these approaches pose a hindrance when processing infinite data streams that are potentially high-dimensional. The latter increases the resource-usage of algorithms that could suffer from the curse of dimensionality. To cope with these issues, some techniques for incremental DR have been proposed. In this paper, we provide a survey on reduction approaches designed to handle data streams and highlight the key benefits of using these approaches for stream mining algorithms.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:4803-4811
【Authors】: Tathagata Chakraborti ; Sarath Sreedharan ; Subbarao Kambhampati
【Abstract】: In this paper, we provide a comprehensive outline of the different threads of work in Explainable AI Planning (XAIP) that has emerged as a focus area in the last couple of years and contrast that with earlier efforts in the field in terms of techniques, target users, and delivery mechanisms. We hope that the survey will provide guidance to new researchers in automated planning towards the role of explanations in the effective design of human-in-the-loop systems, as well as provide the established researcher with some perspective on the evolution of the exciting world of explainable planning.
【Keywords】: Safe, Explainable, and Trustworthy AI: general; Human aspects in AI: general; Planning and Scheduling: general;
【Paper Link】 【Pages】:4812-4818
【Authors】: Ramya Srinivasan ; Ajay Chander
【Abstract】: With growing adoption of AI across fields such as healthcare, finance, and the justice system, explaining an AI decision has become more important than ever before. Development of human-centric explainable AI (XAI) systems necessitates an understanding of the requirements of the human-in-the-loop seeking the explanation. This includes the cognitive behavioral purpose that the explanation serves for its recipients, and the structure that the explanation uses to reach those ends. An understanding of the psychological foundations of explanations is thus vital for the development of effective human-centric XAI systems. Towards this end, we survey papers from the cognitive science literature that address the following broad questions: (1) what is an explanation, (2) what are explanations for, and 3) what are the characteristics of good and bad explanations. We organize the insights gained therein by means of highlighting the advantages and shortcomings of various explanation structures and theories, discuss their applicability across different domains, and analyze their utility to various types of humans-in-the-loop. We summarize the key takeaways for human-centric design of XAI systems, and recommend strategies to bridge the existing gap between XAI research and practical needs. We hope this work will spark the development of novel human-centric XAI systems.
【Keywords】: Human aspects in AI: general; Safe, Explainable, and Trustworthy AI: general;
【Paper Link】 【Pages】:4819-4825
【Authors】: Rémy Portelas ; Cédric Colas ; Lilian Weng ; Katja Hofmann ; Pierre-Yves Oudeyer
【Abstract】: Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL). These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities. In recent years, they have been used to improve sample efficiency and asymptotic performance, to organize exploration, to encourage generalization or to solve sparse reward problems, among others. To do so, ACL mechanisms can act on many aspects of learning problems. They can optimize domain randomization for Sim2Real transfer, organize task presentations in multi-task robotic settings, order sequences of opponents in multi-agent scenarios, etc. The ambition of this work is dual: 1) to present a compact and accessible introduction to the Automatic Curriculum Learning literature and 2) to draw a bigger picture of the current state of the art in ACL to encourage the cross-breeding of existing concepts and the emergence of new ideas.
【Keywords】: Machine Learning: general; Agent-based and Multi-agent Systems: general;
【Paper Link】 【Pages】:4826-4832
【Authors】: Giovanni Da San Martino ; Stefano Cresci ; Alberto Barrón-Cedeño ; Seunghak Yu ; Roberto Di Pietro ; Preslav Nakov
【Abstract】: Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda. They exploit the anonymity of the Internet, the micro-profiling ability of social networks, and the ease of automatically creating and managing coordinated networks of accounts, to reach millions of social network users with persuasive messages, specifically targeted to topics each individual user is sensitive to, and ultimately influencing the outcome on a targeted issue. In this survey, we review the state of the art on computational propaganda detection from the perspective of Natural Language Processing and Network Analysis, arguing about the need for combined efforts between these communities. We further discuss current challenges and future research directions.
【Keywords】: Natural Language Processing: general; Machine Learning: general;
【Paper Link】 【Pages】:4833-4839
【Authors】: Andrew Cropper ; Sebastijan Dumancic ; Stephen H. Muggleton
【Abstract】: Common criticisms of state-of-the-art machine learning include poor generalisation, a lack of interpretability, and a need for large amounts of training data. We survey recent work in inductive logic programming (ILP), a form of machine learning that induces logic programs from data, which has shown promise at addressing these limitations. We focus on new methods for learning recursive programs that generalise from few examples, a shift from using hand-crafted background knowledge to learning background knowledge, and the use of different technologies, notably answer set programming and neural networks. As ILP approaches 30, we also discuss directions for future research.
【Keywords】: Machine Learning: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:4840-4846
【Authors】: Allegra De Filippo ; Michele Lombardi ; Michela Milano
【Abstract】: Optimization problems under uncertainty are traditionally solved either via offline or online methods. Offline approaches can obtain high-quality robust solutions, but have a considerable computational cost. Online algorithms can react to unexpected events once they are observed, but often run under strict time constraints, preventing the computation of optimal solutions. Many real world problems, however, have both offline and online elements: a substantial amount of time and information is frequently available (offline) before an online problem is solved (e.g. energy production forecasts, or historical travel times in routing problems); in other cases both offline (i.e. strategic) and online (i.e. operational) decisions need to be made. Surprisingly, the interplay of these offline and online phases has received little attention: like in the blind men and the elephant tale, we risk missing the whole picture, and the benefits that could come from integrated offline/online optimization. In this survey we highlight the potential shortcomings of pure methods when applied to mixed offline/online problems, we review the strategies that have been designed to take advantage of this integration, and we suggest directions for future research.
【Keywords】: Constraints and Satisfiability: general; Planning and Scheduling: general; Machine Learning: general;
【Paper Link】 【Pages】:4847-4853
【Authors】: Sarah Keren ; Avigdor Gal ; Erez Karpas
【Abstract】: Goal recognition is the task of recognizing the objective of agents based on online observations of their behavior. Goal recognition design (GRD), the focus of this survey, facilitates goal recognition by the analysis and redesign of goal recognition models. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines: (1) to what extent do actions performed by an agent reveal the agent’s objective? and (2) what is the best way to modify the model so that the objective of an agent can be detected as early as possible? GRD answers these questions by offering a solution for assessing and minimizing the maximal progress of any agent before recognition is guaranteed. This approach is relevant to any domain in which efficient goal recognition is essential and in which the model can be redesigned. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. This survey presents the solutions developed for evaluation and optimization in the GRD context, a discussion on the use of GRD in a variety of real-world applications, and suggestions of possible future avenues of GRD research.
【Keywords】: Agent-based and Multi-agent Systems: general; Planning and Scheduling: general;
【Paper Link】 【Pages】:4854-4860
【Authors】: Shen Gao ; Xiuying Chen ; Zhaochun Ren ; Dongyan Zhao ; Rui Yan
【Abstract】: Text summarization is the research area aiming at creating a short and condensed version of the original document, which conveys the main idea of the document in a few words. This research topic has started to attract the attention of a large community of researchers, and it is nowadays counted as one of the most promising research areas. In general, text summarization algorithms aim at using a plain text document as input and then output a summary. However, in real-world applications, most of the data is not in a plain text format. Instead, there is much manifold information to be summarized, such as the summary for a web page based on a query in the search engine, extreme long document (e.g. academic paper), dialog history and so on. In this paper, we focus on the survey of these new summarization tasks and approaches in the real-world application.
【Keywords】: Natural Language Processing: general;
【Paper Link】 【Pages】:4861-4867
【Authors】: Yuxiao Dong ; Ziniu Hu ; Kuansan Wang ; Yizhou Sun ; Jie Tang
【Abstract】: Representation learning has offered a revolutionary learning paradigm for various AI domains. In this survey, we examine and review the problem of representation learning with the focus on heterogeneous networks, which consists of different types of vertices and relations. The goal of this problem is to automatically project objects, most commonly, vertices, in an input heterogeneous network into a latent embedding space such that both the structural and relational properties of the network can be encoded and preserved. The embeddings (representations) can be then used as the features to machine learning algorithms for addressing corresponding network tasks. To learn expressive embeddings, current research developments can fall into two major categories: shallow embedding learning and graph neural networks. After a thorough review of the existing literature, we identify several critical challenges that remain unaddressed and discuss future directions. Finally, we build the Heterogeneous Graph Benchmark to facilitate open research for this rapidly-developing topic.
【Keywords】: Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:4868-4876
【Authors】: Stefan Kramer
【Abstract】: Learning higher-level representations from data has been on the agenda of AI research for several decades. In the paper, I will give a survey of various approaches to learning symbolic higher-level representations: feature construction and constructive induction, predicate invention, propositionalization, pattern mining, and mining time series patterns. Finally, I will give an outlook on how approaches to learning higher-level representations, symbolic and neural, can benefit from each other to solve current issues in machine learning.
【Keywords】: Machine Learning: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:4877-4884
【Authors】: Luís C. Lamb ; Artur S. d'Avila Garcez ; Marco Gori ; Marcelo O. R. Prates ; Pedro H. C. Avelar ; Moshe Y. Vardi
【Abstract】: Neural-symbolic computing has now become the subject of interest of both academic and industry research laboratories. Graph Neural Networks (GNNs) have been widely used in relational and symbolic domains, with widespread application of GNNs in combinatorial optimization, constraint satisfaction, relational reasoning and other scientific domains. The need for improved explainability, interpretability and trust of AI systems in general demands principled methodologies, as suggested by neural-symbolic computing. In this paper, we review the state-of-the-art on the use of GNNs as a model of neural-symbolic computing. This includes the application of GNNs in several domains as well as their relationship to current developments in neural-symbolic computing.
【Keywords】: Safe, Explainable, and Trustworthy AI: general; Knowledge Representation and Reasoning: general; Machine Learning: general; Constraints and Satisfiability: general;
【Paper Link】 【Pages】:4885-4891
【Authors】: Jérôme Lang
【Abstract】: Most solution concepts in collective decision making are defined assuming complete knowledge of individuals' preferences and of the mechanism used for aggregating them. This is often unpractical or unrealistic. Under incomplete knowledge, a solution advocated by many consists in quanrtifying over all completions of the incomplete preference profile (or all instantiations of the incompletely specified mechanism). Voting rules can be `modalized' this way (leading to the notions of possible and necessary winners), and also efficiency and fairness notions in fair division, stability concepts in coalition formation, and more. I give here a survey of works along this line.
【Keywords】: Agent-based and Multi-agent Systems: general;
【Paper Link】 【Pages】:4892-4898
【Authors】: Levi H. S. Lelis
【Abstract】: In this paper we review several planning algorithms developed for zero-sum games with exponential action spaces, i.e., spaces that grow exponentially with the number of game components that can act simultaneously at a given game state. As an example, real-time strategy games have exponential action spaces because the number of actions available grows exponentially with the number of units controlled by the player. We also present a unifying perspective in which several existing algorithms can be described as an instantiation of a variant of NaiveMCTS. In addition to describing several existing planning algorithms for exponential action spaces, we show that other instantiations of this variant of NaiveMCTS represent novel and promising algorithms to be studied in future works.
【Keywords】: Games and Virtual Environments: general; Heuristic Search: general; Planning and Scheduling: general; Agent-based and Multi-agent Systems: general;
【Paper Link】 【Pages】:4899-4906
【Authors】: João Marques-Silva ; Carlos Mencía
【Abstract】: The analysis of inconsistent formulas finds an ever-increasing range of applications, that include axiom pinpointing in description logics, fault localization in software, model-based diagnosis, optimization problems, but also explainability of machine learning models. This paper overviews approaches for analyzing inconsistent formulas, focusing on finding and enumerating explanations of and corrections for inconsistency, but also on solving optimization problems modeled as inconsistent formulas.
【Keywords】: Constraints and Satisfiability: general;
【Paper Link】 【Pages】:4907-4913
【Authors】: Sandeep Mathias ; Diptesh Kanojia ; Abhijit Mishra ; Pushpak Bhattacharyya
【Abstract】: Gaze behaviour has been used as a way to gather cognitive information for a number of years. In this paper, we discuss the use of gaze behaviour in solving different tasks in natural language processing (NLP) without having to record it at test time. This is because the collection of gaze behaviour is a costly task, both in terms of time and money. Hence, in this paper, we focus on research done to alleviate the need for recording gaze behaviour at run time. We also mention different eye tracking corpora in multiple languages, which are currently available and can be used in natural language processing. We conclude our paper by discussing applications in a domain - education - and how learning gaze behaviour can help in solving the tasks of complex word identification and automatic essay grading.
【Keywords】: Natural Language Processing: general;
【Paper Link】 【Pages】:4914-4921
【Authors】: Lavindra de Silva ; Felipe Meneguzzi ; Brian Logan
【Abstract】: The BDI model forms the basis of much of the research on symbolic models of agency and agent-oriented software engineering. While many variants of the basic BDI model have been proposed in the literature, there has been no systematic review of research on BDI agent architectures in over 10 years. In this paper, we survey the main approaches to each component of the BDI architecture, how these have been realised in agent programming languages, and discuss the trade-offs inherent in each approach.
【Keywords】: Agent-based and Multi-agent Systems: general;
【Paper Link】 【Pages】:4922-4928
【Authors】: Pauli Miettinen ; Stefan Neumann
【Abstract】: The goal of Boolean Matrix Factorization (BMF) is to approximate a given binary matrix as the product of two low-rank binary factor matrices, where the product of the factor matrices is computed under the Boolean algebra. While the problem is computationally hard, it is also attractive because the binary nature of the factor matrices makes them highly interpretable. In the last decade, BMF has received a considerable amount of attention in the data mining and formal concept analysis communities and, more recently, the machine learning and the theory communities also started studying BMF. In this survey, we give a concise summary of the efforts of all of these communities and raise some open questions which in our opinion require further investigation.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:4929-4935
【Authors】: Arpita Roy ; Shimei Pan
【Abstract】: Word embedding, a process to automatically learn the mathematical representations of words from unlabeled text corpora, has gained a lot of attention recently. Since words are the basic units of a natural language, the more precisely we can represent the morphological, syntactic and semantic properties of words, the better we can support downstream Natural Language Processing (NLP) tasks. Since traditional word embeddings are mainly designed to capture the semantic relatedness between co-occurred words in a predefined context, it may not be effective in encoding other information that is important for different NLP applications. In this survey, we summarize the recent advances in incorporating extra knowledge to enhance word embedding. We will also identify the limitations of existing work as well as point out a few promising future directions.
【Keywords】: Natural Language Processing: general; Machine Learning: general;
【Paper Link】 【Pages】:4936-4942
【Authors】: Tommaso Pasini
【Abstract】: Word Sense Disambiguation (WSD) is the task of identifying the meaning of a word in a given context. It lies at the base of Natural Language Processing as it provides semantic information for words. In the last decade, great strides have been made in this field and much effort has been devoted to mitigate the knowledge acquisition bottleneck problem, i.e., the problem of semantically annotating texts at a large scale and in different languages. This issue is ubiquitous in WSD as it hinders the creation of both multilingual knowledge bases and manually-curated training sets. In this work, we first introduce the reader to the task of WSD through a short historical digression and then take the stock of the advancements to alleviate the knowledge acquisition bottleneck problem. In that, we survey the literature on manual, semi-automatic and automatic approaches to create English and multilingual corpora tagged with sense annotations and present a clear overview over supervised models for WSD. Finally, we provide our view over the future directions that we foresee for the field.
【Keywords】: Natural Language Processing: general;
【Paper Link】 【Pages】:4943-4950
【Authors】: Luc De Raedt ; Sebastijan Dumancic ; Robin Manhaeve ; Giuseppe Marra
【Abstract】: Neuro-symbolic and statistical relational artificial intelligence both integrate frameworks for learning with logical reasoning. This survey identifies several parallels across seven different dimensions between these two fields. These cannot only be used to characterize and position neuro-symbolic artificial intelligence approaches but also to identify a number of directions for further research.
【Keywords】: Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:4951-4958
【Authors】: Ruohan Zhang ; Akanksha Saran ; Bo Liu ; Yifeng Zhu ; Sihang Guo ; Scott Niekum ; Dana H. Ballard ; Mary M. Hayhoe
【Abstract】: Human gaze reveals a wealth of information about internal cognitive state. Thus, gaze-related research has significantly increased in computer vision, natural language processing, decision learning, and robotics in recent years. We provide a high-level overview of the research efforts in these fields, including collecting human gaze data sets, modeling gaze behaviors, and utilizing gaze information in various applications, with the goal of enhancing communication between these research areas. We discuss future challenges and potential applications that work towards a common goal of human-centered artificial intelligence.
【Keywords】: Human aspects in AI: general; Machine Learning: general; Vision: general;
【Paper Link】 【Pages】:4959-4965
【Authors】: Giuseppe De Giacomo ; Antonio Di Stasio ; Francesco Fuggitti ; Sasha Rubin
【Abstract】: We review PLTLf and PLDLf, the pure-past versions of the well-known logics on finite traces LTLf and LDLf, respectively. PLTLf and PLDLf are logics about the past, and so scan the trace backwards from the end towards the beginning. Because of this, we can exploit a foundational result on reverse languages to get an exponential improvement, over LTLf /LDLf , for computing the corresponding DFA. This exponential improvement is reflected in several forms of sequential decision making involving temporal specifications, such as planning and decision problems in non-deterministic and non-Markovian domains. Interestingly, PLTLf (resp., PLDLf ) has the same expressive power as LTLf (resp., LDLf ), but transforming a PLTLf (resp., PLDLf ) formula into its equivalent LTLf (resp.,LDLf) is quite expensive. Hence, to take advantage of the exponential improvement, properties of interest must be directly expressed in PLTLf /PLDLf .
【Keywords】: Planning and Scheduling: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:4966-4972
【Authors】: Toby Walsh
【Abstract】: I survey recent progress on a classic and challenging problem in social choice: the fair division of indivisible items. I discuss how a computational perspective has provided interesting insights into and understanding of how to divide items fairly and efficiently. This has involved bringing to bear tools such as those used in knowledge representation, computational complexity, approximation methods, game theory, online analysis and communication complexity.
【Keywords】: Agent-based and Multi-agent Systems: general;
【Paper Link】 【Pages】:4973-4980
【Authors】: Zheng Wang ; Zhixiang Wang ; Yinqiang Zheng ; Yang Wu ; Wenjun Zeng ; Shin'ichi Satoh
【Abstract】: An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis. Recently, with the explosive demands of practical applications, a lot of research efforts have been dedicated to heterogeneous person re-identification (Hetero-ReID). In this paper, we provide a comprehensive review of state-of-the-art Hetero-ReID methods that address the challenge of inter-modality discrepancies. According to the application scenario, we classify the methods into four categories --- low-resolution, infrared, sketch, and text. We begin with an introduction of ReID, and make a comparison between Homogeneous ReID (Homo-ReID) and Hetero-ReID tasks. Then, we describe and compare existing datasets for performing evaluations, and survey the models that have been widely employed in Hetero-ReID. We also summarize and compare the representative approaches from two perspectives, i.e., the application scenario and the learning pipeline. We conclude by a discussion of some future research directions. Follow-up updates are available at https://github.com/lightChaserX/Awesome-Hetero-reID
【Keywords】: Vision: general; Human aspects in AI: general; Information Retrieval and Filtering: general;
【Paper Link】 【Pages】:4981-4987
【Authors】: Fanzhen Liu ; Shan Xue ; Jia Wu ; Chuan Zhou ; Wenbin Hu ; Cécile Paris ; Surya Nepal ; Jian Yang ; Philip S. Yu
【Abstract】: As communities represent similar opinions, similar functions, similar purposes, etc., community detection is an important and extremely useful tool in both scientific inquiry and data analytics. However, the classic methods of community detection, such as spectral clustering and statistical inference, are falling by the wayside as deep learning techniques demonstrate an increasing capacity to handle high-dimensional graph data with impressive performance. Thus, a survey of current progress in community detection through deep learning is timely. Structured into three broad research streams in this domain – deep neural networks, deep graph embedding, and graph neural networks, this article summarizes the contributions of the various frameworks, models, and algorithms in each stream along with the current challenges that remain unsolved and the future research opportunities yet to be explored.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:4988-4996
【Authors】: Junchi Yan ; Shuang Yang ; Edwin R. Hancock
【Abstract】: This survey gives a selective review of recent development of machine learning (ML) for combinatorial optimization (CO), especially for graph matching. The synergy of these two well-developed areas (ML and CO) can potentially give transformative change to artificial intelligence, whose foundation relates to these two building blocks. For its representativeness and wide-applicability, this paper is more focused on the problem of weighted graph matching, especially from the learning perspective. For graph matching, we show that many learning techniques e.g. convolutional neural networks, graph neural networks, reinforcement learning can be effectively incorporated in the paradigm for extracting the node features, graph structure features, and even the matching engine. We further present outlook for the new settings for learning graph matching, and direction towards more integrated combinatorial optimization solvers with prediction models, and also the mutual embrace of traditional solver and machine learning components.
【Keywords】: Constraints and Satisfiability: general; Machine Learning: general; Planning and Scheduling: general;
【Paper Link】 【Pages】:4997-5003
【Authors】: Sheng Li ; Handong Zhao
【Abstract】: Artificial intelligent systems are changing every aspect of our daily life. In the past decades, numerous approaches have been developed to characterize user behavior, in order to deliver personalized experience to users in scenarios like online shopping or movie recommendation. This paper presents a comprehensive survey of recent advances in user modeling from the perspective of representation learning. In particular, we formulate user modeling as a process of learning latent representations for users. We discuss both the static and sequential representation learning methods for the purpose of user modeling, and review representative approaches in each category, such as matrix factorization, deep collaborative filtering, and recurrent neural networks. Both shallow and deep learning methods are reviewed and discussed. Finally, we conclude this survey and discuss a number of open research problems that would inspire further research in this field.
【Keywords】: Information Retrieval and Filtering: general; Machine Learning: general;
【Paper Link】 【Pages】:5005-5009
【Authors】: Felicidad Aguado ; Pedro Cabalar ; Jorge Fandinno ; David Pearce ; Gilberto Pérez ; Concepción Vidal
【Abstract】: This work tackles the problem of checking strong equivalence of logic programs that may contain local auxiliary atoms, to be removed from their stable models and to be forbidden in any external context. We call this property projective strong equivalence (PSE). It has been recently proved that not any logic program containing auxiliary atoms can be reformulated, under PSE, as another logic program or formula without them -- this is known as strongly persistent forgetting. In this paper, we introduce a conservative extension of Equilibrium Logic and its monotonic basis, the logic of Here-and-There, in which we deal with a new connective we call fork. We provide a semantic characterisation of PSE for forks and use it to show that, in this extension, it is always possible to forget auxiliary atoms under strong persistence. We further define when the obtained fork is representable as a regular formula.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Knowledge Representation Languages;
【Paper Link】 【Pages】:5010-5014
【Authors】: Mohammad Mahdi Amirian ; Saeed Shiry Ghidary
【Abstract】: We present improvements in maximum a-posteriori inference for Markov Logic, a widely used SRL formalism. Several approaches, including Cutting Plane Aggregation (CPA), perform inference through translation to Integer Linear Programs. Aggregation exploits context-specific symmetries independently of evidence and reduces the size of the program. We illustrate much more symmetries occurring in long ground clauses that are ignored by CPA and can be exploited by higher-order aggregations. We propose Full-Constraint-Aggregation, a superior algorithm to CPA which exploits the ignored symmetries via a lifted translation method and some constraint relaxations. RDBMS and heuristic techniques are involved to improve the overall performance. We introduce Xeggora as an evolutionary extension of RockIt, the query engine that uses CPA. Xeggora evaluation on real-world benchmarks shows progress in efficiency compared to RockIt especially for models with long formulas.
【Keywords】: Machine Learning: Probabilistic Machine Learning; Constraints and SAT: SAT: : Solvers and Applications; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:5015-5019
【Authors】: Nelly Barbot ; Laurent Miclet ; Henri Prade
【Abstract】: Analogical proportions are statements of the form “x is to y as z is to t”, where x, y, z, t are items of the same nature, or not. In this paper, we more particularly consider “relational proportions” of the form “object A has the same relationship with attribute a as object B with attribute b”. We provide a formal definition for relational proportions, and investigate how they can be extracted from a formal context, in the setting of formal concept analysis.
【Keywords】: Knowledge Representation and Reasoning: Case-based Reasoning; Knowledge Representation and Reasoning: Other; Data Mining: Theoretical Foundations;
【Paper Link】 【Pages】:5020-5024
【Authors】: Omer Ben-Porat ; Lital Kuchy ; Sharon Hirsch ; Guy Elad ; Roi Reichart ; Moshe Tennenholtz
【Abstract】: The connection between messaging and action is fundamental both to web applications, such as web search and sentiment analysis, and to economics. However, while prominent online applications exploit messaging in natural (human) language in order to predict non-strategic action selection, the economics literature focuses on the connection between structured stylized messaging to strategic decisions in games and multi-agent encounters. This paper aims to connect these two strands of research, which we consider highly timely and important due to the vast online textual communication on the web. Particularly, we introduce the following question: can free text expressed in natural language serve for the prediction of action selection in an economic context, modeled as a game? We initiate research on this question by providing preliminary positive results.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Machine Learning Applications: Applications of Supervised Learning; Natural Language Processing: Text Classification;
【Paper Link】 【Pages】:5025-5029
【Authors】: Piero A. Bonatti
【Abstract】: Many modern applications of description logics (DLs, for short), such as biomedical ontologies and semantic web policies, provide compelling motivations for extending DLs with an overriding mechanism analogous to the homonymous feature of object-oriented programming. Rational closure (RC) is one of the candidate semantics for such extensions, and one of the most intensively studied. So far, however, it has been limited to strict fragments of SROIQ(D) – the logic on which OWL2 is founded. In this paper we prove that RC cannot be extended to logics that do not satisfy the disjoint model union property, including SROIQ(D). Then we introduce a refinement of RC called stable rational closure that overcomes the dependency on the disjoint model union property. Our results show that stable RC is a natural extension of RC. However, its positive features come at a price: stable RC re-introduces one of the undesirable features of other nonmonotonic logics, namely, deductive closures may not exist and may not be unique.
【Keywords】: Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Description Logics and Ontologies; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:5030-5034
【Authors】: Jirí Cermák ; Viliam Lisý ; Branislav Bosanský
【Abstract】: Information abstraction is one of the methods for tackling large extensive-form games (EFGs). Removing some information available to players reduces the memory required for computing and storing strategies. We present novel domain-independent abstraction methods for creating very coarse abstractions of EFGs that still compute strategies that are (near) optimal in the original game. First, the methods start with an arbitrary abstraction of the original game (domain-specific or the coarsest possible). Next, they iteratively detect which information is required in the abstract game so that a (near) optimal strategy in the original game can be found and include this information into the abstract game. Moreover, the methods are able to exploit imperfect-recall abstractions where players can even forget the history of their own actions. We present two algorithms that follow these steps -- FPIRA, based on fictitious play, and CFR+IRA, based on counterfactual regret minimization. The experimental evaluation confirms that our methods can closely approximate Nash equilibrium of large games using abstraction with only 0.9% of information sets of the original game.
【Keywords】: Agent-based and Multi-agent Systems: Noncooperative Games; Uncertainty in AI: Sequential Decision Making;
【Paper Link】 【Pages】:5035-5039
【Authors】: Martin C. Cooper ; Achref El Mouelhi ; Cyril Terrioux
【Abstract】: We investigate rules which allow variable elimination in binary CSP (constraint satisfaction problem) instances while conserving satisfiability. We propose new rules and compare them, both theoretically and experimentally. We give optimised algorithms to apply these rules and show that each defines a novel tractable class. Using our variable-elimination rules in preprocessing allowed us to solve more benchmark problems than without.
【Keywords】: Constraints and SAT: Constraint Satisfaction;
【Paper Link】 【Pages】:5040-5044
【Authors】: Yi-Dong Shen ; Thomas Eiter
【Abstract】: [Gelfond and Lifschitz, 1991] introduced simple disjunctive logic programs and defined the answer set semantics called GL-semantics. We observed that the requirement of GL-semantics, i.e., an answer set should be a minimal model of the GL-reduct may be too strong and exclude some answer sets that would be reasonably acceptable. To address this, we present a novel and more permissive semantics, called determining inference semantics.
【Keywords】: Knowledge Representation and Reasoning: Knowledge Representation Languages; Knowledge Representation and Reasoning: Non-monotonic Reasoning, Common-Sense Reasoning; Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:5045-5049
【Authors】: Oswin Krause ; Asja Fischer ; Christian Igel
【Abstract】:
Estimating the normalization constants (partition functions) of energy-based probabilistic models (Markov random fields) with a high accuracy is required for measuring performance, monitoring the training progress of adaptive models, and conducting likelihood ratio tests. We devised a unifying theoretical framework for algorithms for estimating the partition function, including Annealed Importance Sampling (AIS) and Bennett's Acceptance Ratio method (BAR). The unification reveals conceptual similarities of and differences between different approaches and suggests new algorithms. The framework is based on a generalized form of Crooks' equality, which links the expectation over a distribution of samples generated by a transition operator to the expectation over the distribution induced by the reversed operator.
Different ways of sampling, such as parallel
tempering and path sampling, are covered by the framework.
We performed experiments in which we estimated the partition function of restricted Boltzmann
machines (RBMs) and Ising models. We found that BAR using parallel
tempering worked well with a small number of bridging distributions,
while path sampling based AIS performed best with many bridging
distributions. The normalization constant is measured w.r.t.~a reference distribution, and the choice of this distribution turned out to be very important in our experiments.
Overall, BAR gave the best empirical results, outperforming AIS.
【Keywords】: Machine Learning: Learning Graphical Models; Machine Learning: Learning Generative Models;
【Paper Link】 【Pages】:5050-5054
【Authors】: James R. Foulds ; Mijung Park ; Kamalika Chaudhuri ; Max Welling
【Abstract】: Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion. The iterative nature of variational Bayes presents a challenge since iterations increase the amount of noise needed to ensure privacy. We overcome this by combining: (1) an improved composition method, called the moments accountant, and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method on LDA topic models, evaluated on Wikipedia. In the full paper we extend our method to a broad class of models, including Bayesian logistic regression and sigmoid belief networks.
【Keywords】: Multidisciplinary Topics and Applications: Security and Privacy; Uncertainty in AI: Approximate Probabilistic Inference; Machine Learning: Probabilistic Machine Learning; Natural Language Processing: Natural Language Processing;
【Paper Link】 【Pages】:5055-5059
【Authors】: Vincent François-Lavet ; Guillaume Rabusseau ; Joelle Pineau ; Damien Ernst ; Raphael Fonteneau
【Abstract】: When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.
【Keywords】: Machine Learning: Reinforcement Learning; Planning and Scheduling: POMDPs; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Machine Learning: Learning Theory;
【Paper Link】 【Pages】:5060-5064
【Authors】: Patrick Hohenecker ; Thomas Lukasiewicz
【Abstract】: The ability to conduct logical reasoning is a fundamental aspect of intelligent human behavior, and thus an important problem along the way to human-level artificial intelligence. Traditionally, logic-based symbolic methods from the field of knowledge representation and reasoning have been used to equip agents with capabilities that resemble human logical reasoning qualities. More recently, however, there has been an increasing interest in using machine learning rather than logic-based symbolic formalisms to tackle these tasks. In this paper, we employ state-of-the-art methods for training deep neural networks to devise a novel model that is able to learn how to effectively perform logical reasoning in the form of basic ontology reasoning.
【Keywords】: Machine Learning: Neuro-Symbolic Methods; Knowledge Representation and Reasoning: Description Logics and Ontologies; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:5065-5069
【Authors】: Dieuwke Hupkes ; Verna Dankers ; Mathijs Mul ; Elia Bruni
【Abstract】: Despite a multitude of empirical studies, little consensus exists on whether neural networks are able to generalise compositionally. As a response to this controversy, we present a set of tests that provide a bridge between, on the one hand, the vast amount of linguistic and philosophical theory about compositionality of language and, on the other, the successful neural models of language. We collect different interpretations of compositionality and translate them into five theoretically grounded tests for models that are formulated on a task-independent level. To demonstrate the usefulness of this evaluation paradigm, we instantiate these five tests on a highly compositional data set which we dub PCFG SET, apply the resulting tests to three popular sequence-to-sequence models and provide an in-depth analysis of the results.
【Keywords】: Natural Language Processing: Natural Language Semantics; Machine Learning: Deep Learning; Natural Language Processing: Natural Language Processing; Machine Learning: Interpretability;
【Paper Link】 【Pages】:5070-5074
【Authors】: O-Joun Lee ; Jason J. Jung
【Abstract】: This study aims to represent stories in narrative works (i.e., creative works that contain stories) with a fixed-length vector. We apply subgraph-based graph embedding models to dynamic social networks of characters that appeared in stories (character networks). We suppose that interactions between characters reflect the content of stories. We discretize the interactions by discovering the subgraphs and learn representations of stories by predicting occurrences of the subgraphs in corresponding character networks. We find subgraphs rooted in each character on each scene in multiple scales, using the WL (Weisfeiler-Lehman) relabeling process. To predict occurrences of subgraphs, we apply two approaches: (i) considering changes in subgraphs according to scenes and (ii) focusing on subgraphs on the last scene. We evaluated the proposed models by measuring the similarity between real movies with vector representations that were generated by the models.
【Keywords】: Machine Learning: Learning Generative Models; Multidisciplinary Topics and Applications: Social Sciences; Machine Learning Applications: Networks;
【Paper Link】 【Pages】:5075-5079
【Authors】: Salvador Lucas
【Abstract】: The semantics of computational systems (e.g., relational and knowledge data bases, query-answering systems, programming languages, etc.) can often be expressed as (the specification of) a logical theory Th. Queries, goals, and claims about the behavior or features of the system can be expressed as formulas φ which should be checked with respect to the intended model of Th, which is often huge or even incomputable. In this paper we show how to prove such semantic properties φ of Th by just finding a model A of Th∪{φ}∪Zφ, where Zφ is an appropriate (possibly empty) theory depending on φ only. Applications to relational and deductive databases, rewriting-based systems, logic programming, and answer set programming are discussed.
【Keywords】: Knowledge Representation and Reasoning: Automated Reasoning; Tractable Languages and Knowledge compilation; Knowledge Representation and Reasoning: Logics for Knowledge Representation; Multidisciplinary Topics and Applications: Validation and Verification;
【Paper Link】 【Pages】:5080-5084
【Authors】: Pavlos Vougiouklis ; Eddy Maddalena ; Jonathon S. Hare ; Elena Simperl
【Abstract】: We investigate the problem of generating natural language summaries from knowledge base triples. Our approach is based on a pointer-generator network, which, in addition to generating regular words from a fixed target vocabulary, is able to verbalise triples in several ways. We undertake an automatic and a human evaluation on single and open-domain summaries generation tasks. Both show that our approach significantly outperforms other data-driven baselines.
【Keywords】: Natural Language Processing: Natural Language Generation; Machine Learning: Deep Generative Models; Machine Learning: Knowledge-based Learning;
【Paper Link】 【Pages】:5085-5089
【Authors】: Artuur Leeuwenberg ; Marie-Francine Moens
【Abstract】: Time is deeply woven into how people perceive, and communicate about the world. Almost unconsciously, we provide our language utterances with temporal cues, like verb tenses, and we can hardly produce sentences without such cues. Extracting temporal cues from text, and constructing a global temporal view about the order of described events is a major challenge of automatic natural language understanding. Temporal reasoning, the process of combining different temporal cues into a coherent temporal view, plays a central role in temporal information extraction. This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on the integration of symbolic reasoning with machine learning-based information extraction systems.
【Keywords】: Knowledge Representation and Reasoning: Qualitative, Geometric, Spatial, Temporal Reasoning; Natural Language Processing: Information Extraction; Machine Learning: Structured Prediction; Natural Language Processing: Natural Language Semantics;
【Paper Link】 【Pages】:5090-5094
【Authors】: Kedian Mu
【Abstract】: As one of fundamental properties to characterize inconsistency measures for knowledge bases, the property of free formula independence captures well the intuition that free formulas are independent of the amount of inconsistency in a knowledge base for cases where inconsistency is characterized in terms of minimal inconsistent subsets. But it has been argued that not all the free formulas are independent of inconsistency in some other contexts of inconsistency characterization. In this paper, we propose a notion of Bi-free formula to describe formulas that are free from inconsistency in both syntactic characterization and paraconsistent models in the framework of Priest's minimally inconsistent LP. Then we propose the property of Bi-free formula independence, which is more suitable for characterizing the role of formulas free from inconsistency in measuring inconsistency from both syntactic and semantic perspectives.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation;
【Paper Link】 【Pages】:5095-5099
【Authors】: Alvaro Perez-Diaz ; Enrico H. Gerding ; Frank McGroarty
【Abstract】: We consider a scenario where self-interested Electric Vehicle (EV) aggregators compete in the day-ahead electricity market in order to purchase the electricity needed to meet EV requirements. We propose a novel decentralised bidding coordination algorithm based on the Alternating Direction Method of Multipliers (ADMM). Our simulations using real market and driver data from Spain show that the algorithm is able to significantly reduce energy costs for all participants. Furthermore, we postulate that strategic manipulation by deviating agents is possible in decentralised algorithms like ADMM. Hence, we describe and analyse different possible attack vectors and propose a mathematical framework to quantify and detect manipulation. Our simulations show that our ADMM-based algorithm can be effectively disrupted by manipulative attacks achieving convergence to a different non-optimal solution which benefits the attacker. At the same time, our proposed manipulation detection algorithm achieves very high accuracy.
【Keywords】: Agent-based and Multi-agent Systems: Coordination and Cooperation; Planning and Scheduling: Distributed;Multi-agent Planning; Multidisciplinary Topics and Applications: Transportation;
【Paper Link】 【Pages】:5100-5104
【Authors】: Gilles Pesant
【Abstract】: The distinctive driving force of constraint programming (CP) to solve combinatorial problems has been a privileged access to problem structure through the high-level models it uses. We investigate a richer propagation medium for CP made possible by recent work on counting solutions inside constraints. Beliefs about individual variable-value assignments are exchanged between contraints and iteratively adjusted. Its advantage over standard belief propagation is that the higher-level models do not tend to create as many cycles, which are known to be problematic for convergence. We find that it significantly improves search guidance.
【Keywords】: Constraints and SAT: Constraints: Modeling, Solvers, Applications; Constraints and SAT: Constraints and Data Mining ; Constraints and Machine Learning; Heuristic Search and Game Playing: Combinatorial Search and Optimisation;
【Paper Link】 【Pages】:5105-5109
【Authors】: Matthew Stephenson ; Jochen Renz ; Xiaoyu Ge
【Abstract】: In this paper we present several proofs for the computational complexity of the physics-based video game Angry Birds. We are able to demonstrate that solving levels for different versions of Angry Birds is either NP-hard, PSPACE-hard, PSPACE-complete or EXPTIME-hard, depending on the maximum number of birds available and whether the game engine is deterministic or stochastic. We believe that this is the first time that a single-player video game has been proven EXPTIME-hard.
【Keywords】: Knowledge Representation and Reasoning: Computational Complexity of Reasoning; Multidisciplinary Topics and Applications: Computer Games;
【Paper Link】 【Pages】:5110-5114
【Authors】: Rodrigo Agerri ; German Rigau
【Abstract】: In this paper we present a language independent system to model Opinion Target Extraction (OTE) as a sequence labelling task. The system consists of a combination of clustering features implemented on top of a simple set of shallow local features. Experiments on the well known Aspect Based Sentiment Analysis (ABSA) benchmarks show that our approach is very competitive across languages, obtaining, at the time of writing, best results for six languages in seven different datasets. Furthermore, the results provide further insights into the behaviour of clustering features for sequence labeling tasks. Finally, we also show that these results can be outperformed by recent advances in contextual word embeddings and the transformer architecture. The system and models generated in this work are available for public use and to facilitate reproducibility of results.
【Keywords】: Natural Language Processing: Sentiment Analysis and Text Mining; Natural Language Processing: Information Extraction; Natural Language Processing: NLP Applications and Tools;
【Paper Link】 【Pages】:5115-5119
【Authors】: Zhenisbek Assylbekov ; Rustem Takhanov
【Abstract】: This paper takes a step towards the theoretical analysis of the relationship between word embeddings and context embeddings in models such as word2vec. We start from basic probabilistic assumptions on the nature of word vectors, context vectors, and text generation. These assumptions are supported either empirically or theoretically by the existing literature. Next, we show that under these assumptions the widely-used word-word PMI matrix is approximately a random symmetric Gaussian ensemble. This, in turn, implies that context vectors are reflections of word vectors in approximately half the dimensions. As a direct application of our result, we suggest a theoretically grounded way of tying weights in the SGNS model.
【Keywords】: Natural Language Processing: Embeddings; Machine Learning: Probabilistic Machine Learning; Machine Learning: Tensor and Matrix Methods; Machine Learning: Unsupervised Learning;
【Paper Link】 【Pages】:5120-5124
【Authors】: Pavel Naumov ; Jia Tao
【Abstract】: Logical systems containing knowledge and know-how modalities have been investigated in several recent works. Independently, epistemic modal logics in which every knowledge modality is labeled with a degree of uncertainty have been proposed. This article combines these two research lines by introducing a bimodal logic containing knowledge and know-how modalities, both labeled with a degree of uncertainty. The main technical results are soundness, completeness, and incompleteness of the proposed logical system with respect to two classes of semantics.
【Keywords】: Knowledge Representation and Reasoning: Logics for Knowledge Representation; Knowledge Representation and Reasoning: Knowledge Representation and Game Theory; Social Choice; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief;
【Paper Link】 【Pages】:5125-5129
【Authors】: Michael C. Thrun ; Alfred Ultsch
【Abstract】: The Databionic swarm (DBS) is a flexible and robust clustering framework that consists of three independent modules: swarm based projection, high-dimensional data visualization and representation guided clustering. The first module is the parameter-free projection method Pswarm, which exploits concepts of self-organization and emergence, game theory, and swarm intelligence. The second module is a parameter-free high-dimensional data visualization technique called topographic map. It uses the generalized U-matrix, which enables to estimate first, if any cluster tendency exists and second, the estimation of the number of clusters. The third module offers a clustering method which can be verified by the visualization and vice versa. Benchmarking w.r.t. conventional algorithms demonstrated that DBS can outperform them. Several applications showed that cluster structures provided by DBS are meaningful. Exemplary, a clustering of worldwide country-related data w.r.t the COVID-19 pandemic is presented here. Code and data is made available via open source.
【Keywords】: Agent-based and Multi-agent Systems: Multi-agent Learning; Agent-based and Multi-agent Systems: Noncooperative Games; Computer Vision: Statistical Methods and Machine Learning; Data Mining: Clustering, Unsupervised Learning;
【Paper Link】 【Pages】:5130-5134
【Authors】: Eric Timmons ; Brian C. Williams
【Abstract】: State estimation methods based on hybrid discrete and continuous state models have emerged as a method of precisely computing belief states for real world systems, however they have difficulty scaling to systems with more than a handful of components. Classical, consistency based diagnosis methods scale to this level by combining best-first enumeration and conflict-directed search. While best-first methods have been developed for hybrid estimation, conflict-directed methods have thus far been elusive as conflicts summarize constraint violations, but probabilistic hybrid estimation is relatively unconstrained. In this paper we present an approach (ABC) that unifies best-first enumeration and conflict-directed search in relatively unconstrained problems through the concept of "bounding" conflicts, an extension of conflicts that represent tighter bounds on the cost of regions of the search space. Experiments show that an ABC powered state estimator produces estimates up to an order of magnitude faster than the current state of the art, particularly on large systems.
【Keywords】: Heuristic Search and Game Playing: Combinatorial Search and Optimisation; Uncertainty in AI: Approximate Probabilistic Inference;
【Paper Link】 【Pages】:5135-5139
【Authors】: Ferdinando Fioretto ; Pascal Van Hentenryck
【Abstract】: Many applications of machine learning and optimization operate on sensitive data streams, posing significant privacy risks for individuals whose data appear in the stream. Motivated by an application in energy systems, this paper presents OptStream, a novel algorithm for releasing differentially private data streams under the w-event model of privacy. The procedure ensures privacy while guaranteeing bounded error on the released data stream. OptStream is evaluated on a test case involving the release of a real data stream from the largest European transmission operator. Experimental results show that OptStream may not only improve the accuracy of state-of-the-art methods by at least one order of magnitude but also support accurate load forecasting on the privacy-preserving data.
【Keywords】: Multidisciplinary Topics and Applications: Security and Privacy; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:5140-5144
【Authors】: Xi Alice Gao ; James R. Wright ; Kevin Leyton-Brown
【Abstract】: In many settings, an effective way of evaluating objects of interest is to collect evaluations from dispersed individuals and to aggregate these evaluations together. Some examples are categorizing online content and evaluating student assignments via peer grading. For this data science problem, one challenge is to motivate participants to conduct such evaluations carefully and to report them honestly, particularly when doing so is costly. Existing approaches, notably peer-prediction mechanisms, can incentivize truth telling in equilibrium. However, they also give rise to equilibria in which agents do not pay the costs required to evaluate accurately, and hence fail to elicit useful information. We show that this problem is unavoidable whenever agents are able to coordinate using low-cost signals about the items being evaluated (e.g., text labels or pictures). We then consider ways of circumventing this problem by comparing agents' reports to ground truth, which is available in practice when there exist trusted evaluators---such as teaching assistants in the peer grading scenario---who can perform a limited number of unbiased (but noisy) evaluations. Of course, when such ground truth is available, a simpler approach is also possible: rewarding each agent based on agreement with ground truth with some probability, and unconditionally rewarding the agent otherwise. Surprisingly, we show that the simpler mechanism achieves stronger incentive guarantees given less access to ground truth than a large set of peer-prediction mechanisms.
【Keywords】: Agent-based and Multi-agent Systems: Algorithmic Game Theory;
【Paper Link】 【Pages】:5145-5148
【Authors】: Dianmu Zhang ; Blake Hannaford
【Abstract】: Inverse kinematics solves the problem of how to control robot arm joints to achieve desired end effector positions, which is critical to any robot arm design and implementations of control algorithms. It is a common misunderstanding that closed-form inverse kinematics analysis is solved. Popular software and algorithms, such as gradient descent or any multi-variant equations solving algorithm, claims solving inverse kinematics but only on the numerical level. While the numerical inverse kinematics solutions are relatively straightforward to obtain, these methods often fail, even when the inverse kinematics solutions exist. Therefore, closed-form inverse kinematics analysis is superior, but there is no generalized automated algorithm. Up till now, the high-level logical reasoning involved in solving closed-form inverse kinematics made it hard to automate, so it's handled by human experts. We developed IKBT, a knowledge-based intelligent system that can mimic human experts' behaviors in solving closed-from inverse kinematics using Behavior Tree. Knowledge and rules used by engineers when solving closed-from inverse kinematics are encoded as actions in Behavior Tree. The order of applying these rules is governed by higher level composite nodes, which resembles the logical reasoning process of engineers. It is also the first time that the dependency of joint variables, an important issue in inverse kinematics analysis, is automatically tracked in graph form. Besides generating closed-form solutions, IKBT also explains its solving strategies in human (engineers) interpretable form. This is a proof-of-concept of using Behavior Trees to solve high-cognitive problems.
【Keywords】: Knowledge Representation and Reasoning: Automated Reasoning; Tractable Languages and Knowledge compilation; Knowledge Representation and Reasoning: Reasoning about Knowledge and Belief; Humans and AI: Cognitive Systems; Robotics: Motion and Path Planning;
【Paper Link】 【Pages】:5150-5153
【Authors】: Mingming Gong
【Abstract】: Modern machine learning techniques can discover complicated statistical dependencies between ran- dom variables, usually in the form a statistical model, and make use of these dependencies to per- form predictions on future observations. How- ever, many real problems involve causal inference, which aims to infer how the data generating sys- tem should behave under changing conditions. To perform causal inference, we need not only statisti- cal dependencies but also causal structures to deter- mine the system’s behavior under external interven- tions. In this paper, I will be focusing on two essen- tial problems that bridge causality and learning and investigate how they can benefit from each other. On the one hand, since conducting randomized controlled experiments for causal structure discov- ery is often expensive or infeasible, it would be valuable to investigate how we can explore modern machine learning algorithms to search for causal structures from observational data. On the other hand, since causal structure provides information about the distribution changing properties, it can be used as a fundamental tool to tackle a major chal- lenge for machine learning: the capability of gener- alization to new distributions and prediction in non- stationary environment.
【Keywords】: Knowledge Representation and Reasoning: Action, Change and Causality; Uncertainty in AI: Graphical Models;
【Paper Link】 【Pages】:5154-5158
【Authors】: Alexey Ignatiev
【Abstract】: Explainable artificial intelligence (XAI) represents arguably one of the most crucial challenges being faced by the area of AI these days. Although the majority of approaches to XAI are of heuristic nature, recent work proposed the use of abductive reasoning to computing provably correct explanations for machine learning (ML) predictions. The proposed rigorous approach was shown to be useful not only for computing trustable explanations but also for validating explanations computed heuristically. It was also applied to uncover a close relationship between XAI and verification of ML models. This paper overviews the advances of the rigorous logic-based approach to XAI and argues that it is indispensable if trustable XAI is of concern.
【Keywords】: Machine Learning: Explainable Machine Learning; Machine Learning: Classification; Constraints and SAT: Constraints and Data Mining ; Constraints and Machine Learning; Multidisciplinary Topics and Applications: Validation and Verification;
【Paper Link】 【Pages】:5159-5163
【Authors】: Rivka Levitan
【Abstract】: Entrainment, the phenomenon of conversational partners’ speech becoming more similar to each other, is generally accepted to be an important aspect of human-human and human-machine communication. However, there is a gap between accepted psycholinguistic models of entrainment and the body of empirical findings, which includes a large number of unexplained negative results. Existing research does not provide insights specific enough to guide the implementation of entraining spoken dialogue systems or the interpretation of entrainment as a measure of quality. A more integrated model of entrainment is proposed, which looks for consistent explanations of entrainment behavior on specific features and how they interact with speaker, session, and utterance characteristics.
【Keywords】: Natural Language Processing: Dialogue; Natural Language Processing: Speech; Humans and AI: Human-Computer Interaction;
【Paper Link】 【Pages】:5164-5168
【Authors】: Risheng Liu
【Abstract】: Numerous tasks at the core of statistics, learning, and vision areas are specific cases of ill-posed inverse problems. Recently, learning-based (e.g., deep) iterative methods have been empirically shown to be useful for these problems. Nevertheless, integrating learnable structures into iterations is still a laborious process, which can only be guided by intuitions or empirical insights. Moreover, there is a lack of rigorous analysis of the convergence behaviors of these reimplemented iterations, and thus the significance of such methods is a little bit vague. We move beyond these limits and propose a theoretically guaranteed optimization learning paradigm, a generic and provable paradigm for nonconvex inverse problems, and develop a series of convergent deep models. Our theoretical analysis reveals that the proposed optimization learning paradigm allows us to generate globally convergent trajectories for learning-based iterative methods. Thanks to the superiority of our framework, we achieve state-of-the-art performance on different real applications.
【Keywords】: Machine Learning: Deep Learning; Computer Vision: Structural and Model-Based Approaches, Knowledge Representation and Reasoning; Computer Vision: Biomedical Image Understanding; Constraints and SAT: Constraint Optimization;
【Paper Link】 【Pages】:5169-5173
【Authors】: Nicholas Mattei
【Abstract】: Research in both computational social choice and preference reasoning uses tools and techniques from computer science, generally algorithms and complexity analysis, to examine topics in group decision making. This has brought tremendous progress in the last decades, creating new avenues for research and results in areas including voting and resource allocation. I argue that of equal importance to the theoretical results are impacts in research and development from the empirical part of the computer scientists toolkit: data, system building, and human interaction. I highlight work by myself and others to establish data driven, application driven research in the computational social choice and preference reasoning areas. Along the way, I highlight interesting application domains and important results from the community in driving this area to make concrete, real-world impact.
【Keywords】: Agent-based and Multi-agent Systems: Computational Social Choice; Agent-based and Multi-agent Systems: Voting; Agent-based and Multi-agent Systems: Algorithmic Game Theory; AI Ethics: Moral Decision Making;
【Paper Link】 【Pages】:5174-5177
【Authors】: Taiki Todo
【Abstract】: My research is summarized as mechanism design with uncertainty. Traditional mechanism design focuses on static environments where all the (possibly probabilistic) information about the agents are observable by the mechanism designer. In practice, however, it is possible that the set of participating agents and/or some of teheir actions are not observable a priori. We therefore focused on various kinds of uncertainty in mechanism design and developed/analyzed several market mechanisms that incentivise agents to behave in a sincere way.
【Keywords】: Agent-based and Multi-agent Systems: Economic Paradigms, Auctions and Market-Based Systems; Agent-based and Multi-agent Systems: Algorithmic Game Theory; Agent-based and Multi-agent Systems: Computational Social Choice;
【Paper Link】 【Pages】:5178-5182
【Authors】: Lijun Zhang
【Abstract】: The usual goal of online learning is to minimize the regret, which measures the performance of online learner against a fixed comparator. However, it is not suitable for changing environments in which the best decision may change over time. To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms. In dynamic regret, the learner is compared with a sequence of comparators, and in adaptive regret, the learner is required to minimize the regret over every interval. In this paper, we will review the recent developments in this area, and highlight our contributions. Specifically, we have proposed novel algorithms to minimize the dynamic regret and adaptive regret, and investigated the relationship between them.
【Keywords】: Machine Learning: Online Learning; Machine Learning: Time-series;Data Streams; Machine Learning: Big data; Scalability;
【Paper Link】 【Pages】:5184-5185
【Authors】: Mark Zolotas ; Yiannis Demiris
【Abstract】: Robots supplied with the ability to infer human intent have many applications in assistive robotics. In these applications, robots rely on accurate models of human intent to administer appropriate assistance. However, the effectiveness of this assistance also heavily depends on whether the human can form accurate mental models of robot behaviour. The research problem is to therefore establish a transparent interaction, such that both the robot and human understand each other’s underlying "intent". We situate this problem in our Explainable Shared Control paradigm and present ongoing efforts to achieve transparency in human-robot collaboration.
【Keywords】: Robotics: Human Robot Interaction; Humans and AI: Human-AI Collaboration;
【Paper Link】 【Pages】:5186-5187
【Authors】: Nicolas Bougie ; Ryutaro Ichise
【Abstract】: Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Machine Learning: Reinforcement Learning; Agent-based and Multi-agent Systems: Other;
【Paper Link】 【Pages】:5188-5189
【Authors】: Ya-nan Han ; Jian-wei Liu ; Xiong-lin Luo
【Abstract】: There is growing interest in low rank representation (LRR) for subspace clustering. Existing latent LRR methods can exploit the global structure of data when the observations are insufficient and/or grossly corrupted, but it cannot capture the intrinsic structure due to the neglect of the local information of data. In this paper, we proposed an improved latent LRR model with a distance regularization and a non-negative regularization jointly, which can effectively discover the global and local structure of data for graph learning and improve the expression of the model. Then, an efficiently iterative algorithm is developed to optimize the improved latent LRR model. In addition, traditional subspace clustering characterizes a fixed numbers of cluster, which cannot efficiently make model selection. An efficiently automatic subspace clustering is developed via the bias and variance trade-off, where the numbers of cluster can be automatically added and discarded on the fly.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Machine Learning: Clustering; Constraints and SAT: Constraint Optimization; Computer Vision: Statistical Methods and Machine Learning;
【Paper Link】 【Pages】:5190-5191
【Authors】: Shiwei Liu
【Abstract】: Deep neural networks perform well on test data when they are highly overparameterized, which, however, also leads to large cost to train and deploy them. As a leading approach to address this problem, sparse neural networks have been widely used to significantly reduce the size of networks, making them more efficient during training and deployment, without compromising performance. Recently, sparse neural networks, either compressed from a pre-trained model or obtained by training from scratch, have been observed to be able to generalize as well as or even better than their dense counterparts. However, conventional techniques to find well fitted sparse sub-networks are expensive and the mechanisms underlying this phenomenon are far from clear. To tackle these problems, this Ph.D. research aims to study the generalization of sparse neural networks, and to propose more efficient approaches that can yield sparse neural networks with generalization bounds.
【Keywords】: Machine Learning: Feature Selection; Learning Sparse Models; Machine Learning: Cost-Sensitive Learning; Machine Learning: Deep Learning;
【Paper Link】 【Pages】:5192-5193
【Authors】: Xinghao Yang ; Wei Liu
【Abstract】: Estimations on people movement behaviour within a country can provide valuable information to government strategic resource plannings. In this paper, we propose to utilize multi-domain statistical data to estimate people movements under the assumption that most population tend to move to areas with similar or better living conditions. We design a Multi-domain Matrix Factorization (MdMF) model to discover the underlying consistency patterns from these cross-domain data and estimate the movement trends using the proposed model. This research can provide important theoretical support to government and agencies in strategic resource planning and investments.
【Keywords】: Data Mining: Clustering, Unsupervised Learning; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Machine Learning: Tensor and Matrix Methods; Machine Learning: Clustering;
【Paper Link】 【Pages】:5194-5195
【Authors】: Guillaume Lorthioir ; Katsumi Inoue
【Abstract】: Digital games have proven to be valuable simulation environments for plan and goal recognition. Though, goal recognition is a hard problem, especially in the field of digital games where players unintentionally achieve goals through exploratory actions, abandon goals with little warning, or adopt new goals based upon recent or prior events. In this paper, a method using simulation and bayesian programming to infer the player's strategy in a Real-Time-Strategy game (RTS) is described, as well as how we could use it to make more adaptive AI for this kind of game and thus make more challenging and entertaining games for the players.
【Keywords】: Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Computer Vision: Action Recognition; Humans and AI: Human-Computer Interaction; Multidisciplinary Topics and Applications: Computer Games;
【Paper Link】 【Pages】:5196-5197
【Authors】: Amith Manoharan
【Abstract】: Unmanned aerial vehicles (UAVs) have reached significant maturity over several years for safe civilian operations like mapping, search and rescue. The operation performance can be significantly improved by deploying multiple cooperating UAVs and optimal decision making. In this work, we present the use of nonlinear model predictive control (NMPC) for two different applications involving cooperative UAVs.
【Keywords】: Robotics: Multi-Robot Systems; Robotics: Localization, Mapping, State Estimation; Robotics: Motion and Path Planning; Agent-based and Multi-agent Systems: Coordination and Cooperation;
【Paper Link】 【Pages】:5198-5199
【Authors】: Bonaventure C. Molokwu ; Ziad Kobti
【Abstract】: Social Network Analysis (SNA) has become a very interesting research topic with regard to Artificial Intelligence (AI) because a wide range of activities, comprising animate and inanimate entities, can be examined by means of social graphs. Consequently, classification and prediction tasks in SNA remain open problems with respect to AI. Latent representations about social graphs can be effectively exploited for training AI models in a bid to detect clusters via classification of actors as well as predict ties with regard to a given social network. The inherent representations of a social graph are relevant to understanding the nature and dynamics of a given social network. Thus, our research work proposes a unique hybrid model: Representation Learning via Knowledge-Graph Embeddings and ConvNet (RLVECN). RLVECN is designed for studying and extracting meaningful representations from social graphs to aid in node classification, community detection, and link prediction problems. RLVECN utilizes an edge sampling approach for exploiting features of the social graph via learning the context of each actor with respect to its neighboring actors.
【Keywords】: Machine Learning: Deep Learning; Data Mining: Classification, Semi-Supervised Learning; Data Mining: Feature Extraction, Selection and Dimensionality Reduction; Data Mining: Mining Graphs, Semi Structured Data, Complex Data;
【Paper Link】 【Pages】:5200-5201
【Authors】: Zarmeen Nasim
【Abstract】: This research is an endeavor to combine deep-learning-based language modeling with classical topic modeling techniques to produce interpretable topics for a given set of documents in Urdu, a low resource language. The existing topic modeling techniques produce a collection of words, often un-interpretable, as suggested topics without integrat-ing them into a semantically correct phrase/sentence. The proposed approach would first build an accurate Part of Speech (POS) tagger for the Urdu Language using a publicly available corpus of many million sentences. Using semanti-cally rich feature extraction approaches including Word2Vec and BERT, the proposed approach, in the next step, would experiment with different clus-tering and topic modeling techniques to produce a list of potential topics for a given set of documents. Finally, this list of topics would be sent to a labeler module to produce syntactically correct phrases that will represent interpretable topics.
【Keywords】: Natural Language Processing: Natural Language Processing; Natural Language Processing: NLP Applications and Tools; Natural Language Processing: Embeddings; Natural Language Processing: Natural Language Summarization;
【Paper Link】 【Pages】:5202-5203
【Authors】: Nat Pavasant ; Masayuki Numao ; Ken-ichi Fukui
【Abstract】: This paper proposed a method to detect changes in causal relations over a multi-dimensional sequence of events. Cluster Sequence Mining algorithm was modified to extract causal relations in the form of g-patterns: a pair of clusters of events that have their occurrence time determined by Granger causality. This paper also proposed the pattern time signature, a probabilistic density function of the cluster sequence occurring at any given time. Synthetic data were used for validation. The result shows that the proposed algorithm can correctly identify the changes in causal relations even under noisy data.
【Keywords】: Data Mining: Mining Spatial, Temporal Data; Data Mining: Frequent Pattern Mining; Machine Learning: Time-series;Data Streams; Machine Learning: Clustering;
【Paper Link】 【Pages】:5204-5205
【Authors】: Wenqi Zhao ; Satoshi Oyama ; Masahito Kurihara
【Abstract】: Counterfactual explanations help users to understand the behaviors of machine learning models by changing the inputs for the existing outputs. For an image classification task, an example counterfactual visual explanation explains: "for an example that belongs to class A, what changes do we need to make to the input so that the output is more inclined to class B." Our research considers changing the attribute description text of class A on the basis of the attributes of class B and generating counterfactual images on the basis of the modified text. We can use the prediction results of the model on counterfactual images to find the attributes that have the greatest effect when the model is predicting classes A and B. We applied our method to a fine-grained image classification dataset and used the generative adversarial network to generate natural counterfactual visual explanations. To evaluate these explanations, we used them to assist crowdsourcing workers in an image classification task. We found that, within a specific range, they improved classification accuracy.
【Keywords】: Machine Learning: Interpretability; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation;
【Paper Link】 【Pages】:5206-5207
【Authors】: Ramon Ruiz-Dolz
【Abstract】: Computational Argumentation studies the definition of models able to either have a debate, persuade users in decision making or assist humans with argument analysis. In this work, some of our initial contributions and the foundations of this research field are presented.
【Keywords】: Knowledge Representation and Reasoning: Computational Models of Argument; Humans and AI: Computer-Aided Education; Natural Language Processing: Discourse;
【Paper Link】 【Pages】:5208-5209
【Authors】: Kyungwoo Song
【Abstract】: Context modeling helps understand the data, such as sentence or user behavior. Contextual information captures the important underlying feature, and it enhances the relationship between data instances or hidden representations. As the importance of the sequential model grows, so does the importance of the sequential contextual modeling. Under the sequential data, we need to consider the context change over time. In this paper, we present our research works on context modeling and its dynamics modeling over time. Furthermore, we extend our research to handle the multi-granularity of sequential context modeling to consider rich context representations.
【Keywords】: Machine Learning: Deep Learning; Machine Learning: Deep Learning: Sequence Modeling; Data Mining: Mining Text, Web, Social Media; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:5210-5211
【Authors】: Julissa Villanueva Llerena
【Abstract】: Tractable Deep Probabilistic Models (TPMs) are generative models based on arithmetic circuits that allow for exact marginal inference in linear time. These models have obtained promising results in several machine learning tasks. Like many other models, TPMs can produce over-confident incorrect inferences, especially on regions with small statistical support. In this work, we will develop efficient estimators of the predictive uncertainty that are robust to data scarcity and outliers. We investigate two approaches. The first approach measures the variability of the output to perturbations of the model weights. The second approach captures the variability of the prediction to changes in the model architecture. We will evaluate the approaches on challenging tasks such as image completion and multilabel classification.
【Keywords】: Trust, Fairness, Bias: General; Uncertainty in AI: Graphical Models; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:5212-5213
【Authors】: Jennifer Williams
【Abstract】: Preliminary experiments in this dissertation show that it is possible to factorize specific types of information from the speech signal in an abstract embedding space using machine learning. This information includes characteristics of the recording environment, speaking style, and speech quality. Based on these findings, a new technique is proposed to factorize multiple types of information from the speech signal simultaneously using a combination of state-of-the-art machine learning methods for speech processing. Successful speech signal factorization will lead to advances across many speech technologies, including improved speaker identification, detection of speech audio deep fakes, and controllable expression in speech synthesis.
【Keywords】: Natural Language Processing: Speech; Machine Learning: Deep Learning; Machine Learning: Multi-instance;Multi-label;Multi-view learning; Multidisciplinary Topics and Applications: Security and Privacy;
【Paper Link】 【Pages】:5214-5215
【Authors】: Lu YIn
【Abstract】: Knowledge present in a domain is well expressed as relationships between corresponding concepts. For example, in zoology, animal species form complex hierarchies; in genomics, the different (parts of) molecules are organized in groups and subgroups based on their functions; plants, molecules, and astronomical objects all form complex taxonomies. Nevertheless, when applying supervised machine learning (ML) in such domains, we commonly reduce the complex and rich knowledge to a fixed set of labels. This oversimplifies and limits the potential impact that the ML solution can deliver. The main reason for such a reductionist approach is the difficulty in eliciting the domain knowledge from the experts. Developing a label structure with sufficient fidelity and providing comprehensive multi-label annotation can be exceedingly labor-intensive in many real-world applications. Here, we provide a method for efficient hierarchical knowledge elicitation (HKE) from experts working with high-dimensional data such as images or videos. Our method is based on psychometric testing and active deep metric learning. The developed models embed the high-dimensional data in a metric space where distances are semantically meaningful, and the data can be organized in a hierarchical structure.
【Keywords】: Humans and AI: Cognitive Modeling; Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation; Machine Learning: Semi-Supervised Learning; Machine Learning: Active Learning;
【Paper Link】 【Pages】:5216-5217
【Authors】: Hanhua Zhu
【Abstract】: Deep reinforcement learning (DRL) increases the successful applications of reinforcement learning (RL) techniques but also brings challenges such as low sample efficiency. In this work, I propose generalized representation learning methods to obtain compact state space suitable for RL from a raw observation state. I expect my new methods will increase sample efficiency of RL by understandable representations of state and therefore improve the performance of RL.
【Keywords】: Machine Learning: Deep Reinforcement Learning; Heuristic Search and Game Playing: Game Playing and Machine Learning; Machine Learning: Probabilistic Machine Learning;
【Paper Link】 【Pages】:5219-5221
【Authors】: Aaron Hunter ; John Agapeyev
【Abstract】: The process of belief revision occurs in many applications where agents may have incorrect or incomplete information. One important theoretical model of belief revision is the well-known AGM approach. Unfortunately, there are few tools available for solving AGM revision problems quickly; this has limited the use of AGM operators for practical applications. In this demonstration paper, we describe GenC, a tool that is able to quickly calculate the result of AGM belief revision for formulas with hundreds of variables and millions of clauses. GenC uses an AllSAT solver and parallel processing to solve revision problems at a rate much faster than existing systems. The solver works for the class of parametrised difference operators, which is an extensive class of revision operators that use a weighted Hamming distance to measure the similarity between states. We demonstrate how GenC can be used as a stand-alone tool or as a component of a reasoning system for a variety of applications.
【Keywords】: Knowledge Representation and Reasoning: general; Multi-agent Systems: general; Uncertainty in AI: general;
【Paper Link】 【Pages】:5222-5224
【Authors】: Gaël Aglin ; Siegfried Nijssen ; Pierre Schaus
【Abstract】: Decision Trees (DTs) are widely used Machine Learning (ML) models with a broad range of applications. The interest in these models has increased even further in the context of Explainable AI (XAI), as decision trees of limited depth are very interpretable models. However, traditional algorithms for learning DTs are heuristic in nature; they may produce trees that are of suboptimal quality under depth constraints. We introduce PyDL8.5, a Python library to infer depth-constrained Optimal Decision Trees (ODTs). PyDL8.5 provides an interface for DL8.5, an efficient algorithm for inferring depth-constrained ODTs. The library provides an easy-to-use scikit-learn compatible interface. It cannot only be used for classification tasks, but also for regression, clustering, and other tasks. We introduce an interface that allows users to easily implement these other learning tasks. We provide a number of examples of how to use this library.
【Keywords】: Machine Learning: general; Constraints and Satisfiability: general;
【Paper Link】 【Pages】:5225-5227
【Authors】: Valentijn Borghuis ; Luca Angioloni ; Lorenzo Brusci ; Paolo Frasconi
【Abstract】: We demonstrate a pattern-based MIDI music generation system with a generation strategy based on Wasserstein autoencoders and a novel variant of pianoroll descriptions of patterns which employs separate channels for note velocities and note durations and can be fed into classic DCGAN-style convolutional architectures. We trained the system on two new datasets (in the acid-jazz and high-pop genres) composed by musicians in our team with music generation in mind. Our demonstration shows that moving smoothly in the latent space allows us to generate meaningful sequences of four-bars patterns.
【Keywords】: Machine Learning: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5228-5230
【Authors】: Roy Assaf ; Ioana Giurgiu ; Jonas Pfefferle ; Serge Monney ; Haris Pozidis ; Anika Schumann
【Abstract】: Anomaly detection in data storage systems is a challenging problem due to the high dimensional sequential data involved, and lack of labels. The state of the art for automating anomaly detection in these systems typically relies on hand crafted rules and thresholds which mainly allow to distinguish between normal and abnormal behavior of each indicator in isolation. In this work we present an end-to-end framework based on convolutional autoencoders which not only allows for anomaly detection on multivariate time series data, but also provides explainability. This is done by identifying similar historic anomalies and extracting the most influential indicators. These are then presented to relevant personnel such as system designers and architects, or to support engineers for further analysis. We demonstrate the application of this framework along with an intuitive interactive web interface which was developed for data storage system anomaly detection. We discuss how this framework along with its explainability aspects enables support engineers to effectively tackle abnormal behaviors, all while allowing for crucial feedback.
【Keywords】: Machine Learning: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5231-5233
【Authors】: Sagar Sen ; Pierre Bernabé ; Erik Johannes B. L. G. Husom
【Abstract】: Tracking physical effort from physiological signals has enabled people to manage required activity levels in our increasingly sedentary and automated world. Breathing is a physiological process that is a reactive representation of our physical effort. In this demo, we present DeepVentilation, a deep learning system to predict minute ventilation in litres of air a person moves in one minute uniquely from real-time measurement of rib-cage breathing forces. DeepVentilation has been trained on input signals of expansion and contraction of the rib-cage obtained using a non-invasive respiratory inductance plethysmography sensor to predict minute ventilation as observed from a face/head mounted exercise spirometer. The system is used to track physical effort closely matching our perception of actual exercise intensity. The source code for the demo is available here: https://github.com/simula-vias/DeepVentilation
【Keywords】: Machine Learning: general; Human-Computer Interactive Systems: general; Uncertainty in AI: general;
【Paper Link】 【Pages】:5234-5236
【Authors】: Sondre Hamnvik ; Pierre Bernabé ; Sagar Sen
【Abstract】: Obstructive sleep apnea is a serious sleep disorder that affects an estimated one billion adults worldwide. It causes breathing to repeatedly stop and start during sleep which over years increases the risk of hypertension, heart disease, stroke, Alzheimer's, and cancer. In this demo, we present Yolo4Apnea a deep learning system extending You Only Look Once (Yolo) system to detect sleep apnea events from abdominal breathing patterns in real-time enabling immediate awareness and action. Abdominal breathing is measured using a respiratory inductance plethysmography sensor worn around the stomach. The source code is available at https://github.com/simula-vias/Yolo4Apnea
【Keywords】: Machine Learning: general; Computer Vision: general; Uncertainty in AI: general;
【Paper Link】 【Pages】:5237-5239
【Authors】: Shreyas Kolala Venkataramanaiah ; Xiaocong Du ; Zheng Li ; Shihui Yin ; Yu Cao ; Jae-sun Seo
【Abstract】: Training of deep Convolution Neural Networks (CNNs) requires a tremendous amount of computation and memory and thus, GPUs are widely used to meet the computation demands of these complex training tasks. However, lacking the flexibility to exploit architectural optimizations, GPUs have poor energy efficiency of GPUs and are hard to be deployed on energy-constrained platforms. FPGAs are highly suitable for training, such as real-time learning at the edge, as they provide higher energy efficiency and better flexibility to support algorithmic evolution. This paper first develops a training accelerator on FPGA, with 16-bit fixed-point computing and various training modules. Furthermore, leveraging model segmentation techniques from Progressive Segmented Training, the newly developed FPGA accelerator is applied to online learning, achieving much lower computation cost. We demonstrate the performance of representative CNNs trained for CIFAR-10 on Intel Stratix-10 MX FPGA, evaluating both the conventional training procedure and the online learning algorithm.
【Keywords】: Machine Learning: general; Computer Vision: general;
【Paper Link】 【Pages】:5240-5242
【Authors】: Jorge Fernandez ; Olivier Gasquet ; Andreas Herzig ; Dominique Longin ; Emiliano Lorini ; Frédéric Maris ; Pierre Régnier
【Abstract】: This work deals with logical formalization and problem solving using automated solvers. We present the automatic translator TouIST that provides a simple language to generate logical formulas from a problem description. Our tool allows us to model many static or dynamic combinatorial problems and to benefit from the regular improvements of SAT, QBF or SMT solvers in order to solve these problems efficiently. In particular, we show how to use TouIST to solve different classes of planning tasks in Artificial Intelligence.
【Keywords】: Constraints and Satisfiability: general; Planning and Scheduling: general;
【Paper Link】 【Pages】:5243-5245
【Authors】: Xavier Gillard ; Pierre Schaus ; Vianney Coppé
【Abstract】: This paper presents ddo, a generic and efficient library to solve constraint optimization problems with decision diagrams. To that end, our framework implements the branch-and-bound approach which has recently been introduced by Bergman et al., (2016) to solve dynamic programs to optimality. Our library allowed us to successfully reproduce the results of Bergman et al. for MISP, MCP and MAX2SAT while using a single generic library. As an additional benefit, our ddo library is able to exploit parallel computing for its purpose without imposing any constraint on the user (apart from memory safety). Ddo is released as an open source rust library (crate) alongside with its companion example programs to solve the aforementioned problems. To the best of our knowledge, this is the first public implementation of a generic library to solve combinatorial optimization problems with branch-and-bound MDD.
【Keywords】: Constraints and Satisfiability: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:5246-5248
【Authors】: Johannes Huegle ; Christopher Hagedorn ; Matthias Uflacker
【Abstract】: The efficiency of modern automotive body shop assembly lines is highly related to the reduction of downtimes due to failures and quality deviations within the manufacturing process. Consequently, the need for implementing tools into the assembly lines for on-line monitoring, and failure diagnosis, also under the prism of improving the troubleshooting, is of great importance. While the identification of root causes and elimination of failures is usually built upon individual on-site expert knowledge, causal graphical models (CGMs) have opened the possibility to make a purely data-driven assessment. In this demo, we showcase how a CGM of the production process is incorporated into a monitoring tool to function as a decision-support system for an operator of a modern automotive body shop assembly line and enables fast and effective handling of failures and quality deviations.
【Keywords】: Machine Learning: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:5249-5251
【Authors】: Jette Henderson ; Shubham Sharma ; Alan H. Gee ; Valeri Alexiev ; Steve Draper ; Carlos Marin ; Yessel Hinojosa ; Christine Draper ; Michael Perng ; Luis Aguirre ; Michael Li ; Sara Rouhani ; Shorya Consul ; Susan Michalski ; Akarsh Prasad ; Mayank Chutani ; Aditya Kumar ; Shahzad Alam ; Prajna Kandarpa ; Binnu Jesudasan ; Colton Lee ; Michael Criscolo ; Sinead Williamson ; Matt Sanchez ; Joydeep Ghosh
【Abstract】: As more companies and governments build and use machine learning models to automate decisions, there is an ever-growing need to monitor and evaluate these models' behavior once they are deployed. Our team at CognitiveScale has developed a toolkit called Cortex Certifai to answer this need. Cortex Certifai is a framework that assesses aspects of robustness, fairness, and interpretability of any classification or regression model trained on tabular data, without requiring access to its internal workings. Additionally, Cortex Certifai allows users to compare models along these different axes and only requires 1) query access to the model and 2) an “evaluation” dataset. At its foundation, Cortex Certifai generates counterfactual explanations, which are synthetic data points close to input data points but differing in terms of model prediction. The tool then harnesses characteristics of these counterfactual explanations to analyze different aspects of the supplied model and delivers evaluations relevant to a variety of different stakeholders (e.g., model developers, risk analysts, compliance officers). Cortex Certifai can be configured and executed using a command-line interface (CLI), within jupyter notebooks, or on the cloud, and the results are recorded in JSON files and can be visualized in an interactive console. Using these reports, stakeholders can understand, monitor, and build trust in their AI systems. In this paper, we provide a brief overview of a demonstration of Cortex Certifai's capabilities.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:5252-5254
【Authors】: Pengwei Hu ; Chenhao Lin ; Hui Su ; Shaochun Li ; Xue Han ; Yuan Zhang ; Jing Mei
【Abstract】: The use of social media runs through our lives, and users' emotions are also affected by it. Previous studies have reported social organizations and psychologists using social media to find depressed patients. However, due to the variety of content published by users, it isn't effortless for the system to consider the text, image, and even the hidden information behind the image. To address this problem, we proposed a new system for social media screening of depressed patients named BlueMemo. We collected real-time posts from Twitter. Based on the posts, learned text features, image features, and visual attributes were extracted as three modalities and were fed into a multi-modal fusion and classification model to implement our system. The proposed BlueMemo has the power to help physicians and clinicians quickly and accurately identify users at potential risk for depression.
【Keywords】: Knowledge Representation and Reasoning: general; Machine Learning: general; Natural Language Processing: general; Computer Vision: general;
【Paper Link】 【Pages】:5255-5257
【Authors】: Hen-Hsen Huang
【Abstract】: This work presents AutoSurvey, an intelligent system that performs literature survey and generates a summary specific to a research draft. A neural model for information structure analysis is employed for extracting fine-grained information from the abstracts of previous work, and a novel evolutionary multi-source summarization model is proposed for generating the summary of related work. This system is extremely used for both academic and educational purposes.
【Keywords】: Natural Language Processing: general;
【Paper Link】 【Pages】:5258-5260
【Authors】: Weiyi Huang ; Jiahao Jiang ; Qiang Qu ; Min Yang
【Abstract】: Question answering (QA) in the legal domain has gained increasing popularity for people to seek legal advice. However, existing QA systems struggle to comprehend the legal context and provide jurisdictionally relevant answers due to the lack of domain expertise. In this paper, we develop an Artificial Intelligence Law Assistant (AILA) for question answering in the domain of Chinese laws. AILA system automatically comprehends users' natural language queries with the help of the legal knowledge graph (KG) and provides the best matching answers for given queries. In addition, AILA provides visual cues to interpret the input queries and candidate answers based on the legal KG. Experimental results on a large-scale legal QA corpus show the effectiveness of AILA. To the best of our knowledge, AILA is the first Chinese legal QA system which integrates the domain knowledge from legal KG to comprehend the questions and answers for ranking QA pairs. AILA is available at http://bmilab.ticp.io:48478/.
【Keywords】: Natural Language Processing: general;
【Paper Link】 【Pages】:5261-5263
【Authors】: Joanne T. Kim ; Sookyung Kim ; Brenden K. Petersen
【Abstract】: Discovering tractable mathematical expressions that best explain a dataset is a long-standing challenge in artificial intelligence. This problem, known as symbolic regression, is relevant when one seeks to generate new physical knowledge and insights. Since practitioners are primarily interested in knowledge generation, the ability to interact with a symbolic regression algorithm would be highly valuable. Thus, we present an interactive symbolic regression framework that allows users not only to configure runs, but also to control the system during training. The interface provides real-time visualization and diagnostics to help guide the user as they control the algorithm on the fly.
【Keywords】: Machine Learning: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:5264-5266
【Authors】: Daochen Zha ; Kwei-Herng Lai ; Songyi Huang ; Yuanpu Cao ; Keerthana Reddy ; Juan Vargas ; Alex Nguyen ; Ruzhe Wei ; Junyu Guo ; Xia Hu
【Abstract】: We present RLCard, a Python platform for reinforcement learning research and development in card games. RLCard supports various card environments and several baseline algorithms with unified easy-to-use interfaces, aiming at bridging reinforcement learning and imperfect information games. The platform provides flexible configurations of state representation, action encoding, and reward design. RLCard also supports visualizations for algorithm debugging. In this demo, we showcase two representative environments and their visualization results. We conclude this demo with challenges and research opportunities brought by RLCard. A video is available on YouTube.
【Keywords】: Game Playing: general; Multi-agent Systems: general; Machine Learning: general;
【Paper Link】 【Pages】:5267-5269
【Authors】: Chang Liu ; Zhao Yong Lim ; Han Yu ; Zhiqi Shen ; Ian Dixon ; Zhanning Gao ; Pan Wang ; Peiran Ren ; Xuansong Xie ; Lizhen Cui ; Chunyan Miao
【Abstract】: Video editing is currently a highly skill- and time-intensive process. One of the most important tasks in video editing is to compose the visual storyline. This paper outlines Visual Storyline Generator (VSG), an artificial intelligence (AI)-empowered system that automatically generates visual storylines based on a set of images and video footages provided by the user. It is designed to produce engaging and persuasive promotional videos with an easy-to-use interface. In addition, users can be involved in refining the AI-generated visual storylines. The editing results can be used as training data to further improve the AI algorithms in VSG.
【Keywords】: Computer Vision: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5270-5272
【Authors】: Baihan Lin
【Abstract】: This paper proposed a new interaction paradigm in the virtual reality (VR) environments, which consists of a virtual mirror or window projected onto a virtual surface, representing the correct perspective geometry of a mirror or window reflecting the real world. This technique can be applied to various videos, live streaming apps, augmented and virtual reality settings to provide an interactive and immersive user experience. To support such a perspective-accurate representation, we implemented computer vision algorithms for feature detection and correspondence matching. To constrain the solutions, we incorporated an automatically tuning scaling factor upon the homography transform matrix such that each image frame follows a smooth transition with the user in sight. The system is a real-time rendering framework where users can engage their real-life presence with the virtual space.
【Keywords】: Computer Vision: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5273-5275
【Authors】: Philippe Esling ; Naotake Masuda ; Axel Chemla-Romeu-Santos
【Abstract】: Audio synthesizers are pervasive in modern music production. These highly complex audio generation functions provide a unique diversity through their large sets of parameters. However, this feature also can make them extremely hard and obfuscated to use, especially for non-expert users with no formal knowledge on signal processing. We recently introduced a novel formalization of the problem of synthesizer control as learning an invertible mapping between an audio latent space, extracted from the audio signal, and a target parameter latent space, extracted from the synthesizer's presets, using normalizing flows. In addition to model a continuous representation allowing to ease the intuitive exploration of the synthesizer, it also provides a ground-breaking method for audio-based parameter inference, vocal control and macro-control learning. Here, we discuss the details of integrating these high-level features to develop new interaction schemes between a human user and the generating device: parameters inference from audio, high-level preset visualization and interpolation, that can be used both in off-time and real-time situations. Moreover, we also leverage LeapMotion devices to allow the control of hundreds of parameters simply by moving one hand across space to explore the low-dimensional latent space, allowing to both empower and facilitate the user's interaction with the synthesizer.
【Keywords】: Machine Learning: general; Human-Computer Interactive Systems: general; Knowledge Representation and Reasoning: general;
【Paper Link】 【Pages】:5276-5278
【Authors】: Beatriz San Miguel ; Aisha Naseer ; Hiroya Inakoshi
【Abstract】: To improve and ensure trustworthiness and ethics on Artificial Intelligence (AI) systems, several initiatives around the globe are producing principles and recommendations, which are providing to be difficult to translate into technical solutions. A common trait among ethical AI requirements is accountability that aims at ensuring responsibility, auditability, and reduction of negative impact of AI systems. To put accountability into practice, this paper presents the Global-view Accountability Framework (GAF) that considers auditability and redress of conflicting information arising from a context with two or more AI systems which can produce a negative impact. A technical implementation of the framework for automotive and motor insurance is demonstrated, where the focus is on preventing and reporting harm rendered by autonomous vehicles.
【Keywords】: Uncertainty in AI: general; Multi-agent Systems: general;
【Paper Link】 【Pages】:5279-5281
【Authors】: Kang Loon Ng ; Zichen Chen ; Zelei Liu ; Han Yu ; Yang Liu ; Qiang Yang
【Abstract】: Federated Learning (FL) enables participants to "share'' their sensitive local data in a privacy preserving manner and collaboratively build machine learning models. In order to sustain long-term participation by high quality data owners (especially if they are businesses), FL systems need to provide suitable incentives. To design an effective incentive scheme, it is important to understand how FL participants respond under such schemes. This paper proposes FedGame, a multi-player game to study how FL participants make action selection decisions under different incentive schemes. It allows human players to role-play under various conditions. The decision-making processes can be analyzed and visualized to inform FL incentive mechanism design in the future.
【Keywords】: Machine Learning: general; Game Playing: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5282-5284
【Authors】: William Ogallo ; Skyler Speakman ; Victor Akinwande ; Kush R. Varshney ; Aisha Walcott-Bryant ; Charity Wayua ; Komminist Weldemariam
【Abstract】: Improving maternal, newborn, and child health (MNCH) outcomes is a critical target for global sustainable development. Our research is centered on building predictive models, evaluating their interpretability, and generating actionable insights about the markers (features) and triggers (events) associated with vulnerability in MNCH. In this work, we demonstrate how a tool for inspecting "black box" machine learning models can be used to generate actionable insights from models trained on demographic health survey data to predict neonatal mortality.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:5285-5287
【Authors】: Andreas Persson ; Pedro Zuidberg Dos Martires ; Luc De Raedt ; Amy Loutfi
【Abstract】: Modeling object representations derived from perceptual observations, in a way that is also semantically meaningful for humans as well as autonomous agents, is a prerequisite for joint human-agent understanding of the world. A practical approach that aims to model such representations is perceptual anchoring, which handles the problem of mapping sub-symbolic sensor data to symbols and maintains these mappings over time. In this paper, we present ProbAnch, a modular data-driven anchoring framework, whose implementation requires a variety of well-orchestrated components, including a probabilistic reasoning system.
【Keywords】: Computer Vision: general; Uncertainty in AI: general;
【Paper Link】 【Pages】:5288-5290
【Authors】: Celia Cintas ; Ramya Raghavendra ; Victor Akinwande ; Aisha Walcott-Bryant ; Charity Wayua ; Komminist Weldemariam
【Abstract】: Contraceptive use improves the health of women and children in several ways, yet data shows high rates of discontinuation which is not well understood. We introduce an AI-based decision platform capable of analyzing event data to identify patterns of contraceptive uptake that are unique to a subpopulation of interest. These discriminatory patterns provide valuable, interpretable insights to policy-makers. The sequences then serve as a hypothesis for downstream causal analysis to estimate the effect of specific variables on discontinuation outcomes. Our platform presents a way to visualize, stratify, compare, and perform a causal analysis on covariates that determine contraceptive uptake behavior, and yet is general enough to be extended to a variety of applications.
【Keywords】: Machine Learning: general;
【Paper Link】 【Pages】:5291-5293
【Authors】: Rolf Schwitter
【Abstract】: The PENG ASP system supports the writing of textual specifications with the help of a smart text editor that possesses knowledge about the structure of the specification language. Specifications written in PENG ASP are incrementally translated into executable answer set programs and vice versa. That means the system allows for lossless semantic round-tripping between a human-readable specification and an answer set program. This functionality is achieved by a single bi-directional logic grammar that serves at the same time as a text processor and a text generator. We demonstrate that the PENG ASP system can be used to bridge the gap between a (seemingly) informal specification and an executable answer set program.
【Keywords】: Knowledge Representation and Reasoning: general; Natural Language Processing: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5294-5296
【Authors】: Harrison Jun Yong Wong ; Zichao Deng ; Han Yu ; Jianqiang Huang ; Cyril Leung ; Chunyan Miao
【Abstract】: Order dispatch is an important area where artificial intelligence (AI) can benefit ride-sharing systems (e.g., Grab, Uber), which has become an integral part of our public transport network. In this paper, we present a multi-agent testbed to study the spread of infectious diseases through such a system. It allows users to vary the parameters of the disease and behaviours to study the interaction effect between technology, disease and people's behaviours in such a complex environment.
【Keywords】: Human-Computer Interactive Systems: general; Multi-agent Systems: general;
【Paper Link】 【Pages】:5297-5299
【Authors】: Zhiwei Zeng ; Hongchao Jiang ; Yanci Zhang ; Zhiqi Shen ; Jun Ji ; Martin J. McKeown ; Jing Jih Chin ; Cyril Leung ; Chunyan Miao
【Abstract】: Population aging is becoming an increasingly important issue around the world. As people live longer, they also tend to suffer from more challenging medical conditions. Currently, there is a lack of a holistic technology-powered solution for providing quality care at affordable cost to patients suffering from co-morbidity. In this paper, we demonstrate a novel AI-powered solution to provide early detection of the onset of Dementia + Parkinson's disease (DPD) co-morbidity, a condition which severely limits a senior's ability to live actively and independently. We investigate useful in-game behaviour markers which can support machine learning-based predictive analytics on seniors' risk of developing DPD co-morbidity.
【Keywords】: Machine Learning: general; Human-Computer Interactive Systems: general;
【Paper Link】 【Pages】:5300-5302
【Authors】: Xi Chen ; Hao Zhai ; Danqian Liu ; Weifu Li ; Chaoyue Ding ; Qiwei Xie ; Hua Han
【Abstract】: Biologists often need to handle numerous video-based home-cage animal behavior analysis tasks that require massive workloads. Therefore, we develop an AI-based multi-species tracking and segmentation system, SiamBOMB, for real-time and automatic home-cage animal behavioral analysis. In this system, a background-enhanced Siamese-based network with replaceable modular design ensures the flexibility and generalizability of the system, and a user-friendly interface makes it convenient to use for biologists. This real-time AI system will effectively reduce the burden on biologists.
【Keywords】: Computer Vision: general; Machine Learning: general;
【Paper Link】 【Pages】:5303-5305
【Authors】: Xiaoyi Fu ; Jie Zhang ; Hao Yu ; Jiachen Li ; Dong Chen ; Jie Yuan ; Xindong Wu
【Abstract】: This paper presents a HAO-Graph system that generates and visualizes knowledge graphs from a speech in real-time. When a user speaks to the system, HAO-Graph transforms the voice into knowledge graphs with key phrases from the original speech as nodes and edges. Different from language-to-language systems, such as Chinese-to-English and English-to-English, HAO-Graph converts a speech into graphs, and is the first of its kind. The effectiveness of our HAO-Graph system is verified by a two-hour chairman's talk in front of two thousand participants at an annual meeting in the form of a satisfaction survey.
【Keywords】: Natural Language Processing: general; Human-Computer Interactive Systems: general; Knowledge Representation and Reasoning: general; Speech Processing: general;
【Paper Link】 【Pages】:5306-5308
【Authors】: Wei Niu ; Pu Zhao ; Zheng Zhan ; Xue Lin ; Yanzhi Wang ; Bin Ren
【Abstract】: High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.
【Keywords】: Computer Vision: general; Machine Learning: general;
【Paper Link】 【Pages】:5309-5311
【Authors】: Chongsheng Zhang ; Ruixing Zong ; Shuang Cao ; Yi Men ; Bofeng Mo
【Abstract】: Oracle Bone Inscriptions (OBI) research is very meaningful for both history and literature. In this paper, we introduce our contributions in AI-Powered Oracle Bone (OB) fragments rejoining and OBI recognition. (1) We build a real-world dataset OB-Rejoin, and propose an effective OB rejoining algorithm which yields a top-10 accuracy of 98.39%. (2) We design a practical annotation software to facilitate OBI annotation, and build OracleBone-8000, a large-scale dataset with character-level annotations. We adopt deep learning based scene text detection algorithms for OBI localization, which yield an F-score of 89.7%. We propose a novel deep template matching algorithm for OBI recognition which achieves an overall accuracy of 80.9%. Since we have been cooperating closely with OBI domain experts, our effort above helps advance their research. The resources of this work are available at https://github.com/chongshengzhang/OracleBone.
【Keywords】: Computer Vision: general; Machine Learning: general; Knowledge Representation and Reasoning: general;