I'm a second year PhD. student working with Prof. Andrew McCallum and his excellent group of students!. I completed my Master's from the Language Technologies Institute at Carnegie Mellon University. There I had a lot of fun working with Prof. Chris Dyer and Prof. Norman Sadeh. I got my Bachelor's in Computer Science from NIT Calicut, India
I spent a year in the fantastic Bay Area working as a Machine Learning Engineer at @WalmartLabs
Existing question answering methods infer answers either from a knowledge base or from raw text. While knowledge base (KB) methods are good at answering compositional questions, their performance is often affected by the incompleteness of the KB. Au contraire, web text contains millions of facts that are absent in the KB, however in an unstructured form. Universal schema can support reasoning on the union of both structured KBs and unstructured text by aligning them in a common embedded space. In this paper we extend universal schema to natural language question answering, employing memory networks to attend to the large body of facts in the combination of text and KB. Our models can be trained in an end-to-end fashion on question-answer pairs. Evaluation results on SPADES fill-in-the-blank question answering dataset show that exploiting universal schema for question answering is better than using either a KB or text alone. This model also outperforms the currentstate-of-the-art by 8.5 F1 points.
Our goal is to combine the rich multistep inference of symbolic logical reasoning with the generalization capabilities of neural networks. We are particularly interested in complex reasoning about entities and relations in text and large-scale knowledge bases (KBs). Neelakantan et al. (2015) use RNNs to compose the distributed semantics of multi-hop paths in KBs; however for multiple reasons, the approach lacks accuracy and practicality. This paper proposes three significant modeling advances: (1) we learn to jointly reason about relations, entities, and entity-types; (2) we use neural attention modeling to incorporate multiple paths; (3) we learn to share strength in a single RNN that represents logical composition across all relations. On a largescale Freebase+ClueWeb prediction task, we achieve 25% error reduction, and a 53% error reduction on sparse relations due to shared strength. On chains of reasoning in WordNet we reduce error in mean quantile by 84% versus previous state-of-the-art.
Continuous space word embeddings learned from large, unstructured corpora have been shown to be effective at capturing semantic regularities in language. In this paper we replace LDA's parameterization of "topics" as categorical distributions over opaque word types with multivariate Gaussian distributions on the embedding space. This encourages the model to group words that are a-priori known to be semantically related into topics. To perform inference, we introduce a fast collapsed Gibbs sampling algorithm based on Cholesky decompositions of covariance matrices of the posterior predictive distributions. We further derive a scalable algorithm that draws samples from stale posterior predictive distributions and corrects them with a Metropolis--Hastings step. Using vectors learned from a domain-general corpus (English Wikipedia), we report results on two document collections (20-newsgroups and NIPS). Qualitatively, Gaussian LDA infers different (but still very sensible) topics relative to standard LDA. Quantitatively, our technique outperforms existing models at dealing with OOV words in held-out documents.
Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information. A popular approach to KB completion is to infer new relations by combinatory reasoning over the information found along other paths connecting a pair of entities. Given the enormous size of KBs and the exponential number of paths, previous path-based models have considered only the problem of predicting a missing relation given two entities, or evaluating the truth of a proposed triple. Additionally, these methods have traditionally used random paths between fixed entity pairs or more recently learned to pick paths between them. We propose a new algorithm, MINERVA, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity. Since random walks are impractical in a setting with combinatorially many destinations from a start node, we present a neural reinforcement learning approach which learns how to navigate the graph conditioned on the input query to find predictive paths. Empirically, this approach obtains state-of-the-art results on several datasets, significantly outperforming prior methods.
Entity resolution systems often rely on string similarity between entity mention spellings. These string similarity models are potentially more effective when they are learned for a particular domain. For example, “John A. Smith” is more similar to “John Austin Smith” than “John B. Smith” for Western names and “ABC” is more similar to “ABC Co.” than “CBC” for business names. However, an unweighted edit distance model would make wrong predictions in both of these cases. In this paper, we train neural network models of string similarity to predict how likely two mentions are to refer to the same entity based solely on the spelling of the mentions. Our approach uses recurrent neural network models to learn embedded representations of the characters of a string, and a learned model to score the alignment of the embedded representations of a pair of strings. We describe an approach using convolutional neural networks to score the alignment of the embedded representation. We compare our approach and several baseline approaches on large datasets and find that the convolutional alignment model significantly outperforms the next best baseline.
Relation extraction is one of the core challenges in automated knowledge base construction. One line of approach for relation extraction is to perform multi-hop reasoning on the paths connecting an entity pair to infer new relations. While these methods have been successfully applied for knowledge base completion, they do not utilize the entity or the entity type information to make predictions. In this work, we incorporate selectional preferences, i.e., relations enforce constraints on the allowed entity types for the candidate entities, to multi-hop relation extraction by including entity type information. We achieve a 17:67% improvement in MAP score in a relation extractiontask when compared to a method that does not use entity type information.