A new platform for the development of large-scale Natural Language / Knowledge Representation and Reasoning Applications (Poster Abstract)

keywords: NLP, AI, Reasoning

Accepted for HPSG-2014 conference

The Natural Language / Knowledge Representation and Reasoning development platform we have created is analogous to integrated development environments (IDEs) for programmers of software applications. The platform (referred to herein as "HSA") and several applications built with the HAS platform will be commercially available this year. HSA will also be made available to researchers.

Several significant advancements have made HSA possible, among the most significant being a new parallel feature logic based parse and generate algorithm, a new description logic engine based on OWL 2 EL++ and rough set logic, and a set of algorithms for aligning multiple lexical and construction resources into a "constructicon". Benchmark results and demonstrations will be provided during the poster session.

HSA is a multi-user, multi-process Web-based environment that combines large-scale deep linguistic processing, several reasoning and pattern inference engines, and test beds for linguistic and agent-based systems. HSA consists of a number of inter-related subsystems, the most important being:

  1. A development environment for large scale lexicons (such as WordNet and VerbNet), gazettes (such as geolocation datasets), and other encyclopedic knowledge sets (such as FrameNet, Wikidata, DBpedia, and more specialized sources such as the Open Biological Ontology Foundry). The unique capabilities of this subsystem include a user-friendly human interface for knowledge editing and the ability to discover and create connections that bridge between separate knowledge sets. The resources managed by this subsystem are directly usable by the other HSA subsystems.

  2. A computational linguistics subsystem consisting of a development environment for very large scale grammars, multiple grammars including a wide-coverage grammar for English based on Sign-Based Construction Grammar and a version of the English Resource Grammar, and a new proprietary parallel constraint-based feature logic engine for parsing and generating language. A morphology component accounts for wide ranging language productivity phenomena and inferred semantics of lexemes that are not found within the lexicon including semantically classified proper names and multi-word expressions.

  3. A description logic-based system for reasoning based on OWL EL++ and rough set logic. Important uses of the reasoner is the adaptation of semantic structures generated through natural language parsing for situation and context analysis and for agent-based micro-world modelling.

  4. A document processing subsystem for context-sensitive frame-based information extraction and summarization. Processed documents are semantically indexed for use by HSA and by content management systems. Document processing also generates new lexical data toextend existing lexicons with novel or context sensitive terms (such as the jargon that may be used by a particular company or work team).

  5. Software agent development tools including specialized inference engines.

The application deployment platform shares much of its codebase with the HSA development environment. Applications built with HAS may be hosted on industry standard hardware and software environments or in the cloud.

Below is a screenshot of HSA showing WordNet 3.1 in the Lexicon subsystem.

HSA Screenshot

References:

  1. Hans C. Boas and Ivan A. Sag, eds. 2012 Sign-Based Construction Grammar CSLI Publications
  2. Ray Jackendoff and Peter Culicover 2005 Simpler Syntax Oxford University Press
  3. Ginzburg, Jonathan and Sag, Ivan 2001 Interrogative Investigations: The Form, Meaning, and Use of English Interrogatives CSLI Publications
  4. Rochelle Lieber 2009 Morphology and Lexical Semantics Cambridge University Press
  5. Bob Carpenter 1992 The Logic of Typed Feature Structures Cambridge University Press
  6. http://www.w3.org/TR/owl2-profiles/
  7. Franz Baader, Sebastian Brandt, and Carsten Lutz. Pushing the EL Envelope Further. In Proc. of the Washington DC workshop on OWL: Experiences and Directions (OWLED08DC), 2008.
  8. http://wordnet.princeton.edu/
  9. http://verbs.colorado.edu/~mpalmer/projects/verbnet.html
  10. https://framenet.icsi.berkeley.edu/fndrupal/