Here you will find links to papers and abstracts by students and architects of NLP++.

2024

“Low-resource Medical Coding of Hospital Discharge Summaries” (to be published) – Aston Williamson
“Scalable Analysis of English Dictionary Files on HPCC Systems Big Data Platform” – Adarsh U, Jayanth C, David de Hilster, Hugo Watanuki

Adarsh U and Jayanth C along with Professors Shobha and Jyoti Shetty from RSVE University in India presenting their paper in a conference in Japan in early 2024.

2023

“Emotions Detection in Social Media Posts” – Pedro Rodrigues, Renato de Oliveira Moraes, David de Hilster, Hugo Watanuki

Pedro Rodrigues during his presentation of his project involving sentiment analysis of tweets at the 30th annual symposium SIICUSP.

2000s

“Text processing in an Integrated Development Environment (IDE): Integrating Natural Language Processing (NLP) techniques” – Paul Deane, Amnon Meyers, David de Hilster – 2001
“Integrated Development Environments for Natural Language Processing” – Text Analysis International, 2001
“Multi-Pass Multi-Strategy NLP” – Amnon Meyers, 2003
“Review: Software: VisualText” – Terence Langendoen, LinguistList.org, 2002

1990s

MUC: Message Understanding Conferences – 1991-1992

The Message Understanding Conferences (MUC) for computing and computer science, were initiated and financed by DARPA (Defense Advanced Research Projects Agency) to encourage the development of new and better methods of information extraction. The character of this competition, many concurrent research teams competing against one another—required the development of standards for evaluation, e.g. the adoption of metrics like precision and recall.

Authors Amnon Meyers and David de Hilster participated in 1991 and 1992 with Amnon Meyers being instrumental in helping coordinate these first conferences. Motivated by the VOX system, DARPA and Naval Ocean Systems Center (NOSC) launched the MUC series of workshops.

“MUC-3 Test Results and Analysis” – Meyers & de Hilster, 1991
“Description of the INLET System Used for MUC-3” – Meyers & de Hilster, 1991
“MUC-4 Test Results and Analysis” – Meyers & de Hilster, 1992
“Description of the TexUS System as Used for MUC-4” – Meyers & de Hilster, 1992

1980s

Papers by Amnon Meyers – 1980s

“VOX – An Extensible Natural Language Processor” – Amnon Meyers, 1985
“VOX Naval Text Understanding System” – Amnon Meyers, 1985

Papers by David de Hilster – 1980s

“Natural language processing at Battelle-Columbus” – Klaus Obermeier & David de Hilster, 1985
“DIID — Data independent interface for database” (abstract) – Klaus Obermeier & David de Hilster, 1986

Phonological Expert System – 1984

This paper was written by David de Hilster for a project during this master’s degree in Linguistics. He used LISP on an Xerox 1108 machine. The paper was printed out on a teletype machine and has recently been scanned and is in a pdf document. This document has not been converted to text and is still in an “image” form and is somewhat faded. Here is some text from the first pages.

The concept or a phonological expert system involves much more then what I have attempted to do in this project. It is fair to say that what I have done is created some visual “tools” which a phonological expert system can eventually use.

The tools are as follows:

Distinctive Features (DF) table inquires.
Prefix and Suffix rule combinations of regular forms. (Harmony rules are excluded for now)

This part of the Phonological Expert System (PHONEX) is not “AI-ish” in the true sense of the word but one cannot attempt to build or experiment with PHONEX until the tools are in place. Therefore, I will only briefly discuss how I envision the future structures and concepts or PHONEX focusing most of my attention on the tools which I have created.

What is a Phonological Expert System?

PHONEX will be an expert system which will be able to analyze the phonological surface forms of a given language and break them down into rules or exceptions. This data would then be used to assist in the understanding and generation of speech and will eventually enable a system to constantly update and analyze occurring speech. Thus, when the system encounters a new word, it can successfully use the current rule system to generate its essential forms (i.e. plural person …).

The data which PHONEX Will eventually start with will be from two source s:

the surface forms stored in a DF table
and the surface forms of words stored in FRAMES.

The knowledge representation in PHONEX is yet to be determined. Before the knowledge base is constructed, certain tools must be made. One of these tools is rule generations for prefixes and suffixes. That is what my program is about.

Link to Paper

Phonological Expert System