Here you will find links to papers and abstracts by students and architects of NLP++.
2024
- “Low-resource Medical Coding of Hospital Discharge Summaries” (to be published) – Aston Williamson
- “Scalable Analysis of English Dictionary Files on HPCC Systems Big Data Platform” – Adarsh U, Jayanth C, David de Hilster, Hugo Watanuki

2023
- “Emotions Detection in Social Media Posts” – Pedro Rodrigues, Renato de Oliveira Moraes, David de Hilster, Hugo Watanuki

2000s
- “Text processing in an Integrated Development Environment (IDE): Integrating Natural Language Processing (NLP) techniques” – Paul Deane, Amnon Meyers, David de Hilster – 2001
- “Integrated Development Environments for Natural Language Processing” – Text Analysis International, 2001
- “Multi-Pass Multi-Strategy NLP” – Amnon Meyers, 2003
- “Review: Software: VisualText” – Terence Langendoen, LinguistList.org, 2002
1990s
MUC: Message Understanding Conferences – 1991-1992
The Message Understanding Conferences (MUC) for computing and computer science, were initiated and financed by DARPA (Defense Advanced Research Projects Agency) to encourage the development of new and better methods of information extraction. The character of this competition, many concurrent research teams competing against one another—required the development of standards for evaluation, e.g. the adoption of metrics like precision and recall.
Authors Amnon Meyers and David de Hilster participated in 1991 and 1992 with Amnon Meyers being instrumental in helping coordinate these first conferences. Motivated by the VOX system, DARPA and Naval Ocean Systems Center (NOSC) launched the MUC series of workshops.
- “MUC-3 Test Results and Analysis” – Meyers & de Hilster, 1991
- “Description of the INLET System Used for MUC-3” – Meyers & de Hilster, 1991
- “MUC-4 Test Results and Analysis” – Meyers & de Hilster, 1992
- “Description of the TexUS System as Used for MUC-4” – Meyers & de Hilster, 1992
1980s
Papers by Amnon Meyers – 1980s
- “VOX – An Extensible Natural Language Processor” – Amnon Meyers, 1985
- “VOX Naval Text Understanding System” – Amnon Meyers, 1985
Papers by David de Hilster – 1980s
- “Natural language processing at Battelle-Columbus” – Klaus Obermeier & David de Hilster, 1985
- “DIID — Data independent interface for database” (abstract) – Klaus Obermeier & David de Hilster, 1986
Phonological Expert System – 1984
This paper was written by David de Hilster for a project during this master’s degree in Linguistics. He used LISP on an Xerox 1108 machine. The paper was printed out on a teletype machine and has recently been scanned and is in a pdf document. This document has not been converted to text and is still in an “image” form and is somewhat faded. Here is some text from the first pages.
The concept or a phonological expert system involves much more then what I have attempted to do in this project. It is fair to say that what I have done is created some visual “tools” which a phonological expert system can eventually use.
The tools are as follows:
- Distinctive Features (DF) table inquires.
- Prefix and Suffix rule combinations of regular forms. (Harmony rules are excluded for now)
This part of the Phonological Expert System (PHONEX) is not “AI-ish” in the true sense of the word but one cannot attempt to build or experiment with PHONEX until the tools are in place. Therefore, I will only briefly discuss how I envision the future structures and concepts or PHONEX focusing most of my attention on the tools which I have created.
What is a Phonological Expert System?
PHONEX will be an expert system which will be able to analyze the phonological surface forms of a given language and break them down into rules or exceptions. This data would then be used to assist in the understanding and generation of speech and will eventually enable a system to constantly update and analyze occurring speech. Thus, when the system encounters a new word, it can successfully use the current rule system to generate its essential forms (i.e. plural person …).
The data which PHONEX Will eventually start with will be from two source s:
- the surface forms stored in a DF table
- and the surface forms of words stored in FRAMES.
The knowledge representation in PHONEX is yet to be determined. Before the knowledge base is constructed, certain tools must be made. One of these tools is rule generations for prefixes and suffixes. That is what my program is about.
Link to Paper
