Full English Parser – VisualText

NOTE: The latest VisualText is required for running and modifying TAIParse. Even the compiled version needs runtime libraries bundled with VisualText. DOWNLOAD HERE.

TAIParse also performs part-of-speech tagging at 94% accuracy on a blind test business article corpus.

Natural language processing (NLP) generally refers to the complete linguistic and conceptual processing of a text. To facilitate the construction of natural language processing products, TAI is now making available some general text analysis prototypes that can be used as a starting point for a host of applications, such as information extraction, categorization, summarization, and question parsing.

TAIParse is a general analyzer that emphasizes the minimal use of knowledge (“just-in-time” knowledge) to perform part-of-speech tagging, entity extraction, and shallow parsing. TAIParse is an excellent starting point for customizing your own text analysis capabilities. For one thing, TAIParse includes a full lexicon with part-of-speech information within its knowledge base. For another, it illustrates the latest features of the NLP++^® language in action. TAIParse further illustrates the ease of implementation of NLP systems with the VisualText^® IDE (SDK, tools, etc.). TAIParse includes these capabilities and more:

Zoning and “parsing-per-line” to characterize regions and formats in text
Dynamic and context-dependent part-of-speech assignment and parsing
Successive segmentation of text in a “divide-and-conquer” strategy
Treatment of unknown words
Noun phrase extraction
A semantic and discourse processing framework that ties into an ontology and dynamic representation of the analysis within the knowledge base
Processing INDEPENDENT of capitalization, so that, for example, all-uppercase text regions can be analyzed.
Robust analysis in the face of errors, misspellings, and ungrammatical text