Every Way to Run an NLP++ Analyzer

One of the strengths of NLP++ is that once you’ve written an analyzer in the VisualText VS Code extension, you are not locked into a single way of running it. The same glass-box, 100% rule-based analyzer can be driven from Python, from Node.js, from TypeScript, or straight from the command line — and it can run either interpreted from its .nlp source or compiled to native shared libraries for speed and distribution.

This article walks through all of the officially supported ways to run NLP++ analyzers, when to reach for each one, and where to find them.

The two run modes behind all of them

Before the individual packages, it helps to understand a distinction that runs through every option below.

Every NLP++ analyzer can run in one of two modes. Interpreted mode is the default: the engine runs your analyzer straight from its .nlp source files. This is ideal during development because every edit takes effect on the next run, and you can watch the parse tree evolve pass by pass. Compiled mode takes the analyzer’s rule passes and its knowledge base and compiles them once into native shared libraries — a .dll on Windows, a .dylib on macOS, a .so on Linux — which the engine loads at runtime. Compiled analyzers run entirely from that native code, so they’re faster on large inputs and can be shipped as a “frozen” build without bundling the source. The trade-off is that source edits no longer change the output until you recompile.

Most of the packages below expose both modes, and several offer a one-call cloud build so you never have to touch a C++ toolchain yourself. With that in mind, here are your options.

1. The NLPPlus Python package (production Python)

The NLPPlus Python package is the recommended way to call NLP++ analyzers from Python in production. Rather than shelling out to a command-line executable, it links the C++ libraries of the NLP Engine directly through native bindings, so calls run in-process and are far more efficient.

Installation is a single pip command:

pip install nlpplus

Basic usage runs the default US-English parser and returns XML:

import NLPPlus
xml = NLPPlus.analyze("Hello world.")

The package ships with several ready-to-use analyzers — a full English parser plus extractors for telephone numbers, links, email addresses, and postal addresses — and because NLP++ is glass-box, you can copy any of them somewhere writable and edit them to your exact needs. It also exposes a rich results object giving you the parsed output.json, the final parse tree, and more.

For compiled mode, the simplest path is a single call to cloud_compile(), which generates the C++ trees, ships them to the public nlp-compile-service cloud builder, downloads the shared library for your platform, and stages it for you:

import NLPPlus
NLPPlus.cloud_compile("parse-en-us")
xml = NLPPlus.analyze("Hello world.", compiled=True)

Requires Python 3.10 or newer. This is the package to reach for whenever performance and Python are both priorities.

2. The nlpplus Node.js package (production Node.js)

The nlpplus npm package is the Node.js peer of the Python package. It links the same C++ engine libraries into a native Node-API addon, so analyzer calls run in-process rather than spawning a subprocess.

npm install nlpplus
const nlpplus = require('nlpplus');
const xml = nlpplus.analyze('Hello world.');

It bundles the same set of analyzers (parse-en-us, address-parser, emailaddress, links, telephone), exposes an Engine class with a Results object for structured output, and supports the same compiled mode with an async cloudCompile() call. One practical note from the docs: when you create an Engine explicitly, always call engine.close() (for example in a finally block) so temporary working folders get cleaned up — this matters especially on Windows.

Requires Node.js 18 or newer. When a prebuilt binary exists for your platform it’s used automatically; otherwise the package builds from source at install time.

3. The Python NLPEngine class (simple scripting)

If you want something lighter-weight for non-production scripting, the VisualText/python repository provides an NLPEngine class that simply shells out to the command-line nlp.exe. The README is candid about this: the subprocess approach is suitable for tasks that aren’t meant for production, where the native NLPPlus package would be the better choice.

from python.nlpengine import NLPEngine
nlp = NLPEngine(engineDir=".", analyzersDir="data")
nlp.analyzeFile("rfb", "text.txt", dev=True)

The class still supports the full compiled-mode workflow through compileAnalyzer() (which emits the C++ trees) and compileLocal() (which drives the platform’s compile script end-to-end and stages the shared libraries). For this class to work you need the NLP Engine command-line executable on your system, which comes from the per-OS bundles described below.

4. The TypeScript NLPEngine class (Node scripting)

The ts-nlp-engine repository offers a TypeScript class that calls nlp.exe via child_process. Like the Python class above, it’s the simpler shell-out model, and the README points you at the native nlpplus npm package for production workloads.

import { NLPEngine } from './nlp';
const nlpEngine = new NLPEngine(engineDir, analyzersDir);
nlpEngine.analyzeStr('Telephone-Numbers', 'text.txt', 'the phone number is 555-1212');
const output = nlpEngine.outputFileContents('Telephone-Numbers', 'text.txt', 'codes.json');

It’s a convenient choice for Node.js scripts that would rather work in TypeScript and don’t mind the subprocess overhead. It includes a handful of helpers for reading analyzer output, managing input directories, clearing log files, and both interpreted and compiled runs.

5. Straight from the command line (per-OS engine bundles)

At the bottom of every option above sits the NLP Engine executable itself, nlp.exe. The three per-OS repositories package it as a ready-to-run binary distribution — no compilation required — together with the data knowledge bases and a Python wrapper:

Running an analyzer is a matter of pointing the executable at an analyzer folder (-ANA), a working directory containing the data/ tree (-WORK), and an input text file:

./nlp.exe -ANA data/rfb -WORK . data/rfb/input/text.txt

Add -DEV to keep the per-pass parse-tree and knowledge-base log files for inspection. Each bundle also includes a compile-analyzer script (.ps1/.bat on Windows, .sh on Linux and macOS) that handles compiled mode locally: a full compile of both the analyzer rules and the knowledge base, a KB-only compile for when just the knowledge base changed, or an analyzer-only compile for when only the rules changed. After compiling, add -COMPILED to your command to run against the native libraries instead of the interpreter.

These bundles track the upstream engine releases automatically, so a release tagged v3.x here contains exactly the binaries built from that version of the engine.

Which one should you use?

A quick way to choose:

Whichever path you pick, you’re running the same analyzer you built and can see, inspect, and modify in VisualText — no statistical black box, no retraining, just deterministic, human-readable rules you own. And when you’re ready for speed or a frozen distributable, compiled mode is one call away.

To learn more about writing the analyzers themselves, visit visualtext.org, grab the NLP++ VS Code extension, or pick up the NLP++ textbook.

Loading