For the complete documentation index, see llms.txt. This page is also available as Markdown.

WebSpellChecker security advisory WSC-SA-2026-001

Security advisory WSC-SA-2026-001 covering CVE-2026-40682, CVE-2026-42027, and CVE-2026-42440 in Apache OpenNLP bundled with WProofreader. Status: not exploitable, no customer action required.

Apache OpenNLP CVEs in WProofreader (opennlp-tools 1.9.4)

  • Advisory ID: WSC-SA-2026-001

  • Status: Not exploitable in the current deployment. Upgrade planned.

  • Date issued: 2026-05-12

  • Last updated: 2026-05-20

  • Affected product: WProofreader, all currently supported versions

  • Affected component: org.apache.opennlp:opennlp-tools:1.9.4

  • Component type: Bundled transitive dependency

  • CVEs covered:

No customer action is required at this time.

Affected WProofreader versions

  • All versions prior to 6.13.0: Protected by architectural controls only. OpenNLP models load from trusted bundled resources. There is no external model substitution path.

  • WProofreader 6.13.0 and later: The same architectural controls apply. Bundled OpenNLP model files are also verified by SHA-256 at load time.

Summary

Container image scanners such as Docker Scout, Trivy, Grype, and Snyk flag org.apache.opennlp:opennlp-tools:1.9.4 inside the WProofreader Docker image.

WebSpellChecker reviewed the source code and execution paths in detail. These findings do not create risk in the current WProofreader runtime under normal operating conditions.

Vulnerable code is present. Some affected code also runs. However, none of the affected code paths receive attacker-controlled input.

WebSpellChecker is working to upgrade the affected dependency.

Background

WProofreader uses LanguageTool as its grammar engine. LanguageTool depends on Apache OpenNLP for English-language chunking, tokenization, and part-of-speech tagging.

The opennlp-tools 1.9.4 JAR ships in the WProofreader Docker image through this dependency chain.

The CVEs covered by this advisory require crafted input for specific OpenNLP code paths:

  • a crafted binary model file (.bin)

  • a crafted dictionary XML file

In WProofreader, these code paths read only JAR-bundled resources fixed at build time.

There is no HTTP API, configuration option, or file path that lets an external user or network attacker replace these inputs.

Detailed analysis per CVE

CVE-2026-40682 — XXE injection in DictionaryEntryPersistor

Upstream description

opennlp.tools.dictionary.serializer.DictionaryEntryPersistor configures its SAX parser without FEATURE_SECURE_PROCESSING. A crafted XML dictionary stream can trigger external entity resolution. This can lead to local file disclosure and SSRF.

Status in WProofreader

Not affected.

Justification

vulnerable_code_not_in_execute_path

Analysis

A full source search of the LanguageTool tree shows zero imports and zero invocations of the opennlp.tools.dictionary package.

The vulnerable class exists in the opennlp-tools JAR. It is never loaded or referenced at runtime.

There is no direct or transitive path to:

  • DictionaryEntryPersistor.create()

  • Dictionary(InputStream)

CVE-2026-42440 — Denial of service in AbstractModelReader

Upstream description

opennlp.tools.ml.model.AbstractModelReader reads a 32-bit integer from a binary model file. It then allocates an array of that size without bounds validation. A crafted model file with a maximum-sized integer field can trigger OutOfMemoryError and crash the JVM.

Status in WProofreader

Not affected.

Justification

inline_mitigations_already_exist

Analysis

AbstractModelReader is invoked indirectly during English-language analysis.

The relevant call chain is:

The input stream is loaded through Tools.getStream("/en-pos-maxent.bin"). That path resolves through JLanguageTool.getDataBroker().getAsStream() and the JVM classpath.

The .bin file comes from the bundled Maven artifact edu.washington.cs.knowitall:opennlp-postag-models:1.5. This artifact is fixed at build time inside the WProofreader Docker image.

The vulnerable code runs during normal operation. However, the trusted model file contains a small and valid integer count.

The malicious precondition is absent. WProofreader does not expose any HTTP request, configuration setting, or file path that allows model substitution.

Starting with WProofreader 6.13.0, each bundled OpenNLP model is verified by SHA-256 at load time. A failed check blocks startup. This eliminates the post-compromise filesystem tampering path described below.

CVE-2026-42027 — Unsafe reflection in ExtensionLoader

Upstream description

ExtensionLoader.instantiateExtension(Class, String) calls Class.forName() on a class name from manifest.properties inside a model archive. This happens before the resolved class is validated against the expected extension interface. A crafted model archive can trigger arbitrary static initializer execution and pre-RCE gadget chains.

Status in WProofreader

Not affected.

Justification

inline_mitigations_already_exist

Analysis

ExtensionLoader.instantiateExtension() is invoked indirectly during BaseModel initialization when WProofreader loads an English OpenNLP model.

The triggering path is the same as for CVE-2026-42440. It ends in:

  • new POSModel(stream)

  • new TokenizerModel(stream)

  • new ChunkerModel(stream)

The .bin model archives are read from the classpath through Tools.getStream().

These archives come from bundled artifacts fixed at build time:

  • edu.washington.cs.knowitall:opennlp-tokenize-models:1.5

  • edu.washington.cs.knowitall:opennlp-postag-models:1.5

  • edu.washington.cs.knowitall:opennlp-chunk-models:1.5

The manifest.properties entries in these archives reference legitimate Apache OpenNLP serializer classes only.

The vulnerable code runs during normal operation. However, the class name passed to Class.forName() is never attacker controlled in the WProofreader runtime.

WProofreader does not expose any HTTP request, configuration setting, or file path that allows model archive or manifest substitution.

Starting with WProofreader 6.13.0, each bundled OpenNLP model is verified by SHA-256 at load time. A failed check blocks startup. This eliminates the post-compromise filesystem tampering path described below.

Mitigation controls

WProofreader implements multiple layers of defense for the affected code paths.

1. Architectural isolation (all versions)

OpenNLP model files load exclusively from the JVM classpath via Tools.getStream(). There is no API, configuration option, or file path that allows external substitution.

Models originate from edu.washington.cs.knowitall:opennlp-*-models:1.5 artifacts, fixed at build time inside the Docker image.

2. SHA-256 model integrity verification (WProofreader 6.13.0 and later)

At load time, each bundled OpenNLP model file is verified against its expected SHA-256 hash.

A failed verification prevents the service from starting. This closes the post-compromise filesystem tampering scenario described in the Attack surface considerations section above.

This control was added in upstream LanguageTool PR #12002.

Attack surface considerations

The only theoretical exploitation paths for CVE-2026-42440 and CVE-2026-42027 are:

  1. Supply chain compromise. An attacker tampers with edu.washington.cs.knowitall:opennlp-*-models:1.5 in WebSpellChecker's Artifactory or in Maven Central before the WProofreader Docker image is built.

  2. Post-compromise tampering. An attacker with write access to the running container filesystem replaces the bundled JAR.

WebSpellChecker mitigates the first path through artifact integrity verification during build.

The second path requires container or host compromise. At that point, the attacker already has greater capabilities than these CVEs provide.

Neither path is reachable through the WProofreader HTTP API or through normal customer interaction with the product.

Environmental CVSS assessment

The NVD base scores assume a worst-case deployment where an attacker can deliver a crafted model or dictionary file over the network without authentication.

That assumption does not match the WProofreader deployment model.

In the WProofreader runtime, the attack vector is limited to the local container filesystem. Exploitation also requires high privileges, high attack complexity, and a service restart.

CVE-2026-42027

  • Published severity: Critical 9.8

  • Environmental severity: Medium 6.3

  • Vector: CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:H

Rationale

Exploitation requires write access to the container filesystem in order to replace a bundled model archive. It also requires a service restart to trigger model reload.

Remote exploitation through the WProofreader HTTP API is not possible.

CVE-2026-40682

  • Published severity: Critical 9.1

  • Environmental severity: Medium 5.8

  • Vector: CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:N

Rationale

The vulnerable DictionaryEntryPersistor class is not invoked in the LanguageTool code path.

Even in a theoretical worst case where a crafted dictionary is placed on the filesystem, impact is limited to file disclosure and SSRF. There is no availability impact.

CVE-2026-42440

  • Published severity: High 7.5

  • Environmental severity: Medium 4.1

  • Vector: CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:N/I:N/A:H

Rationale

Exploitation requires replacing a bundled .bin model file with a crafted file that contains an oversized count field. It also requires a service restart.

Impact is limited to denial of service through OOM. There is no confidentiality or integrity impact.

Even in the worst-case scenario where an attacker gains write access to the container filesystem, the maximum severity is Medium, not Critical.

Recommendation

No customer action is required at this time.

Customers who use image scanners and see these CVEs reported against the WProofreader image can apply the published OpenVEX document to suppress the findings:

WebSpellChecker is actively working with the LanguageTool community to upgrade to a fixed OpenNLP version.

Customers on WProofreader 6.13.0 and later benefit from the additional SHA-256 integrity verification described in the Mitigation controls section.

Upgrade to 6.13.0 for the strongest available defense. No upgrade is required for the not-affected status to apply.

References

Revision history

  • 2026-05-12 v1.0.0 Initial publication.

  • 2026-05-20 v2.0.0 Documented SHA-256 model integrity verification added in WProofreader 6.13.0 via upstream LanguageTool PR #12002. No change in CVE not-affected status or environmental severity. Version aligned with the published OpenVEX document.

Last updated

Was this helpful?