# WebSpellChecker security advisory WSC-SA-2026-001

### Apache OpenNLP CVEs in WProofreader (`opennlp-tools 1.9.4`)

* **Advisory ID:** `WSC-SA-2026-001`
* **Status:** Not exploitable in the current deployment. Upgrade planned.
* **Date issued:** 2026-05-12
* **Last updated:** 2026-05-12
* **Affected product:** WProofreader, all currently supported versions
* **Affected component:** `org.apache.opennlp:opennlp-tools:1.9.4`
* **Component type:** Bundled transitive dependency
* **CVEs covered:**
  * [CVE-2026-40682](https://nvd.nist.gov/vuln/detail/CVE-2026-40682) — Critical, CVSS 9.1
  * [CVE-2026-42027](https://nvd.nist.gov/vuln/detail/CVE-2026-42027) — Critical, CVSS 9.8
  * [CVE-2026-42440](https://nvd.nist.gov/vuln/detail/CVE-2026-42440) — High, CVSS 7.5

{% hint style="info" %}
No customer action is required at this time.
{% endhint %}

### Summary

Container image scanners such as Docker Scout, Trivy, Grype, and Snyk flag `org.apache.opennlp:opennlp-tools:1.9.4` inside the WProofreader Docker image.

WebSpellChecker reviewed the source code and execution paths in detail. The flagged vulnerabilities do not create risk in the current WProofreader runtime under normal operating conditions.

Vulnerable code is present and some of it is also executed. However, none of the affected code paths receive attacker-controlled input.

WebSpellChecker is working to upgrade the affected dependency.

### Background

WProofreader uses LanguageTool as its grammar engine. LanguageTool depends on Apache OpenNLP for English-language chunking, tokenization, and part-of-speech tagging.

The `opennlp-tools 1.9.4` JAR is shipped in the WProofreader Docker image through this dependency chain.

The CVEs covered by this advisory require crafted input for specific OpenNLP code paths:

* a crafted binary model file (`.bin`)
* a crafted dictionary XML file

In WProofreader, these code paths read only JAR-bundled resources fixed at build time.

There is no HTTP API, configuration option, or file path that lets an external user or network attacker replace these inputs.

### Detailed analysis per CVE

#### CVE-2026-40682 — XXE injection in `DictionaryEntryPersistor`

**Upstream description**

`opennlp.tools.dictionary.serializer.DictionaryEntryPersistor` configures its SAX parser without `FEATURE_SECURE_PROCESSING`. A crafted XML dictionary stream can trigger external entity resolution. This can lead to local file disclosure and SSRF.

**Status in WProofreader**

Not affected.

**Justification**

`vulnerable_code_not_in_execute_path`

**Analysis**

A full source search of the LanguageTool tree shows zero imports and zero invocations of the `opennlp.tools.dictionary` package.

The vulnerable class exists in the `opennlp-tools` JAR. It is never loaded or referenced at runtime.

There is no direct or transitive path to:

* `DictionaryEntryPersistor.create()`
* `Dictionary(InputStream)`

#### CVE-2026-42440 — Denial of service in `AbstractModelReader`

**Upstream description**

`opennlp.tools.ml.model.AbstractModelReader` reads a 32-bit integer from a binary model file. It then allocates an array of that size without bounds validation. A crafted model file with a maximum-sized integer field can trigger `OutOfMemoryError` and crash the JVM.

**Status in WProofreader**

Not affected.

**Justification**

`vulnerable_code_cannot_be_controlled_by_adversary`

**Analysis**

`AbstractModelReader` is invoked indirectly during English-language analysis.

The relevant call chain is:

```
HTTP request with English text
  -> TextChecker.checkText()
  -> JLanguageTool.getAnalyzedSentence()
  -> JLanguageTool.getRawAnalyzedSentence()
  -> Language.getChunker()
  -> English.createDefaultChunker()
  -> new EnglishChunker()
  -> new POSModel(stream) and similar
  -> OpenNLP internal AbstractModelReader.getOutcomes()
```

The input stream is loaded through `Tools.getStream("/en-pos-maxent.bin")`. That path resolves through `JLanguageTool.getDataBroker().getAsStream()` and the JVM classpath.

The `.bin` file comes from the bundled Maven artifact `edu.washington.cs.knowitall:opennlp-postag-models:1.5`. This artifact is fixed at build time inside the WProofreader Docker image.

The vulnerable code is executed during normal operation. However, the trusted model file contains a small and valid integer count.

The malicious precondition is absent. WProofreader does not expose any HTTP request, configuration setting, or file path that allows model substitution.

#### CVE-2026-42027 — Unsafe reflection in `ExtensionLoader`

**Upstream description**

`ExtensionLoader.instantiateExtension(Class, String)` calls `Class.forName()` on a class name from `manifest.properties` inside a model archive. This happens before the resolved class is validated against the expected extension interface. A crafted model archive can trigger arbitrary static initializer execution and pre-RCE gadget chains.

**Status in WProofreader**

Not affected.

**Justification**

`vulnerable_code_cannot_be_controlled_by_adversary`

**Analysis**

`ExtensionLoader.instantiateExtension()` is invoked indirectly during `BaseModel` initialization when WProofreader loads an English OpenNLP model.

The triggering path is the same as for CVE-2026-42440. It ends in:

* `new POSModel(stream)`
* `new TokenizerModel(stream)`
* `new ChunkerModel(stream)`

The `.bin` model archives are read from the classpath through `Tools.getStream()`.

These archives come from bundled artifacts fixed at build time:

* `edu.washington.cs.knowitall:opennlp-tokenize-models:1.5`
* `edu.washington.cs.knowitall:opennlp-postag-models:1.5`
* `edu.washington.cs.knowitall:opennlp-chunk-models:1.5`

The `manifest.properties` entries in these archives reference legitimate Apache OpenNLP serializer classes only.

The vulnerable code is executed during normal operation. However, the class name passed to `Class.forName()` is never attacker controlled in the WProofreader runtime.

WProofreader does not expose any HTTP request, configuration setting, or file path that allows model archive or manifest substitution.

### Attack surface considerations

The only theoretical exploitation paths for CVE-2026-42440 and CVE-2026-42027 are:

1. **Supply chain compromise.** An attacker tampers with `edu.washington.cs.knowitall:opennlp-*-models:1.5` in WebSpellChecker's Artifactory or in Maven Central before the WProofreader Docker image is built.
2. **Post-compromise tampering.** An attacker with write access to the running container filesystem replaces the bundled JAR.

WebSpellChecker mitigates the first path through artifact integrity verification during build.

The second path requires container or host compromise. At that point, the attacker already has greater capabilities than these CVEs provide.

Neither path is reachable through the WProofreader HTTP API or through normal customer interaction with the product.

### Environmental CVSS assessment

The NVD base scores assume a worst-case deployment where an attacker can deliver a crafted model or dictionary file over the network without authentication.

That assumption does not match the WProofreader deployment model.

In the WProofreader runtime, the attack vector is limited to the local container filesystem. Exploitation also requires high privileges, high attack complexity, and a service restart.

#### CVE-2026-42027

* **Published severity:** Critical 9.8
* **Environmental severity:** Medium 6.3
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:H`

**Rationale**

Exploitation requires write access to the container filesystem in order to replace a bundled model archive. It also requires a service restart to trigger model reload.

Remote exploitation through the WProofreader HTTP API is not possible.

#### CVE-2026-40682

* **Published severity:** Critical 9.1
* **Environmental severity:** Medium 5.8
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:N`

**Rationale**

The vulnerable `DictionaryEntryPersistor` class is not invoked in the LanguageTool code path.

Even in a theoretical worst case where a crafted dictionary is placed on the filesystem, impact is limited to file disclosure and SSRF. There is no availability impact.

#### CVE-2026-42440

* **Published severity:** High 7.5
* **Environmental severity:** Medium 4.1
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:N/I:N/A:H`

**Rationale**

Exploitation requires replacing a bundled `.bin` model file with a crafted file that contains an oversized count field. It also requires a service restart.

Impact is limited to denial of service through OOM. There is no confidentiality or integrity impact.

Even in the worst-case scenario where an attacker gains write access to the container filesystem, the maximum severity is Medium, not Critical.

### Recommendation

No customer action is required at this time.

Customers who use image scanners and see these CVEs reported against the WProofreader image can apply the published OpenVEX document to suppress the findings:

* <https://files.webspellchecker.com/security/vex/opennlp-cves-v1.0.0.vex.json>

WebSpellChecker is actively working with the LanguageTool community to upgrade to a fixed OpenNLP version.

### References

* [CVE-2026-40682](https://nvd.nist.gov/vuln/detail/CVE-2026-40682)
* [CVE-2026-42027](https://nvd.nist.gov/vuln/detail/CVE-2026-42027)
* [CVE-2026-42440](https://nvd.nist.gov/vuln/detail/CVE-2026-42440)
* [Apache OpenNLP project](https://opennlp.apache.org/)
* [OpenVEX specification](https://github.com/openvex/spec)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.wproofreader.com/v6.12.0/faq/technical/security/webspellchecker-security-advisory-wsc-sa-2026-001.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
