> For the complete documentation index, see [llms.txt](https://docs.wproofreader.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.wproofreader.com/faq/technical/security/webspellchecker-security-advisory-wsc-sa-2026-001.md).

# WebSpellChecker security advisory WSC-SA-2026-001

### Apache OpenNLP CVEs in WProofreader (`opennlp-tools 1.9.4`)

* **Advisory ID:** `WSC-SA-2026-001`
* **Status:** Not exploitable in the current deployment. Upgrade planned.
* **Date issued:** 2026-05-12
* **Last updated:** 2026-05-20
* **Affected product:** WProofreader, all currently supported versions
* **Affected component:** `org.apache.opennlp:opennlp-tools:1.9.4`
* **Component type:** Bundled transitive dependency
* **CVEs covered:**
  * [CVE-2026-40682](https://nvd.nist.gov/vuln/detail/CVE-2026-40682) — Critical, CVSS 9.1
  * [CVE-2026-42027](https://nvd.nist.gov/vuln/detail/CVE-2026-42027) — Critical, CVSS 9.8
  * [CVE-2026-42440](https://nvd.nist.gov/vuln/detail/CVE-2026-42440) — High, CVSS 7.5

{% hint style="info" %}
No customer action is required at this time.
{% endhint %}

### Affected WProofreader versions

* **All versions prior to 6.13.0:** Protected by architectural controls only. OpenNLP models load from trusted bundled resources. There is no external model substitution path.
* **WProofreader 6.13.0 and later:** The same architectural controls apply. Bundled OpenNLP model files are also verified by SHA-256 at load time.

### Summary

Container image scanners such as Docker Scout, Trivy, Grype, and Snyk flag `org.apache.opennlp:opennlp-tools:1.9.4` inside the WProofreader Docker image.

WebSpellChecker reviewed the source code and execution paths in detail. These findings do not create risk in the current WProofreader runtime under normal operating conditions.

Vulnerable code is present. Some affected code also runs. However, none of the affected code paths receive attacker-controlled input.

WebSpellChecker is working to upgrade the affected dependency.

### Background

WProofreader uses LanguageTool as its grammar engine. LanguageTool depends on Apache OpenNLP for English-language chunking, tokenization, and part-of-speech tagging.

The `opennlp-tools 1.9.4` JAR ships in the WProofreader Docker image through this dependency chain.

The CVEs covered by this advisory require crafted input for specific OpenNLP code paths:

* a crafted binary model file (`.bin`)
* a crafted dictionary XML file

In WProofreader, these code paths read only JAR-bundled resources fixed at build time.

There is no HTTP API, configuration option, or file path that lets an external user or network attacker replace these inputs.

### Detailed analysis per CVE

#### CVE-2026-40682 — XXE injection in `DictionaryEntryPersistor`

**Upstream description**

`opennlp.tools.dictionary.serializer.DictionaryEntryPersistor` configures its SAX parser without `FEATURE_SECURE_PROCESSING`. A crafted XML dictionary stream can trigger external entity resolution. This can lead to local file disclosure and SSRF.

**Status in WProofreader**

Not affected.

**Justification**

`vulnerable_code_not_in_execute_path`

**Analysis**

A full source search of the LanguageTool tree shows zero imports and zero invocations of the `opennlp.tools.dictionary` package.

The vulnerable class exists in the `opennlp-tools` JAR. It is never loaded or referenced at runtime.

There is no direct or transitive path to:

* `DictionaryEntryPersistor.create()`
* `Dictionary(InputStream)`

#### CVE-2026-42440 — Denial of service in `AbstractModelReader`

**Upstream description**

`opennlp.tools.ml.model.AbstractModelReader` reads a 32-bit integer from a binary model file. It then allocates an array of that size without bounds validation. A crafted model file with a maximum-sized integer field can trigger `OutOfMemoryError` and crash the JVM.

**Status in WProofreader**

Not affected.

**Justification**

`inline_mitigations_already_exist`

**Analysis**

`AbstractModelReader` is invoked indirectly during English-language analysis.

The relevant call chain is:

```
HTTP request with English text
  -> TextChecker.checkText()
  -> JLanguageTool.getAnalyzedSentence()
  -> JLanguageTool.getRawAnalyzedSentence()
  -> Language.getChunker()
  -> English.createDefaultChunker()
  -> new EnglishChunker()
  -> new POSModel(stream) and similar
  -> OpenNLP internal AbstractModelReader.getOutcomes()
```

The input stream is loaded through `Tools.getStream("/en-pos-maxent.bin")`. That path resolves through `JLanguageTool.getDataBroker().getAsStream()` and the JVM classpath.

The `.bin` file comes from the bundled Maven artifact `edu.washington.cs.knowitall:opennlp-postag-models:1.5`. This artifact is fixed at build time inside the WProofreader Docker image.

The vulnerable code runs during normal operation. However, the trusted model file contains a small and valid integer count.

The malicious precondition is absent. WProofreader does not expose any HTTP request, configuration setting, or file path that allows model substitution.

Starting with WProofreader 6.13.0, each bundled OpenNLP model is verified by SHA-256 at load time. A failed check blocks startup. This eliminates the post-compromise filesystem tampering path described below.

#### CVE-2026-42027 — Unsafe reflection in `ExtensionLoader`

**Upstream description**

`ExtensionLoader.instantiateExtension(Class, String)` calls `Class.forName()` on a class name from `manifest.properties` inside a model archive. This happens before the resolved class is validated against the expected extension interface. A crafted model archive can trigger arbitrary static initializer execution and pre-RCE gadget chains.

**Status in WProofreader**

Not affected.

**Justification**

`inline_mitigations_already_exist`

**Analysis**

`ExtensionLoader.instantiateExtension()` is invoked indirectly during `BaseModel` initialization when WProofreader loads an English OpenNLP model.

The triggering path is the same as for CVE-2026-42440. It ends in:

* `new POSModel(stream)`
* `new TokenizerModel(stream)`
* `new ChunkerModel(stream)`

The `.bin` model archives are read from the classpath through `Tools.getStream()`.

These archives come from bundled artifacts fixed at build time:

* `edu.washington.cs.knowitall:opennlp-tokenize-models:1.5`
* `edu.washington.cs.knowitall:opennlp-postag-models:1.5`
* `edu.washington.cs.knowitall:opennlp-chunk-models:1.5`

The `manifest.properties` entries in these archives reference legitimate Apache OpenNLP serializer classes only.

The vulnerable code runs during normal operation. However, the class name passed to `Class.forName()` is never attacker controlled in the WProofreader runtime.

WProofreader does not expose any HTTP request, configuration setting, or file path that allows model archive or manifest substitution.

Starting with WProofreader 6.13.0, each bundled OpenNLP model is verified by SHA-256 at load time. A failed check blocks startup. This eliminates the post-compromise filesystem tampering path described below.

### Mitigation controls

WProofreader implements multiple layers of defense for the affected code paths.

#### 1. Architectural isolation (all versions)

OpenNLP model files load exclusively from the JVM classpath via `Tools.getStream()`. There is no API, configuration option, or file path that allows external substitution.

Models originate from `edu.washington.cs.knowitall:opennlp-*-models:1.5` artifacts, fixed at build time inside the Docker image.

#### 2. SHA-256 model integrity verification (WProofreader 6.13.0 and later)

At load time, each bundled OpenNLP model file is verified against its expected SHA-256 hash.

A failed verification prevents the service from starting. This closes the post-compromise filesystem tampering scenario described in the Attack surface considerations section above.

This control was added in upstream LanguageTool PR [#12002](https://github.com/languagetool-org/languagetool/pull/12002).

### Attack surface considerations

The only theoretical exploitation paths for CVE-2026-42440 and CVE-2026-42027 are:

1. **Supply chain compromise.** An attacker tampers with `edu.washington.cs.knowitall:opennlp-*-models:1.5` in WebSpellChecker's Artifactory or in Maven Central before the WProofreader Docker image is built.
2. **Post-compromise tampering.** An attacker with write access to the running container filesystem replaces the bundled JAR.

WebSpellChecker mitigates the first path through artifact integrity verification during build.

The second path requires container or host compromise. At that point, the attacker already has greater capabilities than these CVEs provide.

Neither path is reachable through the WProofreader HTTP API or through normal customer interaction with the product.

### Environmental CVSS assessment

The NVD base scores assume a worst-case deployment where an attacker can deliver a crafted model or dictionary file over the network without authentication.

That assumption does not match the WProofreader deployment model.

In the WProofreader runtime, the attack vector is limited to the local container filesystem. Exploitation also requires high privileges, high attack complexity, and a service restart.

#### CVE-2026-42027

* **Published severity:** Critical 9.8
* **Environmental severity:** Medium 6.3
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:H`

**Rationale**

Exploitation requires write access to the container filesystem in order to replace a bundled model archive. It also requires a service restart to trigger model reload.

Remote exploitation through the WProofreader HTTP API is not possible.

#### CVE-2026-40682

* **Published severity:** Critical 9.1
* **Environmental severity:** Medium 5.8
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:H/I:H/A:N`

**Rationale**

The vulnerable `DictionaryEntryPersistor` class is not invoked in the LanguageTool code path.

Even in a theoretical worst case where a crafted dictionary is placed on the filesystem, impact is limited to file disclosure and SSRF. There is no availability impact.

#### CVE-2026-42440

* **Published severity:** High 7.5
* **Environmental severity:** Medium 4.1
* **Vector:** `CVSS:3.1/AV:L/AC:H/PR:H/UI:R/S:U/C:N/I:N/A:H`

**Rationale**

Exploitation requires replacing a bundled `.bin` model file with a crafted file that contains an oversized count field. It also requires a service restart.

Impact is limited to denial of service through OOM. There is no confidentiality or integrity impact.

Even in the worst-case scenario where an attacker gains write access to the container filesystem, the maximum severity is Medium, not Critical.

### Recommendation

No customer action is required at this time.

Customers who use image scanners and see these CVEs reported against the WProofreader image can apply the published OpenVEX document to suppress the findings:

* <https://files.webspellchecker.com/security/vex/opennlp-cves-v2.0.0.vex.json>

WebSpellChecker is actively working with the LanguageTool community to upgrade to a fixed OpenNLP version.

Customers on WProofreader 6.13.0 and later benefit from the additional SHA-256 integrity verification described in the Mitigation controls section.

Upgrade to 6.13.0 for the strongest available defense. No upgrade is required for the not-affected status to apply.

### References

* [CVE-2026-40682](https://nvd.nist.gov/vuln/detail/CVE-2026-40682)
* [CVE-2026-42027](https://nvd.nist.gov/vuln/detail/CVE-2026-42027)
* [CVE-2026-42440](https://nvd.nist.gov/vuln/detail/CVE-2026-42440)
* [Apache OpenNLP project](https://opennlp.apache.org/)
* [OpenVEX specification](https://github.com/openvex/spec)
* [LanguageTool PR #12002](https://github.com/languagetool-org/languagetool/pull/12002)
* [WProofreader OpenVEX document](https://files.webspellchecker.com/security/vex/opennlp-cves-v2.0.0.vex.json)

### Revision history

* **2026-05-12 v1.0.0** Initial publication.
* **2026-05-20 v2.0.0** Documented SHA-256 model integrity verification added in WProofreader 6.13.0 via upstream LanguageTool PR #12002. No change in CVE not-affected status or environmental severity. Version aligned with the published OpenVEX document.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.wproofreader.com/faq/technical/security/webspellchecker-security-advisory-wsc-sa-2026-001.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.