draft April 8, 2026

14 AI/LLM Vulnerability Classes Not Cataloged by OWASP or MITRE

novel-vulnsowaspmitre-atlasgpu-security

Methodology

Autonomous research loop: 12 cycles, 32 independent research agents, 16 adversarial couplets. Covered OWASP LLM Top 10, Agentic Top 10, ML Security Top 10, and MITRE ATLAS (140 techniques). After deduplication: 53 unique vulnerability patterns. These 14 are the ones not adequately cataloged.

Tier 1: Genuinely Novel

1. Reasoning Chain Hijacking (CoT-Hijack)

Attacks targeting chain-of-thought safety reasoning in o1/o3/DeepSeek-R1/Gemini models. 99% attack success rate on Gemini 2.5 Pro. Targets the refusal direction vector in activation space, not the prompt boundary.

2. Inference-Time Compute DoS (ReasoningBomb)

Prompts inducing pathologically long reasoning chains, consuming 10-100x normal compute. Exploits the thinking budget, not token limits.

3. Quantization-Induced Defense Blindspots

INT8 quantization reduces ALL backdoor defenses to 0% detection while keeping attack success above 99%. Defenses evaluated on FP32, deployed on INT8.

4. Computational Graph Backdoors (ShadowLogic)

Logic embedded in ONNX computational graph topology — not weights. Invisible to all weight-based integrity checks. Demonstrated on Phi-3 mini (62% ASR) and Llama 3.2 3B (70% ASR).

5. Temporal Semantic Sleeper Agents

Multi-session backdoors using implicit memory with bitwise state accumulation. Requires entire interaction history to detect.

6. KV-Cache Cross-Tenant Side Channels

Timing side-channels from shared KV-cache reuse allow prompt reconstruction. Published at NDSS 2025 (PROMPTPEEK).

7. GPU Hardware Attacks (GPUBreach)

Rowhammer on NVIDIA GDDR6 GPU memory. Single bit-flip degrades accuracy from 80% to below 1%. Chains to full CPU root shell. No CVE assigned.

Tier 2: Under-Cataloged

8-14

Persistent memory poisoning (>95% success), model merging backdoor propagation, token smuggling (100% ASR with emoji), LLM steganographic channels, inference framework deserialization (ShadowMQ — copy-pasted across Meta/NVIDIA/Microsoft), cross-modal injection (94% inaudible jailbreak), adversarial drift detection evasion.

Proposed CWEs

8 new CWE variants proposed for these classes — all mapping to existing CWE parents (the taxonomy extends, no new classes needed).