/claim #5838
Proposed Changes
This PR implements a comprehensive XSS Context Analyzer to replace blind fuzzing with intelligent, context-aware exploitation.
The solution moves away from “spray-and-pray” payloads to a Probe-and-Exploit architecture. It parses HTTP response bodies using a browser-grade HTML tokenizer to determine exactly where user input is reflected and whether it can be exploited.
Architecture & Approach
The analyzer operates in a two-phase process:
Phase 1: Smart Probing & Context Detection
- Smart Canary Injection: Sends a unique canary with format
Nucl3i<random><>'".
- 6 random alphanumeric characters ensure uniqueness.
- Special characters (
<>') test filter behavior.
- Robust HTML Tokenization: Parses response using browser-grade tokenizer (
golang.org/x/net/html).
- Accurately identifies parsing context, not just string position.
- Malformed HTML Handling: Implements a “drain” logic to capture reflections in truncated or malformed HTML (e.g., unclosed tags), ensuring no potential injection points are missed.
- Single-pass tokenization for O(n) performance.
- Context Classification: Detects 11+ distinct HTML contexts, including:
html_text - Plain HTML body content (supports fallback for malformed HTML)
html_attr_double_quoted / _single_quoted / _unquoted
script_code / script_string_double / _single_quoted / _template
rcdata (textarea, title), style_block, comment_block
event_handler - Event attributes (e.g., onclick)
- Exploitability Prioritization: Ranks contexts by exploitation difficulty (1-7 scale).
- Prioritizes easiest contexts first (e.g.,
html_text rank 1 vs script_string rank 6).
- Smart Sorting: Even fallback contexts from malformed HTML are correctly sorted and prioritized based on their exploitability.
- Filter Detection & Fail-Fast: Analyzes which special characters survived encoding.
- Compares canary before vs. after reflection.
- Granular Analysis: Correctly identifies if specific chars (
<, ', ") are allowed even in edge cases like malformed trailing content.
- Fails fast if critical characters are encoded.
Phase 2: Targeted Exploitation
- Context-Aware Payload Selection: Chooses optimal payload based on detected context and filter analysis.
- Verification: Sends targeted payload and verifies execution via pattern matching.
- Multi-Context Fallback: Tries up to 3 contexts if first attempt fails, ordered by exploitability rank.
Performance Characteristics
| Metric |
Traditional Fuzzing |
Context Analyzer |
Improvement |
| Requests per parameter |
50-100 payloads |
~4 (1 probe + up to 3 exploits) |
87-96% reduction |
| Context awareness |
None |
Full (11+ contexts) |
✅ |
| False positives |
High |
Low (fail-fast + verification) |
✅ |
| Time complexity |
O(n×m) |
O(n) single-pass |
✅ |
Key Features
- Precision: Distinguishes between 11+ distinct contexts.
- Fail-Fast Optimization: Skips unexploitable reflections immediately.
- Robustness: Handles RCDATA, comments, event handlers, and malformed trailing HTML (captured via a post-tokenizer drain logic).
- Prioritization System: Targets “weakest link” first when multiple reflections exist.
- Clean Code: Zero magic numbers/strings, all configuration extracted to constants.
- Modern Go: stdlib error handling, proper resource cleanup with
defer.
Proof of Work: Comprehensive Testing
Unit Test Results
Test Suite: pkg/fuzz/analyzers/xss/xss_test.go
- Execution:
go test -v ./pkg/fuzz/analyzers/xss/...
- Results:
- 39 test cases (covering standard contexts + edge cases)
- 100% pass rate
- Coverage: Context detection, specialized event handler logic, filter analysis, and malformed HTML handling.
Integration Test Results
Templates: integration_tests/fuzz/
- Execution:
./integration-test -protocol fuzzing
- Results:
fuzz/xss-context-test.yaml (General Context)
fuzz/fuzz-xss-attribute.yaml (Attributes)
fuzz/fuzz-xss-body.yaml (Body)
fuzz/fuzz-xss-comment.yaml (Comments)
fuzz/fuzz-xss-event.yaml (Event Handlers)
fuzz/fuzz-xss-script.yaml (Script Injection)
- Success Rate: 100% (6/6 tests passed)
Checklist
- Pull request is created against the dev branch
- All checks passed (lint, unit/integration/regression tests etc.) with my changes
- I have added tests that prove my fix is effective or that my feature works
- 39 unit tests with 100% pass rate
- 6 Integration tests covering all major XSS vectors
- I have added necessary documentation (if appropriate)
- Comprehensive code comments
- Clear function documentation
- Architecture explained in code structure
Summary by CodeRabbit
-
New Features
- Added context-aware XSS analyzer with smart canary probing, context detection, targeted payload dispatch, and verification
- Introduced a public XSS payload catalog and payload-selection/verification logic
- Added playground endpoints to reflect inputs across multiple XSS contexts
- Extended fuzzing with six new XSS test cases
-
Tests
- Added comprehensive unit tests covering context detection, payload selection, filters, edge cases, and robustness
-
Chores
- Minor integration-test control-flow and readiness improvements