feat(fuzz): Add XSS context analyzer with smart detection

/claim #5838

Proposed Changes

This PR implements a comprehensive XSS Context Analyzer to replace blind fuzzing with intelligent, context-aware exploitation. The solution moves away from “spray-and-pray” payloads to a Probe-and-Exploit architecture. It parses HTTP response bodies using a browser-grade HTML tokenizer to determine exactly where user input is reflected and whether it can be exploited.

Architecture & Approach

The analyzer operates in a two-phase process:

Phase 1: Smart Probing & Context Detection

Smart Canary Injection: Sends a unique canary with format Nucl3i<random><>'".
- 6 random alphanumeric characters ensure uniqueness.
- Special characters (<>') test filter behavior.
Robust HTML Tokenization: Parses response using browser-grade tokenizer (golang.org/x/net/html).
- Accurately identifies parsing context, not just string position.
- Malformed HTML Handling: Implements a “drain” logic to capture reflections in truncated or malformed HTML (e.g., unclosed tags), ensuring no potential injection points are missed.
- Single-pass tokenization for O(n) performance.
Context Classification: Detects 11+ distinct HTML contexts, including:
- html_text - Plain HTML body content (supports fallback for malformed HTML)
- html_attr_double_quoted / _single_quoted / _unquoted
- script_code / script_string_double / _single_quoted / _template
- rcdata (textarea, title), style_block, comment_block
- event_handler - Event attributes (e.g., onclick)
Exploitability Prioritization: Ranks contexts by exploitation difficulty (1-7 scale).
- Prioritizes easiest contexts first (e.g., html_text rank 1 vs script_string rank 6).
- Smart Sorting: Even fallback contexts from malformed HTML are correctly sorted and prioritized based on their exploitability.
Filter Detection & Fail-Fast: Analyzes which special characters survived encoding.
- Compares canary before vs. after reflection.
- Granular Analysis: Correctly identifies if specific chars (<, ', ") are allowed even in edge cases like malformed trailing content.
- Fails fast if critical characters are encoded.

Phase 2: Targeted Exploitation

Context-Aware Payload Selection: Chooses optimal payload based on detected context and filter analysis.
Verification: Sends targeted payload and verifies execution via pattern matching.
Multi-Context Fallback: Tries up to 3 contexts if first attempt fails, ordered by exploitability rank.

Performance Characteristics

Metric	Traditional Fuzzing	Context Analyzer	Improvement
Requests per parameter	50-100 payloads	~4 (1 probe + up to 3 exploits)	87-96% reduction
Context awareness	None	Full (11+ contexts)	✅
False positives	High	Low (fail-fast + verification)	✅
Time complexity	O(n×m)	O(n) single-pass	✅

Key Features

Precision: Distinguishes between 11+ distinct contexts.
Fail-Fast Optimization: Skips unexploitable reflections immediately.
Robustness: Handles RCDATA, comments, event handlers, and malformed trailing HTML (captured via a post-tokenizer drain logic).
Prioritization System: Targets “weakest link” first when multiple reflections exist.
Clean Code: Zero magic numbers/strings, all configuration extracted to constants.
Modern Go: stdlib error handling, proper resource cleanup with defer.

Proof of Work: Comprehensive Testing

Unit Test Results

Test Suite: pkg/fuzz/analyzers/xss/xss_test.go

Execution: go test -v ./pkg/fuzz/analyzers/xss/...
Results:
- 39 test cases (covering standard contexts + edge cases)
- 100% pass rate
- Coverage: Context detection, specialized event handler logic, filter analysis, and malformed HTML handling.

Integration Test Results

Templates: integration_tests/fuzz/

Execution: ./integration-test -protocol fuzzing
Results:
- fuzz/xss-context-test.yaml (General Context)
- fuzz/fuzz-xss-attribute.yaml (Attributes)
- fuzz/fuzz-xss-body.yaml (Body)
- fuzz/fuzz-xss-comment.yaml (Comments)
- fuzz/fuzz-xss-event.yaml (Event Handlers)
- fuzz/fuzz-xss-script.yaml (Script Injection)
Success Rate: 100% (6/6 tests passed)

Checklist

Pull request is created against the dev branch
All checks passed (lint, unit/integration/regression tests etc.) with my changes
I have added tests that prove my fix is effective or that my feature works
- 39 unit tests with 100% pass rate
- 6 Integration tests covering all major XSS vectors
I have added necessary documentation (if appropriate)
- Comprehensive code comments
- Clear function documentation
- Architecture explained in code structure

Summary by CodeRabbit

New Features
- Added context-aware XSS analyzer with smart canary probing, context detection, targeted payload dispatch, and verification
- Introduced a public XSS payload catalog and payload-selection/verification logic
- Added playground endpoints to reflect inputs across multiple XSS contexts
- Extended fuzzing with six new XSS test cases
Tests
- Added comprehensive unit tests covering context detection, payload selection, filters, edge cases, and robustness
Chores
- Minor integration-test control-flow and readiness improvements

Recruiting

Bounties

Community

Legal