Add XSS context analyzer with AST-based detection

Summary

Adds an XSS context analyzer that detects the precise injection context of reflected input in HTTP responses. This enables context-aware XSS payload generation in the fuzzer.

17 distinct context types detected: HTML text, comments, attribute values (double/single/unquoted), event handlers (onclick/onload/etc.), URL attributes (href/src/action), JavaScript strings (double/single/template literal), JS expressions, JS comments (line/block), CSS values, CSS url(), and style attributes
Hybrid parsing approach: golang.org/x/net/html tokenizer for HTML-level context + gotreesitter AST parsing for JavaScript and CSS sub-context classification
Canary-based detection: Injects a unique random marker, sends the request, then locates all reflection points in the response body
Full test coverage: 31 tests covering all context types, real-world scenarios, edge cases, and the analyzer interface

How it works

Generates a unique canary string (gtss + 8 random alphanum chars)
Sets canary as the fuzz parameter value and sends the request
Scans the response body for canary reflections
For each reflection: classifies the HTML context, then sub-parses with JS or CSS grammars if inside <script> or <style> blocks
Returns the detected context(s) in analyzer_details

Why gotreesitter for JS/CSS

Previous attempts at this feature used regex or flat tokenizers, which can’t distinguish between:

<script>var x = "REFLECTED"</script> (string context — needs quote escape)
<script>var x = REFLECTED</script> (expression context — direct injection)
<script>// REFLECTED</script> (comment context — needs newline)

AST parsing with tree-sitter grammars handles these correctly via structural analysis rather than pattern matching.

Files

File	Purpose
`pkg/fuzz/analyzers/xss/context.go`	Context enum types and string mappings
`pkg/fuzz/analyzers/xss/html_context.go`	HTML tokenizer-based context detection
`pkg/fuzz/analyzers/xss/js_context.go`	JS sub-context via gotreesitter AST
`pkg/fuzz/analyzers/xss/css_context.go`	CSS sub-context via gotreesitter AST
`pkg/fuzz/analyzers/xss/analyzer.go`	Analyzer interface + registration
`pkg/fuzz/analyzers/xss/analyzer_test.go`	31 table-driven tests + benchmarks
`pkg/protocols/http/http.go`	Blank import for auto-registration

Build

Requires the grammar_set_core build tag for gotreesitter grammars (includes HTML/JS/CSS, ~1MB):

go build -tags grammar_set_core ./...
go test -tags grammar_set_core ./pkg/fuzz/analyzers/xss/ -v

Test Results

All 31 tests pass:

=== RUN   TestDetermineContext (22 sub-tests: all 17 context types)          PASS
=== RUN   TestNoReflection                                                   PASS
=== RUN   TestMultipleReflections                                            PASS
=== RUN   TestGenerateCanary                                                 PASS
=== RUN   TestRealWorldReflections (6 sub-tests)                             PASS
=== RUN   TestContextStrings (9 sub-tests)                                   PASS

Real-World Test Scenarios

TestRealWorldReflections validates context detection against realistic HTML from actual web applications:

Scenario	Reflections	Contexts Detected
Search results page (query in heading + input)	2	`html_text`, `attr_value_double_quoted`
Error page (param in script var + message)	2	`script_string_double`, `html_text`
Profile page (username in 6 places)	6	`css_url`, `url_attribute`, `html_text`, `url_attribute`, `attr_value_double_quoted`, `event_handler`
SPA boot page (config in JSON init)	1	`script_string_double`
Comment form (input in textarea + hidden)	2	`attr_value_double_quoted`, `html_text`
Template literal + event handler	2	`script_template_literal`, `event_handler`

Benchmarks

goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) Ultra 9 285

BenchmarkFindReflections/html_text-20            1371783       961 ns/op      4424 B/op       8 allocs/op
BenchmarkFindReflections/attribute-20            1108882      1074 ns/op      4592 B/op      13 allocs/op
BenchmarkFindReflections/event_handler-20        1238641       961 ns/op      4528 B/op      12 allocs/op
BenchmarkFindReflections/script_string-20           1953    613302 ns/op    397377 B/op    4491 allocs/op
BenchmarkFindReflections/script_template-20         1845    624138 ns/op    397694 B/op    4491 allocs/op
BenchmarkFindReflections/css_value-20              10000    101334 ns/op    111953 B/op    1357 allocs/op
BenchmarkFindReflections/css_url-20                11804    100782 ns/op    112031 B/op    1359 allocs/op
BenchmarkFindReflections/multi_reflect-20           1551    750427 ns/op    504590 B/op    5844 allocs/op

HTML-only contexts (text, attribute, event handler) run in ~1μs. JS/CSS sub-parsing adds ~100-600μs for grammar loading — negligible for network-bound fuzzing.

Test plan

All 31 context detection tests pass (22 unit + 6 real-world + 3 utility)
Canary generation produces unique 12-char strings
Multiple reflections in single response detected correctly
No-reflection case returns false correctly
Real-world HTML scenarios with multiple mixed contexts all classified correctly
Full nuclei binary builds cleanly with grammar_set_core tag

/claim #5838

Summary by CodeRabbit

Release Notes

New Features
- Added XSS context detection analyzer to identify injection points and classify contexts (HTML text, attributes, JavaScript, CSS, etc.) during analysis.
Tests
- Added comprehensive test suite for XSS context detection.