Summary

Adds an XSS context analyzer that detects the precise injection context of reflected input in HTTP responses. This enables context-aware XSS payload generation in the fuzzer.

  • 17 distinct context types detected: HTML text, comments, attribute values (double/single/unquoted), event handlers (onclick/onload/etc.), URL attributes (href/src/action), JavaScript strings (double/single/template literal), JS expressions, JS comments (line/block), CSS values, CSS url(), and style attributes
  • Hybrid parsing approach: golang.org/x/net/html tokenizer for HTML-level context + gotreesitter AST parsing for JavaScript and CSS sub-context classification
  • Canary-based detection: Injects a unique random marker, sends the request, then locates all reflection points in the response body
  • Full test coverage: 31 tests covering all context types, real-world scenarios, edge cases, and the analyzer interface

How it works

  1. Generates a unique canary string (gtss + 8 random alphanum chars)
  2. Sets canary as the fuzz parameter value and sends the request
  3. Scans the response body for canary reflections
  4. For each reflection: classifies the HTML context, then sub-parses with JS or CSS grammars if inside <script> or <style> blocks
  5. Returns the detected context(s) in analyzer_details

Why gotreesitter for JS/CSS

Previous attempts at this feature used regex or flat tokenizers, which can’t distinguish between:

  • <script>var x = "REFLECTED"</script> (string context — needs quote escape)
  • <script>var x = REFLECTED</script> (expression context — direct injection)
  • <script>// REFLECTED</script> (comment context — needs newline)

AST parsing with tree-sitter grammars handles these correctly via structural analysis rather than pattern matching.

Files

File Purpose
pkg/fuzz/analyzers/xss/context.go Context enum types and string mappings
pkg/fuzz/analyzers/xss/html_context.go HTML tokenizer-based context detection
pkg/fuzz/analyzers/xss/js_context.go JS sub-context via gotreesitter AST
pkg/fuzz/analyzers/xss/css_context.go CSS sub-context via gotreesitter AST
pkg/fuzz/analyzers/xss/analyzer.go Analyzer interface + registration
pkg/fuzz/analyzers/xss/analyzer_test.go 31 table-driven tests + benchmarks
pkg/protocols/http/http.go Blank import for auto-registration

Build

Requires the grammar_set_core build tag for gotreesitter grammars (includes HTML/JS/CSS, ~1MB):

go build -tags grammar_set_core ./...
go test -tags grammar_set_core ./pkg/fuzz/analyzers/xss/ -v

Test Results

All 31 tests pass:

=== RUN TestDetermineContext (22 sub-tests: all 17 context types) PASS
=== RUN TestNoReflection PASS
=== RUN TestMultipleReflections PASS
=== RUN TestGenerateCanary PASS
=== RUN TestRealWorldReflections (6 sub-tests) PASS
=== RUN TestContextStrings (9 sub-tests) PASS

Real-World Test Scenarios

TestRealWorldReflections validates context detection against realistic HTML from actual web applications:

Scenario Reflections Contexts Detected
Search results page (query in heading + input) 2 html_text, attr_value_double_quoted
Error page (param in script var + message) 2 script_string_double, html_text
Profile page (username in 6 places) 6 css_url, url_attribute, html_text, url_attribute, attr_value_double_quoted, event_handler
SPA boot page (config in JSON init) 1 script_string_double
Comment form (input in textarea + hidden) 2 attr_value_double_quoted, html_text
Template literal + event handler 2 script_template_literal, event_handler

Benchmarks

goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) Ultra 9 285
BenchmarkFindReflections/html_text-20 1371783 961 ns/op 4424 B/op 8 allocs/op
BenchmarkFindReflections/attribute-20 1108882 1074 ns/op 4592 B/op 13 allocs/op
BenchmarkFindReflections/event_handler-20 1238641 961 ns/op 4528 B/op 12 allocs/op
BenchmarkFindReflections/script_string-20 1953 613302 ns/op 397377 B/op 4491 allocs/op
BenchmarkFindReflections/script_template-20 1845 624138 ns/op 397694 B/op 4491 allocs/op
BenchmarkFindReflections/css_value-20 10000 101334 ns/op 111953 B/op 1357 allocs/op
BenchmarkFindReflections/css_url-20 11804 100782 ns/op 112031 B/op 1359 allocs/op
BenchmarkFindReflections/multi_reflect-20 1551 750427 ns/op 504590 B/op 5844 allocs/op

HTML-only contexts (text, attribute, event handler) run in ~1μs. JS/CSS sub-parsing adds ~100-600μs for grammar loading — negligible for network-bound fuzzing.

Test plan

  • All 31 context detection tests pass (22 unit + 6 real-world + 3 utility)
  • Canary generation produces unique 12-char strings
  • Multiple reflections in single response detected correctly
  • No-reflection case returns false correctly
  • Real-world HTML scenarios with multiple mixed contexts all classified correctly
  • Full nuclei binary builds cleanly with grammar_set_core tag

/claim #5838

Summary by CodeRabbit

Release Notes

  • New Features

    • Added XSS context detection analyzer to identify injection points and classify contexts (HTML text, attributes, JavaScript, CSS, etc.) during analysis.
  • Tests

    • Added comprehensive test suite for XSS context detection.

Claim

Total prize pool $200
Total paid $0
Status Pending
Submitted February 27, 2026
Last updated February 27, 2026

Contributors

OS

Oscar Villavicencio

@odvcencio

100%

Sponsors

PR

ProjectDiscovery

@projectdiscovery

$200