Adds an XSS context analyzer that detects the precise injection context of reflected input in HTTP responses. This enables context-aware XSS payload generation in the fuzzer.
onclick/onload/etc.), URL attributes (href/src/action), JavaScript strings (double/single/template literal), JS expressions, JS comments (line/block), CSS values, CSS url(), and style attributesgolang.org/x/net/html tokenizer for HTML-level context + gotreesitter AST parsing for JavaScript and CSS sub-context classificationgtss + 8 random alphanum chars)<script> or <style> blocksanalyzer_detailsPrevious attempts at this feature used regex or flat tokenizers, which can’t distinguish between:
<script>var x = "REFLECTED"</script> (string context — needs quote escape)<script>var x = REFLECTED</script> (expression context — direct injection)<script>// REFLECTED</script> (comment context — needs newline)AST parsing with tree-sitter grammars handles these correctly via structural analysis rather than pattern matching.
| File | Purpose |
|---|---|
pkg/fuzz/analyzers/xss/context.go |
Context enum types and string mappings |
pkg/fuzz/analyzers/xss/html_context.go |
HTML tokenizer-based context detection |
pkg/fuzz/analyzers/xss/js_context.go |
JS sub-context via gotreesitter AST |
pkg/fuzz/analyzers/xss/css_context.go |
CSS sub-context via gotreesitter AST |
pkg/fuzz/analyzers/xss/analyzer.go |
Analyzer interface + registration |
pkg/fuzz/analyzers/xss/analyzer_test.go |
31 table-driven tests + benchmarks |
pkg/protocols/http/http.go |
Blank import for auto-registration |
Requires the grammar_set_core build tag for gotreesitter grammars (includes HTML/JS/CSS, ~1MB):
go build -tags grammar_set_core ./...
go test -tags grammar_set_core ./pkg/fuzz/analyzers/xss/ -v
All 31 tests pass:
=== RUN TestDetermineContext (22 sub-tests: all 17 context types) PASS
=== RUN TestNoReflection PASS
=== RUN TestMultipleReflections PASS
=== RUN TestGenerateCanary PASS
=== RUN TestRealWorldReflections (6 sub-tests) PASS
=== RUN TestContextStrings (9 sub-tests) PASS
TestRealWorldReflections validates context detection against realistic HTML from actual web applications:
| Scenario | Reflections | Contexts Detected |
|---|---|---|
| Search results page (query in heading + input) | 2 | html_text, attr_value_double_quoted |
| Error page (param in script var + message) | 2 | script_string_double, html_text |
| Profile page (username in 6 places) | 6 | css_url, url_attribute, html_text, url_attribute, attr_value_double_quoted, event_handler |
| SPA boot page (config in JSON init) | 1 | script_string_double |
| Comment form (input in textarea + hidden) | 2 | attr_value_double_quoted, html_text |
| Template literal + event handler | 2 | script_template_literal, event_handler |
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) Ultra 9 285
BenchmarkFindReflections/html_text-20 1371783 961 ns/op 4424 B/op 8 allocs/op
BenchmarkFindReflections/attribute-20 1108882 1074 ns/op 4592 B/op 13 allocs/op
BenchmarkFindReflections/event_handler-20 1238641 961 ns/op 4528 B/op 12 allocs/op
BenchmarkFindReflections/script_string-20 1953 613302 ns/op 397377 B/op 4491 allocs/op
BenchmarkFindReflections/script_template-20 1845 624138 ns/op 397694 B/op 4491 allocs/op
BenchmarkFindReflections/css_value-20 10000 101334 ns/op 111953 B/op 1357 allocs/op
BenchmarkFindReflections/css_url-20 11804 100782 ns/op 112031 B/op 1359 allocs/op
BenchmarkFindReflections/multi_reflect-20 1551 750427 ns/op 504590 B/op 5844 allocs/op
HTML-only contexts (text, attribute, event handler) run in ~1μs. JS/CSS sub-parsing adds ~100-600μs for grammar loading — negligible for network-bound fuzzing.
grammar_set_core tag/claim #5838
New Features
Tests
Oscar Villavicencio
@odvcencio
ProjectDiscovery
@projectdiscovery