/claim #5838
Adds an XSS reflection context analyzer to the fuzzing engine (pkg/fuzz/analyzers/xss/). Given an HTTP response body and a marker string, it classifies the HTML context of the reflection into one of 8 types (body text, attribute, URL attribute, event handler, executable script, non-executable script data, style, or comment). This gives the fuzzer the information it needs to select context-appropriate payloads instead of blindly spraying all of them. Relates to #5838.
The fuzzing engine has no way to determine where in an HTML response a reflected value lands. Without this, XSS payload selection is blind, you either spray every payload everywhere (noisy, slow, false-positive-heavy) or miss valid injection points entirely. Issue #5838 asks for a context analyzer to fix this.
Adds pkg/fuzz/analyzers/xss/ — a standalone context classification package that takes a response body and a marker string, and returns which HTML context the marker was reflected into.
The function signature:
func AnalyzeReflectionContext(responseBody, marker string) (XSSContext, error)
It uses golang.org/x/net/html tokenizer (already in go.mod) to walk the HTML token stream with a state machine tracking inScript, inStyle, and script executability. No regex for HTML parsing.
| Context | When | Example |
|---|---|---|
ContextHTMLBody |
Text between tags | <p>MARKER</p> |
ContextHTMLAttribute |
Generic attribute | <input value="MARKER"> |
ContextHTMLAttributeURL |
URL attribute (href, src, action, ping, etc.) | <a href="/path/MARKER"> |
ContextHTMLAttributeEvent |
Event handler (onclick, onerror, etc.) | <div onclick="fn(MARKER)"> |
ContextScript |
Executable <script>, or javascript:/vbscript:/data:text/html/data:image/svg+xml URI in an executable sink |
<a href="javascript:MARKER"> |
ContextScriptData |
Non-executable script (JSON, template, etc.) | <script type="application/json">MARKER</script> |
ContextStyle |
<style> block or style="" attribute |
<div style="color:MARKER"> |
ContextComment |
HTML comment | <!-- MARKER --> |
These are the specific issues called out in community review of prior attempts:
javascript: and vbscript: URIs classified as ContextScript, not ContextHTMLAttributeURLdata:text/html, data:application/xhtml+xml, and data:image/svg+xml URIs classified as ContextScript<a href="javascript:..."> → ContextScript, but <img src="javascript:..."> → ContextHTMLAttributeURL (browsers don’t execute it)<script type="application/json"> classified as ContextScriptData, not ContextScripttype attributes use first value per HTML5 spec (browsers ignore subsequent dupes)srcdoc attribute classified as ContextHTMLBody (renders full HTML)style attribute classified as ContextStyle, not ContextHTMLAttributeonclick, onerror, etc.) get their own ContextHTMLAttributeEventThis is intentionally scoped to context classification only. It does not:
analyzers.go, request.go, http.go — all untouched)Analyzer interface (see “Integration path” below)Prior PRs were likely not merged in part because they modified the shared Options struct in analyzers.go to add ResponseBody, renamed existing unexported functions, and mixed core context analysis with canary injection and payload replay logic. This PR avoids all of that.
The fuzzing pipeline already has the response body available as bodyStr at pkg/protocols/http/request.go:985, in scope when analyzers execute (line 1013). To wire this up:
ResponseBody string field to analyzers.OptionsbodyStr into the options at the existing analyzer call siteAnalyze() method calls AnalyzeReflectionContext(options.ResponseBody, marker) and selects payloads based on contextimport "github.com/projectdiscovery/nuclei/v3/pkg/fuzz/analyzers/xss"
// responseBody is the HTML response from the server,
// marker is the unique string the fuzzer injected.
ctx, err := xss.AnalyzeReflectionContext(responseBody, marker)
if err != nil {
log.Fatal(err)
}
switch ctx {
case xss.ContextScript:
// use script breakout payloads
case xss.ContextHTMLAttribute:
// use attribute escape payloads
case xss.ContextHTMLAttributeEvent:
// use event handler payloads
case xss.ContextComment:
// use comment breakout payloads
// ... etc
}
To replicate and verify:
# clone and checkout the branch
git clone https://github.com/ZachL111/nuclei.git
cd nuclei
git checkout feat/xss-context-analyzer
# run the xss analyzer tests
go test ./pkg/fuzz/analyzers/xss/... -v -count=1
# verify no regressions in the rest of the fuzz package
go test ./pkg/fuzz/... -count=1
# build and vet
go build ./...
go vet ./pkg/fuzz/...
pkg/fuzz/analyzers/xss/
├── context.go — XSSContext type, iota constants, String()
├── analyzer.go — AnalyzeReflectionContext() + helpers (~340 lines)
└── analyzer_test.go — 63 table-driven test cases (~500 lines)
Zero files modified.
Changes made during review before submission:
TagAttr() is a forward-only iterator on the tokenizer. The original code checked script type and marker in two separate loops, so the second loop saw no attributes. Merged both checks into a single scanAttributes pass. Added 3 test cases to cover <script src="MARKER"> variations.ping to URL attributes — missed on the first pass. The ping attribute on <a> tags fires a POST to the specified URL when clicked, so it’s a valid URL injection context.data:application/xhtml+xml to dangerous URI detection — data:text/html was covered but data:application/xhtml+xml renders and executes script the same way in iframes. Added a test case for it.AnalyzeReflectionContext was returning nil error on every ErrorToken, including actual parse failures. Now checks tokenizer.Err() and only swallows io.EOF (normal end of document). Real errors get surfaced to the caller.foundType bool in scanAttributes was redundant since scriptType already defaults to "" which maps to executable in the lookup table. Cleaned it up.onauxclick, onbeforeinput, onformdata, onslotchange, onsecuritypolicyviolation were missing from the event handler set. Added them with test cases.type="text/javascript; charset=utf-8" was failing the exact lookup and getting misclassified as ContextScriptData. Now strips everything after ; before checking. Added a test case.urlAttrs, eventHandlers, executableScriptTypes, and contextNames had comments that didn’t follow Go’s godoc convention (comment must start with the entity name). Rewrote them so go doc and linters pick them up correctly.data:image/svg+xml to dangerous URI detection — SVG data URIs can contain embedded JavaScript (<svg onload=alert(1)>) that executes when rendered in iframe/object/embed. Added test cases for both iframe (ContextScript) and img (ContextHTMLAttributeURL — browsers block SVG script execution in img tags).type attribute parser differential — scanAttributes was overwriting scriptType on every type attribute encountered. HTML5 spec says browsers use the first attribute when dupes exist, so <script type="application/json" type="text/javascript"> should be non-executable. Now only records the first type. Added a test case.<img src="javascript:..."> was being classified as ContextScript even though browsers don’t execute javascript: in img src. Added an executableURLSinks map that restricts ContextScript promotion to tag+attr pairs that actually execute (a+href, iframe+src, form+action, button+formaction, object+data, etc.). Everything else stays ContextHTMLAttributeURL. Added test cases for img src and ping with javascript: URIs.vbscript: URI detection — covers IE11 and legacy Edge environments still deployed in corporate settings. Added a test case.longdesc to URL attributes — longdesc on img/iframe elements can contain navigable URIs. Added a test case.$ go test ./pkg/fuzz/analyzers/xss/... -v -count=1
=== RUN TestAnalyzeReflectionContext
=== RUN TestAnalyzeReflectionContext/reflection_in_plain_HTML_body_text
=== RUN TestAnalyzeReflectionContext/reflection_in_nested_div_body_text
=== RUN TestAnalyzeReflectionContext/reflection_in_regular_attribute_value
=== RUN TestAnalyzeReflectionContext/reflection_in_class_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_data-custom_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_title_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_href_with_regular_URL
=== RUN TestAnalyzeReflectionContext/reflection_in_src_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_action_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_formaction_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_longdesc_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_onclick_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_onmouseover_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_onerror_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_onload_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_onauxclick_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_onbeforeinput_handler
=== RUN TestAnalyzeReflectionContext/reflection_in_script_block_with_no_type
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=text/javascript
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=module
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=application/javascript
=== RUN TestAnalyzeReflectionContext/script_type_with_MIME_parameters_still_executable
=== RUN TestAnalyzeReflectionContext/javascript_URI_in_href_must_be_ContextScript
=== RUN TestAnalyzeReflectionContext/javascript_URI_with_whitespace_prefix
=== RUN TestAnalyzeReflectionContext/javascript_URI_case-insensitive
=== RUN TestAnalyzeReflectionContext/data:text/html_URI_in_src
=== RUN TestAnalyzeReflectionContext/data:application/xhtml+xml_URI_in_src
=== RUN TestAnalyzeReflectionContext/data:image/svg+xml_URI_in_iframe_src
=== RUN TestAnalyzeReflectionContext/data:image/svg+xml_URI_in_img_src_does_not_execute
=== RUN TestAnalyzeReflectionContext/vbscript_URI_in_href
=== RUN TestAnalyzeReflectionContext/javascript_URI_in_img_src_does_not_execute
=== RUN TestAnalyzeReflectionContext/javascript_URI_in_ping_does_not_execute
=== RUN TestAnalyzeReflectionContext/reflection_in_ping_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=application/json
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=text/template
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=text/x-handlebars-template
=== RUN TestAnalyzeReflectionContext/reflection_in_script_type=application/ld+json
=== RUN TestAnalyzeReflectionContext/duplicate_type_attributes_uses_first_per_HTML5_spec
=== RUN TestAnalyzeReflectionContext/reflection_in_style_block
=== RUN TestAnalyzeReflectionContext/reflection_in_style_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_HTML_comment
=== RUN TestAnalyzeReflectionContext/reflection_in_comment_between_tags
=== RUN TestAnalyzeReflectionContext/reflection_in_srcdoc_attribute
=== RUN TestAnalyzeReflectionContext/case-insensitive_marker_matching_(lowercase_body)
=== RUN TestAnalyzeReflectionContext/case-insensitive_marker_matching_(mixed_case_body)
=== RUN TestAnalyzeReflectionContext/case-insensitive_in_attribute
=== RUN TestAnalyzeReflectionContext/marker_not_found_in_response
=== RUN TestAnalyzeReflectionContext/empty_response_body
=== RUN TestAnalyzeReflectionContext/empty_marker
=== RUN TestAnalyzeReflectionContext/malformed_HTML_with_unclosed_tags
=== RUN TestAnalyzeReflectionContext/malformed_HTML_with_no_tags_at_all
=== RUN TestAnalyzeReflectionContext/malformed_script_tag_not_closed
=== RUN TestAnalyzeReflectionContext/broken_HTML_with_unclosed_attribute_quote
=== RUN TestAnalyzeReflectionContext/broken_HTML_with_missing_closing_quote_but_valid_parse
=== RUN TestAnalyzeReflectionContext/multiple_reflections_returns_first_context
=== RUN TestAnalyzeReflectionContext/reflection_in_self-closing_tag_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_script_src_attribute
=== RUN TestAnalyzeReflectionContext/reflection_in_script_src_with_type_attribute
=== RUN TestAnalyzeReflectionContext/script_tag_with_src_and_type_but_reflection_in_text
=== RUN TestAnalyzeReflectionContext/non-executable_script_with_marker_in_src_attribute
=== RUN TestAnalyzeReflectionContext/reflection_inside_noscript
=== RUN TestAnalyzeReflectionContext/reflection_inside_textarea
--- PASS: TestAnalyzeReflectionContext (0.04s)
=== RUN TestAnalyzeReflectionContext_NoPanic
--- PASS: TestAnalyzeReflectionContext_NoPanic (0.00s)
=== RUN TestXSSContextString
--- PASS: TestXSSContextString (0.00s)
PASS
ok github.com/projectdiscovery/nuclei/v3/pkg/fuzz/analyzers/xss 0.510s
$ go build ./... # zero errors
$ go vet ./pkg/fuzz/... # zero warnings
golang.org/x/net/html tokenizer (no regex)go build ./... passesgo vet ./pkg/fuzz/... passes// EntityName ... conventionCloses #5838
Zach
@ZachL111
ProjectDiscovery
@projectdiscovery