fix(xss): fix 4 context classification edge cases (#7086) by Gengyscan · Pull Request #7208 · projectdiscovery/nuclei

Gengyscan · 2026-03-13T18:47:57Z

Fix: 4 XSS Context Analyzer Edge Cases

Changes

Based on the XSS context analyzer from #7076, this PR fixes all 4 misclassification edge cases:

1. javascript: URIs in href/action/formaction

Attribute values starting with javascript: are now classified as ContextScript instead of ContextAttribute.

2. JSON script blocks

neo-by-projectdiscovery-dev · 2026-03-13T18:48:09Z

Neo - PR Security Review

No security issues found

Highlights

Commit abf46b5 contains code cleanup/refactoring (net -17 lines across 3 files)
All 4 XSS context analyzer edge case fixes verified intact
Both critical security fixes remain properly implemented

Hardening Notes

Consider adding text/xml and application/xml to the dangerous data URI MIME types list in pkg/fuzz/analyzers/xss/context.go line 95 for defense in depth, as these may be rendered as XHTML in some browser contexts

_{Comment @pdneo help for available commands. · Open in Neo}

coderabbitai · 2026-03-13T18:48:21Z

Walkthrough

Adds a new XSS context analyzer (canary‑based reflection detection, context classification, payload selection, replay verification) with extensive tests and benchmarks; introduces thread‑safe random utilities (exported RandStringBytesMask, mutex‑guarded GetRandomInteger) and new response fields on Options; registers the analyzer via side‑effect import and guards analyzer parameter map initialization.

Changes

Cohort / File(s)	Summary
Analyzer core & utilities `pkg/fuzz/analyzers/analyzers.go`	Added package-level `randomMu` (`sync.Mutex`), mutex‑guarded random helpers (`GetRandomInteger`, exported `RandStringBytesMask`), replaced private random usage, and added new `Options` fields: `ResponseBody`, `ResponseHeaders`, `ResponseStatusCode`.
XSS analyzer implementation `pkg/fuzz/analyzers/xss/analyzer.go`	New `Analyzer` type registered as `"xss_context"` with `ApplyInitialTransformation`, `Analyze`, canary generation, character survival detection, context‑aware payload selection, and `replayAndVerify`.
HTML reflection tokenizer/classifier `pkg/fuzz/analyzers/xss/context.go`	New `DetectReflections(body, marker)` and helpers to classify reflections across tag names, attributes (quoted/unquoted, event handlers, `javascript:` URIs, `srcdoc`), scripts/styles/comments/RCDATA; includes `BestReflection`.
XSS types & helpers `pkg/fuzz/analyzers/xss/types.go`	Introduced `Context` enum (`String()/priority()`), `ReflectionInfo`, `CharacterSet`, event handler/script‑attribute maps, `isEventHandler`/`isScriptURLAttribute`, and `normalizeURIScheme`.
XSS tests & benchmarks `pkg/fuzz/analyzers/xss/context_test.go`	Extensive unit tests and benchmarks covering reflection detection, script‑string contexts, payload selection, character survival, HTML/CSP checks, edge cases, and performance.
HTTP integration `pkg/protocols/http/http.go`	Added side‑effect import `_ "github.com/projectdiscovery/nuclei/v3/pkg/fuzz/analyzers/xss"` to register the analyzer and integrated analyzer usage into request flow.
Request fuzzing init guard `pkg/protocols/http/request_fuzz.go`	Ensure `request.Analyzer.Parameters` map is initialized when applying an analyzer to avoid nil‑map assignments.
Module manifest `go.mod`	Module file updated to include new analyzer package dependencies.

Sequence Diagram

sequenceDiagram
    participant Client
    participant FuzzEngine as Fuzz Engine
    participant Analyzer as XSS Analyzer
    participant HTMLParser as HTML Parser
    participant Target as Target Server

    Client->>FuzzEngine: start fuzz run
    FuzzEngine->>Analyzer: ApplyInitialTransformation([XSS_CANARY])
    Analyzer->>Analyzer: generateCanary(), store in params
    FuzzEngine->>Target: send fuzzed request
    Target->>FuzzEngine: response (body, headers)
    FuzzEngine->>Analyzer: Analyze(options with response)
    Analyzer->>HTMLParser: DetectReflections(body, canary)
    HTMLParser-->>Analyzer: []ReflectionInfo
    Analyzer->>Analyzer: BestReflection + detectCharacterSurvival
    Analyzer->>Analyzer: selectPayloads(reflection, charset)
    loop for each payload
        Analyzer->>Target: replayAndVerify(payload)
        Target->>Analyzer: response
        Analyzer->>Analyzer: verify unencoded reflection, check CSP
        alt confirmed
            Analyzer->>FuzzEngine: report positive result (details)
        end
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐇 I nibble canaries in HTML light,
mutex‑safe hops keep randomness bright,
tokens and tags twirl through the night,
payloads replay until reflections bite,
a rabbit cheers — fuzzing takes flight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.79% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the four context classification edge cases documented in issue `#7086` (javascript: URIs, non-executable script types, case-sensitive reflection detection, and srcdoc handling).
Linked Issues check	✅ Passed	The PR implements all four coding objectives from `#7086`: javascript: URI script context classification [`#7086`], non-executable script type detection [`#7086`], case-insensitive reflection detection [`#7086`], and srcdoc as HTML context [`#7086`]. Additional enhancements align with security review findings.
Out of Scope Changes check	✅ Passed	All changes directly support the four edge-case fixes from `#7086` and their integration. HTTP module imports and request fuzzing guards enable the analyzer registration and proper parameter initialization without introducing unrelated functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

pkg/fuzz/analyzers/xss/context.go (1)
299-312: Pointer aliasing: BestReflection returns pointer to slice element.

The function returns &reflections[i] which points into the original slice. If the caller modifies the slice or it gets garbage collected, this pointer may become invalid. Since reflections is typically returned from DetectReflections and used immediately, this is likely safe in practice.
♻️ Safer alternative returning a copy
 func BestReflection(reflections []ReflectionInfo) *ReflectionInfo {
 	if len(reflections) == 0 {
 		return nil
 	}
 
-	best := &reflections[0]
+	bestIdx := 0
 	for i := 1; i < len(reflections); i++ {
-		if reflections[i].Context.priority() > best.Context.priority() {
-			best = &reflections[i]
+		if reflections[i].Context.priority() > reflections[bestIdx].Context.priority() {
+			bestIdx = i
 		}
 	}
-	return best
+	result := reflections[bestIdx]
+	return &result
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context.go` around lines 299 - 312, BestReflection
currently returns a pointer to a slice element (&reflections[i]) which can alias
the backing array; instead allocate a new ReflectionInfo, copy the chosen
element into it and return its address so the caller gets an independent copy.
Locate BestReflection and change the return to create a local var like selected
:= reflections[i] (or reflections[0] for the first) and return &selected; keep
the same logic comparing Context.priority() so behavior is unchanged.
pkg/fuzz/analyzers/xss/analyzer.go (1)
218-227: Character survival detection assumes specific canary+chars ordering.

The logic checks for sequential patterns like canary+"<" and canary+"<>". This works because canaryChars is <>"'/ and ApplyInitialTransformation appends it directly after the canary. The checks progressively verify each character's survival by checking cumulative prefixes.

However, the GreaterThan check (line 222) has redundant logic—it checks for both canary+"<>" and canary+">". The second check (canary+">") would match even if < was encoded but > wasn't, which may not be the intended semantic.
♻️ Clarify greater-than detection
 func detectCharacterSurvival(body string, canary string) CharacterSet {
 	return CharacterSet{
 		LessThan:     strings.Contains(body, canary+"<"),
-		GreaterThan:  strings.Contains(body, canary+"<>") || strings.Contains(body, canary+">"),
+		GreaterThan:  strings.Contains(body, canary+">"),
 		DoubleQuote:  strings.Contains(body, canary+`<>"`),
 		SingleQuote:  strings.Contains(body, canary+`<>"'`),
 		ForwardSlash: strings.Contains(body, canary+canaryChars), // full canary+chars survived
 	}
 }
This simplifies the check while still detecting if > survived (regardless of <).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 218 - 227, The GreaterThan
detection in detectCharacterSurvival is redundant and can produce misleading
matches; change the GreaterThan field to test only for the canary immediately
followed by '>' (i.e., strings.Contains(body, canary+">")) rather than checking
both canary+"<>" and canary+">", so GreaterThan reflects whether '>' survived
regardless of '<' and removes the cumulative-prefix check that can produce false
positives.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 218-227: The GreaterThan detection in detectCharacterSurvival is
redundant and can produce misleading matches; change the GreaterThan field to
test only for the canary immediately followed by '>' (i.e.,
strings.Contains(body, canary+">")) rather than checking both canary+"<>" and
canary+">", so GreaterThan reflects whether '>' survived regardless of '<' and
removes the cumulative-prefix check that can produce false positives.

In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 299-312: BestReflection currently returns a pointer to a slice
element (&reflections[i]) which can alias the backing array; instead allocate a
new ReflectionInfo, copy the chosen element into it and return its address so
the caller gets an independent copy. Locate BestReflection and change the return
to create a local var like selected := reflections[i] (or reflections[0] for the
first) and return &selected; keep the same logic comparing Context.priority() so
behavior is unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 20931324-7215-409d-bf89-d19a7bd5ea64

📥 Commits

Reviewing files that changed from the base of the PR and between 979c867 and 696a00e.

📒 Files selected for processing (8)

pkg/fuzz/analyzers/analyzers.go
pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/context_test.go
pkg/fuzz/analyzers/xss/types.go
pkg/protocols/http/http.go
pkg/protocols/http/request.go
pkg/protocols/http/request_fuzz.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/fuzz/analyzers/xss/context.go (1)

170-188: Optional: Consolidate duplicate text handling logic.

The RCDATA and regular text branches have identical logic. Consider consolidating:

Proposed refactor

-			} else if inRCDATA {
-				tag := ""
-				if len(tagStack) > 0 {
-					tag = tagStack[len(tagStack)-1]
-				}
-				reflections = append(reflections, ReflectionInfo{
-					Context: ContextHTMLText,
-					TagName: tag,
-				})
-			} else {
+			} else {
+				// Both RCDATA and regular text are HTML text contexts
 				tag := ""
 				if len(tagStack) > 0 {
 					tag = tagStack[len(tagStack)-1]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context.go` around lines 170 - 188, The two branches
handling inRCDATA and the plain text fallback contain identical logic; refactor
by merging them into a single branch: compute tag from tagStack (using tag := ""
and if len(tagStack) > 0 { tag = tagStack[len(tagStack)-1] }) once and then
append the ReflectionInfo with Context: ContextHTMLText and TagName: tag to
reflections, replacing the duplicated code in the inRCDATA and else blocks
(symbols: inRCDATA, tagStack, reflections, ReflectionInfo, ContextHTMLText,
TagName).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 276-297: detectAttrQuoting can mis-match substrings (e.g.,
"dataclass" vs "class"); change the single strings.Index lookup to loop
searching for attrAssign in rawLower and for each match verify a word-boundary
before the attr name (e.g., idx==0 or the preceding rune is not a letter, digit,
'-', '_' or ':', or is whitespace or '<'/'/'), only accepting the match when
that boundary check passes; use the updated index offset when calling
strings.Index again to continue searching; keep the rest of the logic (afterEq
checks and quote determinations) the same and reference the detectAttrQuoting
function, attrAssign, rawLower, idx and afterEq symbols.

---

Nitpick comments:
In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 170-188: The two branches handling inRCDATA and the plain text
fallback contain identical logic; refactor by merging them into a single branch:
compute tag from tagStack (using tag := "" and if len(tagStack) > 0 { tag =
tagStack[len(tagStack)-1] }) once and then append the ReflectionInfo with
Context: ContextHTMLText and TagName: tag to reflections, replacing the
duplicated code in the inRCDATA and else blocks (symbols: inRCDATA, tagStack,
reflections, ReflectionInfo, ContextHTMLText, TagName).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 05ae5fd8-2ae3-4c1f-8f06-aa66bfbcf24a

📥 Commits

Reviewing files that changed from the base of the PR and between 696a00e and 5804653.

📒 Files selected for processing (2)

pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/fuzz/analyzers/xss/analyzer.go

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 202-216: isExecutableScriptType incorrectly treats parameterized
types and an empty essence as executable and omits many legacy JS essences;
change it to reject any type string that contains a ';' (i.e., treat
parameterized types as non-executable), trim and lower-case the raw scriptType
into an essence but return false if the essence is empty, and replace the
current switch with an exact-match whitelist containing all MIME essences from
the spec: text/javascript, text/ecmascript, text/javascript1.0,
text/javascript1.1, text/javascript1.2, text/javascript1.3, text/javascript1.4,
text/javascript1.5, text/jscript, text/livescript, text/x-ecmascript,
text/x-javascript, application/javascript, application/ecmascript,
application/x-ecmascript, application/x-javascript, and the literal "module";
update the logic in isExecutableScriptType to only return true for exact matches
of those essences and false otherwise.
- Around line 92-94: The code currently promotes any attribute value beginning
with "javascript:" to ContextScript; restrict this to only executable URL
attributes by first checking the attribute name (e.g., attrName) against a
whitelist of navigation-target URL attributes ("href", "action", "formaction",
"src", "xlink:href") before setting ctx = ContextScript, following the existing
isEventHandler() pattern; implement the whitelist as a small helper or map
lookup and use case-insensitive comparison (strings.EqualFold) when matching the
attribute name so non-executable attributes like data-* or aria-* are not
upgraded to ContextScript.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ae5597c-f505-45c0-a399-588e1044a72e

📥 Commits

Reviewing files that changed from the base of the PR and between 5804653 and d8171af.

📒 Files selected for processing (2)

pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/context_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/fuzz/analyzers/xss/context_test.go

Gengyscan · 2026-03-13T21:26:54Z

Pushed commit 2ae3b64 addressing all CodeRabbit findings plus two additional issues we discovered:

CodeRabbit findings addressed:

detectAttrQuoting word boundary (Minor) — ✅ Already fixed in d8171af
javascript: URI gating (Major) — ✅ Added isScriptURLAttribute() whitelist
isExecutableScriptType MIME essence (Major) — ✅ Full 16-essence WHATWG whitelist + proper essence extraction

Additional fixes:
4. detectCharacterSurvival false negatives (Critical) — Characters were concatenated (canary<>"'/) so only < was ever independently detectable. Now each character is interleaved with its own canary prefix.
5. Template literal payload (High) — selectPayloads for ContextScriptString now includes ${alert(1)} for backtick template literal injection.

All fixes include corresponding test cases.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

pkg/fuzz/analyzers/xss/context.go (1)
315-322: Consider adding < to attribute boundary characters.

The boundary check handles common whitespace but may miss the case where the attribute immediately follows the tag name opening (e.g., <divclass=... - malformed but possible). Adding < would handle this edge case.

However, since properly formatted HTML will always have whitespace between the tag name and attributes, this is a minor edge case that likely won't occur in practice.
Optional fix for edge case
 func isAttrBoundary(ch byte) bool {
 	switch ch {
-	case ' ', '\t', '\n', '\r', '/':
+	case ' ', '\t', '\n', '\r', '/', '<':
 		return true
 	default:
 		return false
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context.go` around lines 315 - 322, The isAttrBoundary
function currently treats space, tab, newline, carriage return and '/' as
attribute boundary characters; update it to also consider the '<' character as a
boundary to catch malformed cases like "<divclass=..." by adding '<' to the
switch/case list in isAttrBoundary so it returns true for '<' as well.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 218-227: The detectCharacterSurvival function is checking for
canary+char but the canary is generated with the critical character before the
canary (char+canary); update detectCharacterSurvival (and the CharacterSet
fields LessThan, GreaterThan, DoubleQuote, SingleQuote, ForwardSlash) to search
for the pattern where the character precedes the canary (e.g., "<"+canary,
">"+canary, `"`+canary, '\''+canary, "/"+canary) so detection matches the actual
generated canary sequence; alternatively, if you prefer, change the canary
generation to place the character after the canary but be consistent between
canary creation and the detectCharacterSurvival checks.

In `@pkg/fuzz/analyzers/xss/types.go`:
- Around line 197-207: The scriptURLAttributes map currently includes the
"poster" key which leads to false positives; remove the "poster" entry from the
scriptURLAttributes map in pkg/fuzz/analyzers/xss/types.go (the map named
scriptURLAttributes) so poster is no longer treated as a javascript-executing
attribute, and run/update any affected tests or comments that assumed poster was
executable.

---

Nitpick comments:
In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 315-322: The isAttrBoundary function currently treats space, tab,
newline, carriage return and '/' as attribute boundary characters; update it to
also consider the '<' character as a boundary to catch malformed cases like
"<divclass=..." by adding '<' to the switch/case list in isAttrBoundary so it
returns true for '<' as well.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e639f060-119d-4285-802a-bff2c49a0c66

📥 Commits

Reviewing files that changed from the base of the PR and between d8171af and 2ae3b64.

📒 Files selected for processing (4)

pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/context_test.go
pkg/fuzz/analyzers/xss/types.go

coderabbitai · 2026-03-13T21:33:01Z

+// detectCharacterSurvival checks which XSS-critical characters survived server-side encoding
+func detectCharacterSurvival(body string, canary string) CharacterSet {
+	return CharacterSet{
+		LessThan:     strings.Contains(body, canary+"<"),
+		GreaterThan:  strings.Contains(body, canary+">"),
+		DoubleQuote:  strings.Contains(body, canary+`"`),
+		SingleQuote:  strings.Contains(body, canary+`'`),
+		ForwardSlash: strings.Contains(body, canary+"/"),
+	}
+}


⚠️ Potential issue | 🟠 Major

Bug: Character survival detection pattern doesn't match canary generation.

The canary is generated as canary + "<" + canary + ">" + ... which produces patterns like ABC<ABC>ABC".... However, detectCharacterSurvival looks for canary + ">" (i.e., ABC>) which doesn't exist in the generated string.

The actual pattern has >canary (e.g., >ABC), not canary> (e.g., ABC>).

Only LessThan and ForwardSlash will be detected correctly:

canary< → exists (ABC<ABC)

canary> → doesn't exist (pattern is >ABC, not ABC>)

canary" → doesn't exist (pattern is "ABC, not ABC")

canary' → doesn't exist (pattern is 'ABC, not ABC')

canary/ → exists (ABC/)

Proposed fix

Either change the canary generation to put the character after canary:

func (a *Analyzer) ApplyInitialTransformation(data string, params map[string]interface{}) string { if strings.Contains(data, "[XSS_CANARY]") { canary := generateCanary() if params != nil { params["xss_canary"] = canary } - // Each character gets its own canary prefix for independent detection. - canaryWithChars := canary + "<" + canary + ">" + canary + `"` + canary + "'" + canary + "/" + // Each character is immediately followed by canary for independent detection. + canaryWithChars := "<" + canary + ">" + canary + `"` + canary + "'" + canary + "/" + canary data = strings.ReplaceAll(data, "[XSS_CANARY]", canaryWithChars) }

And update detection to look for char + canary:

func detectCharacterSurvival(body string, canary string) CharacterSet { return CharacterSet{ - LessThan: strings.Contains(body, canary+"<"), - GreaterThan: strings.Contains(body, canary+">"), - DoubleQuote: strings.Contains(body, canary+`"`), - SingleQuote: strings.Contains(body, canary+`'`), - ForwardSlash: strings.Contains(body, canary+"/"), + LessThan: strings.Contains(body, "<"+canary), + GreaterThan: strings.Contains(body, ">"+canary), + DoubleQuote: strings.Contains(body, `"`+canary), + SingleQuote: strings.Contains(body, `'`+canary), + ForwardSlash: strings.Contains(body, "/"+canary), } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 218 - 227, The detectCharacterSurvival function is checking for canary+char but the canary is generated with the critical character before the canary (char+canary); update detectCharacterSurvival (and the CharacterSet fields LessThan, GreaterThan, DoubleQuote, SingleQuote, ForwardSlash) to search for the pattern where the character precedes the canary (e.g., "<"+canary, ">"+canary, `"`+canary, '\''+canary, "/"+canary) so detection matches the actual generated canary sequence; alternatively, if you prefer, change the canary generation to place the character after the canary but be consistent between canary creation and the detectCharacterSurvival checks.

Respectfully, this is a false positive. The interleaved canary pattern does work correctly.

The generated string is: canary<canary>canary"canary'canary/

For example with canary=ABC: ABC<ABC>ABC"ABC'ABC/

detectCharacterSurvival checks:

strings.Contains(body, canary+"<") — matches at position 0 (ABC<) ✅

strings.Contains(body, canary+">") — matches at position 4 (ABC>) ✅

strings.Contains(body, canary+'"') — matches at position 8 (ABC") ✅

strings.Contains(body, canary+"'") — matches at position 12 (ABC') ✅

strings.Contains(body, canary+"/") — matches at position 16 (ABC/) ✅

Each character is preceded by its own canary in the interleaved string, so canary + char is independently findable for all 5 characters. This was the whole point of the interleaving fix — the old code concatenated all chars after a single canary (canary<>"'/), making only < detectable.

@Gengyscan, you are absolutely right — I apologize for the incorrect review comment. Tracing through the generated string manually confirms your explanation:

With canary = "ABC", canaryWithChars = ABC<ABC>ABC"ABC'ABC/

Each canary + char pair is independently present:

ABC< at position 0 ✅

ABC> at position 4 ✅

ABC" at position 8 ✅

ABC' at position 12 ✅

ABC/ at position 16 ✅

The interleaving design is sound — each character is preceded by its own copy of the canary, so detectCharacterSurvival correctly finds all five patterns via canary + char. My original analysis was in error. This comment is a false positive.

✏️ Learnings added

Learnt from: Gengyscan Repo: projectdiscovery/nuclei PR: 7208 File: pkg/fuzz/analyzers/xss/analyzer.go:218-227 Timestamp: 2026-03-13T21:42:14.374Z Learning: In projectdiscovery/nuclei pkg/fuzz/analyzers/xss/analyzer.go, the XSS canary is generated as an interleaved string: `canary + "<" + canary + ">" + canary + '"' + canary + "'" + canary + "/"`. This means `detectCharacterSurvival` correctly uses `canary+char` patterns (e.g., `strings.Contains(body, canary+"<")`), because each special character is immediately preceded by its own copy of the canary in the generated string. This is intentional and correct — do not flag it as a mismatch between canary generation and detection.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: ayuxsec Repo: projectdiscovery/nuclei PR: 6769 File: pkg/installer/template.go:217-217 Timestamp: 2026-01-15T10:44:22.186Z Learning: In projectdiscovery/gologger, the Print() function writes to stderr and Silent() writes to stdout. During reviews, ensure code uses Print() for error/output going to stderr and Silent() for stdout; verify the chosen function matches the intended destination and that logs are directed to the correct stream for proper piping/redirects.

Learnt from: orbin123 Repo: projectdiscovery/nuclei PR: 6800 File: pkg/fuzz/analyzers/xss/analyzer.go:100-136 Timestamp: 2026-02-02T18:03:19.054Z Learning: In analyzers under pkg/fuzz/analyzers/, the Component interface is stateful and should be reused per request. For each request, call gr.Component.SetValue(gr.Key, payload) and then gr.Component.Rebuild() without cloning the component. This avoids state leakage; this pattern is used by existing analyzers (e.g., time_delay) and should be followed for all analyzers in this directory.

Learnt from: orbin123 Repo: projectdiscovery/nuclei PR: 6800 File: pkg/fuzz/analyzers/xss/context_detector.go:7-75 Timestamp: 2026-02-02T18:08:18.736Z Learning: In code reviews of the XSS context analyzer (pkg/fuzz/analyzers/xss/), preserve the design intent: use an exact-canary search during detection, because if the entire canary is heavily encoded there may be no exploitable reflection. The detection phase should only identify reflection points where the canary appears with sufficient unencoded characters to indicate exploitation potential. The implementation should follow the two-phase approach: 1) detect, 2) analyze available characters, 3) select payloads. Ensure changes do not alter this core design and verify this two-phase flow across files in pkg/fuzz/analyzers/xss/.

Learnt from: k-xeno Repo: projectdiscovery/nuclei PR: 6824 File: pkg/fuzz/parts.go:138-138 Timestamp: 2026-02-11T11:55:52.862Z Learning: In Go code related to fuzz analyzers (e.g., XSS analyzer in pkg/fuzz/analyzers), extract the canary from FuzzGenerated.Value (the final reflected string) rather than from GeneratedRequest.OriginalPayload. This ensures compatibility across all fuzzing modes, including KV-mode, and makes the canary source consistent and robust for all fuzz runs. Apply this pattern to all files under the pkg/fuzz directory that implement or reference canary extraction for fuzzed inputs.

@Gengyscan, you're absolutely right — I apologize for the incorrect analysis. Tracing through the full string makes it clear:

With canary ABC, the generated payload is:

ABC<ABC>ABC"ABC'ABC/

And detectCharacterSurvival correctly finds all five patterns:

Check Substring sought Present in string Result

LessThan ABC< at pos 0 ✅

GreaterThan ABC> at pos 4 ✅

DoubleQuote ABC" at pos 8 ✅

SingleQuote ABC' at pos 12 ✅

ForwardSlash ABC/ at pos 16 ✅

Each character is immediately preceded by its own copy of the canary, so canary+char is independently detectable for all five cases. My original comment was wrong. The implementation is correct.

✏️ Learnings added

Learnt from: Gengyscan Repo: projectdiscovery/nuclei PR: 7208 File: pkg/fuzz/analyzers/xss/analyzer.go:218-227 Timestamp: 2026-03-13T21:42:12.740Z Learning: In pkg/fuzz/analyzers/xss/analyzer.go (projectdiscovery/nuclei), the canary payload is generated as `canary+"<"+canary+">"+canary+'"'+canary+"'"+canary+"/"`, producing an interleaved string like `ABC<ABC>ABC"ABC'ABC/`. detectCharacterSurvival correctly searches for `canary+char` (e.g., `ABC<`, `ABC>`, `ABC"`, `ABC'`, `ABC/`), all of which exist independently in the interleaved string. Do not flag this as a mismatch between generation and detection — it is intentional and correct.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: ayuxsec Repo: projectdiscovery/nuclei PR: 6769 File: pkg/installer/template.go:217-217 Timestamp: 2026-01-15T10:44:22.186Z Learning: In projectdiscovery/gologger, the Print() function writes to stderr and Silent() writes to stdout. During reviews, ensure code uses Print() for error/output going to stderr and Silent() for stdout; verify the chosen function matches the intended destination and that logs are directed to the correct stream for proper piping/redirects.

Learnt from: orbin123 Repo: projectdiscovery/nuclei PR: 6800 File: pkg/fuzz/analyzers/xss/analyzer.go:100-136 Timestamp: 2026-02-02T18:03:19.054Z Learning: In analyzers under pkg/fuzz/analyzers/, the Component interface is stateful and should be reused per request. For each request, call gr.Component.SetValue(gr.Key, payload) and then gr.Component.Rebuild() without cloning the component. This avoids state leakage; this pattern is used by existing analyzers (e.g., time_delay) and should be followed for all analyzers in this directory.

Learnt from: orbin123 Repo: projectdiscovery/nuclei PR: 6800 File: pkg/fuzz/analyzers/xss/context_detector.go:7-75 Timestamp: 2026-02-02T18:08:18.736Z Learning: In code reviews of the XSS context analyzer (pkg/fuzz/analyzers/xss/), preserve the design intent: use an exact-canary search during detection, because if the entire canary is heavily encoded there may be no exploitable reflection. The detection phase should only identify reflection points where the canary appears with sufficient unencoded characters to indicate exploitation potential. The implementation should follow the two-phase approach: 1) detect, 2) analyze available characters, 3) select payloads. Ensure changes do not alter this core design and verify this two-phase flow across files in pkg/fuzz/analyzers/xss/.

Learnt from: k-xeno Repo: projectdiscovery/nuclei PR: 6824 File: pkg/fuzz/parts.go:138-138 Timestamp: 2026-02-11T11:55:52.862Z Learning: In Go code related to fuzz analyzers (e.g., XSS analyzer in pkg/fuzz/analyzers), extract the canary from FuzzGenerated.Value (the final reflected string) rather than from GeneratedRequest.OriginalPayload. This ensures compatibility across all fuzzing modes, including KV-mode, and makes the canary source consistent and robust for all fuzz runs. Apply this pattern to all files under the pkg/fuzz directory that implement or reference canary extraction for fuzzed inputs.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/types.go`:
- Around line 197-206: Remove the obsolete "codebase" key from the
scriptURLAttributes map literal in types.go: locate the scriptURLAttributes
variable and delete the "codebase": {} entry (and update the surrounding
comma/whitespace as needed), and optionally adjust the comment if it claims
these attributes execute javascript: URIs; run/update tests that might assert
scriptURLAttributes contents to avoid false positives.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8627cd1e-873b-416d-8d1a-56adc87815a2

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae3b64 and ad292c5.

📒 Files selected for processing (1)

pkg/fuzz/analyzers/xss/types.go

Gengyscan · 2026-03-13T22:12:51Z

Pushed commit 5a0fac6 addressing all 3 Neo security review findings:

1. data: URI XSS bypass (Critical) — Now detects data:text/html and data:image/svg+xml URIs in navigation-target attributes as ContextScript.

2. Tab/newline injection bypass (Major) — Added normalizeURIScheme() that strips internal ASCII whitespace (\t, \n, \r, \f) before scheme prefix matching, matching browser URL parsing behavior.

3. vbscript: scheme (Minor) — Added detection for legacy IE vbscript: URIs.

All fixes include tests (6 new test cases: 3 positive, 1 negative for data:, 2 whitespace variants).

Gengyscan · 2026-03-17T18:27:29Z

@coderabbitai resume

coderabbitai · 2026-03-17T18:27:36Z

✅ Actions performed

Reviews resumed.

Gengyscan · 2026-03-17T18:32:36Z

/claim #7086

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/fuzz/analyzers/xss/context_test.go (1)

739-781: Add a regression case for percent-encoded control-whitespace in scheme prefixes.

Please add a case like href="java%0ascript:...#nucleiXSScanary" (and %09) so decode+whitespace normalization behavior is locked by tests.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context_test.go` around lines 739 - 781, Update the
TestDetectReflections_EncodedSchemesAndXHTMLDataURI table to include regression
cases for percent-encoded control-whitespace in scheme prefixes: add at least
two new entries (e.g., name "percent-encoded LF in javascript scheme" with body
containing href="java%0ascript:alert(1)#nucleiXSScanary" and name
"percent-encoded TAB in javascript scheme" with
href="java%09script:alert(1)#nucleiXSScanary") and attrName "href"; keep them
inside the same loop so DetectReflections is called and the existing assertion
that a reflection with AttrName == "href" and Context == ContextScript still
holds. Ensure the marker "nucleiXSScanary" remains present in each body so the
detection logic triggers.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/types.go`:
- Around line 223-239: The normalizeURIScheme function can reintroduce
whitespace via percent-decoding (url.PathUnescape) which bypasses the initial
uriSchemeWhitespaceRemover; update normalizeURIScheme so that after decoding the
scheme prefix (decodedPrefix) you run the same whitespace removal and
trimming/lowercasing (using uriSchemeWhitespaceRemover.Replace,
strings.TrimSpace and strings.ToLower) on decodedPrefix before concatenating
with normalized[colon:], ensuring any decoded \t/\n/\r/\f characters are
stripped and the scheme check remains reliable.

---

Nitpick comments:
In `@pkg/fuzz/analyzers/xss/context_test.go`:
- Around line 739-781: Update the
TestDetectReflections_EncodedSchemesAndXHTMLDataURI table to include regression
cases for percent-encoded control-whitespace in scheme prefixes: add at least
two new entries (e.g., name "percent-encoded LF in javascript scheme" with body
containing href="java%0ascript:alert(1)#nucleiXSScanary" and name
"percent-encoded TAB in javascript scheme" with
href="java%09script:alert(1)#nucleiXSScanary") and attrName "href"; keep them
inside the same loop so DetectReflections is called and the existing assertion
that a reflection with AttrName == "href" and Context == ContextScript still
holds. Ensure the marker "nucleiXSScanary" remains present in each body so the
detection logic triggers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9582e0f-6fd0-4b5f-bd70-daea80ad9c7e

📥 Commits

Reviewing files that changed from the base of the PR and between 2ae3b64 and 5136b78.

📒 Files selected for processing (3)

pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/context_test.go
pkg/fuzz/analyzers/xss/types.go

)

…dependent char survival checks

…CDATA branch

- restrict javascript: URI promotion to executable URL attributes only (href, action, formaction, src, xlink:href, data, poster, codebase) - expand isExecutableScriptType to full WHATWG 16-essence whitelist + module, extract MIME essence before matching (text/javascript;charset=utf-8 is valid) - fix detectCharacterSurvival: interleave canary per character for independent detection instead of concatenating all critical chars - add template literal payload (dollar-brace interpolation) for ContextScriptString - add tests for all four fixes

…javascript: URIs in video poster The poster attribute on <video> specifies a preview image URL. Modern browsers do not execute javascript: URIs in the poster attribute - it is treated as a URL to an image resource, similar to <img src>. Including it caused false positives where reflections in poster attributes were incorrectly classified as executable script context. Addresses CodeRabbit review feedback.

…olver codebase on <object> is a base-URL resolver for relative URIs, not a navigation target. Modern browsers do not execute javascript: URIs in codebase. Per MDN, genuine javascript: navigation targets are href, action, formaction, and iframe src. Firefox bug #1973900 confirms codebase is effectively dead. Addresses CodeRabbit follow-up suggestion.

Address Neo security review findings: 1. Tab/newline injection bypass (CWE-79): Browsers strip ASCII whitespace (tab, LF, CR) from URI schemes during parsing, so 'java\tscript:' executes as 'javascript:'. Added normalizeURIScheme() helper that strips internal control whitespace before prefix check. 2. data: URI XSS bypass (CWE-79): data:text/html and data:image/svg+xml URIs execute scripts when loaded in navigation-target attributes (iframe src, object data, a href). Now detected as ContextScript. 3. vbscript: URI scheme (CWE-79): Added detection for vbscript: scheme used in IE/legacy enterprise environments. Tests added for all three vectors plus negative case (data:image/png should not be classified as script context).

…l+xml detection - normalizeURIScheme now decodes percent-encoded prefix via url.PathUnescape - Added application/xhtml+xml to executable data: URI MIME types - Added 3 tests for encoded schemes and XHTML data URIs Addresses neo-bot review: percent-encoded javascript:/data: bypass (CWE-79) and missing xhtml+xml MIME type detection.

…IScheme

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 79-82: The current canary presence and survival checks use
case-sensitive string containment (e.g., checking if body contains canary via
strings.Contains), which can miss reflections that change casing; update those
checks (the initial "if !strings.Contains(body, canary)" check and the later
survival/reflection checks around the canary comparison) to perform
case-insensitive containment comparisons (for example by normalizing both body
and canary to the same case with strings.ToLower before using strings.Contains,
or using strings.ContainsFold if your Go version supports it) so detection does
not fail when the reflected canary’s case is altered.
- Around line 171-173: The CSP note currently checks
hasCSP(options.ResponseHeaders) which can reflect the initial response; change
this to check the headers from the replay response instead (call hasCSP on the
replay response headers variable used elsewhere in this analyzer, e.g.,
replay.ResponseHeaders or replayHeaders) inside the same if block so the CSP
note reflects the replay response rather than options.ResponseHeaders.

In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 96-99: The code currently uses strings.Contains on the entire
data: URI (normalizedVal) which can match MIME substrings in the payload body;
instead extract only the media type portion immediately after "data:" (take the
substring after "data:", split on the first ',' to isolate the mediatype+params,
then split on ';' to drop parameters) and perform a case-insensitive
exact/prefix match against the known executable types ("text/html",
"image/svg+xml", "application/xhtml+xml") before setting ctx = ContextScript;
update the logic around normalizedVal (the data: branch) to use this parsed
mediaType and use strings.EqualFold or strings.HasPrefix as appropriate to avoid
misclassifying payload contents.
- Around line 289-321: detectAttrQuoting currently searches for attrName + "="
literally, missing cases like "href = ..." and misclassifying context; update
the search to locate the attrName using rawLower and isAttrBoundary for the char
before the name, then after finding attrName advance past the name and any
intervening whitespace, ensure the next non-space char is '=', then advance past
'=' and any whitespace to inspect the first char after the equals for '"' or
'\''; keep using rawToken, rawLower, attrName, isAttrBoundary and return the
same byte/boolean semantics when quoted vs unquoted.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3a2d7f0e-e337-47b2-a2f7-762a0149de70

📥 Commits

Reviewing files that changed from the base of the PR and between 52f1d86 and d826c01.

📒 Files selected for processing (8)

pkg/fuzz/analyzers/analyzers.go
pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/context_test.go
pkg/fuzz/analyzers/xss/types.go
pkg/protocols/http/http.go
pkg/protocols/http/request.go
pkg/protocols/http/request_fuzz.go

🚧 Files skipped from review as they are similar to previous changes (3)

pkg/protocols/http/request_fuzz.go
pkg/fuzz/analyzers/xss/types.go
pkg/fuzz/analyzers/xss/context_test.go

coderabbitai · 2026-03-18T04:00:33Z

+		if hasCSP(options.ResponseHeaders) {
+			details += " [note: CSP header present, may limit exploitability]"
+		}


⚠️ Potential issue | 🟡 Minor

CSP note should use replay response headers, not the initial response headers.

At Line 171, hasCSP(options.ResponseHeaders) can report stale/mismatched CSP status for the replay request.

🔧 Proposed fix

- if hasCSP(options.ResponseHeaders) { + if hasCSP(resp.Header) { details += " [note: CSP header present, may limit exploitability]" }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if hasCSP(options.ResponseHeaders) {

details += " [note: CSP header present, may limit exploitability]"

}

if hasCSP(resp.Header) {

details += " [note: CSP header present, may limit exploitability]"

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 171 - 173, The CSP note currently checks hasCSP(options.ResponseHeaders) which can reflect the initial response; change this to check the headers from the replay response instead (call hasCSP on the replay response headers variable used elsewhere in this analyzer, e.g., replay.ResponseHeaders or replayHeaders) inside the same if block so the CSP note reflects the replay response rather than options.ResponseHeaders.

…survival Addresses CodeRabbit review on commit d826c01: - Early exit: strings.ToLower on both body and canary - detectCharacterSurvival: use bodyL/canaryL for case-insensitive comparisons

…uoting detection Addresses CodeRabbit review on commit d826c01: - data: URI: parse media-type before comma/semicolon; prevents false positives when payload data contains MIME strings (e.g. data:text/plain,...text/html...) - detectAttrQuoting: skip optional whitespace between attr name and = so href = "val" is correctly identified as double-quoted

Gengyscan · 2026-03-18T18:37:43Z

All 4 findings from the latest CodeRabbit review (commit d826c01) have been addressed:

analyzer.go — commit 66cf8d9

Early exit case-insensitive: strings.ToLower applied to both body and canary before the strings.Contains check.
detectCharacterSurvival case-insensitive: introduced bodyL/canaryL lowercase vars; all five character checks (<, >, ", ', /) now use them.

context.go — commit c554dbc
3. data: URI MIME parsing: media-type is extracted before the first , or ;, preventing false positives when the payload data portion contains a MIME string like text/html.
4. detectAttrQuoting whitespace: the search loop now skips optional whitespace between the attribute name and =, so href = "val" is handled correctly.

Minor (CSP): Properly threading replay response headers through replayAndVerify requires a broader refactor; tracked as a follow-up to keep this PR focused.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

pkg/fuzz/analyzers/xss/analyzer.go (1)

171-173: ⚠️ Potential issue | 🟡 Minor

Use replay response headers for CSP note, not initial response headers.

Line [171] checks hasCSP(options.ResponseHeaders), which may not match the replayed request that confirmed reflection.

🔧 Proposed fix

-		if hasCSP(options.ResponseHeaders) {
+		if hasCSP(resp.Header) {
 			details += " [note: CSP header present, may limit exploitability]"
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 171 - 173, The CSP note
currently checks hasCSP(options.ResponseHeaders) which inspects the initial
response headers; change this to check the headers from the replayed/confirmed
response instead (e.g., call hasCSP on the replay response object such as
replayResponse.ResponseHeaders or replayResp.Headers used in the confirmation
step) so the "[note: CSP header present...]" annotation reflects the actual
headers seen when the reflection was confirmed.

pkg/fuzz/analyzers/xss/context.go (1)

306-313: ⚠️ Potential issue | 🟠 Major

detectAttrQuoting still misclassifies quoted attrs when whitespace appears after =.

Line [322] inspects the byte immediately after =, so inputs like href = "x" get treated as unquoted. This can downgrade context detection and payload choice.

🔧 Proposed fix (skip full ASCII whitespace around and after `=`)

-			for i < len(rawLower) && (rawLower[i] == ' ' || rawLower[i] == '\t') {
+			for i < len(rawLower) && (rawLower[i] == ' ' || rawLower[i] == '\t' || rawLower[i] == '\n' || rawLower[i] == '\r' || rawLower[i] == '\f') {
 				i++
 			}
@@
 	afterEq := idx + eqOffset + 1
+	for afterEq < len(rawLower) && (rawLower[afterEq] == ' ' || rawLower[afterEq] == '\t' || rawLower[afterEq] == '\n' || rawLower[afterEq] == '\r' || rawLower[afterEq] == '\f') {
+		afterEq++
+	}
 	if afterEq >= len(rawToken) {
 		return '"', false
 	}

Also applies to: 322-333

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context.go` around lines 306 - 313, detectAttrQuoting
misclassifies attributes with spaces after '=' (e.g. `href = "x"`) because it
only skips whitespace before '=' but then checks the byte immediately after '='
for a quote; update detectAttrQuoting to, once it finds '=' (using variables
rawLower, absPos, attrLower and setting idx and eqOffset), advance a new pointer
(e.g. j) past any ASCII whitespace (spaces and tabs) after the '=' before
inspecting for a quote character, and ensure eqOffset still points to the '='
position relative to absPos while quote detection uses the post-'=' pointer so
quoted attributes with spaces around '=' are detected correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 131-143: The code mutates component state with
gr.Component.SetValue(gr.Key, payload) but only registers the restore defer
after gr.Component.Rebuild(), so if Rebuild() fails the mutated state leaks; to
fix, install the defer to restore the original value (using
gr.Component.SetValue(gr.Key, gr.OriginalValue)) immediately after a successful
SetValue call and before calling gr.Component.Rebuild(), so the original state
is always restored even when Rebuild() returns an error.

In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 24-25: Remove the dead inRCDATA state: delete the inRCDATA
variable declaration and remove the assignments to it (the writes that set
inRCDATA) in pkg/fuzz/analyzers/xss/context.go, and ensure any html.TextToken
handling does not reference or expect that state (adjust the html.TextToken
branch to use the existing context state instead). This eliminates the unused
local that causes a compile error while preserving correct token handling.

---

Duplicate comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 171-173: The CSP note currently checks
hasCSP(options.ResponseHeaders) which inspects the initial response headers;
change this to check the headers from the replayed/confirmed response instead
(e.g., call hasCSP on the replay response object such as
replayResponse.ResponseHeaders or replayResp.Headers used in the confirmation
step) so the "[note: CSP header present...]" annotation reflects the actual
headers seen when the reflection was confirmed.

In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 306-313: detectAttrQuoting misclassifies attributes with spaces
after '=' (e.g. `href = "x"`) because it only skips whitespace before '=' but
then checks the byte immediately after '=' for a quote; update detectAttrQuoting
to, once it finds '=' (using variables rawLower, absPos, attrLower and setting
idx and eqOffset), advance a new pointer (e.g. j) past any ASCII whitespace
(spaces and tabs) after the '=' before inspecting for a quote character, and
ensure eqOffset still points to the '=' position relative to absPos while quote
detection uses the post-'=' pointer so quoted attributes with spaces around '='
are detected correctly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ba2edd6-f7d8-4023-8957-781a7fd27816

📥 Commits

Reviewing files that changed from the base of the PR and between d826c01 and c554dbc.

📒 Files selected for processing (2)

pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go

- analyzer.go: move defer restore before Rebuild() to prevent component state mutation leaks when Rebuild fails early (finding projectdiscovery#6) - context.go: remove inRCDATA dead variable and its associated writes; both RCDATA and regular text are correctly handled as ContextHTMLText by the existing else branch (finding projectdiscovery#7) - types.go: remove rcdataElements map, now unreferenced after inRCDATA dead code removal - findings projectdiscovery#1-projectdiscovery#5 (isExecutableScriptType MIME whitelist, normalizeURIScheme decode whitespace strip, case-insensitive canary checks, data: URI media-type parsing, detectAttrQuoting word boundary) were already addressed in prior commits

Gengyscan · 2026-03-18T19:29:52Z

Addressed all remaining CodeRabbit findings in commit abf46b5e:

Already fixed (from previous commits, CR comments referenced stale code):

isExecutableScriptType — full MIME whitelist + param stripping via ; split ✓
normalizeURIScheme — whitespace re-stripped after percent-decode ✓
Case-insensitive canary — strings.ToLower on both early exit and detectCharacterSurvival ✓
data: URI — media-type parsed before ,/; only ✓
detectAttrQuoting — word boundary check + whitespace tolerance around = ✓

Fixed in this commit:

analyzer.go:143 — moved defer restore immediately after SetValue, before Rebuild(). A Rebuild() failure no longer leaks the mutated component state.
context.go:25 — removed dead inRCDATA variable (written but never read) and unused rcdataElements map from types.go.

All tests pass (go test ./pkg/fuzz/... -timeout 120s).

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

pkg/fuzz/analyzers/xss/analyzer.go (1)

171-173: ⚠️ Potential issue | 🟡 Minor

Use replay response headers for CSP note.

This still checks options.ResponseHeaders (initial response), so the note can be stale/mismatched for the replay request result.

🔧 Proposed fix

-		if hasCSP(options.ResponseHeaders) {
+		if hasCSP(resp.Header) {
 			details += " [note: CSP header present, may limit exploitability]"
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/analyzer.go` around lines 171 - 173, The CSP note
currently inspects the initial request headers via options.ResponseHeaders;
change it to inspect the headers from the replay result so the note reflects the
replayed response. Replace the hasCSP(options.ResponseHeaders) check and the
details append to use the replay response headers variable (the replay/result
response object used in this analyzer, e.g. replayResp.ResponseHeaders or
replayResult.ResponseHeaders) while keeping the hasCSP(...) call and the details
string unchanged otherwise so the message accurately reflects the replay
outcome.

pkg/fuzz/analyzers/xss/context.go (1)

298-303: ⚠️ Potential issue | 🟠 Major

Quoted attributes are still misdetected when whitespace follows =.

detectAttrQuoting handles spaces before =, but it reads the byte immediately after = without skipping whitespace. Cases like href = "x" are treated as unquoted, which can downgrade context accuracy and payload choice.

🔧 Proposed fix

 	for {
 		pos := strings.Index(rawLower[searchStart:], attrLower)
 		if pos < 0 {
 			break
 		}
 		absPos := searchStart + pos
 		if absPos == 0 || isAttrBoundary(rawLower[absPos-1]) {
 			// skip optional whitespace after attr name, then require =
 			i := absPos + len(attrLower)
-			for i < len(rawLower) && (rawLower[i] == ' ' || rawLower[i] == '\t') {
+			for i < len(rawLower) && (rawLower[i] == ' ' || rawLower[i] == '\t' || rawLower[i] == '\n' || rawLower[i] == '\r' || rawLower[i] == '\f') {
 				i++
 			}
 			if i < len(rawLower) && rawLower[i] == '=' {
 				idx = absPos
 				eqOffset = i - absPos
 				break
 			}
 		}
 		searchStart = absPos + 1
 	}
@@
 	afterEq := idx + eqOffset + 1
+	for afterEq < len(rawToken) && (rawToken[afterEq] == ' ' || rawToken[afterEq] == '\t' || rawToken[afterEq] == '\n' || rawToken[afterEq] == '\r' || rawToken[afterEq] == '\f') {
+		afterEq++
+	}
 	if afterEq >= len(rawToken) {
 		return '"', false
 	}

Also applies to: 312-317

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/fuzz/analyzers/xss/context.go` around lines 298 - 303, detectAttrQuoting
currently skips whitespace before '=' but when it finds '=' it immediately
inspects the next byte and misclassifies cases like `href = "x"` as unquoted;
update the logic in detectAttrQuoting to advance i past any spaces/tabs after
the '=' before inspecting the next byte so it correctly detects quoted
attributes (i.e., after setting idx and eqOffset, skip whitespace: while i <
len(rawLower) && (rawLower[i] == ' ' || rawLower[i] == '\t') { i++ } then check
rawLower[i] for quotes), and apply the same whitespace-after-equals fix to the
analogous check around the other occurrence referenced (the block at the second
spot ~312-317) so both places treat `=` followed by spaces then a quote as
quoted attributes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 207-225: The code currently strips parameters from the MIME type
(variable normalized) before checking types, causing inputs like "module;foo" to
be treated as the special non-MIME token "module"; to fix, perform an ASCII
case-insensitive check for the exact token "module" (e.g.,
strings.EqualFold(normalized, "module") or compare lowercased) immediately
before the block that trims parameters and, if it matches, return true; only
then proceed to strip parameters and run the existing switch that lists MIME
types.

---

Duplicate comments:
In `@pkg/fuzz/analyzers/xss/analyzer.go`:
- Around line 171-173: The CSP note currently inspects the initial request
headers via options.ResponseHeaders; change it to inspect the headers from the
replay result so the note reflects the replayed response. Replace the
hasCSP(options.ResponseHeaders) check and the details append to use the replay
response headers variable (the replay/result response object used in this
analyzer, e.g. replayResp.ResponseHeaders or replayResult.ResponseHeaders) while
keeping the hasCSP(...) call and the details string unchanged otherwise so the
message accurately reflects the replay outcome.

In `@pkg/fuzz/analyzers/xss/context.go`:
- Around line 298-303: detectAttrQuoting currently skips whitespace before '='
but when it finds '=' it immediately inspects the next byte and misclassifies
cases like `href = "x"` as unquoted; update the logic in detectAttrQuoting to
advance i past any spaces/tabs after the '=' before inspecting the next byte so
it correctly detects quoted attributes (i.e., after setting idx and eqOffset,
skip whitespace: while i < len(rawLower) && (rawLower[i] == ' ' || rawLower[i]
== '\t') { i++ } then check rawLower[i] for quotes), and apply the same
whitespace-after-equals fix to the analogous check around the other occurrence
referenced (the block at the second spot ~312-317) so both places treat `=`
followed by spaces then a quote as quoted attributes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 829bb8ae-e546-4ac6-a2df-b0d0e7cbf4c0

📥 Commits

Reviewing files that changed from the base of the PR and between c554dbc and abf46b5.

📒 Files selected for processing (3)

pkg/fuzz/analyzers/xss/analyzer.go
pkg/fuzz/analyzers/xss/context.go
pkg/fuzz/analyzers/xss/types.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/fuzz/analyzers/xss/types.go

coderabbitai · 2026-03-18T19:36:22Z

+	// Extract MIME essence by stripping parameters after ';'.
+	// Per WHATWG spec, "text/javascript; charset=utf-8" has essence "text/javascript" → executable.
+	if idx := strings.IndexByte(normalized, ';'); idx >= 0 {
+		normalized = strings.TrimSpace(normalized[:idx])
+		if normalized == "" {
+			return false // e.g. ";charset=utf-8" — invalid MIME type, not executable
+		}
+	}
+
+	switch normalized {
+	case "text/javascript", "text/ecmascript",
+		"text/javascript1.0", "text/javascript1.1", "text/javascript1.2",
+		"text/javascript1.3", "text/javascript1.4", "text/javascript1.5",
+		"text/jscript", "text/livescript",
+		"text/x-ecmascript", "text/x-javascript",
+		"application/javascript", "application/ecmascript",
+		"application/x-ecmascript", "application/x-javascript",
+		"module":
+		return true


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

According to the WHATWG HTML Standard, is <script type="module;foo">executable, or must the value be an exact ASCII case-insensitive match ofmodule?

💡 Result:

According to the WHATWG HTML Standard, <script type="module;foo"> is NOT executable as a module script. The value must be an exact ASCII case-insensitive match of "module".

Citations:

1: https://html.spec.whatwg.org/dev/scripting.html

2: https://html.spec.whatwg.org/multipage/scripting.html

3: https://html.spec.whatwg.org/multipage/scripting-1.html

🏁 Script executed:

cat -n pkg/fuzz/analyzers/xss/context.go | sed -n '190,230p'

Repository: projectdiscovery/nuclei

Length of output: 1578

type="module;foo" is incorrectly classified as executable.

The current code universally strips parameters after ; before checking the script type. For MIME types like "text/javascript; charset=utf-8", this is correct. However, module is not a MIME type—according to the WHATWG HTML Standard, it requires an exact ASCII case-insensitive match. The current logic would incorrectly classify type="module;foo" as executable when it is not.

Fix: Check for "module" before stripping parameters.

Proposed fix

func isExecutableScriptType(hasType bool, scriptType string) bool { if !hasType { return true } normalized := strings.ToLower(strings.TrimSpace(scriptType)) if normalized == "" { return true } + // Module scripts require exact token match. + if normalized == "module" { + return true + } // Extract MIME essence by stripping parameters after ';'. // Per WHATWG spec, "text/javascript; charset=utf-8" has essence "text/javascript" → executable. if idx := strings.IndexByte(normalized, ';'); idx >= 0 { normalized = strings.TrimSpace(normalized[:idx]) if normalized == "" { return false // e.g. ";charset=utf-8" — invalid MIME type, not executable } } switch normalized { case "text/javascript", "text/ecmascript", "text/javascript1.0", "text/javascript1.1", "text/javascript1.2", "text/javascript1.3", "text/javascript1.4", "text/javascript1.5", "text/jscript", "text/livescript", "text/x-ecmascript", "text/x-javascript", "application/javascript", "application/ecmascript", - "application/x-ecmascript", "application/x-javascript", - "module": + "application/x-ecmascript", "application/x-javascript": return true default: return false } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/fuzz/analyzers/xss/context.go` around lines 207 - 225, The code currently strips parameters from the MIME type (variable normalized) before checking types, causing inputs like "module;foo" to be treated as the special non-MIME token "module"; to fix, perform an ASCII case-insensitive check for the exact token "module" (e.g., strings.EqualFold(normalized, "module") or compare lowercased) immediately before the block that trims parameters and, if it matches, return true; only then proceed to strip parameters and run the existing switch that lists MIME types.

Mzack9999 · 2026-03-20T14:36:42Z

Closing this PR as the edge cases reported in Issue #7086 targeted the XSS context analyzer from PR #7076, which was not merged. The XSS context analyzer has been implemented via PR #7164 which correctly handles all reported edge cases by design. This PR also includes extensive reformatting of core HTTP protocol files. Thank you for your contribution.

auto-assign Bot requested a review from dwisiswant0 March 13, 2026 18:48

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/context.go

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

Comment thread pkg/fuzz/analyzers/xss/context.go

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Gengyscan commented Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/types.go Outdated

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/types.go Outdated

neo-by-projectdiscovery-dev Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

neo-by-projectdiscovery-dev Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/context.go

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

coderabbitai Bot reviewed Mar 17, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/types.go Outdated

Gengyscan and others added 9 commits March 18, 2026 03:38

fix(xss): fix 4 context classification edge cases (projectdiscovery#7086

ba7eacc

)

refactor: address review nitpicks — copy-return in BestReflection, in…

4541511

…dependent char survival checks

fix(xss): word boundary check in detectAttrQuoting, merge duplicate R…

d895bd0

…CDATA branch

fix: properly handle whitespace after percent decoding in normalizeUR…

d826c01

…IScheme

Gengyscan force-pushed the fix/xss-context-edge-cases branch from 52f1d86 to d826c01 Compare March 18, 2026 03:51

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

Gengyscan added 2 commits March 18, 2026 19:37

fix(xss): case-insensitive canary checks in early exit and character …

66cf8d9

…survival Addresses CodeRabbit review on commit d826c01: - Early exit: strings.ToLower on both body and canary - detectCharacterSurvival: use bodyL/canaryL for case-insensitive comparisons

Gengyscan mentioned this pull request Mar 18, 2026

XSS Context Analyzer misclassifies javascript: URIs and JSON script blocks #7086

Closed

1 task

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

Comment thread pkg/fuzz/analyzers/xss/analyzer.go Outdated

Comment thread pkg/fuzz/analyzers/xss/context.go Outdated

coderabbitai Bot reviewed Mar 18, 2026

View reviewed changes

Mzack9999 closed this Mar 20, 2026

mendarb mentioned this pull request Mar 20, 2026

fix(fuzz): harden XSS context analyzer edge cases (#7086) #7279

Closed

5 tasks

Check	Substring sought	Present in string	Result
`LessThan`	`ABC<`	at pos 0	✅
`GreaterThan`	`ABC>`	at pos 4	✅
`DoubleQuote`	`ABC"`	at pos 8	✅
`SingleQuote`	`ABC'`	at pos 12	✅
`ForwardSlash`	`ABC/`	at pos 16	✅

Conversation

Gengyscan commented Mar 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: 4 XSS Context Analyzer Edge Cases

Changes

1. javascript: URIs in href/action/formaction

2. JSON script blocks

Uh oh!

neo-by-projectdiscovery-dev Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Neo - PR Security Review

Highlights

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Gengyscan commented Mar 13, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gengyscan Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Gengyscan Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Gengyscan commented Mar 13, 2026

Uh oh!

Uh oh!

Uh oh!

Gengyscan commented Mar 17, 2026

Uh oh!

coderabbitai Bot commented Mar 17, 2026

Uh oh!

Gengyscan commented Mar 17, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Gengyscan commented Mar 13, 2026 •

edited by coderabbitai Bot

Loading

neo-by-projectdiscovery-dev Bot commented Mar 13, 2026 •

edited

Loading

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading

coderabbitai Bot Mar 13, 2026 •

edited

Loading