Replaces github.com/BishopFox/jsluice (which depends on github.com/smacker/go-tree-sitter, a CGO library) with gotreesitter, a pure-Go tree-sitter runtime that requires no C compiler or CGO toolchain.
Resolves #1367
| File | Change |
|---|---|
go.mod |
Removed jsluice + go-tree-sitter; added gotreesitter (pure Go) |
pkg/utils/jsextract.go |
New: tree-sitter AST-based JS endpoint extraction (fetch, XHR, jQuery, location, window.open, import, string literals) |
pkg/utils/jsluice.go |
Rewritten: same ExtractJsluiceEndpoints API, now calls pure-Go extractor |
pkg/utils/jsluice_test.go |
Removed build tags (works everywhere now) |
pkg/utils/jsextract_test.go |
New: comprehensive tests + benchmarks |
pkg/engine/parser/parser_generic.go |
Removed //go:build !(386 || windows) constraint |
pkg/engine/parser/parser_nojs.go |
Deleted (no longer needed — pure Go works on all platforms) |
Other attempts at this issue used goja (a JS interpreter) or regex patterns. This PR uses a proper tree-sitter JavaScript grammar parsed by a pure-Go runtime, which:
All existing katana tests pass:
ok github.com/projectdiscovery/katana/pkg/engine/parser 0.013s
ok github.com/projectdiscovery/katana/pkg/utils 0.017s
(all packages pass — full output omitted for brevity)
Performance benchmarks — gotreesitter (pure Go) vs jsluice (CGO):
Same machine, same inputs (Intel Core Ultra 9 285, Linux amd64):
| Input | gotreesitter (preloaded) | jsluice (CGO) | Speedup | Alloc Reduction |
|---|---|---|---|---|
| Small (430B) | 216μs / 190 allocs / 14KB | 408μs / 1,036 allocs / 44KB | 1.9x faster | 5.5x fewer allocs |
| Medium (1.5KB) | 1.04ms / 575 allocs / 47KB | 6.77ms / 2,396 allocs / 106KB | 6.5x faster | 4.2x fewer allocs |
| Large (8.1KB) | 10.4ms / 2,641 allocs / 266KB | 33.1ms / 11,775 allocs / 510KB | 3.2x faster | 4.5x fewer allocs |
Correctness parity with jsluice:
| Input | Overlap | go-only | jsluice-only | Notes |
|---|---|---|---|---|
| Small (430B) | 11 | 0 | 0 | 100% match |
| Medium (1.5KB) | 22 | 0 | 6 | see note below |
| Large (8.1KB) | 85 | 0 | 31 | see note below |
Note on jsluice-only URLs: All jsluice-only results are string concatenation fragments — e.g., jsluice extracts
"/api/resource/"from"/api/resource/" + id, while gotreesitter resolves the full expression as"/api/resource/EXPR"(with a placeholder for the dynamic part). gotreesitter finds every URL jsluice finds (0 go-only), plus provides resolved concatenation context. The fragment-vs-resolved behavior is a design choice — happy to adjust if the team prefers fragment extraction.
Cross-platform: now works everywhere
Previously, jsluice functionality was disabled on Windows and 386 via build tags (//go:build !(386 || windows)). With pure Go, the --jsluice flag now works on all platforms — no CGO, no C compiler, no cross-compilation toolchain needed.
/claim #1367
Oscar Villavicencio
@odvcencio
ProjectDiscovery
@projectdiscovery
Hermes1118
@dalledajay-gmail-com