This PR provides a complete, production-hardened fix for issue #819 where tlsx hangs indefinitely during long scans (~25k+ targets), resulting in truncated JSON output and resource exhaustion.
Unlike previous fixes that addressed only the core handshake timeout, this solution guarantees:
After analyzing the original issue and existing PR attempts (#886, #926, #938), we identified four distinct timeout bugs that collectively caused the hang:
ztls.tlsHandshakeWithTimeout (CRITICAL)Partially fixed by PR #938, but goroutine leak remained
// BEFORE (PR #938): errChan not drained in all paths
select {
case <-ctx.Done():
_ = rawConn.Close()
return err // ← Goroutine still blocked on Handshake()
case err := <-errChan:
// ...
}
// AFTER: Explicit drain prevents accumulation
select {
case <-ctx.Done():
_ = rawConn.Close()
<-errChan // ← Always drain to prevent leak
return err
case err := <-errChan:
// ...
}
Why it matters: Over 25k targets, even a 1% leak rate = 250 orphaned goroutines. Combined with other leak paths, this exhausted resources.
NEW FIX - This PR only
// BEFORE: defer cancel() in loop = leak
for _, v := range toEnumerate {
ctx, cancel := context.WithTimeout(context.TODO(), timeout)
defer cancel() // ← Never executed until function returns!
// ...
}
// AFTER: Immediate cancel per iteration
for _, v := range toEnumerate {
ctx, cancel := context.WithTimeout(context.TODO(), timeout)
// ... operation ...
cancel() // ← Explicit call, not defer
}
NEW FIX - This PR only
// BEFORE: context.TODO() never times out
conn, err := pool.Acquire(context.TODO())
// AFTER: Timeout context prevents indefinite block
ctx, cancel := context.WithTimeout(context.Background(), timeout)
conn, err := pool.Acquire(ctx)
ENHANCED in this PR with flush guarantee
// BEFORE: Early return on Flush() error = fd leak
func (w *fileWriter) Close() error {
if err := w.writer.Flush(); err != nil {
return err // ← File never closed!
}
return w.file.Close()
}
// AFTER: Always close file, report flush error
func (w *fileWriter) Close() error {
flushErr := w.writer.Flush()
w.file.Sync()
closeErr := w.file.Close()
if flushErr != nil {
return flushErr // ← File closed, error reported
}
return closeErr
}
Why it matters: The original issue showed output ending mid-JSON:
{"subject_cn":
This happened because buffered data was never flushed when the process hung. Our fix ensures every line is complete before exit.
pkg/tlsx/ztls/ztls.gotlsHandshakeWithTimeout with guaranteed errChan draincontext.TODO())pkg/tlsx/tls/tls.goHandshakeContext() with per-attempt timeoutpkg/tlsx/openssl/openssl.gocancel() called immediately, not deferredpkg/tlsx/jarm/jarm.gopkg/output/file_writer.goThis PR includes 5 comprehensive tests that verify timeout behavior under realistic conditions:
TestHandshakeTimeoutWithUnresponsiveServer (ztls)Simulates hosts that accept connection but never respond.
TestHandshakeTimeoutWithSlowServer (ztls)Exact reproduction of issue #819: Server reads ClientHello but never sends response.
TestGoroutineCleanupOnTimeout (ztls)5 consecutive timeout scenarios - verifies no goroutine accumulation.
TestHandshakeContextTimeoutWithUnresponsiveServer (tls)Verifies ctls client respects timeout.
TestGoroutineCleanupOnHandshakeTimeout (tls)High-concurrency cleanup test - verifies no leaks after repeated timeouts.
# Build succeeds
$ go build -v -ldflags '-s -w' -o "tlsx" cmd/tlsx/main.go
# All timeout tests pass
$ go test -v ./pkg/tlsx/tls/... ./pkg/tlsx/ztls/... -run "TestHandshake|TestGoroutine" -timeout 120s
=== RUN TestHandshakeTimeoutWithUnresponsiveServer
handshake_timeout_test.go:74: handshake correctly timed out after 2.001319291s
--- PASS: TestHandshakeTimeoutWithUnresponsiveServer (2.00s)
=== RUN TestHandshakeTimeoutWithSlowServer
handshake_timeout_test.go:132: slow-server handshake correctly timed out after 2.001091667s
--- PASS: TestHandshakeTimeoutWithSlowServer (2.00s)
=== RUN TestGoroutineCleanupOnTimeout
handshake_timeout_test.go:194: goroutine cleanup verified - no leaks detected
--- PASS: TestGoroutineCleanupOnTimeout (2.61s)
=== RUN TestHandshakeContextTimeoutWithUnresponsiveServer
handshake_timeout_test.go:70: handshake correctly timed out after 2.001146125s
--- PASS: TestHandshakeContextTimeoutWithUnresponsiveServer (2.00s)
=== RUN TestGoroutineCleanupOnHandshakeTimeout
handshake_timeout_test.go:188: goroutine cleanup verified - no leaks detected
--- PASS: TestGoroutineCleanupOnHandshakeTimeout (2.61s)
PASS
| Feature | PR #886 | PR #926 | PR #938 | This PR |
|---|---|---|---|---|
| ztls handshake timeout | ✅ | ✅ | ⚠️ Partial | ✅ Goroutine-safe |
| ztls cipher enum timeout | ✅ | ✅ | ✅ | ✅ + Config clone |
| tls cipher enum timeout | ✅ | ❌ | ✅ | ✅ |
| OpenSSL context leak | ⚠️ Partial | ❌ | ❌ | ✅ Fixed |
| JARM timeout | ❌ | ❌ | ❌ | ✅ Fixed |
| File writer mutex | ❌ | ✅ | ✅ | ✅ + Flush guarantee |
| Goroutine drain | ❌ | ❌ | ⚠️ Partial | ✅ Comprehensive |
| Regression tests | 2 | 0 | 3 | 5 |
Closes #819 /claim #819
Bug Fixes
Tests
hanzhcn
@hanzhcn
youssefosama3820009-commits
@youssefosama3820009-commits
ProjectDiscovery
@projectdiscovery