NATS JetStream Consumer Replication and Fetch Failures

Detects NATS JetStream consumer replication issues and fetch operation failures that impact message delivery reliability when consumers have multiple replicas configured.This includes scenarios where consumers with replica counts greater than 1 fail to retrieve messages when batch sizes exceed available messages, leading to message loss, delivery delays, and consumer state inconsistencies across the cluster.

This rule helps identify problems such as:

  • FetchNoWait operations requesting batch sizes larger than available message counts
  • Consumer replicas returning zero messages despite having outstanding acknowledgments
  • Pending message counts dropping to zero while expected messages remain undelivered
  • Consumer state inconsistencies when replica count exceeds 1 (typically with 3 replicas)
  • Batch processing failures in multi-replica consumer configurations

/claim #77 /close #77

Updated categories and tags

📋 Reproduction Steps

Make the script executable and run it:

# Make the script executable and run it
chmod +x run.sh
./run.sh

# Execute detection rule
cat test.log | preq -r fetchnowait-replica-bug.yaml -d

Reproducible test setup (Maintainers invited): cre-nats Live CRE Detection: CRE Playground

https://github.com/user-attachments/assets/c89a9635-297d-4ab7-a23d-58ccc59ce593

Claim

Total prize pool $200
Total paid $0
Status Pending
Submitted June 15, 2025
Last updated June 15, 2025

Contributors

SA

SaikiranSurapalli

@SaikiranSurapalli17

100%

Sponsors

PR

Prequel

@prequel-dev

$200