Optimize Agent Execution with Caching Mechanism

Summary by mrge

Implemented a caching mechanism for agent execution that stores and reuses successful task plans based on task description, URL, and DOM state. This optimization reduces redundant LLM calls and improves performance for repeated tasks.

Performance Improvements

Added plan caching in the Planner class to check for cached plans before making LLM calls
Enhanced MessageManager to track the last used plan for caching successful executions
Created cache utilities for storing and retrieving plans with proper fingerprinting

/claim #763

Problem

Previously, the planner invoked the LLM for every task execution, even if the task had already been executed under identical conditions. This led to:

Repetitive and unnecessary API calls
Increased execution time
Higher infrastructure and usage costs
Inefficient agent behavior for frequently repeated tasks

Solution

A content-aware caching layer was implemented, which stores and retrieves task plans based on a combination of:

The task description
The current URL
A normalized version of the DOM structure

Key components of the solution include:

1. Deterministic Plan Fingerprinting

A normalization function reduces noise in the DOM content to make fingerprinting more reliable.
The fingerprint is generated using a SHA-256 hash of the normalized DOM, task, and URL.

2. Integration with the Planner

The Planner class checks the cache before making LLM calls.
Cached plans are only saved when they pass validation and follow the expected format.

3. Safe and Efficient Cache Management

Plans are saved using atomic file operations to prevent corruption during concurrent access.
A cleanup system enforces expiration (default: 7 days) and a maximum cache size (default: 100 MB).
Expired or oldest non-expired files are evicted to maintain cache limits.

4. Validation and Robustness

Each plan is validated to ensure it includes the required structure and action types.
Plans with unsupported or incomplete fields are filtered out.
Errors in reading or writing to the cache are handled gracefully and logged for visibility.

Performance Impact

Significantly reduces the number of LLM calls for repeated tasks
Improves task execution speed
Decreases system and API resource consumption
Provides a smoother and more scalable experience in repetitive automation workflows

Testing

Verified that identical inputs consistently return the same cached plan
Confirmed that expired plans are removed and cache size limits are respected
Ensured proper fallback behavior when cached plans are invalid or missing

Benefits

Faster and more efficient execution of automation tasks
Reduced dependency on LLM for routine operations
Lower operational costs, especially in high-frequency use cases
Clean separation of cache logic from planning logic, making the system easier to maintain

Future Improvements

Introduce DOM-level change detection for smarter cache invalidation
Support versioning of cached plans tied to LLM updates
Add cache monitoring and reporting capabilities
Make cache behavior more configurable by environment or task type