Summary by mrge
Implemented a caching mechanism for agent execution that stores and reuses successful task plans based on task description, URL, and DOM state. This optimization reduces redundant LLM calls and improves performance for repeated tasks.
Performance Improvements
- Added plan caching in the
Planner
class to check for cached plans before making LLM calls
- Enhanced
MessageManager
to track the last used plan for caching successful executions
- Created cache utilities for storing and retrieving plans with proper fingerprinting
/claim #763
Problem
Previously, the planner invoked the LLM for every task execution, even if the task had already been executed under identical conditions. This led to:
- Repetitive and unnecessary API calls
- Increased execution time
- Higher infrastructure and usage costs
- Inefficient agent behavior for frequently repeated tasks
Solution
A content-aware caching layer was implemented, which stores and retrieves task plans based on a combination of:
- The task description
- The current URL
- A normalized version of the DOM structure
Key components of the solution include:
1. Deterministic Plan Fingerprinting
- A normalization function reduces noise in the DOM content to make fingerprinting more reliable.
- The fingerprint is generated using a SHA-256 hash of the normalized DOM, task, and URL.
2. Integration with the Planner
- The
Planner
class checks the cache before making LLM calls.
- Cached plans are only saved when they pass validation and follow the expected format.
3. Safe and Efficient Cache Management
- Plans are saved using atomic file operations to prevent corruption during concurrent access.
- A cleanup system enforces expiration (default: 7 days) and a maximum cache size (default: 100 MB).
- Expired or oldest non-expired files are evicted to maintain cache limits.
4. Validation and Robustness
- Each plan is validated to ensure it includes the required structure and action types.
- Plans with unsupported or incomplete fields are filtered out.
- Errors in reading or writing to the cache are handled gracefully and logged for visibility.
Performance Impact
- Significantly reduces the number of LLM calls for repeated tasks
- Improves task execution speed
- Decreases system and API resource consumption
- Provides a smoother and more scalable experience in repetitive automation workflows
Testing
- Verified that identical inputs consistently return the same cached plan
- Confirmed that expired plans are removed and cache size limits are respected
- Ensured proper fallback behavior when cached plans are invalid or missing
Benefits
- Faster and more efficient execution of automation tasks
- Reduced dependency on LLM for routine operations
- Lower operational costs, especially in high-frequency use cases
- Clean separation of cache logic from planning logic, making the system easier to maintain
Future Improvements
- Introduce DOM-level change detection for smarter cache invalidation
- Support versioning of cached plans tied to LLM updates
- Add cache monitoring and reporting capabilities
- Make cache behavior more configurable by environment or task type