/claim #760

What kind of change does this PR introduce?

Feature: New cursor replay strategy with visual feedback and self-correction

Summary

This PR addresses #760 by introducing a new cursor replay strategy that improves targeting accuracy using visual feedback and AI-powered self-correction.

Key Features:

  • Red dot visual feedback system for suggested target points
  • AI-powered accuracy analysis via OpenAI models
  • Self-correction mechanism based on visual feedback
  • Grid-based movement with recursive refinement for higher precision
  • Robust testing framework to measure accuracy, actions, and performance

This strategy sets the groundwork for improving OpenAdapt’s cursor control system in complex screen environments.

Checklist

  • My code follows OpenAdapt’s style guidelines
  • Follows PEP 8
  • Uses consistent naming conventions
  • Maintains existing project structure
  • Self-reviewed my code
  • Verified edge cases
  • Validated parameter types
  • Checked error handling
  • Added tests
  • test_grid.py evaluates grid strategy
  • Metrics for accuracy, actions, and time
  • Test cases for various screen regions
  • Linted code
  • Used flake8 for Python linting
  • Fixed all issues
  • Removed unused imports
  • Commented the code
  • Explained AI logic
  • Documented grid algorithm
  • Clarified self-correction behavior
  • Updated documentation
  • Added docstrings for all methods/classes
  • Updated requirements.txt
  • Included usage examples in comments
  • All new and existing tests pass locally
  • Visual feedback tests
  • Grid strategy accuracy checks
  • OpenAI API integration tests

How can your code be run and tested?

  1. Install dependencies:
pip install -r requirements.txt
  1. Run the grid evaluation:
python -m experiments.cursor.test_grid

Example Output:

Grid Strategy Evaluation Results:
---------------------------------
Total test cases: 45
Average distance error: 5.2 pixels
Average actions per target: 4.3
Average time per target: 0.82 seconds

Results by grid size:
Grid size: 2x2
  Average error: 8.4 pixels
  Average actions: 3.0
  Average time: 0.65 seconds

Grid size: 4x4
  Average error: 4.2 pixels
  Average actions: 4.5
  Average time: 0.85 seconds

Grid size: 8x8
  Average error: 3.1 pixels
  Average actions: 5.5
  Average time: 0.96 seconds
  1. Test specific components:
from openadapt.strategies.cursor import CursorReplayStrategy
from experiments.cursor.grid import GridCursorStrategy

# Visual feedback
strategy = CursorReplayStrategy(recording)
img_with_dot = strategy.paint_dot(screenshot, x=100, y=100)

# Grid approach
grid_strategy = GridCursorStrategy(recording, grid_size=(4, 4))
action = grid_strategy.get_next_action_event(screenshot, window_event)

Dependencies:

  • opencv-python for visual processing
  • numpy for grid calculations
  • openai for visual feedback evaluation

Claim

Total prize pool $1,000
Total paid $0
Status Pending
Submitted May 29, 2025
Last updated May 29, 2025

Contributors

TA

Tanmay

@TanCodeX

100%

Sponsors

OP

OpenAdaptAI

@OpenAdaptAI

$1,000