/claim #21

Overview

This PR implements a new Rust example that allows users to easily capture the UI context of their currently focused application and process it with a local LLM. The example addresses the bounty request #21, which called for a tool that can:

  • Detect keyboard shortcuts
  • Capture application UI trees
  • Send them to a local LLM
  • Copy the processed context to the clipboard

Implementation Details

Core Functionality

Global Hotkey Detection

  • Uses global-hotkey to register and detect global keyboard shortcuts
  • Default shortcut: Ctrl+Shift+C (configurable via CLI)
  • Keyboard events processed asynchronously for responsiveness

UI Tree Capture

  • Implements Windows-specific UI capture via Windows Accessibility APIs
  • Creates a traversable UI tree of the focused application
  • Captures properties: role, name, value, and hierarchical structure
  • Includes a fallback for non-Windows platforms for demo purposes

Local LLM Processing with Ollama

  • Integrates with Ollama for local LLM processing
  • Default model: Gemma 2B (configurable)
  • Customizable system prompt
  • Uses structured JSON for reliable parsing by LLM

Clipboard Integration

  • Uses arboard to copy output to clipboard
  • Immediate user feedback after copy
  • Enables pasting into other apps or AI tools seamlessly

Project Structure

  • src/main.rs – Core app logic, hotkey handling, Ollama integration
  • src/windows_ui.rs – Windows-specific UI capture logic
  • README.md – Setup and usage documentation
  • run.bat – Windows helper script

Benefits

  • Privacy-Focused: All processing is local—no external server calls
  • Cross-Platform: Works on Windows and macOS (advanced capture on Windows)
  • Configurable: Custom model, hotkey, and system prompt
  • Easy to Use: Single hotkey triggers full pipeline, result copied to clipboard

How It Works

  1. Application listens for hotkey (Ctrl+Shift+C by default)
  2. On trigger:
    • Captures UI structure of the focused application
    • Sends UI data to local Ollama instance
    • LLM (e.g., Gemma 2B) generates human-readable summary
    • Result is copied to clipboard

Users can then paste the context into any AI tool, IDE, or notes.


Testing

Tested on both Windows and macOS:

  • Windows: Captures detailed UI info from various apps
  • macOS: Fallback logic runs correctly
  • LLM Processing: Works reliably with Gemma 2B, also tested with Llama & Mistral

Claim

Total prize pool $100
Total paid $0
Status Pending
Submitted May 25, 2025
Last updated May 25, 2025

Contributors

KA

Kartikay Singh pundir

@Kartikayy007

100%

Sponsors

SC

screenpi.pe

@mediar-ai

$100