/claim #21

Overview

This PR implements a new Rust example that allows users to easily capture the UI context of their currently focused application and process it with a local LLM. The example addresses the bounty request #21, which called for a tool that can:

Detect keyboard shortcuts
Capture application UI trees
Send them to a local LLM
Copy the processed context to the clipboard

Implementation Details

Core Functionality

Global Hotkey Detection

Uses global-hotkey to register and detect global keyboard shortcuts
Default shortcut: Ctrl+Shift+C (configurable via CLI)
Keyboard events processed asynchronously for responsiveness

UI Tree Capture

Implements Windows-specific UI capture via Windows Accessibility APIs
Creates a traversable UI tree of the focused application
Captures properties: role, name, value, and hierarchical structure
Includes a fallback for non-Windows platforms for demo purposes

Local LLM Processing with Ollama

Integrates with Ollama for local LLM processing
Default model: Gemma 2B (configurable)
Customizable system prompt
Uses structured JSON for reliable parsing by LLM

Clipboard Integration

Uses arboard to copy output to clipboard
Immediate user feedback after copy
Enables pasting into other apps or AI tools seamlessly

Project Structure

src/main.rs – Core app logic, hotkey handling, Ollama integration
src/windows_ui.rs – Windows-specific UI capture logic
README.md – Setup and usage documentation
run.bat – Windows helper script

Benefits

Privacy-Focused: All processing is local—no external server calls
Cross-Platform: Works on Windows and macOS (advanced capture on Windows)
Configurable: Custom model, hotkey, and system prompt
Easy to Use: Single hotkey triggers full pipeline, result copied to clipboard

How It Works

Application listens for hotkey (Ctrl+Shift+C by default)
On trigger:
- Captures UI structure of the focused application
- Sends UI data to local Ollama instance
- LLM (e.g., Gemma 2B) generates human-readable summary
- Result is copied to clipboard

Users can then paste the context into any AI tool, IDE, or notes.

Testing

Tested on both Windows and macOS:

Windows: Captures detailed UI info from various apps
macOS: Fallback logic runs correctly
LLM Processing: Works reliably with Gemma 2B, also tested with Llama & Mistral