/claim #1540

Summary

Extends the existing deeplink infrastructure in deeplink_actions.rs with new actions and adds a complete Raycast extension for controlling Cap.

Deeplink Changes

Added 10 new DeepLinkAction variants to apps/desktop/src-tauri/src/deeplink_actions.rs:

Action Description
PauseRecording Pause the current recording
ResumeRecording Resume a paused recording
TogglePauseRecording Toggle pause/resume state
RestartRecording Restart the current recording
TakeScreenshot Take a screenshot (optional capture_mode, defaults to primary display)
ListCameras List available cameras (JSON copied to clipboard)
SetCamera Switch camera input
ListMicrophones List available microphones (JSON copied to clipboard)
SetMicrophone Switch microphone input
ListDisplays / ListWindows List available capture targets (JSON copied to clipboard)
  • capture_mode is optional in both StartRecording and TakeScreenshot — when omitted, falls back to the primary display
  • List* actions write JSON results to the system clipboard so external callers can read the data
  • Extracted resolve_capture_target helper to DRY up target resolution logic

Deeplink URL Format

Unit actions: cap-desktop://action?value="stop_recording" Parameterized: cap-desktop://action?value={"take_screenshot":{"capture_mode":null}}

Raycast Extension

Created apps/raycast/ with 9 commands:

  • Start Instant Recording / Start Studio Recording
  • Stop / Pause / Resume / Toggle Pause / Restart Recording
  • Take Screenshot
  • Open Settings

All commands use the cap-desktop:// deeplink scheme to communicate with the desktop app. Error handling ensures failure toasts are shown only on actual errors, not unconditionally.

Closes #1540

Greptile Summary

This PR extends the deeplink infrastructure with 10 new DeepLinkAction variants (pause/resume/toggle/restart recording, screenshot, list and set camera/microphone/display/window) and adds a complete Raycast extension with 9 commands (instant/studio/saved recording modes, stop/pause/resume/toggle/restart, screenshot, and settings). All commands use the cap-desktop:// deeplink scheme.

Key implementation details:

  • All List* actions write JSON results to the clipboard for external callers
  • Permission guards correctly gate device access: SetCamera/SetMicrophone only check permission when activating a device
  • ListDisplays/ListWindows include screen recording permission checks
  • resolve_capture_target uses case-insensitive name matching
  • Raycast commands use start_current_recording with mode overrides instead of hardcoded display names
  • HUD messages use “dispatched” phrasing to reflect the fire-and-forget nature of deeplinks
  • Command descriptions in package.json clearly differentiate instant vs. studio modes

Confidence Score: 5/5

  • Safe to merge — comprehensive deeplink infrastructure expansion with proper permission guards, clipboard integration for list operations, and case-insensitive matching. Raycast extension correctly uses saved settings via mode overrides.
  • The prior review’s identified concerns were found to be false positives upon verification: (1) the Display::primary() concern was speculative about edge-case API behavior without confirming the actual implementation, and (2) the command description issue was already resolved in the current code (descriptions clearly differentiate instant vs. studio modes). All major features from prior iterations (clipboard output, permission guards, case-insensitive matching) are correctly implemented.
  • No files require special attention

Sequence Diagram

sequenceDiagram
participant User
participant Raycast
participant macOS
participant Cap as Cap Desktop (Tauri)
participant Rust as deeplink_actions.rs
User->>Raycast: Trigger command (e.g. Pause Recording)
Raycast->>Raycast: closeMainWindow()
Raycast->>macOS: open("cap-desktop://action?value=...")
macOS-->>Raycast: OS acknowledges (immediate resolve)
Raycast->>Raycast: showHUD("Pause recording dispatched")
Note over Raycast,macOS: Raycast is done — fire-and-forget
macOS->>Cap: Dispatch deeplink URL
Cap->>Rust: handle(app_handle, urls)
Rust->>Rust: TryFrom<&Url> → DeepLinkAction
alt Permission check fails
Rust-->>Cap: Err("Permission not granted")
Cap->>Cap: eprintln! / tracing error
else Action executes
Rust->>Cap: execute(app_handle)
alt List* action
Cap->>Cap: list_cameras / list_displays / list_windows / list_microphones
Cap->>macOS: clipboard.write_text(json)
else Recording control
Cap->>Cap: start / stop / pause / resume / restart / toggle
else SetCamera / SetMicrophone
Cap->>Cap: set_camera_input / set_mic_input
end
Rust-->>Cap: Ok(())
end

Comments Outside Diff (8)

  1. apps/desktop/src-tauri/src/deeplink_actions.rs, line 187-192 (link)

    organization_id is hard-coded to None in StartRecording, but StartCurrentRecording correctly reads it from RecordingSettingsStore (line 211). This inconsistency means recordings initiated via start_recording deeplinks will never be associated with a user’s organization—even if they are an organization member—silently disabling organization-level features like shared libraries and team uploads.

    Consider reading organization_id from the settings store to match StartCurrentRecording:

  2. apps/desktop/src-tauri/src/deeplink_actions.rs, line 187-190 (link)

    Silently discards settings read error for organization_id

    This .ok().flatten() chain has no inspect_err, so any I/O or deserialization error from RecordingSettingsStore::get() is silently swallowed. If the settings file is corrupt or unreadable the recording proceeds with organization_id: None — no log entry is emitted to help diagnose the problem.

    The StartCurrentRecording handler immediately below (line 204) reads from the same store and consistently logs the error with .inspect_err(|e| eprintln!(...)). The same pattern should be applied here for consistency:

  3. extensions/raycast/src/lib/deeplink.ts, line 13 (link)

    Error message doesn’t account for Cap not running

    open() on macOS with a custom URL scheme will throw if no application is registered for the scheme (i.e., Cap is not installed). However it will also silently succeed if Cap is installed but not running — macOS will attempt to launch it, but the recorded action may never be processed if the app takes too long to start or if the deeplink handler fires before Cap is ready to receive it.

    The current error message "make sure it is installed" covers the case where open() throws, but users who have Cap installed yet not running may see the success HUD even though Cap never acted on the action. Consider making the message more complete:

  4. apps/desktop/src-tauri/DEEPLINKS.md, line 93-101 (link)

    Required: Yes is incorrect for Option<T> fields

    Both set_camera.id and set_microphone.label are backed by Option<T> in Rust. Serde’s default behavior is to deserialize a missing Option<T> field as None — meaning a caller can omit the JSON key entirely and get the same result as passing null. Documenting them as Required: Yes is misleading: callers are led to believe they must include the key.

    Suggested fix — change both to Required: No:

    And for set_microphone:

    This aligns with the treatment of other Option<T> fields like start_current_recording.mode and open_settings.page, which are both correctly documented as Required: No.

  5. apps/desktop/src-tauri/src/deeplink_actions.rs, line 150-157 (link)

    default_display_target validates with one API but resolves with another

    default_display_target calls cap_recording::screen_capture::list_displays() to confirm at least one display exists, but then returns an ID from scap_targets::Display::primary(). These are two separate enumeration APIs and may order or identify displays differently.

    If list_displays() is non-empty but Display::primary() returns an ID that is not present in the list returned by list_displays() (e.g. it uses a different backing API), callers will see inconsistent results: list_displays will not show the display that gets recorded by default.

    For consistency, the fallback should use the same API as the rest of the capture path:

    fn default_display_target() -> Result<ScreenCaptureTarget, String> {
    let displays = cap_recording::screen_capture::list_displays();
    displays
    .into_iter()
    .next()
    .map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
    .ok_or_else(|| "No displays found".to_string())
    }

    This ensures the “default” display is always the first entry from the same source that ListDisplays and resolve_capture_target use.

  6. apps/desktop/src-tauri/src/deeplink_actions.rs, line 303-318 (link)

    ListMicrophones and ListCameras return incompatible formats

    ListMicrophones writes only a sorted Vec<String> of labels to the clipboard, while ListCameras writes the full camera-struct array (including id and name fields). Neither DEEPLINKS.md nor any comment explains this difference to callers.

    While both formats happen to be consistent with what their paired Set* actions need (SetMicrophone needs a label string; SetCamera needs an id object), external callers who read the clipboard output to feed into the corresponding Set* call may be surprised that:

    • list_cameras output → they extract id from an object array
    • list_microphones output → they use the string directly

    Consider documenting the exact clipboard JSON schema for each List* action in DEEPLINKS.md so external callers know what to expect.

  7. apps/desktop/src-tauri/src/deeplink_actions.rs, line 292-298 (link)

    default_display_target does not select the primary display

    The function picks the first element from cap_recording::screen_capture::list_displays(), which in turn calls scap_targets::Display::list()CGDisplay::active_displays() (macOS) / EnumDisplayMonitors() (Windows). Neither API guarantees that the primary/main display is first; the order is system-defined and varies by display configuration.

    This contradicts the PR description (“falls back to the primary display”), the DEEPLINKS.md docs (“Defaults to primary display when omitted or null”), and the take_screenshot table (“null uses the primary display”). On a multi-monitor setup, the “default” display will silently be whichever display the OS happens to enumerate first, not necessarily the display with the menu bar.

    To reliably select the primary display, use the CGMainDisplay ID on macOS:

    fn default_display_target() -> Result<ScreenCaptureTarget, String> {
    // list_displays returns (Display, _); on macOS the primary display
    // has the CGMainDisplayID — query it explicitly via scap or CGDisplay
    cap_recording::screen_capture::list_displays()
    .into_iter()
    .find(|(s, _)| s.is_primary) // if the Display struct exposes this
    .or_else(|| cap_recording::screen_capture::list_displays().into_iter().next())
    .map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
    .ok_or_else(|| "No displays found".to_string())
    }

    If the Display struct does not expose an is_primary flag, it should be added so callers can distinguish the main display without relying on enumeration order.

  8. extensions/raycast/src/lib/deeplink.ts, line 1-15 (link)

    closeMainWindow silently swallowed may mask legitimate errors

    closeMainWindow().catch(() => undefined) discards every error unconditionally, including any non-trivial errors thrown by Raycast’s API. While it’s intentional to avoid failing on no-view commands (which have no window to close), logging the suppressed error would be helpful for debugging:

    This keeps the fire-and-forget behaviour intact while making unexpected failures observable in Raycast’s developer console.

Last reviewed commit: d377b34

Claim

Total prize pool $200.10
Total paid $0
Status Pending
Submitted March 04, 2026
Last updated March 04, 2026

Contributors

TE

Tedy Fazrin

@tedy69

100%

Sponsors

CA

Cap

@CapSoftware

$200
AB

Abhishek Verma

@w3Abhishek

$0.10