/claim #1540
Extends the existing deeplink infrastructure in deeplink_actions.rs with new actions and adds a complete Raycast extension for controlling Cap.
Added 10 new DeepLinkAction variants to apps/desktop/src-tauri/src/deeplink_actions.rs:
| Action | Description |
|---|---|
PauseRecording |
Pause the current recording |
ResumeRecording |
Resume a paused recording |
TogglePauseRecording |
Toggle pause/resume state |
RestartRecording |
Restart the current recording |
TakeScreenshot |
Take a screenshot (optional capture_mode, defaults to primary display) |
ListCameras |
List available cameras (JSON copied to clipboard) |
SetCamera |
Switch camera input |
ListMicrophones |
List available microphones (JSON copied to clipboard) |
SetMicrophone |
Switch microphone input |
ListDisplays / ListWindows |
List available capture targets (JSON copied to clipboard) |
capture_mode is optional in both StartRecording and TakeScreenshot — when omitted, falls back to the primary displayList* actions write JSON results to the system clipboard so external callers can read the dataresolve_capture_target helper to DRY up target resolution logicUnit actions: cap-desktop://action?value="stop_recording"
Parameterized: cap-desktop://action?value={"take_screenshot":{"capture_mode":null}}
Created apps/raycast/ with 9 commands:
All commands use the cap-desktop:// deeplink scheme to communicate with the desktop app. Error handling ensures failure toasts are shown only on actual errors, not unconditionally.
Closes #1540
This PR extends the deeplink infrastructure with 10 new DeepLinkAction variants (pause/resume/toggle/restart recording, screenshot, list and set camera/microphone/display/window) and adds a complete Raycast extension with 9 commands (instant/studio/saved recording modes, stop/pause/resume/toggle/restart, screenshot, and settings). All commands use the cap-desktop:// deeplink scheme.
Key implementation details:
List* actions write JSON results to the clipboard for external callersSetCamera/SetMicrophone only check permission when activating a deviceListDisplays/ListWindows include screen recording permission checksresolve_capture_target uses case-insensitive name matchingstart_current_recording with mode overrides instead of hardcoded display namespackage.json clearly differentiate instant vs. studio modesDisplay::primary() concern was speculative about edge-case API behavior without confirming the actual implementation, and (2) the command description issue was already resolved in the current code (descriptions clearly differentiate instant vs. studio modes). All major features from prior iterations (clipboard output, permission guards, case-insensitive matching) are correctly implemented.sequenceDiagram
participant User
participant Raycast
participant macOS
participant Cap as Cap Desktop (Tauri)
participant Rust as deeplink_actions.rs
User->>Raycast: Trigger command (e.g. Pause Recording)
Raycast->>Raycast: closeMainWindow()
Raycast->>macOS: open("cap-desktop://action?value=...")
macOS-->>Raycast: OS acknowledges (immediate resolve)
Raycast->>Raycast: showHUD("Pause recording dispatched")
Note over Raycast,macOS: Raycast is done — fire-and-forget
macOS->>Cap: Dispatch deeplink URL
Cap->>Rust: handle(app_handle, urls)
Rust->>Rust: TryFrom<&Url> → DeepLinkAction
alt Permission check fails
Rust-->>Cap: Err("Permission not granted")
Cap->>Cap: eprintln! / tracing error
else Action executes
Rust->>Cap: execute(app_handle)
alt List* action
Cap->>Cap: list_cameras / list_displays / list_windows / list_microphones
Cap->>macOS: clipboard.write_text(json)
else Recording control
Cap->>Cap: start / stop / pause / resume / restart / toggle
else SetCamera / SetMicrophone
Cap->>Cap: set_camera_input / set_mic_input
end
Rust-->>Cap: Ok(())
end
apps/desktop/src-tauri/src/deeplink_actions.rs, line 187-192 (link)
organization_id is hard-coded to None in StartRecording, but StartCurrentRecording correctly reads it from RecordingSettingsStore (line 211). This inconsistency means recordings initiated via start_recording deeplinks will never be associated with a user’s organization—even if they are an organization member—silently disabling organization-level features like shared libraries and team uploads.
Consider reading organization_id from the settings store to match StartCurrentRecording:
apps/desktop/src-tauri/src/deeplink_actions.rs, line 187-190 (link)
Silently discards settings read error for organization_id
This .ok().flatten() chain has no inspect_err, so any I/O or deserialization error from RecordingSettingsStore::get() is silently swallowed. If the settings file is corrupt or unreadable the recording proceeds with organization_id: None — no log entry is emitted to help diagnose the problem.
The StartCurrentRecording handler immediately below (line 204) reads from the same store and consistently logs the error with .inspect_err(|e| eprintln!(...)). The same pattern should be applied here for consistency:
extensions/raycast/src/lib/deeplink.ts, line 13 (link)
Error message doesn’t account for Cap not running
open() on macOS with a custom URL scheme will throw if no application is registered for the scheme (i.e., Cap is not installed). However it will also silently succeed if Cap is installed but not running — macOS will attempt to launch it, but the recorded action may never be processed if the app takes too long to start or if the deeplink handler fires before Cap is ready to receive it.
The current error message "make sure it is installed" covers the case where open() throws, but users who have Cap installed yet not running may see the success HUD even though Cap never acted on the action. Consider making the message more complete:
apps/desktop/src-tauri/DEEPLINKS.md, line 93-101 (link)
Required: Yes is incorrect for Option<T> fields
Both set_camera.id and set_microphone.label are backed by Option<T> in Rust. Serde’s default behavior is to deserialize a missing Option<T> field as None — meaning a caller can omit the JSON key entirely and get the same result as passing null. Documenting them as Required: Yes is misleading: callers are led to believe they must include the key.
Suggested fix — change both to Required: No:
And for set_microphone:
This aligns with the treatment of other Option<T> fields like start_current_recording.mode and open_settings.page, which are both correctly documented as Required: No.
apps/desktop/src-tauri/src/deeplink_actions.rs, line 150-157 (link)
default_display_target validates with one API but resolves with another
default_display_target calls cap_recording::screen_capture::list_displays() to confirm at least one display exists, but then returns an ID from scap_targets::Display::primary(). These are two separate enumeration APIs and may order or identify displays differently.
If list_displays() is non-empty but Display::primary() returns an ID that is not present in the list returned by list_displays() (e.g. it uses a different backing API), callers will see inconsistent results: list_displays will not show the display that gets recorded by default.
For consistency, the fallback should use the same API as the rest of the capture path:
fn default_display_target() -> Result<ScreenCaptureTarget, String> {
let displays = cap_recording::screen_capture::list_displays();
displays
.into_iter()
.next()
.map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
.ok_or_else(|| "No displays found".to_string())
}
This ensures the “default” display is always the first entry from the same source that ListDisplays and resolve_capture_target use.
apps/desktop/src-tauri/src/deeplink_actions.rs, line 303-318 (link)
ListMicrophones and ListCameras return incompatible formats
ListMicrophones writes only a sorted Vec<String> of labels to the clipboard, while ListCameras writes the full camera-struct array (including id and name fields). Neither DEEPLINKS.md nor any comment explains this difference to callers.
While both formats happen to be consistent with what their paired Set* actions need (SetMicrophone needs a label string; SetCamera needs an id object), external callers who read the clipboard output to feed into the corresponding Set* call may be surprised that:
list_cameras output → they extract id from an object arraylist_microphones output → they use the string directlyConsider documenting the exact clipboard JSON schema for each List* action in DEEPLINKS.md so external callers know what to expect.
apps/desktop/src-tauri/src/deeplink_actions.rs, line 292-298 (link)
default_display_target does not select the primary display
The function picks the first element from cap_recording::screen_capture::list_displays(), which in turn calls scap_targets::Display::list() → CGDisplay::active_displays() (macOS) / EnumDisplayMonitors() (Windows). Neither API guarantees that the primary/main display is first; the order is system-defined and varies by display configuration.
This contradicts the PR description (“falls back to the primary display”), the DEEPLINKS.md docs (“Defaults to primary display when omitted or null”), and the take_screenshot table (“null uses the primary display”). On a multi-monitor setup, the “default” display will silently be whichever display the OS happens to enumerate first, not necessarily the display with the menu bar.
To reliably select the primary display, use the CGMainDisplay ID on macOS:
fn default_display_target() -> Result<ScreenCaptureTarget, String> {
// list_displays returns (Display, _); on macOS the primary display
// has the CGMainDisplayID — query it explicitly via scap or CGDisplay
cap_recording::screen_capture::list_displays()
.into_iter()
.find(|(s, _)| s.is_primary) // if the Display struct exposes this
.or_else(|| cap_recording::screen_capture::list_displays().into_iter().next())
.map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
.ok_or_else(|| "No displays found".to_string())
}
If the Display struct does not expose an is_primary flag, it should be added so callers can distinguish the main display without relying on enumeration order.
extensions/raycast/src/lib/deeplink.ts, line 1-15 (link)
closeMainWindow silently swallowed may mask legitimate errors
closeMainWindow().catch(() => undefined) discards every error unconditionally, including any non-trivial errors thrown by Raycast’s API. While it’s intentional to avoid failing on no-view commands (which have no window to close), logging the suppressed error would be helpful for debugging:
This keeps the fire-and-forget behaviour intact while making unexpected failures observable in Raycast’s developer console.
Last reviewed commit: d377b34
Tedy Fazrin
@tedy69
Cap
@CapSoftware
Abhishek Verma
@w3Abhishek