Skip to content

impr (cli): Device id for uniqueness#797

Open
xav-db wants to merge 6 commits intodevfrom
device-id-for-uniqueness
Open

impr (cli): Device id for uniqueness#797
xav-db wants to merge 6 commits intodevfrom
device-id-for-uniqueness

Conversation

@xav-db
Copy link
Copy Markdown
Member

@xav-db xav-db commented Jan 12, 2026

Greptile Overview

Greptile Summary

Overview

This PR adds device ID tracking to the Helix CLI metrics system for better uniqueness and correlation between CLI and container metrics. The device ID is derived from platform-specific machine identifiers (IOPlatformUUID on macOS, /etc/machine-id on Linux, MachineGuid on Windows), hashed with SHA256 for privacy, and included in all metrics events.

What Changed

Core Implementation:

  • Added get_device_id() function that retrieves platform-specific machine IDs and hashes them
  • Updated MetricsConfig struct to include device_id field
  • Modified all event creation functions to include device_id in RawEvent
  • Added device ID display to helix metrics status command

Container Integration:

  • Modified DockerManager to pass HELIX_DEVICE_ID as environment variable to containers
  • Updated metrics library to read HELIX_DEVICE_ID from environment in container context
  • Ensures CLI and container metrics share the same device ID for correlation

Dependencies:

  • Added sha2 and hex crates for cryptographic hashing

Critical Issues Found

🚨 MEMORY LEAKS - Multiple critical memory leaks that will cause unbounded memory growth:

  1. get_device_id() function (line 428): Leaks memory on EVERY call because it creates a new String and calls .leak() without caching. This function is called 8+ times during initialization and on every metrics event sent.

  2. load_metrics_config() function (line 156): Leaks entire config file contents on every call via .leak(). This function is called multiple times throughout the codebase.

  3. Test suite also affected: Test functions call get_device_id() multiple times, causing leaks during test execution.

Impact: In production, these leaks will cause memory usage to grow continuously during normal CLI operation, especially for users with metrics enabled who perform many operations.

Fix Required: Use LazyLock pattern to cache the device ID (similar to HELIX_DEVICE_ID in metrics/lib.rs) and remove unnecessary .leak() from config loading.

Positive Aspects

  • Privacy-preserving design: Machine IDs are hashed with SHA256 + salt, not sent raw
  • Cross-platform support: Handles macOS, Linux, and Windows appropriately
  • Good error handling: Returns Option and handles failures gracefully
  • Container correlation: Clever design to share device ID between CLI and containers
  • Comprehensive tests: Good test coverage for the hashing and platform-specific logic

Important Files Changed

File Analysis

Filename Score Overview
helix-cli/src/metrics_sender.rs 1/5 CRITICAL: Multiple memory leaks in get_device_id() function (called repeatedly), load_metrics_config() leaks file contents, and Default impl calls leaky function. Needs LazyLock pattern.
helix-cli/src/docker.rs 3/5 Adds device_id to container environment variables. Implementation is correct but inherits memory leak from get_device_id() function. Will be fixed once metrics_sender.rs is corrected.
metrics/src/lib.rs 5/5 Correctly implements HELIX_DEVICE_ID as LazyLock from environment variable, uses it in create_raw_event(). Pattern is correct - leak only happens once during static initialization.
metrics/src/events.rs 5/5 Adds device_id field to RawEvent struct. Clean addition with proper Option<&'static str> type, consistent with user_id and email fields.

Sequence Diagram

sequenceDiagram
    participant CLI as Helix CLI
    participant MS as MetricsSender
    participant MID as get_machine_id()
    participant Hash as hash_to_device_id()
    participant Docker as DockerManager
    participant Container as Helix Container
    participant ML as Metrics Lib
    participant Server as Metrics Server

    Note over CLI,Server: Device ID Generation & Usage Flow

    CLI->>MS: Initialize MetricsConfig
    MS->>MS: get_device_id()
    MS->>MID: Get platform-specific machine ID
    alt macOS
        MID->>MID: Execute ioreg command
    else Linux
        MID->>MID: Read /etc/machine-id
    else Windows
        MID->>MID: Query registry for MachineGuid
    end
    MID-->>MS: Return machine_id (String)
    MS->>Hash: hash_to_device_id(machine_id)
    Hash->>Hash: SHA256("helix-device-id:" + machine_id)
    Hash-->>MS: Return hashed ID (32 hex chars)
    MS-->>CLI: device_id stored in config

    Note over CLI,Server: Sending Metrics Events

    CLI->>MS: send_*_event()
    MS->>MS: get_device_id()
    MS->>MS: Create RawEvent with device_id
    MS->>Server: POST metrics with device_id

    Note over CLI,Server: Container Deployment

    CLI->>Docker: Deploy container
    Docker->>MS: get_device_id()
    Docker->>Docker: Add HELIX_DEVICE_ID env var
    Docker->>Container: Start with environment
    Container->>ML: Read HELIX_DEVICE_ID from env
    ML->>ML: Store in static HELIX_DEVICE_ID
    Container->>ML: log_event()
    ML->>ML: create_raw_event() with device_id
    ML->>Server: POST metrics with device_id
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread helix-cli/src/metrics_sender.rs Outdated
Comment on lines +428 to +432
get_machine_id()
.map(|id| hash_to_device_id(&id))
.map(|s| -> &'static str { s.leak() })
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function causes a memory leak every time it's called because .leak() is called on a newly created String. This function is called repeatedly (lines 49, 66, 254, 287, 315, 343, 371, 400) - each call permanently leaks memory.

Solution: Use a LazyLock to compute the device ID once and cache it:

Suggested change
get_machine_id()
.map(|id| hash_to_device_id(&id))
.map(|s| -> &'static str { s.leak() })
}
/// Get a deterministic device ID derived from the machine's unique identifier.
/// This ID is stable across CLI reinstalls and file deletions.
pub fn get_device_id() -> Option<&'static str> {
use std::sync::LazyLock;
static DEVICE_ID: LazyLock<Option<&'static str>> = LazyLock::new(|| {
get_machine_id()
.map(|id| hash_to_device_id(&id))
.map(|s| -> &'static str { s.leak() })
});
*DEVICE_ID
}

This ensures the leak only happens once during initialization, not on every call.

Prompt To Fix With AI
This is a comment left during a code review.
Path: helix-cli/src/metrics_sender.rs
Line: 428:432

Comment:
This function causes a memory leak every time it's called because `.leak()` is called on a newly created String. This function is called repeatedly (lines 49, 66, 254, 287, 315, 343, 371, 400) - each call permanently leaks memory.

**Solution**: Use a `LazyLock` to compute the device ID once and cache it:

```suggestion
/// Get a deterministic device ID derived from the machine's unique identifier.
/// This ID is stable across CLI reinstalls and file deletions.
pub fn get_device_id() -> Option<&'static str> {
    use std::sync::LazyLock;
    static DEVICE_ID: LazyLock<Option<&'static str>> = LazyLock::new(|| {
        get_machine_id()
            .map(|id| hash_to_device_id(&id))
            .map(|s| -> &'static str { s.leak() })
    });
    *DEVICE_ID
}
```

This ensures the leak only happens once during initialization, not on every call.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +553 to +569
);
}

#[test]
fn test_get_device_id_is_deterministic() {
// Get device ID twice - should be the same
let device_id1 = get_device_id();
let device_id2 = get_device_id();

#[cfg(any(target_os = "macos", target_os = "linux", target_os = "windows"))]
{
assert!(device_id1.is_some(), "Device ID should be available");
assert!(device_id2.is_some(), "Device ID should be available");
assert_eq!(
device_id1.unwrap(),
device_id2.unwrap(),
"Device ID should be deterministic"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test calls get_device_id() twice (lines 555-556), which currently causes two memory leaks due to the bug in get_device_id(). Once that function is fixed with a LazyLock pattern, the test will work correctly without leaking memory.

Additionally, the assertion on line 562-565 compares two leaked pointers. After the fix, both calls will return the same cached static reference, so the test will still pass correctly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: helix-cli/src/metrics_sender.rs
Line: 553:569

Comment:
This test calls `get_device_id()` twice (lines 555-556), which currently causes two memory leaks due to the bug in `get_device_id()`. Once that function is fixed with a `LazyLock` pattern, the test will work correctly without leaking memory.

Additionally, the assertion on line 562-565 compares two leaked pointers. After the fix, both calls will return the same cached static reference, so the test will still pass correctly.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +474 to +494
Command::new("reg")
.args([
"query",
r"HKLM\SOFTWARE\Microsoft\Cryptography",
"/v",
"MachineGuid",
])
.output()
.ok()
.and_then(|output| {
let stdout = String::from_utf8_lossy(&output.stdout);
stdout
.lines()
.find(|line| line.contains("MachineGuid"))
.and_then(|line| line.split_whitespace().last())
.map(|s| s.to_string())
})
}

#[cfg(not(any(target_os = "macos", target_os = "linux", target_os = "windows")))]
fn get_machine_id() -> Option<String> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows registry parsing uses .split_whitespace().last() which assumes a specific output format from the reg query command. While this should work for most cases, there's a potential issue if the registry value contains spaces or if the output format changes.

Consider a more robust parsing approach:

The typical output is:

    MachineGuid    REG_SZ    {GUID}

A safer approach would be to:

  1. Find the line containing "MachineGuid"
  2. Split by "REG_SZ" (or other type indicators)
  3. Take the part after the type and trim it

However, since this is used across the Windows ecosystem and the current parsing should work for the standard format, this is a minor concern. The .ok() wrapper ensures that parsing failures return None rather than panicking, which is good error handling.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: helix-cli/src/metrics_sender.rs
Line: 474:494

Comment:
The Windows registry parsing uses `.split_whitespace().last()` which assumes a specific output format from the `reg query` command. While this should work for most cases, there's a potential issue if the registry value contains spaces or if the output format changes.

**Consider a more robust parsing approach:**

The typical output is:
```
    MachineGuid    REG_SZ    {GUID}
```

A safer approach would be to:
1. Find the line containing "MachineGuid"
2. Split by "REG_SZ" (or other type indicators)
3. Take the part after the type and trim it

However, since this is used across the Windows ecosystem and the current parsing should work for the standard format, this is a minor concern. The `.ok()` wrapper ensures that parsing failures return `None` rather than panicking, which is good error handling.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 12, 2026

Additional Comments (2)

helix-cli/src/metrics_sender.rs
The Default implementation calls get_device_id() which has a memory leak issue (see comment on line 428). Since load_metrics_config() returns MetricsConfig::default() when no config file exists (line 153), and load_metrics_config() is called multiple times throughout the code (lines 115, 245, 412, 416), this will cause repeated memory leaks.

Once the get_device_id() function is fixed with a LazyLock pattern, this will be resolved automatically.

Prompt To Fix With AI
This is a comment left during a code review.
Path: helix-cli/src/metrics_sender.rs
Line: 42:56

Comment:
The `Default` implementation calls `get_device_id()` which has a memory leak issue (see comment on line 428). Since `load_metrics_config()` returns `MetricsConfig::default()` when no config file exists (line 153), and `load_metrics_config()` is called multiple times throughout the code (lines 115, 245, 412, 416), this will cause repeated memory leaks.

Once the `get_device_id()` function is fixed with a `LazyLock` pattern, this will be resolved automatically.

How can I resolve this? If you propose a fix, please make it concise.

helix-cli/src/metrics_sender.rs
This line leaks the entire config file contents into static memory. Since load_metrics_config() is called multiple times (lines 115, 245, 412, 416), this will leak memory on each call.

Solution: Remove the .leak() call since the content doesn't need to be static:

    let content = fs::read_to_string(&config_path)?;

The TOML deserializer doesn't require a static string, it can work with a regular String.

Prompt To Fix With AI
This is a comment left during a code review.
Path: helix-cli/src/metrics_sender.rs
Line: 156:156

Comment:
This line leaks the entire config file contents into static memory. Since `load_metrics_config()` is called multiple times (lines 115, 245, 412, 416), this will leak memory on each call.

**Solution**: Remove the `.leak()` call since the content doesn't need to be static:

```suggestion
    let content = fs::read_to_string(&config_path)?;
```

The TOML deserializer doesn't require a static string, it can work with a regular `String`.

How can I resolve this? If you propose a fix, please make it concise.

xav-db and others added 3 commits January 27, 2026 09:06
…. The .leak() call now only executes once via LazyLock, ensuring the device ID string is leaked exactly once rather than on every call.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant