Skip to content

feat: add TOTP two-factor authentication for dashboard login#8189

Open
Raven95676 wants to merge 8 commits into
AstrBotDevs:masterfrom
Raven95676:feat/totp
Open

feat: add TOTP two-factor authentication for dashboard login#8189
Raven95676 wants to merge 8 commits into
AstrBotDevs:masterfrom
Raven95676:feat/totp

Conversation

@Raven95676
Copy link
Copy Markdown
Member

@Raven95676 Raven95676 commented May 14, 2026

新增TOTP

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add TOTP-based two-factor authentication for dashboard login, including configuration UI, trusted device support, recovery codes, and rate limiting for sensitive auth endpoints.

New Features:

  • Introduce TOTP two-factor authentication for dashboard login, including setup, verification, disable endpoints and recovery-code based fallback.
  • Add dashboard UI flows for TOTP login (account, TOTP, and recovery stages) and device trust selection.
  • Provide configuration UI components to enable and manage dashboard TOTP, including viewing/rotating secrets and recovery codes.

Enhancements:

  • Extend dashboard authentication logic to support TOTP verification, recovery codes, and trusted device cookies while preserving existing password checks.
  • Add a new database model to persist trusted dashboard device tokens and revoke them on password or TOTP changes.
  • Introduce centralized error helper in the auth route and improve login error responses for TOTP-related failures.
  • Implement in-memory rate limiting for sensitive auth endpoints in the dashboard server middleware to mitigate brute-force attempts.

Build:

  • Add the pyotp dependency to project configuration for TOTP generation and verification.

@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. area:webui The bug / feature is about webui(dashboard) of astrbot. labels May 14, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The in-memory _last_totp_timecode map in totp.py is never pruned and is keyed by secret forever; consider clearing entries when TOTP is disabled/rotated or periodically pruning to avoid unbounded growth over long runtimes.
  • The _AuthRateLimiter instances are keyed only by endpoint path, so all clients share the same small bucket for /api/auth/login and TOTP routes; if this is not intentional, consider including a client identifier (e.g., IP or user) in the key to prevent one user’s activity from throttling others.
  • The logic that disables TOTP and clears dashboard.totp (in totp_disable and in the recovery-code branch of login) is duplicated; extracting this into a small helper (e.g., disable_totp(config, db)) would reduce the risk of future drift between the two paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The in-memory `_last_totp_timecode` map in `totp.py` is never pruned and is keyed by secret forever; consider clearing entries when TOTP is disabled/rotated or periodically pruning to avoid unbounded growth over long runtimes.
- The `_AuthRateLimiter` instances are keyed only by endpoint path, so all clients share the same small bucket for `/api/auth/login` and TOTP routes; if this is not intentional, consider including a client identifier (e.g., IP or user) in the key to prevent one user’s activity from throttling others.
- The logic that disables TOTP and clears `dashboard.totp` (in `totp_disable` and in the recovery-code branch of `login`) is duplicated; extracting this into a small helper (e.g., `disable_totp(config, db)`) would reduce the risk of future drift between the two paths.

## Individual Comments

### Comment 1
<location path="astrbot/dashboard/server.py" line_range="43-48" />
<code_context>

 # Static assets shipped inside the wheel (built during `hatch build`).
 _BUNDLED_DIST = Path(__file__).parent / "dist"
+_RATE_LIMITED_ENDPOINTS: frozenset = frozenset(
+    {
+        "/api/auth/totp/disable",
+        "/api/auth/totp/setup",
+        "/api/auth/login",
+        "/api/auth/totp/verify-setup",
+    }
+)
</code_context>
<issue_to_address>
**🚨 issue (security):** Rate limiter is global per-endpoint and not scoped per client, which can let one client throttle all others.

Because the limiter is keyed only by `request.path` with a single `_AuthRateLimiter` (capacity=3, refill_rate=1.0) per endpoint, one noisy client can exhaust the shared bucket and cause a DoS for all users on that path. Please scope the limiter per client (e.g., using a composite key like `(request.path, request.remote_addr)` or another stable client identifier), and consider whether capacity/refill should be configurable for these auth endpoints.
</issue_to_address>

### Comment 2
<location path="astrbot/core/utils/totp.py" line_range="24-33" />
<code_context>
+_last_totp_timecode: dict[str, int] = {}
</code_context>
<issue_to_address>
**suggestion (performance):** Replay protection map `_last_totp_timecode` can grow unbounded over long uptime or many secrets.

Because `_last_totp_timecode` is keyed by the raw secret and never cleaned up, it can grow indefinitely in long-lived processes or with frequent secret rotation. Consider adding a pruning strategy (e.g., when TOTP is disabled/rotated, or by keeping only the last N secrets) or keying by a stable hash and cleaning up when it changes, to prevent unbounded memory growth while preserving replay protection.

Suggested implementation:

```python
# Limit how many secrets we keep in memory for TOTP replay protection.
# This prevents unbounded growth over long uptimes or frequent secret rotation.
MAX_TOTP_REPLAY_ENTRIES = 10_000

# Map of hashed TOTP secrets -> last accepted timecode.
# We key by a stable hash of the secret so rotation to a new secret naturally
# evicts old entries as the map reaches MAX_TOTP_REPLAY_ENTRIES.
_last_totp_timecode: "OrderedDict[str, int]" = OrderedDict()
_totp_replay_lock = asyncio.Lock()


def _totp_replay_key(secret: str) -> str:
    """Return a stable hash key for a TOTP secret suitable for in-memory indexing."""
    # Using SHA-256 avoids keeping the raw secret as the dictionary key and gives
    # us a fixed-size identifier that works well with pruning / rotation.
    return hashlib.sha256(secret.encode("utf-8")).hexdigest()


async def _get_last_totp_timecode(secret: str) -> int | None:
    """Fetch the last accepted TOTP timecode for a secret, if any."""
    async with _totp_replay_lock:
        key = _totp_replay_key(secret)
        return _last_totp_timecode.get(key)


async def _update_last_totp_timecode(secret: str, timecode: int) -> None:
    """Update the last accepted TOTP timecode for a secret, pruning old entries."""
    async with _totp_replay_lock:
        key = _totp_replay_key(secret)

        # Update/move to the end for a simple LRU-like eviction strategy.
        _last_totp_timecode[key] = timecode
        _last_totp_timecode.move_to_end(key)

        # Prune oldest secrets if we exceed the maximum size.
        while len(_last_totp_timecode) > MAX_TOTP_REPLAY_ENTRIES:
            _last_totp_timecode.popitem(last=False)

```

To fully implement this change and avoid unbounded growth:

1. **Imports**
   - At the top of `astrbot/core/utils/totp.py`, add:
     - `from collections import OrderedDict`
     - `import hashlib`
   - If `typing.OrderedDict` is used instead of the runtime class anywhere else, adjust the annotation accordingly:
     - `from collections.abc import MutableMapping` or `from typing import OrderedDict` depending on your typing style.

2. **Replace direct uses of `_last_totp_timecode`**
   - Anywhere in this file where `_last_totp_timecode` is accessed directly with the raw secret as a key, change those usages to call the helpers:
     - Read access:
       - Replace `last = _last_totp_timecode.get(secret)` (or similar) with:
         - `last = await _get_last_totp_timecode(secret)`
     - Write/update access:
       - Replace `_last_totp_timecode[secret] = timecode` or mutations inside an `async with _totp_replay_lock` block with:
         - `await _update_last_totp_timecode(secret, timecode)`
   - If there is existing manual locking around `_last_totp_timecode` using `_totp_replay_lock`, remove the redundant lock usage around call sites of `_get_last_totp_timecode` / `_update_last_totp_timecode`, since the helpers already handle locking.

3. **Type annotations**
   - If your project targets Python versions that do not support `int | None`, replace it with `Optional[int]` and add `from typing import Optional`.

These changes ensure the replay protection map is bounded in size, keyed by a stable hash instead of the raw secret, and automatically prunes old entries while preserving replay protection guarantees for recent/active secrets.
</issue_to_address>

### Comment 3
<location path="dashboard/src/components/shared/DashboardTotpDisableDialog.vue" line_range="45" />
<code_context>
+              color="error"
+              variant="tonal"
+              :loading="verifying"
+              :disabled="!recoveryCode || recoveryCode.length < 5"
+              @click="confirmDisable"
+            >
</code_context>
<issue_to_address>
**suggestion (performance):** Recovery code length check is very loose compared to the actual expected format, which can cause avoidable roundtrips.

`verify_recovery_code` expects a normalized 32‑char code, so enabling submit for very short values will almost always fail and cause unnecessary requests. Consider tightening the client-side check (e.g., minimum length close to the full code including dashes, or a regex) to align with the backend expectation and avoid redundant roundtrips.

Suggested implementation:

```
              color="error"
              variant="tonal"
              :loading="verifying"
              :disabled="!isValidRecoveryCode"
              @click="confirmDisable"
            >

```

To fully implement the suggestion, update the `<script>` section of `DashboardTotpDisableDialog.vue`:

1. Add a computed property (or method, if that’s how validation is done elsewhere) that normalizes and validates the recovery code against the backend expectation (32-char normalized code):

```ts
computed: {
  // ...
  isValidRecoveryCode(): boolean {
    if (!this.recoveryCode) return false
    // strip non-alphanumeric chars (e.g., dashes, spaces)
    const normalized = this.recoveryCode.replace(/[^A-Za-z0-9]/g, '')
    // backend expects a normalized 32-char code
    return normalized.length === 32
  },
},
```

2. If this component uses the `<script setup>` syntax instead of the options API, define it as a `computed`:

```ts
const isValidRecoveryCode = computed(() => {
  if (!recoveryCode.value) return false
  const normalized = recoveryCode.value.replace(/[^A-Za-z0-9]/g, '')
  return normalized.length === 32
})
```

3. Ensure `recoveryCode` is a `ref`/data property already present; otherwise, keep using the existing binding but feed it into this validation logic.

This will tighten the client-side check so the button only enables when the recovery code is very likely to pass `verify_recovery_code`, avoiding unnecessary roundtrips.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/dashboard/server.py
Comment on lines +24 to +33
_last_totp_timecode: dict[str, int] = {}
_totp_replay_lock = asyncio.Lock()


def _get_totp_config(config) -> dict:
totp_config = config.get("dashboard", {}).get("totp", {})
return totp_config if isinstance(totp_config, dict) else {}


def is_totp_enabled(config) -> bool:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Replay protection map _last_totp_timecode can grow unbounded over long uptime or many secrets.

Because _last_totp_timecode is keyed by the raw secret and never cleaned up, it can grow indefinitely in long-lived processes or with frequent secret rotation. Consider adding a pruning strategy (e.g., when TOTP is disabled/rotated, or by keeping only the last N secrets) or keying by a stable hash and cleaning up when it changes, to prevent unbounded memory growth while preserving replay protection.

Suggested implementation:

# Limit how many secrets we keep in memory for TOTP replay protection.
# This prevents unbounded growth over long uptimes or frequent secret rotation.
MAX_TOTP_REPLAY_ENTRIES = 10_000

# Map of hashed TOTP secrets -> last accepted timecode.
# We key by a stable hash of the secret so rotation to a new secret naturally
# evicts old entries as the map reaches MAX_TOTP_REPLAY_ENTRIES.
_last_totp_timecode: "OrderedDict[str, int]" = OrderedDict()
_totp_replay_lock = asyncio.Lock()


def _totp_replay_key(secret: str) -> str:
    """Return a stable hash key for a TOTP secret suitable for in-memory indexing."""
    # Using SHA-256 avoids keeping the raw secret as the dictionary key and gives
    # us a fixed-size identifier that works well with pruning / rotation.
    return hashlib.sha256(secret.encode("utf-8")).hexdigest()


async def _get_last_totp_timecode(secret: str) -> int | None:
    """Fetch the last accepted TOTP timecode for a secret, if any."""
    async with _totp_replay_lock:
        key = _totp_replay_key(secret)
        return _last_totp_timecode.get(key)


async def _update_last_totp_timecode(secret: str, timecode: int) -> None:
    """Update the last accepted TOTP timecode for a secret, pruning old entries."""
    async with _totp_replay_lock:
        key = _totp_replay_key(secret)

        # Update/move to the end for a simple LRU-like eviction strategy.
        _last_totp_timecode[key] = timecode
        _last_totp_timecode.move_to_end(key)

        # Prune oldest secrets if we exceed the maximum size.
        while len(_last_totp_timecode) > MAX_TOTP_REPLAY_ENTRIES:
            _last_totp_timecode.popitem(last=False)

To fully implement this change and avoid unbounded growth:

  1. Imports

    • At the top of astrbot/core/utils/totp.py, add:
      • from collections import OrderedDict
      • import hashlib
    • If typing.OrderedDict is used instead of the runtime class anywhere else, adjust the annotation accordingly:
      • from collections.abc import MutableMapping or from typing import OrderedDict depending on your typing style.
  2. Replace direct uses of _last_totp_timecode

    • Anywhere in this file where _last_totp_timecode is accessed directly with the raw secret as a key, change those usages to call the helpers:
      • Read access:
        • Replace last = _last_totp_timecode.get(secret) (or similar) with:
          • last = await _get_last_totp_timecode(secret)
      • Write/update access:
        • Replace _last_totp_timecode[secret] = timecode or mutations inside an async with _totp_replay_lock block with:
          • await _update_last_totp_timecode(secret, timecode)
    • If there is existing manual locking around _last_totp_timecode using _totp_replay_lock, remove the redundant lock usage around call sites of _get_last_totp_timecode / _update_last_totp_timecode, since the helpers already handle locking.
  3. Type annotations

    • If your project targets Python versions that do not support int | None, replace it with Optional[int] and add from typing import Optional.

These changes ensure the replay protection map is bounded in size, keyed by a stable hash instead of the raw secret, and automatically prunes old entries while preserving replay protection guarantees for recent/active secrets.

Comment thread dashboard/src/components/shared/DashboardTotpDisableDialog.vue Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements Two-Factor Authentication (TOTP) for the WebUI, including backend verification logic, recovery code generation, and trusted device management. It introduces new authentication endpoints, a multi-stage login interface, and a rate limiter for security. Feedback highlights a potential Denial of Service vulnerability in the rate limiter, which is currently global rather than per-IP. Additionally, a bug was identified where TOTP verification uses naive local time instead of UTC, potentially causing failures on servers with non-UTC timezones.

Comment thread astrbot/dashboard/server.py Outdated
Comment thread astrbot/core/utils/totp.py Outdated
@Raven95676 Raven95676 linked an issue May 14, 2026 that may be closed by this pull request
2 tasks
Add auth_rate_limit config block to dashboard settings with enable
(default: true), average_interval (default: 1.0s), and max_burst
(default: 3) options. The dashboard auth middleware now reads from
config instead of using hardcoded values. The average_interval and
max_burst fields are conditionally shown only when rate limiting is
enabled.
@Sjshi763
Copy link
Copy Markdown
Contributor

我早就想要了!本来还在想要不要提iss,不过我很好奇是如何实现的呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:webui The bug / feature is about webui(dashboard) of astrbot. size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] 登录界面加入验证码支持

2 participants