Skip to content

Introduce API Tokens with cluster_permissions and index_permissions directly associated with the token#5443

Open
cwperks wants to merge 61 commits intoopensearch-project:mainfrom
cwperks:feature/api-tokens-cwperx
Open

Introduce API Tokens with cluster_permissions and index_permissions directly associated with the token#5443
cwperks wants to merge 61 commits intoopensearch-project:mainfrom
cwperks:feature/api-tokens-cwperx

Conversation

@cwperks
Copy link
Copy Markdown
Member

@cwperks cwperks commented Jun 25, 2025

Description

Re-basing #5225 with the latest changes from main.

This PR introduces API Tokens — a new capability in the Security plugin that allows security admins to issue long-lived, scoped tokens and associate permissions directly with the token.

How it works

API Tokens are opaque tokens with the format os_<random>. When a token is created, a SHA-256 hash of the plaintext token is stored in a system index, .opensearch_security_api_tokens. The plaintext token is returned once at creation time and never stored. On each request, the incoming token is hashed and looked up in an in-memory cache populated from the index.

Tokens are authenticated via the Authorization: ApiKey <token> header.

What is novel about this approach compared to OBO tokens is that permissions are scoped directly to the token rather than derived from the issuing user's roles. An admin can issue a token with only the permissions it needs — for example, read-only access to a single index — regardless of the admin's own permissions. This enforces the principle of least privilege and is a key building block toward deprecating Roles Injection, the current practice for how plugins run async jobs with user-scoped permissions.

Revocation model

Tokens use a soft-delete revocation model. When a token is revoked via DELETE /_plugins/_security/api/apitokens/{id}, the document in the index is updated with a revoked_at timestamp rather than being deleted. This means:

  • Revoked tokens remain visible in the list endpoint with a revoked_at field, enabling UIs to display revocation history and audit trails.
  • Revocation is synchronous — the cache refresh is broadcast to all nodes and confirmed before the response is returned, so the token is immediately unusable cluster-wide.
  • During cache reload, tokens with revoked_at set are excluded from the in-memory authentication maps, so they cannot be used to authenticate.

Token Identity & Naming

Token names serve as the user-facing identity. They must be unique and match [a-zA-Z0-9_-]+. In audit logs and internal user contexts, tokens appear as token:<name> (e.g., token:my-service-token). The SHA-256 hash used for authentication lookup is internal-only and never exposed in API responses or audit logs.

System Index Protection

API tokens are denied access to all system indices regardless of their granted permissions. This is enforced at the privilege evaluation layer via SpecialIndexProtection and applies to both legacy and V4 privilege evaluation modes.


API Reference

Create API Token

POST /_plugins/_security/api/apitokens

Request:

{
  "name": "my-token",
  "cluster_permissions": ["cluster:monitor/health"],
  "index_permissions": [
    {
      "index_pattern": ["logs-*"],
      "allowed_actions": ["indices:data/read/search"]
    }
  ],
  "expiration": 1800000
}

Response:

{
  "id": "Nd_pMRWeAC93ZGMhRa5CxX",
  "token": "os_abc123..."
}

The id is used to manage the token, such as listing or revoking it. The plaintext token is returned once and never stored — save it immediately.

List API Tokens

GET /_plugins/_security/api/apitokens

Returns all tokens, including revoked ones. Revoked tokens include a revoked_at field (epoch millis).

Response:

[
  {
    "id": "Nd_pMRWeAC93ZGMhRa5CxX",
    "name": "my-token",
    "iat": 1742000000000,
    "expiration": 1800000,
    "cluster_permissions": ["cluster:monitor/health"],
    "index_permissions": [
      {
        "index_pattern": ["logs-*"],
        "allowed_actions": ["indices:data/read/search"]
      }
    ]
  },
  {
    "id": "Xf_qNSZeBC04AHNiSb6DyY",
    "name": "old-token",
    "iat": 1741000000000,
    "expiration": 1800000,
    "revoked_at": 1741500000000,
    "cluster_permissions": ["cluster:monitor/health"],
    "index_permissions": []
  }
]
Revoke API Token

DELETE /_plugins/_security/api/apitokens/{id}

Response:

{
  "message": "Token Nd_pMRWeAC93ZGMhRa5CxX revoked successfully."
}

Revocation is a soft-delete — the token metadata is retained with a revoked_at timestamp. The token is immediately unusable after the response is returned. The cache refresh is broadcast synchronously to all nodes before the response is sent.

Using a Token

Pass the token in the Authorization header using the ApiKey scheme:

Authorization: ApiKey os_abc123...

Example — search a permitted index:

GET /logs-2025/_search
Authorization: ApiKey os_abc123...

Response:

{
  "hits": {
    "total": { "value": 3, "relation": "eq" },
    "hits": [ ... ]
  }
}

Example — attempt a forbidden action:

DELETE /logs-2025
Authorization: ApiKey os_abc123...

Response:

{
  "error": {
    "type": "security_exception",
    "reason": "no permissions for [indices:admin/delete]"
  },
  "status": 403
}

Issues Resolved

Partially resolves #4009, limited to security admins in the initial release.

Check List

  • New functionality includes testing
  • New functionality has been documented
  • New Roles/Permissions have a corresponding security dashboards plugin PR
  • API changes companion pull request created
  • Commits are signed per the DCO using --signoff
  • V4 (nextgen) privilege evaluation mode supported and tested

derek-ho and others added 23 commits November 14, 2024 10:47
…00 tokens outstanding (opensearch-project#5147)

Signed-off-by: Derek Ho <dxho@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

PR Code Suggestions ✨

Latest suggestions up to 42ecbe0

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Token count includes revoked tokens incorrectly

The expiration value stored is an absolute epoch millisecond timestamp
(absoluteExpiration = Instant.now().toEpochMilli() + requestedExpiration), but the
TokenEntry.isExpired() method compares Instant.now().toEpochMilli() > expiration.
When requestedExpiration is 0 (no expiration), absoluteExpiration is set to 0, and
isExpired() returns false because expiration > 0 is false — this is correct.
However, the getTokenCount method counts all tokens including revoked ones from the
index, which may inflate the count and incorrectly block new token creation. The
getTokenCount should only count active (non-revoked) tokens.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenAction.java [229-231]

-long absoluteExpiration = 0;
+public void getTokenCount(ActionListener<Long> listener) {
+    getApiTokens(ActionListener.wrap(
+        tokens -> listener.onResponse(tokens.values().stream().filter(t -> !t.isRevoked()).count()),
+        listener::onFailure
+    ));
+}
 
-if (requestedExpiration != 0) {
-    if (requestedExpiration < 0) {
-        sendErrorResponse(channel, RestStatus.BAD_REQUEST, "Token expiration duration must be positive.");
-        return;
-    }
-    long requestedExpirationSeconds = requestedExpiration / 1000;
-    if (maxExpirationSeconds > 0 && requestedExpirationSeconds > maxExpirationSeconds) {
-
Suggestion importance[1-10]: 7

__

Why: This is a valid bug where getTokenCount counts revoked tokens, potentially blocking new token creation prematurely. The improved code correctly filters out revoked tokens before counting, and the fix is accurate and directly addresses the issue.

Medium
Token name uniqueness check is cache-only

The tokenNameExists check only looks at the in-memory cache, which may not reflect
tokens that were created on other nodes or before the last reload. This can allow
duplicate token names to be created in a distributed cluster. Consider performing a
search against the index to check for name uniqueness, or at minimum document this
limitation clearly.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java [163-165]

 public boolean tokenNameExists(String name) {
+    // NOTE: This only checks the local in-memory cache. In a multi-node cluster,
+    // tokens created on other nodes may not yet be reflected here.
     return tokenCache.values().stream().anyMatch(entry -> name.equals(entry.name()));
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that tokenNameExists only checks the local in-memory cache, which could allow duplicate names in a distributed cluster. However, the 'improved_code' only adds a comment without actually fixing the issue, making it a documentation-only change with limited impact.

Low
Hard limit silently drops tokens beyond 10,000

The search is hard-capped at 10,000 results. If more than 10,000 tokens exist, the
excess tokens will be silently ignored during reload, leaving valid tokens
inaccessible and stale tokens in the cache. Consider using the Scroll or Search
After API for paginated retrieval, or at minimum log a warning when the result count
hits the limit.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenIndexHandler.java [102]

-searchRequest.source(new SearchSourceBuilder().size(10_000));
+SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().size(10_000);
+searchRequest.source(sourceBuilder);
+// TODO: Replace with scroll/search_after pagination to handle > 10,000 tokens
Suggestion importance[1-10]: 5

__

Why: The 10,000 hard cap on search results is a real scalability concern that could silently cause tokens to be inaccessible. However, the 'improved_code' only adds a TODO comment without actually implementing pagination, so it's essentially the same as the existing code with a comment added.

Low
Guard against negative expiry duration

The expiresInSeconds is computed at construction time using Instant.now(), which
means the value will drift over time and may be negative if the token is already
expired. Consider computing this value lazily or storing the expiry instant and
computing the duration on demand to ensure accuracy.

src/main/java/org/opensearch/security/authtoken/jwt/ExpiringBearerAuthToken.java [35]

-this.expiresInSeconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+long seconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+this.expiresInSeconds = Math.max(0, seconds);
Suggestion importance[1-10]: 5

__

Why: The expiresInSeconds could be negative if the token is already expired at construction time. Adding Math.max(0, seconds) prevents negative values, though the practical impact depends on how expiresInSeconds is used downstream.

Low
Non-atomic privilege map swap causes race condition

The two-step update (putAll then retainAll) on a ConcurrentHashMap is not atomic.
Between the two operations, a request could see a mix of old and new privileges —
for example, a revoked token's privileges could still be present while a new token's
privileges are already added. Consider replacing the map reference atomically using
an AtomicReference instead.

src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java [116-118]

-// Atomic swap: add/update new entries, then remove stale ones
-tokenIdToActionPrivileges.putAll(newTokenPrivileges);
-tokenIdToActionPrivileges.keySet().retainAll(newTokenPrivileges.keySet());
+// Use atomic reference swap to avoid race conditions between putAll and retainAll
+Map<String, ActionPrivileges> snapshot = new ConcurrentHashMap<>(newTokenPrivileges);
+tokenIdToActionPrivileges.keySet().retainAll(snapshot.keySet());
+tokenIdToActionPrivileges.putAll(snapshot);
Suggestion importance[1-10]: 4

__

Why: The race condition between putAll and retainAll on a ConcurrentHashMap is a real concern, but the 'improved_code' only reorders the operations (retainAll before putAll) rather than using an AtomicReference as suggested in the description. The reordering doesn't actually solve the atomicity problem and could introduce different issues.

Low
General
Preserve interrupt status correctly in test utility

After calling Thread.currentThread().interrupt() to restore the interrupt flag,
calling fail() will throw an AssertionError, which may suppress the interrupt status
and make it harder to diagnose. Consider throwing an explicit exception or letting
the interrupt propagate before failing.

src/test/java/org/opensearch/security/util/ActionListenerUtils.java [60-69]

 void waitForCompletion() {
     try {
         if (!latch.await(5, TimeUnit.SECONDS)) {
             fail("Test timed out waiting for response");
         }
     } catch (InterruptedException e) {
         Thread.currentThread().interrupt();
-        fail("Test interrupted: " + e.getMessage());
+        throw new RuntimeException("Test interrupted", e);
     }
 }
Suggestion importance[1-10]: 4

__

Why: Throwing a RuntimeException after restoring the interrupt flag is cleaner than calling fail(), which throws AssertionError and may obscure the interrupt. This is a valid improvement for test reliability, though it's in test utility code with limited production impact.

Low
Eliminate duplicate configuration repository lookups

Both methods call
configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name())
independently, resulting in duplicate repository lookups. Extract the config
retrieval into a shared helper method to avoid redundant calls and improve
maintainability.

src/main/java/org/opensearch/security/rest/DashboardsInfoAction.java [197-211]

+private ConfigV7 getGeneralConfig() {
+    return configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name());
+}
+
 private boolean getApiTokensEnabled() {
-    ConfigV7 generalConfig = configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name());
+    ConfigV7 generalConfig = getGeneralConfig();
     if (generalConfig != null && generalConfig.dynamic != null && generalConfig.dynamic.api_tokens != null) {
         return Boolean.TRUE.equals(generalConfig.dynamic.api_tokens.getEnabled());
     }
     return false;
 }
 
 private long getMaxTokenExpirationSeconds() {
-    ConfigV7 generalConfig = configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name());
+    ConfigV7 generalConfig = getGeneralConfig();
     if (generalConfig != null && generalConfig.dynamic != null && generalConfig.dynamic.api_tokens != null) {
         return generalConfig.dynamic.api_tokens.getMaxExpirationSeconds();
     }
     return 0;
 }
Suggestion importance[1-10]: 4

__

Why: Both getApiTokensEnabled() and getMaxTokenExpirationSeconds() independently call configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name()), which is redundant. Extracting a shared helper improves maintainability and reduces duplicate repository calls.

Low
Clarify misleading comment on restricted endpoint logic

The comment says "Don't want to allow any api token access" but the method is
isAccessToRestrictedEndpoints, so returning true means the endpoint is restricted
regardless of the HTTP method. This blocks all HTTP methods (GET, DELETE, etc.)
uniformly, which may be intentional, but the comment is misleading and could cause
confusion. Clarify the comment to accurately describe the intent.

src/main/java/org/opensearch/security/util/AuthTokenUtils.java [34-36]

 case API_TOKEN_SUFFIX:
-    // Don't want to allow any api token access
+    // API token management endpoints are always restricted (all HTTP methods)
     return true;
Suggestion importance[1-10]: 2

__

Why: This is purely a comment improvement with no functional change. The suggestion only updates the comment text, which has minimal impact on correctness or functionality.

Low

Previous suggestions

Suggestions up to commit 67105b5
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix thread context lifecycle in async REST handlers

The ThreadContext.StoredContext is closed at the end of doPrepareRequest, but the
returned RestChannelConsumer lambdas (e.g., handlePost, handleDelete, handleGet)
execute asynchronously after the context is already restored. This means the
transient user set via putTransient will no longer be present when the lambda
actually runs, potentially causing authorization failures or NPEs.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenAction.java [131-144]

 RestChannelConsumer doPrepareRequest(RestRequest request, NodeClient client) {
     final var originalUserAndRemoteAddress = Utils.userAndRemoteAddressFrom(client.threadPool().getThreadContext());
-    try (final ThreadContext.StoredContext ctx = client.threadPool().getThreadContext().stashContext()) {
-        client.threadPool()
-            .getThreadContext()
-            .putTransient(ConfigConstants.OPENDISTRO_SECURITY_USER, originalUserAndRemoteAddress.getLeft());
-        return switch (request.method()) {
-            case POST -> handlePost(request, client);
-            case DELETE -> handleDelete(request, client);
-            case GET -> handleGet(request, client);
-            default -> throw new IllegalArgumentException(request.method() + " not supported");
+    // Store the user for use in the channel consumer
+    final User originalUser = originalUserAndRemoteAddress.getLeft();
+    return switch (request.method()) {
+        case POST -> channel -> {
+            try (final ThreadContext.StoredContext ctx = client.threadPool().getThreadContext().stashContext()) {
+                client.threadPool().getThreadContext().putTransient(ConfigConstants.OPENDISTRO_SECURITY_USER, originalUser);
+                handlePost(request, client).accept(channel);
+            }
         };
-    }
+        case DELETE -> channel -> {
+            try (final ThreadContext.StoredContext ctx = client.threadPool().getThreadContext().stashContext()) {
+                client.threadPool().getThreadContext().putTransient(ConfigConstants.OPENDISTRO_SECURITY_USER, originalUser);
+                handleDelete(request, client).accept(channel);
+            }
+        };
+        case GET -> channel -> {
+            try (final ThreadContext.StoredContext ctx = client.threadPool().getThreadContext().stashContext()) {
+                client.threadPool().getThreadContext().putTransient(ConfigConstants.OPENDISTRO_SECURITY_USER, originalUser);
+                handleGet(request, client).accept(channel);
+            }
+        };
+        default -> throw new IllegalArgumentException(request.method() + " not supported");
+    };
 }
Suggestion importance[1-10]: 8

__

Why: The StoredContext is closed before the returned RestChannelConsumer lambda executes asynchronously, meaning the transient user set via putTransient is no longer available when the handler actually runs. This is a real concurrency/lifecycle bug that could cause authorization failures.

Medium
Prevent race condition during token privileges map update

The tokenIdToActionPrivileges.clear() followed by forEachToken puts is not atomic.
Between the clear() and the subsequent put calls, concurrent requests may see an
empty map and incorrectly deny access to all API token users. Consider building a
new map and replacing the reference atomically, or using a read-write lock.

src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java [94-115]

 apiTokenRepository.subscribeOnChange(() -> {
     SecurityDynamicConfiguration<ActionGroupsV7> actionGroupsConfiguration = configurationRepository.getConfiguration(
         CType.ACTIONGROUPS
     );
     FlattenedActionGroups flattenedActionGroups = new FlattenedActionGroups(actionGroupsConfiguration.withStaticConfig());
-    tokenIdToActionPrivileges.clear();
+    Map<String, ActionPrivileges> newMap = new ConcurrentHashMap<>();
     apiTokenRepository.forEachToken(
-        (tokenId, role) -> tokenIdToActionPrivileges.put(
+        (tokenId, role) -> newMap.put(
             tokenId,
-            ...
+            new SubjectBasedActionPrivileges(
+                role,
+                flattenedActionGroups,
+                new RuntimeOptimizedActionPrivileges.SpecialIndexProtection(
+                    specialIndices::isSystemIndex,
+                    specialIndices::isSystemIndex,
+                    RuntimeOptimizedActionPrivileges.SpecialIndexProtection.IndicesNeedingSpecialRoles.DISABLED
+                ),
+                false
+            )
         )
     );
+    tokenIdToActionPrivileges.clear();
+    tokenIdToActionPrivileges.putAll(newMap);
 });
Suggestion importance[1-10]: 7

__

Why: The clear() followed by putAll pattern creates a window where concurrent requests see an empty tokenIdToActionPrivileges map, potentially denying access to all API token users. Building a new map and swapping atomically is a valid fix for this race condition.

Medium
Prevent early exit when notifying multiple listeners

Throwing an exception inside the loop will prevent remaining listeners from being
notified. If one listener fails, subsequent listeners are silently skipped. Consider
logging the error and continuing to notify all listeners, or collecting exceptions
and re-throwing after all listeners have been called.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java [122-132]

 public synchronized void notifyAboutChanges() {
     for (TokenListener listener : tokenListener) {
         try {
             log.debug("Notify {} listener about change", listener);
             listener.onChange();
         } catch (Exception e) {
-            log.error("{} listener errored: " + e, listener, e);
-            throw ExceptionsHelper.convertToOpenSearchException(e);
+            log.error("{} listener errored: {}", listener, e.getMessage(), e);
         }
     }
 }
Suggestion importance[1-10]: 6

__

Why: Throwing an exception inside the loop causes remaining listeners to be skipped. The improved code logs the error and continues, which is more robust for multi-listener notification patterns.

Low
Guard against negative expiration duration

If expiry is in the past, Duration.between will return a negative value, which could
cause unexpected behavior. You should guard against this by ensuring
expiresInSeconds is not negative, or throw an exception if the expiry is already
past.

src/main/java/org/opensearch/security/authtoken/jwt/ExpiringBearerAuthToken.java [35]

-this.expiresInSeconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+this.expiresInSeconds = Math.max(0, Duration.between(Instant.now(), expiry.toInstant()).getSeconds());
Suggestion importance[1-10]: 5

__

Why: If expiry is in the past, Duration.between returns a negative value for expiresInSeconds, which could cause unexpected behavior. Using Math.max(0, ...) is a reasonable defensive fix, though the impact depends on how expiresInSeconds is used downstream.

Low
Guard against null security index in matcher

The securityIndex variable might be null if not configured, which could cause
WildcardMatcher.from to behave unexpectedly or throw a NullPointerException. Ensure
securityIndex is validated before being passed to WildcardMatcher.from.

src/main/java/org/opensearch/security/compliance/ComplianceConfig.java [177]

-this.securityIndicesMatcher = WildcardMatcher.from(securityIndex, ConfigConstants.OPENSEARCH_API_TOKENS_INDEX);
+this.securityIndicesMatcher = WildcardMatcher.from(
+    securityIndex != null ? securityIndex : "",
+    ConfigConstants.OPENSEARCH_API_TOKENS_INDEX
+);
Suggestion importance[1-10]: 3

__

Why: The suggestion raises a valid concern about securityIndex potentially being null, but securityIndex is likely validated elsewhere in the codebase before reaching this constructor. The risk is low and the fix may be unnecessary depending on the broader context.

Low
General
Handle pagination for large token sets

Using a fixed size of 10,000 for the search query will silently truncate results if
there are more than 10,000 tokens, leading to incomplete cache population. Consider
using the scroll API or search_after pagination to handle large numbers of tokens,
or at minimum log a warning when the result count reaches the limit.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenIndexHandler.java [102]

-searchRequest.source(new SearchSourceBuilder().size(10_000));
+SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().size(10_000);
+searchRequest.source(sourceBuilder);
+// After receiving response, check if total hits exceed the page size and log a warning
Suggestion importance[1-10]: 4

__

Why: The fixed size of 10,000 could silently truncate results, but the improved_code doesn't actually implement pagination - it only restructures the same code and adds a comment. The suggestion identifies a real concern but the fix is incomplete.

Low
Fix inconsistency between stored and displayed max tokens

The getMaxTokens() method silently caps the value at 1000 without reflecting the
actual configured value, which could be confusing. Additionally, the toString()
method uses the raw maxTokens field instead of the capped value, creating an
inconsistency between what is displayed and what is enforced.

src/main/java/org/opensearch/security/securityconf/impl/v7/ConfigV7.java [519-521]

 public int getMaxTokens() {
     return Math.min(maxTokens, 1000);
+}
 
+@Override
+public String toString() {
+    return "ApiTokenSettings [ enabled="
+        + enabled
+        + ", max_tokens="
+        + getMaxTokens()
+        + ", max_expiration_seconds="
+        + maxExpirationSeconds
+        + "]";
+}
+
Suggestion importance[1-10]: 4

__

Why: The toString() method uses the raw maxTokens field while getMaxTokens() returns the capped value, creating a minor inconsistency. Using getMaxTokens() in toString() would make the displayed value consistent with the enforced value.

Low
Use AtomicReference for thread-safe map updates

The tokenIdToActionPrivileges field is declared volatile but is initialized once in
the constructor and never updated afterwards in the visible diff. If this map is
meant to be updated dynamically (e.g., when tokens change), reads and writes to the
map's contents are not thread-safe with just volatile. Consider using
AtomicReference<Map<String, ActionPrivileges>> for safe atomic updates, similar to
how actionPrivileges is handled.

src/main/java/org/opensearch/security/privileges/actionlevel/nextgen/PrivilegesEvaluatorImpl.java [131]

-private volatile Map<String, ActionPrivileges> tokenIdToActionPrivileges;
+private final AtomicReference<Map<String, ActionPrivileges>> tokenIdToActionPrivileges = new AtomicReference<>();
Suggestion importance[1-10]: 4

__

Why: The volatile keyword only guarantees visibility of the reference itself, not the map's contents. If tokenIdToActionPrivileges is updated dynamically, using AtomicReference would be safer, similar to how actionPrivileges is handled. However, if it's only set once in the constructor, the concern is less critical.

Low
Suggestions up to commit a85a39e
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid race condition during privilege map rebuild

The tokenIdToActionPrivileges.clear() followed by individual put calls is not
atomic. A concurrent request between the clear() and the completion of all put()
calls will see an empty or partially populated map, causing valid tokens to be
temporarily denied. Consider building a new map and replacing the reference
atomically using an AtomicReference.

src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java [94-115]

 apiTokenRepository.subscribeOnChange(() -> {
     SecurityDynamicConfiguration<ActionGroupsV7> actionGroupsConfiguration = configurationRepository.getConfiguration(
         CType.ACTIONGROUPS
     );
     FlattenedActionGroups flattenedActionGroups = new FlattenedActionGroups(actionGroupsConfiguration.withStaticConfig());
-    tokenIdToActionPrivileges.clear();
+    Map<String, ActionPrivileges> newMap = new ConcurrentHashMap<>();
     apiTokenRepository.forEachToken(
-        (tokenId, role) -> tokenIdToActionPrivileges.put(
+        (tokenId, role) -> newMap.put(
             tokenId,
-            ...
+            new SubjectBasedActionPrivileges(
+                role,
+                flattenedActionGroups,
+                new RuntimeOptimizedActionPrivileges.SpecialIndexProtection(
+                    specialIndices::isSystemIndex,
+                    specialIndices::isSystemIndex,
+                    RuntimeOptimizedActionPrivileges.SpecialIndexProtection.IndicesNeedingSpecialRoles.DISABLED
+                ),
+                false
+            )
         )
     );
+    tokenIdToActionPrivileges.clear();
+    tokenIdToActionPrivileges.putAll(newMap);
 });
Suggestion importance[1-10]: 7

__

Why: The clear() followed by individual put() calls on tokenIdToActionPrivileges creates a window where concurrent requests see an empty map, causing valid tokens to be temporarily denied. Building a new map and swapping it atomically is a meaningful correctness improvement for a security-sensitive code path.

Medium
Prevent early exit when notifying multiple listeners

Throwing an exception inside the loop will prevent subsequent listeners from being
notified. If one listener fails, all remaining listeners are silently skipped.
Consider logging the error and continuing to notify all listeners, or collecting
failures and reporting them after the loop completes.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java [122-132]

 public synchronized void notifyAboutChanges() {
     for (TokenListener listener : tokenListener) {
         try {
             log.debug("Notify {} listener about change", listener);
             listener.onChange();
         } catch (Exception e) {
             log.error("{} listener errored: " + e, listener, e);
-            throw ExceptionsHelper.convertToOpenSearchException(e);
         }
     }
 }
Suggestion importance[1-10]: 6

__

Why: Throwing an exception inside the loop causes subsequent listeners to be skipped. The improved code removes the throw and only logs the error, ensuring all listeners are notified. This is a valid correctness concern, though the impact depends on how many listeners are registered in practice.

Low
Guard against negative expiration duration

The expiresInSeconds is calculated at construction time using Instant.now(), which
means it captures a snapshot of the remaining time and will become stale as time
passes. If the expiry date is in the past, Duration.between will return a negative
value, which could cause unexpected behavior. Consider validating that the expiry is
in the future before computing the duration.

src/main/java/org/opensearch/security/authtoken/jwt/ExpiringBearerAuthToken.java [35]

-this.expiresInSeconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+long seconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+this.expiresInSeconds = Math.max(0, seconds);
Suggestion importance[1-10]: 5

__

Why: If expiry is in the past, Duration.between returns a negative value for expiresInSeconds, which could cause unexpected behavior. Adding Math.max(0, seconds) is a reasonable defensive guard, though the scenario may be unlikely in practice.

Low
Fix race condition in token name uniqueness check

The tokenNameExists check is performed against the in-memory cache, which is only
eventually consistent with the index. Two concurrent requests with the same token
name could both pass this check before either writes to the index, resulting in
duplicate token names. Consider enforcing uniqueness at the index level (e.g., using
a deterministic document ID based on the token name) to prevent this race condition.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenAction.java [203-206]

+// Consider using a deterministic document ID (e.g., hash of token name) in indexTokenMetadata
+// to enforce uniqueness at the index level and avoid TOCTOU race conditions.
 if (apiTokenRepository.tokenNameExists(tokenName)) {
     sendErrorResponse(channel, RestStatus.BAD_REQUEST, "A token with name '" + tokenName + "' already exists.");
     return;
 }
Suggestion importance[1-10]: 4

__

Why: The TOCTOU race condition is a real concern, but the improved_code is identical to the existing_code with only a comment added, making it a documentation suggestion rather than an actual fix. The score reflects the valid concern but penalizes the lack of a concrete code change.

Low
General
Handle token lists exceeding search result size limit

Using a fixed page size of 10,000 will silently drop tokens beyond that limit. If
the number of API tokens exceeds 10,000, the cache will be incomplete and valid
tokens may be rejected. Consider using scroll/search_after pagination or enforcing a
hard cap via the maxTokens configuration to ensure all tokens are loaded.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenIndexHandler.java [102]

-searchRequest.source(new SearchSourceBuilder().size(10_000));
+searchRequest.source(new SearchSourceBuilder().size(10_000).trackTotalHits(true));
+// TODO: implement pagination (scroll or search_after) to handle more than 10,000 tokens
Suggestion importance[1-10]: 4

__

Why: The fixed size(10_000) limit could silently truncate results if tokens exceed that count. However, the improved_code only adds trackTotalHits(true) and a TODO comment without actually implementing pagination, making it a marginal improvement at best.

Low
Ensure token privileges map is properly refreshed

The tokenIdToActionPrivileges field is initialized once from
dynamicDependencies.tokenIdToActionPrivileges() in the constructor but is never
updated afterward, making the volatile keyword ineffective. If the
token-to-privileges mapping can change at runtime (e.g., when tokens are created or
revoked), there should be a mechanism to refresh this map. Consider either making it
an AtomicReference with proper update logic or ensuring it is refreshed via the
updateConfiguration method.

src/main/java/org/opensearch/security/privileges/actionlevel/nextgen/PrivilegesEvaluatorImpl.java [131]

-private volatile Map<String, ActionPrivileges> tokenIdToActionPrivileges;
+private final AtomicReference<Map<String, ActionPrivileges>> tokenIdToActionPrivileges = new AtomicReference<>(Collections.emptyMap());
Suggestion importance[1-10]: 4

__

Why: The volatile keyword on tokenIdToActionPrivileges is noted as potentially ineffective if the map is never updated after construction. However, the suggestion to switch to AtomicReference with Collections.emptyMap() as default contradicts the constructor initialization from dynamicDependencies, and the full update mechanism is unclear from the diff alone.

Low
Add lower bound validation for max tokens

The getMaxTokens() method silently caps the value at 1000 without any validation or
warning, which could be confusing since the stored maxTokens value may differ from
what is returned. Additionally, there is no lower bound validation — a negative or
zero value for maxTokens would be returned as-is. Consider adding a lower bound
check and/or logging a warning when the configured value exceeds the cap.

src/main/java/org/opensearch/security/securityconf/impl/v7/ConfigV7.java [519-521]

 public int getMaxTokens() {
-    return Math.min(maxTokens, 1000);
+    return Math.min(Math.max(maxTokens, 1), 1000);
 }
Suggestion importance[1-10]: 3

__

Why: Adding a lower bound of 1 for maxTokens is a minor defensive improvement, but negative or zero values are unlikely to be configured in practice and the impact is low.

Low
Restrict internal helper method visibility

The waitForCompletion method has package-private visibility, but assertSuccess and
assertException are public and depend on it. If a subclass or external test utility
in a different package calls assertSuccess or assertException, waitForCompletion
will still be invoked correctly, but the inconsistent visibility is a design issue.
Consider making waitForCompletion private since it is an internal implementation
detail not intended for direct external use.

src/test/java/org/opensearch/security/util/ActionListenerUtils.java [60-69]

-void waitForCompletion() {
+private void waitForCompletion() {
     try {
         if (!latch.await(5, TimeUnit.SECONDS)) {
             fail("Test timed out waiting for response");
         }
     } catch (InterruptedException e) {
         Thread.currentThread().interrupt();
         fail("Test interrupted: " + e.getMessage());
     }
 }
Suggestion importance[1-10]: 3

__

Why: Making waitForCompletion private is a minor encapsulation improvement for a test utility class. The current package-private visibility is unlikely to cause real issues, but private better communicates intent.

Low
Suggestions up to commit d85f1eb
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent race condition during token privileges map rebuild

The tokenIdToActionPrivileges.clear() followed by individual put calls is not
atomic. During the window between clear() and repopulation, concurrent requests will
find an empty map and may be incorrectly denied access. Consider building a new map
and replacing the reference atomically using an AtomicReference.

src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java [94-115]

 apiTokenRepository.subscribeOnChange(() -> {
     SecurityDynamicConfiguration<ActionGroupsV7> actionGroupsConfiguration = configurationRepository.getConfiguration(
         CType.ACTIONGROUPS
     );
     FlattenedActionGroups flattenedActionGroups = new FlattenedActionGroups(actionGroupsConfiguration.withStaticConfig());
-    tokenIdToActionPrivileges.clear();
+    Map<String, ActionPrivileges> newMap = new ConcurrentHashMap<>();
     apiTokenRepository.forEachToken(
-        (tokenId, role) -> tokenIdToActionPrivileges.put(
+        (tokenId, role) -> newMap.put(
             tokenId,
-            ...
+            new SubjectBasedActionPrivileges(
+                role,
+                flattenedActionGroups,
+                new RuntimeOptimizedActionPrivileges.SpecialIndexProtection(
+                    specialIndices::isSystemIndex,
+                    specialIndices::isSystemIndex,
+                    RuntimeOptimizedActionPrivileges.SpecialIndexProtection.IndicesNeedingSpecialRoles.DISABLED
+                ),
+                false
+            )
         )
     );
+    tokenIdToActionPrivileges.clear();
+    tokenIdToActionPrivileges.putAll(newMap);
 });
Suggestion importance[1-10]: 7

__

Why: The clear-then-repopulate pattern on a ConcurrentHashMap creates a real race window where concurrent requests see an empty map and may be denied. The improved code correctly builds a new map first and then swaps, though using an AtomicReference would be even safer.

Medium
Avoid aborting listener notification on single failure

Throwing an exception inside the loop will prevent remaining listeners from being
notified. If one listener fails, subsequent listeners are silently skipped. Consider
logging the error and continuing to notify all listeners instead of re-throwing.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java [119-129]

 public synchronized void notifyAboutChanges() {
     for (TokenListener listener : tokenListener) {
         try {
             log.debug("Notify {} listener about change", listener);
             listener.onChange();
         } catch (Exception e) {
             log.error("{} listener errored: " + e, listener, e);
-            throw ExceptionsHelper.convertToOpenSearchException(e);
         }
     }
 }
Suggestion importance[1-10]: 6

__

Why: Throwing an exception inside the loop stops remaining listeners from being notified, which is a real correctness issue. The fix (removing the rethrow) is straightforward and accurate.

Low
Guard against negative expiration duration

The expiresInSeconds is calculated at construction time using Instant.now(), which
means it captures a snapshot of the remaining time rather than a dynamic value. If
expiry is in the past, Duration.between will return a negative value, which could
cause unexpected behavior. Consider adding a guard to handle past expiry dates,
e.g., using Math.max(0, ...).

src/main/java/org/opensearch/security/authtoken/jwt/ExpiringBearerAuthToken.java [35]

-this.expiresInSeconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+this.expiresInSeconds = Math.max(0, Duration.between(Instant.now(), expiry.toInstant()).getSeconds());
Suggestion importance[1-10]: 6

__

Why: If expiry is in the past, Duration.between returns a negative value for expiresInSeconds, which could cause unexpected behavior. Using Math.max(0, ...) is a reasonable defensive guard, though in practice tokens with past expiry dates may be rejected earlier in the flow.

Low
General
Handle truncation when token count exceeds search limit

Using a fixed size of 10,000 for the search query will silently truncate results if
there are more than 10,000 tokens, leading to incomplete cache population. Consider
using scroll/search_after pagination or at minimum logging a warning when the result
count hits the limit.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenIndexHandler.java [102]

-searchRequest.source(new SearchSourceBuilder().size(10_000));
+SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().size(10_000);
+searchRequest.source(sourceBuilder);
+// After receiving results:
+// if (response.getHits().getTotalHits().value > 10_000) {
+//     LOGGER.warn("API token index contains more than 10,000 tokens; only the first 10,000 were loaded.");
+// }
Suggestion importance[1-10]: 4

__

Why: The concern about silent truncation at 10,000 tokens is valid, but the improved_code only adds a comment rather than implementing actual pagination, making it a low-impact suggestion. The issue is real but the fix is incomplete.

Low
Use atomic reference for thread-safe map replacement

The tokenIdToActionPrivileges field is declared volatile but is initialized from
dynamicDependencies.tokenIdToActionPrivileges() only once in the constructor. If the
map reference itself is replaced (e.g., on token updates), volatile provides
visibility guarantees, but if the map is mutated in place, it does not. Unlike the
legacy implementation which uses a plain Map, this inconsistency suggests the update
mechanism may not be thread-safe. Ensure the field is either updated atomically via
an AtomicReference or that the map is always replaced (not mutated) on updates.

src/main/java/org/opensearch/security/privileges/actionlevel/nextgen/PrivilegesEvaluatorImpl.java [131]

-private volatile Map<String, ActionPrivileges> tokenIdToActionPrivileges;
+private final AtomicReference<Map<String, ActionPrivileges>> tokenIdToActionPrivileges = new AtomicReference<>();
Suggestion importance[1-10]: 4

__

Why: The volatile keyword provides visibility guarantees for reference replacement but the suggestion to use AtomicReference would require changing all usages of the field. The concern about thread safety is valid but the improved code only shows the field declaration without updating the usage sites, making it incomplete.

Low
Enforce token name uniqueness beyond in-memory cache

tokenNameExists checks only the in-memory cache, which may not reflect the current
state of the index if the cache hasn't been refreshed yet. This can allow duplicate
token names to be created in a race condition or after a node restart before the
cache is populated. The uniqueness check should also be enforced at the index level.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenAction.java [203-206]

+// Consider also enforcing uniqueness at the index level (e.g., using a unique constraint or a pre-check query),
+// as the in-memory cache may be stale or incomplete on a fresh node start.
 if (apiTokenRepository.tokenNameExists(tokenName)) {
     sendErrorResponse(channel, RestStatus.BAD_REQUEST, "A token with name '" + tokenName + "' already exists.");
     return;
 }
Suggestion importance[1-10]: 3

__

Why: The concern about stale cache is valid, but the improved_code only adds a comment without implementing any actual index-level check, making it a low-impact suggestion that doesn't change the code behavior.

Low
Enforce minimum bound on max tokens value

The getMaxTokens() method silently caps the value at 1000 without any validation or
warning, and the stored maxTokens field can still hold a value greater than 1000.
This discrepancy between the stored value and the returned value could cause
confusion. Consider also enforcing a minimum value (e.g., > 0) to prevent invalid
configurations.

src/main/java/org/opensearch/security/securityconf/impl/v7/ConfigV7.java [519-521]

 public int getMaxTokens() {
-    return Math.min(maxTokens, 1000);
+    return Math.min(Math.max(1, maxTokens), 1000);
 }
Suggestion importance[1-10]: 3

__

Why: Adding a minimum bound of 1 to getMaxTokens() prevents a zero or negative configuration, but this is a minor edge case with low practical impact since the default is 100 and the setter doesn't validate.

Low
Guard against null security index in matcher

The WildcardMatcher.from call uses the raw securityIndex string directly. If
securityIndex is null or empty (e.g., in test scenarios or misconfigured
environments), this could produce an unexpected matcher. Ensure securityIndex is
validated before being passed to WildcardMatcher.from to avoid silent mismatches.

src/main/java/org/opensearch/security/compliance/ComplianceConfig.java [177]

-this.securityIndicesMatcher = WildcardMatcher.from(securityIndex, ConfigConstants.OPENSEARCH_API_TOKENS_INDEX);
+this.securityIndicesMatcher = WildcardMatcher.from(
+    securityIndex != null ? securityIndex : "",
+    ConfigConstants.OPENSEARCH_API_TOKENS_INDEX
+);
Suggestion importance[1-10]: 3

__

Why: The null check for securityIndex is a defensive measure, but the existing codebase likely already validates this value before it reaches ComplianceConfig. This is a low-impact defensive improvement.

Low
Suggestions up to commit 39b5609
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid race condition during token privilege map rebuild

The tokenIdToActionPrivileges.clear() followed by individual put calls is not
atomic. Between the clear() and the completion of all put calls, concurrent
privilege checks may see an empty or partially populated map, potentially denying
access to valid tokens. Consider building a new map and replacing the reference
atomically using an AtomicReference.

src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java [94-115]

 apiTokenRepository.subscribeOnChange(() -> {
     SecurityDynamicConfiguration<ActionGroupsV7> actionGroupsConfiguration = configurationRepository.getConfiguration(
         CType.ACTIONGROUPS
     );
     FlattenedActionGroups flattenedActionGroups = new FlattenedActionGroups(actionGroupsConfiguration.withStaticConfig());
-    tokenIdToActionPrivileges.clear();
+    Map<String, ActionPrivileges> newMap = new ConcurrentHashMap<>();
     apiTokenRepository.forEachToken(
-        (tokenId, role) -> tokenIdToActionPrivileges.put(
+        (tokenId, role) -> newMap.put(
             tokenId,
-            ...
+            new SubjectBasedActionPrivileges(
+                role,
+                flattenedActionGroups,
+                new RuntimeOptimizedActionPrivileges.SpecialIndexProtection(
+                    specialIndices::isSystemIndex,
+                    specialIndices::isSystemIndex,
+                    RuntimeOptimizedActionPrivileges.SpecialIndexProtection.IndicesNeedingSpecialRoles.DISABLED
+                ),
+                false
+            )
         )
     );
+    tokenIdToActionPrivileges.clear();
+    tokenIdToActionPrivileges.putAll(newMap);
 });
Suggestion importance[1-10]: 7

__

Why: The clear() followed by individual put calls on a ConcurrentHashMap is not atomic and can expose an empty or partial map to concurrent readers. Building a new map and swapping it atomically is a valid fix for this race condition.

Medium
Prevent early exit when notifying multiple listeners

Throwing an exception inside the loop will prevent subsequent listeners from being
notified. If one listener fails, all remaining listeners are silently skipped.
Consider logging the error and continuing to notify all listeners, or collecting
failures and reporting them after the loop completes.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java [119-129]

 public synchronized void notifyAboutChanges() {
     for (TokenListener listener : tokenListener) {
         try {
             log.debug("Notify {} listener about change", listener);
             listener.onChange();
         } catch (Exception e) {
             log.error("{} listener errored: " + e, listener, e);
-            throw ExceptionsHelper.convertToOpenSearchException(e);
         }
     }
 }
Suggestion importance[1-10]: 6

__

Why: Throwing an exception inside the loop causes subsequent listeners to be skipped. The improved_code correctly removes the throw statement so all listeners are notified even if one fails, which is a valid improvement for robustness.

Low
Restrict API token endpoint by HTTP method

Returning true unconditionally for API_TOKEN_SUFFIX means that any request whose
suffix matches "api/apitokens" is always considered a restricted endpoint access,
regardless of the HTTP method. This will block legitimate API token management
requests (e.g., GET to list tokens). The restriction should be method-aware, similar
to the OBO token handling, to allow appropriate HTTP methods while blocking others.

src/main/java/org/opensearch/security/util/AuthTokenUtils.java [34-36]

 case API_TOKEN_SUFFIX:
-    // Don't want to allow any api token access
-    return true;
+    // Block write operations (POST, PUT, DELETE) to api tokens endpoint
+    return request.method() == POST || request.method() == RestRequest.Method.DELETE || request.method() == RestRequest.Method.PUT;
Suggestion importance[1-10]: 6

__

Why: The unconditional return true blocks all HTTP methods for the api/apitokens endpoint, which may be intentional per the comment "Don't want to allow any api token access" (i.e., API tokens cannot be used to manage API tokens). However, if GET requests should be allowed, this is a real issue. The suggestion is plausible but contradicts the explicit comment in the code.

Low
Guard against negative expiry duration

The expiresInSeconds is computed at construction time using Instant.now(), so it
captures the remaining duration only at the moment the object is created, not a
fixed expiry offset. If the expiry date is in the past, Duration.between will return
a negative value, which could cause unexpected behavior. Consider validating that
the expiry is in the future before computing the duration.

src/main/java/org/opensearch/security/authtoken/jwt/ExpiringBearerAuthToken.java [35]

-this.expiresInSeconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+long seconds = Duration.between(Instant.now(), expiry.toInstant()).getSeconds();
+this.expiresInSeconds = seconds > 0 ? seconds : 0;
Suggestion importance[1-10]: 5

__

Why: If expiry is in the past, Duration.between returns a negative value for expiresInSeconds, which could cause unexpected behavior. The fix is valid and adds a reasonable guard, though this edge case may be unlikely in practice.

Low
Fix race condition in duplicate token name detection

The tokenNameExists check is performed against the in-memory cache, which is only
eventually consistent with the index. Under concurrent token creation requests, two
requests with the same name could both pass this check before either is written to
the index, resulting in duplicate token names. Consider enforcing uniqueness at the
index level (e.g., using a unique document ID derived from the name) rather than
relying solely on the cache check.

src/main/java/org/opensearch/security/action/apitokens/ApiTokenAction.java [203-206]

+// Enforce uniqueness at the index level by using the token name as the document ID,
+// or use an optimistic concurrency control mechanism, rather than relying solely on the in-memory cache check.
 if (apiTokenRepository.tokenNameExists(tokenName)) {
     sendErrorResponse(channel, RestStatus.BAD_REQUEST, "A token with name '" + tokenName + "' already exists.");
     return;
 }
Suggestion importance[1-10]: 4

__

Why: The race condition concern is valid, but the improved_code is identical to the existing_code (only adds a comment), so it doesn't actually implement the suggested fix and provides no functional improvement.

Low
Ensure volatile map is refreshed on token changes

The tokenIdToActionPrivileges field is declared volatile but is initialized from
dynamicDependencies.tokenIdToActionPrivileges() and never updated afterwards in the
shown code. If the token-to-privileges mapping can change at runtime (e.g., when
tokens are created or revoked), the map reference must be updated to reflect those
changes; otherwise, stale privilege data will be used for token-based requests.

src/main/java/org/opensearch/security/privileges/actionlevel/nextgen/PrivilegesEvaluatorImpl.java [131]

 private volatile Map<String, ActionPrivileges> tokenIdToActionPrivileges;
+// Ensure this field is updated whenever token privileges change, e.g., via a refresh/reload mechanism.
Suggestion importance[1-10]: 2

__

Why: The suggestion points out a potential staleness issue but the improved_code is identical to the existing_code (just adds a comment), making it a documentation-only suggestion with no actual code change. The concern may be valid but the suggestion doesn't provide a concrete fix.

Low
General
Handle potential truncation of large token result sets

**Using a fixed size of 10,000 for the search query will silently truncate results ...

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 43930aa

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 78049ef

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 8b619cc.

PathLineSeverityDescription
src/main/java/org/opensearch/security/action/apitokens/ApiTokenRepository.java147mediumToken revocation is not atomic across the cluster. revokeApiToken() updates the index and then calls notifyAboutChanges(), which triggers reloadApiTokensFromIndex() on each node via ApiTokenUpdateAction. Between the index write and the per-node cache flush there is a window where a revoked token remains valid in memory. A slow or partitioned node may continue accepting a revoked token indefinitely until its next successful reload.
src/main/java/org/opensearch/security/http/ApiTokenAuthenticator.java75lowAuthentication failures (invalid token, expired token, metadata missing) are logged at log.error() rather than log.debug() or log.warn(). This will flood error logs under any brute-force or token-scanning attempt, potentially masking real errors and causing log-based denial-of-service noise.
src/main/java/org/opensearch/security/action/apitokens/ApiTokenIndexHandler.java105lowToken metadata is fetched with a hardcoded size of 10,000 results and no pagination. If the token index grows beyond this limit (possible given maxTokens is configurable up to 1,000 per config entry but the index itself has no enforced upper bound from the index handler), excess tokens are silently dropped from the in-memory cache, causing valid tokens to appear invalid.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 1 | Low: 2


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 8b619cc

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit f0772a5

@cwperks cwperks added the skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. label May 5, 2026
Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 39b5609

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit d85f1eb

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit a85a39e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Persistent review updated to latest commit 67105b5

Copy link
Copy Markdown
Collaborator

@nibix nibix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked out the integration with the new privilege evaluator. There are a few things to be considered, see below. If you'd need more background on this, we can have a quick conversation.

Comment thread src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java Outdated
Comment thread src/main/java/org/opensearch/security/privileges/PrivilegesConfiguration.java Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Persistent review updated to latest commit 5ead3fc

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Failed to generate code suggestions for PR

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Persistent review updated to latest commit 42ecbe0

Copy link
Copy Markdown
Member

@DarshitChanpura DarshitChanpura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some great work! @cwperks .

I left a few comments and clarification questions.

static {
PARSER.declareString(constructorArg(), new ParseField(NAME_FIELD));
PARSER.declareString(constructorArg(), new ParseField(TOKEN_HASH_FIELD));
PARSER.declareStringArray(optionalConstructorArg(), new ParseField(CLUSTER_PERMISSIONS_FIELD));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of making cluster permissions optional too?

apiTokenRepository.getTokenCount(ActionListener.wrap(tokenCount -> {
ConfigV7 config = configurationRepository.getConfiguration(CType.CONFIG).getCEntry(CType.CONFIG.name());
int maxTokens = config.dynamic.api_tokens.getMaxTokens();
if (tokenCount >= maxTokens) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at some point in future, we should add a schedule job to cleanup expired/revoked tokens.

sendErrorResponse(channel, RestStatus.BAD_REQUEST, "Token expiration duration must be positive.");
return;
}
long requestedExpirationSeconds = requestedExpiration / 1000;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make it clear that this value must be passed in millis.

client.threadPool().getThreadContext().putHeader(ConfigConstants.OPENDISTRO_SECURITY_CONF_REQUEST_HEADER, "true");

SearchRequest searchRequest = new SearchRequest(ConfigConstants.OPENSEARCH_API_TOKENS_INDEX);
searchRequest.source(new SearchSourceBuilder().size(10_000));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a filter to not get revoked/expired tokens?

}
}

// SHA-256 is sufficient for hashing high-entropy random tokens. Consider making configurable if algorithm rotation is needed.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should consider this in next-iteration 😅

@JsonProperty("max_tokens")
private int maxTokens = 100;
@JsonProperty("max_expiration_seconds")
private long maxExpirationSeconds = 7776000;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 90 day arbitrary limit here? Will this be documented?

* @return true if the request is from an API token, otherwise false
*/
public boolean isApiTokenRequest() {
return name != null && name.startsWith("token:");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be replaced with API_TOKEN_PREFIX?

apiTokenRepository.reloadApiTokensFromIndex(
ActionListener.wrap(
unused -> log.debug("API tokens loaded on node start"),
e -> log.warn("Failed to load API tokens on node start", e)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a mechanism where admin would know without looking at this warning?

maybe we show it in dashboard?

import static org.hamcrest.Matchers.equalTo;
import static org.junit.Assert.assertEquals;

public class ApiTokenTest {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a test for testing all the optional arguments.

public AuditLogsRule auditLogsRule = new AuditLogsRule();

@Test
public void testApiTokenAuthenticationIsAudited() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also add a test about any changes to the token being audited.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Support for API Keys in OpenSearch Security Plugin

6 participants