Skip to content

Add include_metadata request parameter for PPL queries #5235#5412

Open
ishag4 wants to merge 2 commits into
opensearch-project:mainfrom
ishag4:issue-5235
Open

Add include_metadata request parameter for PPL queries #5235#5412
ishag4 wants to merge 2 commits into
opensearch-project:mainfrom
ishag4:issue-5235

Conversation

@ishag4
Copy link
Copy Markdown

@ishag4 ishag4 commented May 6, 2026

Description

Add a request-level parameter include_metadata to the PPL query API:

POST /_plugins/_ppl?include_metadata=true
{
"query": "source=logs | where level='ERROR' | fields * | head 10"
}
Result: All regular fields PLUS metadata fields (_id, _index, _score, etc.)

Related Issues

Resolves #5235

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections

🔀 Multiple PR themes

Sub-PR theme: Core engine support for include_metadata parameter propagation

Relevant files:

  • core/src/main/java/org/opensearch/sql/ast/statement/Query.java
  • core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/executor/QueryService.java
  • core/src/main/java/org/opensearch/sql/executor/execution/QueryPlan.java
  • core/src/main/java/org/opensearch/sql/executor/execution/QueryPlanFactory.java
  • core/src/test/java/org/opensearch/sql/executor/execution/QueryPlanFactoryTest.java
  • core/src/test/java/org/opensearch/sql/executor/execution/QueryPlanTest.java

Sub-PR theme: PPL API and parser changes for include_metadata request parameter

Relevant files:

  • plugin/src/main/java/org/opensearch/sql/plugin/request/PPLQueryRequestFactory.java
  • plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java
  • ppl/src/main/java/org/opensearch/sql/ppl/PPLService.java
  • ppl/src/main/java/org/opensearch/sql/ppl/domain/PPLQueryRequest.java
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstStatementBuilder.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/UnresolvedPlanHelper.java
  • ppl/src/test/java/org/opensearch/sql/ppl/PPLServiceTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/parser/AstStatementBuilderTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/UnresolvedPlanHelperTest.java

⚡ Recommended focus areas for review

Missing closing brace

The test method buildQueryStatementWithFetchSizeAndSmallerHead appears to be missing its closing brace }. The new code at line 103 ends with the assertion but there is no closing brace for the method before the next @Test annotation at line 105. This would cause a compilation error.

public void buildQueryStatementWithFetchSizeAndSmallerHead() {
  // User query has head 3, fetchSize=10
  // Head(10) wraps Head(3), then Project(*) wraps on top
  // The inner head 3 limits first, so only 3 rows are returned
  assertEqualWithFetchSize(
      "source=t | head 3",
      10,
      new Query(project(head(head(relation("t"), 3, 0), 10, 0), AllFieldsExcludeMeta.of()), 0, PPL, false));
Inconsistent AllFields usage

In buildQueryStatementWithFetchSizeSmallerThanHead, the expected query uses AllFields.of() instead of AllFieldsExcludeMeta.of(). All other test cases in this file use AllFieldsExcludeMeta.of() for the default (non-include_metadata) case. This inconsistency may indicate a copy-paste error.

new Query(project(head(head(relation("t"), 100, 0), 5, 0), AllFields.of()), 0, PPL, false));
Logic Change

The handleAllFieldsProject method logic has been significantly restructured. Previously, tryToRemoveNestedFields was called only when the node was NOT AllFieldsExcludeMeta, and tryToRemoveMetaFields was called unconditionally. Now, for AllFieldsExcludeMeta, tryToRemoveMetaFields is called with excludeByForce=true but tryToRemoveNestedFields is NOT called. For AllFields, tryToRemoveNestedFields IS called, setProjectVisited(true) is set, and tryToRemoveMetaFields is called with false. The addition of context.setProjectVisited(true) inside this branch could affect other downstream logic that checks isProjectVisited. This should be carefully validated.

if (allFields instanceof AllFieldsExcludeMeta) {
  // Do NOT remove nested fields for AllFieldsExcludeMeta
  tryToRemoveMetaFields(context, true); // Force exclude metadata fields
} else {
  // For AllFields (include_metadata=true), include metadata fields
  tryToRemoveNestedFields(context);
  // Mark as project visited to prevent automatic metadata field removal
  context.setProjectVisited(true);
  // Don't force exclude metadata fields - let them remain
  tryToRemoveMetaFields(context, false);
}
Paginate with metadata

When pageSize.isPresent(), the includeMetadata flag is passed to queryService.execute wrapping the plan in Paginate. However, the Paginate constructor with highlightConfig is not used here (it passes highlightConfig from the field which is null for paginated plans). This is pre-existing behavior, but it's worth verifying that includeMetadata is correctly propagated through the Paginate path in the Calcite engine.

UnresolvedPlan plan,
Removed mock stubs

The doAnswer mock stubs for queryService.execute and queryService.explain have been removed from multiple tests. Without these stubs, the mock queryService will not invoke the response listeners, meaning getQueryListener(false) and getExplainListener(false) will never receive a response. If the tests previously relied on onResponse being called to assert success, removing the stubs may cause the tests to pass vacuously (no assertion triggered) rather than actually verifying correct behavior.

@Test
public void testExecuteShouldPass() {
  pplService.execute(
      new PPLQueryRequest("search source=t a=1", null, QUERY),
      getQueryListener(false),
      getExplainListener(false));
}

@Test
public void testExecuteCsvFormatShouldPass() {
  pplService.execute(
      new PPLQueryRequest("search source=t a=1", null, QUERY, "csv"),
      getQueryListener(false),
      getExplainListener(false));
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix inconsistent AllFields usage in test

This test case still uses AllFields.of() instead of AllFieldsExcludeMeta.of(), which
is inconsistent with all other test cases in the same file that were updated. Since
include_metadata is false, AllFieldsExcludeMeta.of() should be used here as well.

ppl/src/test/java/org/opensearch/sql/ppl/parser/AstStatementBuilderTest.java [113]

-new Query(project(head(head(relation("t"), 100, 0), 5, 0), AllFields.of()), 0, PPL, false));
+new Query(project(head(head(relation("t"), 100, 0), 5, 0), AllFieldsExcludeMeta.of()), 0, PPL, false));
Suggestion importance[1-10]: 8

__

Why: The test at line 113 still uses AllFields.of() instead of AllFieldsExcludeMeta.of(), which is inconsistent with all other updated test cases in the same file. Since includeMetadata is false, this is a genuine bug in the test that would cause it to fail.

Medium
General
Remove redundant side-effectful flag mutation

When allFields is AllFields (include_metadata=true), context.setProjectVisited(true)
is called before tryToRemoveMetaFields, but tryToRemoveMetaFields checks
!context.isProjectVisited() to decide whether to remove fields. Setting
projectVisited=true here means the condition !context.isProjectVisited() is false,
so tryToRemoveMetaFields with excludeByForce=false will do nothing — which is the
intended behavior. However, this side-effect-based logic is fragile; the early
return in tryToRemoveMetaFields when context.isIncludeMetadata() && !excludeByForce
already handles this case, making the setProjectVisited(true) call redundant and
potentially causing unintended side effects for subsequent pipeline stages.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [497-507]

 if (allFields instanceof AllFieldsExcludeMeta) {
   // Do NOT remove nested fields for AllFieldsExcludeMeta
   tryToRemoveMetaFields(context, true); // Force exclude metadata fields
 } else {
   // For AllFields (include_metadata=true), include metadata fields
   tryToRemoveNestedFields(context);
-  // Mark as project visited to prevent automatic metadata field removal
-  context.setProjectVisited(true);
   // Don't force exclude metadata fields - let them remain
+  // context.isIncludeMetadata() check in tryToRemoveMetaFields handles this
   tryToRemoveMetaFields(context, false);
 }
Suggestion importance[1-10]: 4

__

Why: The context.setProjectVisited(true) call before tryToRemoveMetaFields is redundant because the early return in tryToRemoveMetaFields already handles the includeMetadata=true case, and the side effect on isProjectVisited could affect subsequent pipeline stages unexpectedly. Removing it would make the logic cleaner and less fragile.

Low
Ensure null highlightConfig for paginated execution

When executing a paginated plan, highlightConfig is always null (set to null in the
paginated constructor), so passing it here is harmless but the Paginate wrapper may
not support highlightConfig. More critically, the original code passed null for
highlightConfig in the non-paginated execute overload — verify that passing
highlightConfig (which is null for paginated plans) to the new 5-argument execute
method is handled correctly and doesn't break the paginated path.

core/src/main/java/org/opensearch/sql/executor/execution/QueryPlan.java [108]

-queryService.execute(new Paginate(pageSize.get(), plan), getQueryType(), highlightConfig, includeMetadata, listener);
+queryService.execute(new Paginate(pageSize.get(), plan), getQueryType(), null, includeMetadata, listener);
Suggestion importance[1-10]: 3

__

Why: The highlightConfig field is already null for paginated plans (set explicitly in the paginated constructor), so passing null explicitly vs. passing the field makes no functional difference. This is a minor clarity improvement but not a real bug.

Low

@ishag4
Copy link
Copy Markdown
Author

ishag4 commented May 8, 2026

Hi @LantaoJin @penghuo @RyanL1997 @Swiddis Could you please review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add include_metadata request parameter for PPL queries

1 participant