Skip to content

FP-1.1: proposal to reduce runtime#5433

Open
karthikeya-remilla wants to merge 3 commits into
openconfig:mainfrom
b4firex:fp-1.1-reduce-runtime
Open

FP-1.1: proposal to reduce runtime#5433
karthikeya-remilla wants to merge 3 commits into
openconfig:mainfrom
b4firex:fp-1.1-reduce-runtime

Conversation

@karthikeya-remilla
Copy link
Copy Markdown
Contributor

Limit power admin tests to one eligible component per type.

Summary
This change updates the power admin down/up tests to select and validate a single eligible component for each hardware type instead of running the workflow against every matching component.

What changed:

  • TestFabricPowerAdmin
  1. Scan fabric components and pick the first one that is:
    not empty
    removable
    OPER_STATUS_ACTIVE
  2. Skip the test if no eligible fabric is found.
  • TestLinecardPowerAdmin
    ◦ Apply the same selection logic for linecards.
    ◦ Skip if no eligible linecard is found.

  • TestControllerCardPowerAdmin
    ◦ Identify controller cards by redundant-role.
    ◦ Select the SECONDARY controller for the power cycle workflow.
    ◦ Skip if primary/secondary roles are not both present.
    ◦ Skip if the secondary controller is not active.
    ◦ Continue to await switchover-ready on the primary after the secondary recovers.

Why:
Running the power-admin workflow across every eligible fabric, linecard, or controller in a chassis increases runtime as discovered in a fully loaded chassis like https://partnerissuetracker.corp.google.com/issues/492974186. This change narrows the test to one representative eligible component per type while preserving the existing validation flow.

Behavioral impact
•Test scope is reduced from “all eligible components” to “one eligible component per type”.
•If no suitable component exists, the test now skips with a single top-level reason instead of creating multiple skipped subtests.

@karthikeya-remilla karthikeya-remilla requested a review from a team as a code owner May 6, 2026 11:01
@karthikeya-remilla karthikeya-remilla requested a review from AmrNJ May 6, 2026 11:01
@OpenConfigBot
Copy link
Copy Markdown

OpenConfigBot commented May 6, 2026

Pull Request Functional Test Report for #5433 / 22e97ed

Virtual Devices

Device Test Test Documentation Job Raw Log
Arista cEOS status
FP-1.1: Power admin DOWN/UP Test
Cisco 8000E status
FP-1.1: Power admin DOWN/UP Test
Cisco XRd status
FP-1.1: Power admin DOWN/UP Test
Juniper ncPTX status
FP-1.1: Power admin DOWN/UP Test
Nokia SR Linux status
FP-1.1: Power admin DOWN/UP Test
Openconfig Lemming status
FP-1.1: Power admin DOWN/UP Test

Hardware Devices

Device Test Test Documentation Raw Log
Arista 7808 status
FP-1.1: Power admin DOWN/UP Test
Cisco 8808 status
FP-1.1: Power admin DOWN/UP Test
Juniper PTX10008 status
FP-1.1: Power admin DOWN/UP Test
Nokia 7250 IXR-10e status
FP-1.1: Power admin DOWN/UP Test

Help

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the power admin down/up tests to significantly reduce execution time on fully loaded chassis. By selecting a single eligible component for fabric, linecard, and controller card types, the test suite avoids redundant workflows while maintaining necessary validation coverage. The changes also introduce more robust checks for controller card roles and operational status to ensure reliable test execution.

Highlights

  • Test Optimization: Reduced test runtime by limiting power admin validation to a single representative component per hardware type instead of iterating through all components.
  • Controller Card Logic: Updated controller card testing to specifically target the secondary controller, ensuring both primary and secondary roles are present and active before proceeding.
  • Improved Skip Handling: Replaced multiple subtest skips with a single top-level skip condition when no eligible components are found, simplifying test output.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the component selection logic in the power-admin-down-up test to improve robustness, specifically for fabric, linecard, and controller card components. The changes replace restrictive test-skipping patterns with more resilient discovery loops. The reviewer suggests further robustness improvements, including using gnmi.Lookup instead of gnmi.Get during discovery to handle missing telemetry gracefully, and replacing t.Fatalf with a logging-and-continue pattern when encountering unexpected redundant roles to prevent unnecessary test failures.

Comment thread feature/platform/tests/power_admin_down_up_test/power_admin_down_up_test.go Outdated
Comment thread feature/platform/tests/power_admin_down_up_test/power_admin_down_up_test.go Outdated
Comment thread feature/platform/tests/power_admin_down_up_test/power_admin_down_up_test.go Outdated
@karthikeya-remilla
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the power administration tests for fabric, linecard, and controller card components by moving eligibility checks out of the subtests to pre-select a single valid component for testing. Feedback was provided to improve the robustness of the controller card selection logic by using gnmi.Lookup instead of gnmi.Get for operational status to prevent potential fatal errors and to better handle scenarios with multiple secondary controllers.

Comment on lines 105 to +129
for _, c := range cs {
t.Run(c, func(t *testing.T) {
role := gnmi.Get(t, dut, gnmi.OC().Component(c).RedundantRole().State())
if got, want := role, oc.Platform_ComponentRedundantRole_PRIMARY; got == want {
primary = c
t.Skipf("ControllerCard Component %s is PRIMARY, hence skipping", c)
}

oper := gnmi.Get(t, dut, gnmi.OC().Component(c).OperStatus().State())
if got, want := oper, oc.PlatformTypes_COMPONENT_OPER_STATUS_ACTIVE; got != want {
t.Skipf("ControllerCard Component %s is already INACTIVE, hence skipping", c)
}

powerDownUp(t, dut, c, oc.PlatformTypes_OPENCONFIG_HARDWARE_COMPONENT_CONTROLLER_CARD, 20*time.Minute)
})
role, ok := gnmi.Lookup(t, dut, gnmi.OC().Component(c).RedundantRole().State()).Val()
if !ok {
t.Logf("Controller card %q missing redundant-role telemetry; skipping during selection", c)
continue
}
switch role {
case oc.Platform_ComponentRedundantRole_PRIMARY:
primary = c
case oc.Platform_ComponentRedundantRole_SECONDARY:
secondary = c
default:
t.Logf("Controller card %q has unexpected redundant-role %v; skipping during selection", c, role)
continue
}
}
if primary != "" {
gnmi.Await(t, dut, gnmi.OC().Component(primary).SwitchoverReady().State(), 30*time.Minute, true)

if primary == "" || secondary == "" {
t.Skipf("Missing required controller roles: primary=%q secondary=%q", primary, secondary)
}

oper := gnmi.Get(t, dut, gnmi.OC().Component(secondary).OperStatus().State())
if oper != oc.PlatformTypes_COMPONENT_OPER_STATUS_ACTIVE {
t.Skipf("Secondary controller %q not active: got %v", secondary, oper)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The selection logic for the controller card can be improved for robustness and consistency with the Fabric and Linecard tests. Currently, it uses gnmi.Get for OperStatus after the loop (lines 126-129), which will cause a fatal error if the leaf is missing. Additionally, if multiple secondary controllers exist, the current logic might pick an inactive one and skip the test even if an active one is available. It is better to check for an active secondary during the selection loop using gnmi.Lookup.

	for _, c := range cs {
		role, ok := gnmi.Lookup(t, dut, gnmi.OC().Component(c).RedundantRole().State()).Val()
		if !ok {
			t.Logf("Controller card %q missing redundant-role telemetry; skipping during selection", c)
			continue
		}
		switch role {
		case oc.Platform_ComponentRedundantRole_PRIMARY:
			if primary == "" {
				primary = c
			}
		case oc.Platform_ComponentRedundantRole_SECONDARY:
			if secondary == "" {
				oper, ok := gnmi.Lookup(t, dut, gnmi.OC().Component(c).OperStatus().State()).Val()
				if ok && oper == oc.PlatformTypes_COMPONENT_OPER_STATUS_ACTIVE {
					secondary = c
				} else {
					t.Logf("Controller card %q is SECONDARY but not ACTIVE (got %v); skipping during selection", c, oper)
				}
			}
		default:
			t.Logf("Controller card %q has unexpected redundant-role %v; skipping during selection", c, role)
		}
	}

	if primary == "" || secondary == "" {
		t.Skipf("Missing required active controller roles: primary=%q secondary=%q", primary, secondary)
	}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple secondary controllers is an infeasible scenario, so it is not a priority here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple secondary controllers is an infeasible scenario, so it is not a priority here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants