Skip to content

feat(anc): add check-hotfix subcommand to read hotfix pointer from LPS#8696

Open
Devinwong wants to merge 1 commit into
mainfrom
devinwong/anc-check-hotfix-configmap
Open

feat(anc): add check-hotfix subcommand to read hotfix pointer from LPS#8696
Devinwong wants to merge 1 commit into
mainfrom
devinwong/anc-check-hotfix-configmap

feat(anc): add check-hotfix subcommand to read hotfix pointer from LPS

fe768d2
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Agentbaker GPU E2E failed Jul 3, 2026 in 32m 30s

Build #20260703.4 had test failures

Details

Tests

  • Failed: 45 (19.65%)
  • Passed: 184 (80.35%)
  • Other: 0 (0.00%)
  • Total: 229

Annotations

Check failure on line 4332 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker GPU E2E

Build log #L4332

Script failed with exit code: 1

Check failure on line 1 in Test_Ubuntu2204_GPUGridDriver/scriptless_nbc

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker GPU E2E

Test_Ubuntu2204_GPUGridDriver/scriptless_nbc

Failed
Raw output
=== RUN   Test_Ubuntu2204_GPUGridDriver/scriptless_nbc
=== PAUSE Test_Ubuntu2204_GPUGridDriver/scriptless_nbc
=== CONT  Test_Ubuntu2204_GPUGridDriver/scriptless_nbc
    test_helpers.go:418: [8.197s] TAGS {Name:Test_Ubuntu2204_GPUGridDriver/scriptless_nbc ImageName:2204gen2containerd OS:ubuntu Arch:amd64 NetworkIsolated:false NonAnonymousACR:false GPU:true WASM:false BootstrapTokenFallback:false KubeletCustomConfig:false Scriptless:false VHDCaching:false MockAzureChinaCloud:false VMSeriesCoverageTest:false}
    test_helpers.go:229: [8.197s] → running scenario...
    test_helpers.go:246: [8.197s] using cluster abe2e-kubenet-v5-150ee in rg=abe2e-westus3 sub=8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
    test_helpers.go:247: [8.198s] portal: https://portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/abe2e-westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    test_helpers.go:279: [8.223s] → preparing AKS node...
    vmss.go:531: [8.223s] → creating VMSS 4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc...
    vmss.go:435: [9.744s] VMSS portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc/overview
    vmss.go:441: [9.744s] Managed cluster portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    vmss.go:564: [29.297s] VM will be automatically deleted after the test finishes, to preserve it for debugging purposes set KEEP_VMSS=true or pause the test with a breakpoint before the test finishes or failed
    vmss.go:568: [29.297s] SSH Instructions: (may take a few minutes for the VM to be ready for SSH)
        ========================
        az network bastion ssh --target-resource-id "/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc/virtualMachines/0" --name "abe2e-shared-bastion" --resource-group abe2e-westus3 --auth-type ssh-key --username azureuser --ssh-key /tmp/private-key-2922552056
        
    bastionssh.go:304: [281.847s] Attempt 1/5 establishing SSH over bastion to 10.220.112.107
    vmss.go:618: [284.126s] VM reached running state
    vmss.go:588: [284.126s] ✓ creating VMSS 4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc done (275.9s)
    kube.go:160: [284.126s] → waiting for node 4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc to be ready...
    kube.go:182: [284.237s] node 4e7n-2026-07-03-ubuntu2204gpugriddriverscriptlessnbc000000 is ready. Taints: [{"key":"node.kubernetes.io/network-unavailable","effect":"NoSchedule","timeAdded":"2026-07-03T04:46:18Z"}] Conditions: [{"type":"NetworkUnavailable","status":"True","lastHeartbeatTime":"2026-07-03T04:46:18Z","lastTransitionTime":"2026-07-03T04:46:18Z","reason":"NodeInitialization","message":"Waiting for cloud routes"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2026-07-03T04:46:16Z","lastTransitionTime":"2026-07-03T04:46:15Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2026-07-03T04:46:16Z","lastTransitionTime":"2026-07-03T04:46:15Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"PIDPressure","status":
... [The stack trace has been truncated as it exceeded the maximum allowed size. Please refer to the complete log available in the Test Run attachments for full details.]

Check failure on line 1 in Test_Ubuntu2404_NvidiaDevicePluginRunning/default

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker GPU E2E

Test_Ubuntu2404_NvidiaDevicePluginRunning/default

Failed
Raw output
=== RUN   Test_Ubuntu2404_NvidiaDevicePluginRunning/default
=== PAUSE Test_Ubuntu2404_NvidiaDevicePluginRunning/default
=== CONT  Test_Ubuntu2404_NvidiaDevicePluginRunning/default
    azure.go:480: [0.000s] Looking up images in https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/aks-ubuntu-containerd-24.04-gen2/overview
    azure.go:569: [38.348s] Image version /subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/2404gen2containerd/versions/1.1783016979.17372 is already in region westus3
    vhd.go:347: [38.348s] got version by tag branch=refs/heads/main: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/c4c3550e-a965-4993-a50c-628fd38cd3e1/resourceGroups/aksvhdtestbuildrg/providers/Microsoft.Compute/galleries/PackerSigGalleryEastUS/images/aks-ubuntu-containerd-24.04-gen2/versions/1.1783016979.17372/overview
    test_helpers.go:418: [38.348s] TAGS {Name:Test_Ubuntu2404_NvidiaDevicePluginRunning/default ImageName:2404gen2containerd OS:ubuntu Arch:amd64 NetworkIsolated:false NonAnonymousACR:false GPU:true WASM:false BootstrapTokenFallback:false KubeletCustomConfig:false Scriptless:false VHDCaching:false MockAzureChinaCloud:false VMSeriesCoverageTest:false}
    test_helpers.go:229: [38.348s] → running scenario...
    test_helpers.go:246: [38.348s] using cluster abe2e-kubenet-v5-150ee in rg=abe2e-westus3 sub=8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
    test_helpers.go:247: [38.348s] portal: https://portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/abe2e-westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    test_helpers.go:279: [38.351s] → preparing AKS node...
    vmss.go:531: [38.352s] → creating VMSS mvar-2026-07-03-ubuntu2404nvidiadevicepluginrunningdefaul...
    vmss.go:435: [38.812s] VMSS portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/mvar-2026-07-03-ubuntu2404nvidiadevicepluginrunningdefaul/overview
    vmss.go:441: [38.812s] Managed cluster portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
2026/07/03 04:42:20 Using VM extension version 1.465 for extension type Compute.AKS.Linux.AKSNode in region westus3
    vmss.go:564: [56.816s] VM will be automatically deleted after the test finishes, to preserve it for debugging purposes set KEEP_VMSS=true or pause the test with a breakpoint before the test finishes or failed
    vmss.go:568: [56.816s] SSH Instructions: (may take a few minutes for the VM to be ready for SSH)
        ========================
        az network bastion ssh --target-resource-id "/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/mvar-2026-07-03-ubuntu2404nvidiadevicepluginrunningdefaul/virtualMachines/0" --name "abe2e-shared-bastion" --resource-group abe2e-westus3 --auth-type ssh-key --username azureuser --ssh-key /tmp/private-key-2922552056
        
    bastionssh.go:304: [369.253s] Attempt 1/5 establishing SSH over bastion to 10.220.112.11
    vmss.go:618: [370.672s] VM reached running state
    vmss.go:588: [370
... [The stack trace has been truncated as it exceeded the maximum allowed size. Please refer to the complete log available in the Test Run attachments for full details.]

Check failure on line 1 in Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker GPU E2E

Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset

Failed
Raw output
=== RUN   Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset
=== PAUSE Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset
=== CONT  Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset
--- FAIL: Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset (0.00s)

Check failure on line 1 in Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker GPU E2E

Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default

Failed
Raw output
=== RUN   Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default
=== PAUSE Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default
=== CONT  Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default
    test_helpers.go:418: [8.208s] TAGS {Name:Test_Ubuntu2204_NvidiaDevicePlugin_Daemonset/default ImageName:2204gen2containerd OS:ubuntu Arch:amd64 NetworkIsolated:false NonAnonymousACR:false GPU:true WASM:false BootstrapTokenFallback:false KubeletCustomConfig:false Scriptless:false VHDCaching:false MockAzureChinaCloud:false VMSeriesCoverageTest:false}
    test_helpers.go:229: [8.208s] → running scenario...
    test_helpers.go:246: [8.208s] using cluster abe2e-kubenet-v5-150ee in rg=abe2e-westus3 sub=8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
    test_helpers.go:247: [8.208s] portal: https://portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/abe2e-westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    test_helpers.go:279: [8.235s] → preparing AKS node...
    vmss.go:531: [8.244s] → creating VMSS oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa...
    vmss.go:435: [9.777s] VMSS portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa/overview
    vmss.go:441: [9.777s] Managed cluster portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    vmss.go:564: [28.462s] VM will be automatically deleted after the test finishes, to preserve it for debugging purposes set KEEP_VMSS=true or pause the test with a breakpoint before the test finishes or failed
    vmss.go:568: [28.462s] SSH Instructions: (may take a few minutes for the VM to be ready for SSH)
        ========================
        az network bastion ssh --target-resource-id "/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa/virtualMachines/0" --name "abe2e-shared-bastion" --resource-group abe2e-westus3 --auth-type ssh-key --username azureuser --ssh-key /tmp/private-key-2922552056
        
    bastionssh.go:304: [371.340s] Attempt 1/5 establishing SSH over bastion to 10.220.112.102
    vmss.go:618: [373.426s] VM reached running state
    vmss.go:588: [373.427s] ✓ creating VMSS oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa done (365.2s)
    kube.go:160: [373.427s] → waiting for node oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa to be ready...
    kube.go:182: [373.528s] node oc88-2026-07-03-ubuntu2204nvidiadeviceplugindaemonsetdefa000000 is ready. Taints: [{"key":"node.kubernetes.io/network-unavailable","effect":"NoSchedule","timeAdded":"2026-07-03T04:47:27Z"}] Conditions: [{"type":"NetworkUnavailable","status":"True","lastHeartbeatTime":"2026-07-03T04:47:27Z","lastTransitionTime":"2026-07-03T04:47:27Z","reason":"NodeInitialization","message":"Waiting for cloud routes"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2026-07-03T04:47:49Z","lastTransitionTime":"2026-07-03T04:47:19Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2026-07-03T04:47:49Z","lastTransitionTime":"2026-07-03T04:47:19Z","reason":"KubeletHasNoDiskPressure","message":"
... [The stack trace has been truncated as it exceeded the maximum allowed size. Please refer to the complete log available in the Test Run attachments for full details.]