Skip to content

feat(gpu): use aks-gpu-cuda-lts (R580 LTS) for the managed CUDA driver#8811

Merged
ganeshkumarashok merged 2 commits into
mainfrom
ganesh/gpu-cuda-lts
Jul 1, 2026
Merged

feat(gpu): use aks-gpu-cuda-lts (R580 LTS) for the managed CUDA driver#8811
ganeshkumarashok merged 2 commits into
mainfrom
ganesh/gpu-cuda-lts

fix(gpu): point the renovate rule at aks-gpu-cuda-lts

7b47e66
Select commit
Loading
Failed to load commit list.
Azure Pipelines / Agentbaker E2E failed Jul 1, 2026 in 44m 10s

Build #20260701.18 had test failures

Details

Tests

  • Failed: 21 (4.93%)
  • Passed: 405 (95.07%)
  • Other: 0 (0.00%)
  • Total: 426

Annotations

Check failure on line 2188 in Build log

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker E2E

Build log #L2188

Script failed with exit code: 1

Check failure on line 1 in Test_AzureLinuxV3_CSE_FullInstallPerformance/default

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker E2E

Test_AzureLinuxV3_CSE_FullInstallPerformance/default

Failed
Raw output
=== RUN   Test_AzureLinuxV3_CSE_FullInstallPerformance/default
=== PAUSE Test_AzureLinuxV3_CSE_FullInstallPerformance/default
=== CONT  Test_AzureLinuxV3_CSE_FullInstallPerformance/default
    test_helpers.go:418: [0.000s] TAGS {Name:Test_AzureLinuxV3_CSE_FullInstallPerformance/default ImageName:AzureLinuxV3gen2 OS:azurelinux Arch:amd64 NetworkIsolated:false NonAnonymousACR:false GPU:false WASM:false BootstrapTokenFallback:false KubeletCustomConfig:false Scriptless:false VHDCaching:false MockAzureChinaCloud:false VMSeriesCoverageTest:false}
    test_helpers.go:229: [0.000s] → running scenario...
    test_helpers.go:246: [0.000s] using cluster abe2e-kubenet-v5-150ee in rg=abe2e-westus3 sub=8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
    test_helpers.go:247: [0.000s] portal: https://portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/abe2e-westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    test_helpers.go:279: [0.003s] → preparing AKS node...
    vmss.go:531: [0.004s] → creating VMSS c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa...
    vmss.go:435: [0.226s] VMSS portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa/overview
    vmss.go:441: [0.226s] Managed cluster portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    vmss.go:564: [4.636s] VM will be automatically deleted after the test finishes, to preserve it for debugging purposes set KEEP_VMSS=true or pause the test with a breakpoint before the test finishes or failed
    vmss.go:568: [4.636s] SSH Instructions: (may take a few minutes for the VM to be ready for SSH)
        ========================
        az network bastion ssh --target-resource-id "/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa/virtualMachines/0" --name "abe2e-shared-bastion" --resource-group abe2e-westus3 --auth-type ssh-key --username azureuser --ssh-key /tmp/private-key-1226342948
        
    bastionssh.go:304: [166.053s] Attempt 1/5 establishing SSH over bastion to 10.220.112.65
    vmss.go:618: [167.130s] VM reached running state
    vmss.go:588: [167.130s] ✓ creating VMSS c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa done (167.1s)
    kube.go:160: [167.131s] → waiting for node c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa to be ready...
    kube.go:182: [167.255s] node c7as-2026-07-01-azurelinuxv3csefullinstallperformancedefa000000 is ready. Taints: [{"key":"node.kubernetes.io/network-unavailable","effect":"NoSchedule","timeAdded":"2026-07-01T22:46:44Z"}] Conditions: [{"type":"NetworkUnavailable","status":"True","lastHeartbeatTime":"2026-07-01T22:46:44Z","lastTransitionTime":"2026-07-01T22:46:44Z","reason":"NodeInitialization","message":"Waiting for cloud routes"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2026-07-01T22:47:06Z","lastTransitionTime":"2026-07-01T22:46:35Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2026-07-01T22:47:06Z","lastTransitionTime":"2026-07-01T22:46:35Z","reason":"KubeletHasNoDiskPressure","message":"
... [The stack trace has been truncated as it exceeded the maximum allowed size. Please refer to the complete log available in the Test Run attachments for full details.]

Check failure on line 1 in Test_AzureLinuxV3_CSE_FullInstallPerformance

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker E2E

Test_AzureLinuxV3_CSE_FullInstallPerformance

Failed
Raw output
=== RUN   Test_AzureLinuxV3_CSE_FullInstallPerformance
=== PAUSE Test_AzureLinuxV3_CSE_FullInstallPerformance
=== CONT  Test_AzureLinuxV3_CSE_FullInstallPerformance
--- FAIL: Test_AzureLinuxV3_CSE_FullInstallPerformance (0.00s)

Check failure on line 1 in Test_Flatcar_Scriptless

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker E2E

Test_Flatcar_Scriptless

Failed
Raw output
=== RUN   Test_Flatcar_Scriptless
=== PAUSE Test_Flatcar_Scriptless
=== CONT  Test_Flatcar_Scriptless
--- FAIL: Test_Flatcar_Scriptless (0.02s)

Check failure on line 1 in Test_Flatcar_Scriptless/default

See this annotation in the file changed.

@azure-pipelines azure-pipelines / Agentbaker E2E

Test_Flatcar_Scriptless/default

Failed
Raw output
=== RUN   Test_Flatcar_Scriptless/default
=== PAUSE Test_Flatcar_Scriptless/default
=== CONT  Test_Flatcar_Scriptless/default
    test_helpers.go:418: [0.000s] TAGS {Name:Test_Flatcar_Scriptless/default ImageName:flatcargen2 OS:flatcar Arch:amd64 NetworkIsolated:false NonAnonymousACR:false GPU:false WASM:false BootstrapTokenFallback:false KubeletCustomConfig:false Scriptless:false VHDCaching:false MockAzureChinaCloud:false VMSeriesCoverageTest:false}
    test_helpers.go:229: [0.000s] → running scenario...
    test_helpers.go:246: [0.000s] using cluster abe2e-kubenet-v5-150ee in rg=abe2e-westus3 sub=8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8
    test_helpers.go:247: [0.000s] portal: https://portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/abe2e-westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    test_helpers.go:279: [0.003s] → preparing AKS node...
    vmss.go:531: [0.004s] → creating VMSS z4gt-2026-07-01-flatcarscriptlessdefault...
    vmss.go:435: [0.218s] VMSS portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/z4gt-2026-07-01-flatcarscriptlessdefault/overview
    vmss.go:441: [0.218s] Managed cluster portal link: https://ms.portal.azure.com/#@microsoft.onmicrosoft.com/resource/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.ContainerService/managedClusters/abe2e-kubenet-v5-150ee/overview
    vmss.go:564: [4.519s] VM will be automatically deleted after the test finishes, to preserve it for debugging purposes set KEEP_VMSS=true or pause the test with a breakpoint before the test finishes or failed
    vmss.go:568: [4.519s] SSH Instructions: (may take a few minutes for the VM to be ready for SSH)
        ========================
        az network bastion ssh --target-resource-id "/subscriptions/8ecadfc9-d1a3-4ea4-b844-0d9f87e4d7c8/resourceGroups/MC_abe2e-westus3_abe2e-kubenet-v5-150ee_westus3/providers/Microsoft.Compute/virtualMachineScaleSets/z4gt-2026-07-01-flatcarscriptlessdefault/virtualMachines/0" --name "abe2e-shared-bastion" --resource-group abe2e-westus3 --auth-type ssh-key --username azureuser --ssh-key /tmp/private-key-1226342948
        
    bastionssh.go:304: [165.855s] Attempt 1/5 establishing SSH over bastion to 10.220.112.81
    vmss.go:618: [166.968s] VM reached running state
    vmss.go:588: [166.968s] ✓ creating VMSS z4gt-2026-07-01-flatcarscriptlessdefault done (167.0s)
    kube.go:160: [166.968s] → waiting for node z4gt-2026-07-01-flatcarscriptlessdefault to be ready...
    kube.go:182: [167.090s] node z4gt-2026-07-01-flatcarscriptlessdefault000000 is ready. Taints: [{"key":"node.kubernetes.io/network-unavailable","effect":"NoSchedule","timeAdded":"2026-07-01T22:49:15Z"}] Conditions: [{"type":"NetworkUnavailable","status":"True","lastHeartbeatTime":"2026-07-01T22:49:15Z","lastTransitionTime":"2026-07-01T22:49:15Z","reason":"NodeInitialization","message":"Waiting for cloud routes"},{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2026-07-01T22:49:13Z","lastTransitionTime":"2026-07-01T22:49:11Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2026-07-01T22:49:13Z","lastTransitionTime":"2026-07-01T22:49:11Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"PIDPressure","status":"False","lastHeartbeatTime":"2026-07-01T22:49:13Z","lastTransitionTime":"2026-07-01T22:49:11Z","reason":"KubeletHasSufficientPID","m
... [The stack trace has been truncated as it exceeded the maximum allowed size. Please refer to the complete log available in the Test Run attachments for full details.]