Skip to content

improve Azure SDK VCR replay handling and polling behavior#1312

Open
rigalGit wants to merge 1 commit intomainfrom
jitendra/vcr-fix
Open

improve Azure SDK VCR replay handling and polling behavior#1312
rigalGit wants to merge 1 commit intomainfrom
jitendra/vcr-fix

Conversation

@rigalGit
Copy link
Copy Markdown

@rigalGit rigalGit commented Apr 2, 2026

Community Note

  • Please vote on this PR by adding a 👍 reaction to the original PR to help the community and maintainers prioritize for review
  • Please do not leave comments along the lines of "+1", "me too" or "any updates", they generate extra noise for PR followers and do not help prioritize for review

Description

Refines current vcr integration and simplify some of the vcr logic. These changes make VCR replay handling more explicit, improve replay-with-new-episodes behavior, and ensure replay misses surface immediately instead of going through normal retry/backoff paths.

Detailed AI generated implementation summary Gist Link

The existing integration already supported skipping polling delay, but replay mode detection still depended on complex reflection and replay misses could be misclassified as retryable failures. That could lead to unnecessary waits or stuck tests when a cassette interaction was missing.

  • added TransportMode so VCR mode checks no longer depend on reflection-based transport inspection
  • detect replayed responses from X-Go-Azure-SDK-VCR-Replay header, which will be set by provider when VCR returns responses from disk.
  • Polling delay handling now prefers response-based replay detection when an HTTP response is present. This is important for ReplayWithNewEpisodes, where a run can contain both replayed and live HTTP calls. In those cases, the provider VCR recorder sets the replay header on replayed interactions, and skip-delay is applied only for those recorded responses rather than for the entire transport mode.
  • extracted default HTTP transport construction into a shared helper. So that provider and sdk both use same http transport in live and vcr mode.
  • prevent pollers and retry-on-error flows from retrying/exponentially backing off on VCR replay misses
  • updated dataplane/resourcemanager pollers to avoid treating replay-miss errors as dropped connections
  • corrected poller error wrapping from %+v to %w in dataplane pollers
  • expanded test coverage for replay detection, replay misses, polling delay behavior and poller error handling

Testing

  1. added unit tests covering the new VCR replay detection, replay-miss
    handling, and polling behavior changes
  2. ran the relevant existing and newly added unit tests
  3. validated the changes locally with the Azure provider in:
  • record mode
  • replay mode
  • replay mode after modifying the Terraform configuration to intentionally
    trigger an interaction not found error

Please find attached test logs from the local validation runs


~/go/src/github.com/hashicorp/azure-vcr main *9 !19 ?8 ❯ make acctests SERVICE='resource' TESTARGS='-count=1 -run=TestAccResourceGroup_basic' TESTTIMEOUT='90m' TC_TEST_VIA_VCR=record                       05:36:45
==> Checking that code complies with gofmt requirements...
==> Checking that Custom Timeouts are used...
==> Checking that acceptance test packages are used...
TF_ACC=1 go test -v ./internal/services/resource -count=1 -run=TestAccResourceGroup_basic -timeout 90m -ldflags="-X=github.com/hashicorp/terraform-provider-azurerm/version.ProviderVersion=acc"
=== RUN   TestAccResourceGroup_basic
=== PAUSE TestAccResourceGroup_basic
=== CONT  TestAccResourceGroup_basic
--- PASS: TestAccResourceGroup_basic (79.55s)
PASS
ok  	github.com/hashicorp/terraform-provider-azurerm/internal/services/resource	81.305s
~/g/s/githu/h/azure-vcr main *9 !19 ?8 ❯ make acctests SERVICE='resource' TESTARGS='-count=1 -run=TestAccResourceGroup_basic' TESTTIMEOUT='90m' TC_TEST_VIA_VCR=replay
==> Checking that code complies with gofmt requirements...
==> Checking that Custom Timeouts are used...
==> Checking that acceptance test packages are used...
TF_ACC=1 go test -v ./internal/services/resource -count=1 -run=TestAccResourceGroup_basic -timeout 90m -ldflags="-X=github.com/hashicorp/terraform-provider-azurerm/version.ProviderVersion=acc"
=== RUN   TestAccResourceGroup_basic
=== PAUSE TestAccResourceGroup_basic
=== CONT  TestAccResourceGroup_basic
--- PASS: TestAccResourceGroup_basic (53.74s)
PASS
ok  	github.com/hashicorp/terraform-provider-azurerm/internal/services/resource	55.480s
~/g/s/githu/h/azure-vcr main *9 !19 ?8 ❯ make acctests SERVICE='resource' TESTARGS='-count=1 -run=TestAccResourceGroup_basic' TESTTIMEOUT='90m' TC_TEST_VIA_VCR=replay
==> Checking that code complies with gofmt requirements...
==> Checking that Custom Timeouts are used...
==> Checking that acceptance test packages are used...
TF_ACC=1 go test -v ./internal/services/resource -count=1 -run=TestAccResourceGroup_basic -timeout 90m -ldflags="-X=github.com/hashicorp/terraform-provider-azurerm/version.ProviderVersion=acc"
=== RUN   TestAccResourceGroup_basic
=== PAUSE TestAccResourceGroup_basic
=== CONT  TestAccResourceGroup_basic
    testcase.go:203: Step 1/3 error: Error running apply: exit status 1

        Error: checking for presence of existing Resource Group (Subscription: "00000000-0000-0000-0000-000000000000"
        Resource Group Name: "acctestRG22-204501010329659324"): Get "https://management.azure.com/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/acctestRG22-204501010329659324?api-version=2023-07-01": requested interaction not found

          with azurerm_resource_group.test,
          on terraform_plugin_test.tf line 35, in resource "azurerm_resource_group" "test":
          35: resource "azurerm_resource_group" "test" {

--- FAIL: TestAccResourceGroup_basic (12.94s)
FAIL
FAIL	github.com/hashicorp/terraform-provider-azurerm/internal/services/resource	14.746s
FAIL
make: *** [acctests] Error 1

This is a (please select all that apply):

  • Bug Fix
  • New Feature
  • Enhancement
  • Breaking Change

Related Issue(s)

Fixes #0000

Rollback Plan

If a change needs to be reverted, we will publish an updated version of the provider.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

Note

If this PR changes meaningfully during the course of review please update the title and description as required.

@rigalGit rigalGit requested a review from a team as a code owner April 2, 2026 00:46
@github-actions github-actions Bot added the release-once-merged The SDK should be released once this PR is merged label Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-once-merged The SDK should be released once this PR is merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants