diff --git a/app/forms/concerns/hyrax/redirects_field_behavior.rb b/app/forms/concerns/hyrax/redirects_field_behavior.rb index bac11ef490..a43662ea64 100644 --- a/app/forms/concerns/hyrax/redirects_field_behavior.rb +++ b/app/forms/concerns/hyrax/redirects_field_behavior.rb @@ -54,7 +54,7 @@ def deserialize!(params) # Builds plain hashes (the persisted shape) from the submitted # `redirects_attributes` payload. Drops rows marked for destruction # or with a blank path. Normalizes paths to the canonical form - # stored in the uniqueness ledger. + # stored in the redirects table. def redirects_attributes_populator(fragment:, **_options) return unless respond_to?(:redirects) return unless Hyrax.config.redirects_active? diff --git a/app/models/hyrax/redirect_path.rb b/app/models/hyrax/redirect_path.rb index 6254f5eb3a..811385c75b 100644 --- a/app/models/hyrax/redirect_path.rb +++ b/app/models/hyrax/redirect_path.rb @@ -1,7 +1,7 @@ # frozen_string_literal: true module Hyrax - # ActiveRecord-backed uniqueness ledger for redirect paths. + # ActiveRecord-backed redirects table for global path uniqueness. # # Maintained by Hyrax::Transactions::Steps::SyncRedirectPaths and # Hyrax::Transactions::Steps::RemoveRedirectPaths; queried by diff --git a/documentation/redirects.md b/documentation/redirects.md index dd93144892..7930784efb 100644 --- a/documentation/redirects.md +++ b/documentation/redirects.md @@ -124,7 +124,7 @@ When the config is off, the loader filters `redirects.yaml` out of the schema se ## Path normalization -Every redirect path in Hyrax — whether typed into a form, looked up by the resolver, written to the uniqueness ledger, or queried by the validator — passes through `Hyrax::RedirectPathNormalizer.call`. This is the single source of truth for "what does this path look like on disk?". Normalization rules: +Every redirect path in Hyrax — whether typed into a form, looked up by the resolver, written to the redirects table, or queried by the validator — passes through `Hyrax::RedirectPathNormalizer.call`. This is the single source of truth for "what does this path look like on disk?". Normalization rules: 1. If the input parses as a URL with a scheme and host (e.g. `https://old.example.edu/handle/12345/678`), keep only the path component. 2. Strip query strings (`?utm_source=foo`) and fragments (`#section`). @@ -136,7 +136,7 @@ The normalizer is idempotent — `normalize(normalize(x)) == normalize(x)` — a - `Hyrax::RedirectsFieldBehavior#redirects_attributes_populator` normalizes each entry's path on form submission, before validation. The normalized form is what the validator sees and what the resource persists. - `Hyrax::RedirectsController#show` (the resolver) normalizes the incoming request path before the Solr lookup, so `/foo/` and `/foo` both resolve. - `Hyrax::RedirectsLookup` normalizes its input on construction, so callers can pass any reasonable form. -- `Hyrax::Transactions::Steps::SyncRedirectPaths` normalizes paths before writing to the ledger, as defense in depth. +- `Hyrax::Transactions::Steps::SyncRedirectPaths` normalizes paths before writing to the redirects table, as defense in depth. A user pasting a full URL from an old DSpace page (`https://old.example.edu/handle/123?utm=email`) sees the form quietly accept and persist `/handle/123`. @@ -172,9 +172,9 @@ end The validator rejects any redirect path that equals one of these prefixes, or starts with one followed by `/` (so `/admin` is reserved, and `/admin/anything` is reserved, but `/administrator` would *not* be reserved by the `/admin` entry). -### Uniqueness lookup and the `hyrax_redirect_paths` ledger +### Uniqueness lookup and the `hyrax_redirect_paths` table -Global uniqueness is enforced by a Postgres table, `hyrax_redirect_paths`, which has a unique B-tree index on `path`. The table is a derived ledger of every redirect path currently in use, and the unique index gives the hard guarantee that no two records can share a path even under concurrent saves. A second non-unique index on `resource_id` supports the per-resource sync described below. +Global uniqueness is enforced by a Postgres table, `hyrax_redirect_paths`, which has a unique B-tree index on `path`. The table is a derived record of every redirect path currently in use, and the unique index gives the hard guarantee that no two records can share a path even under concurrent saves. A second non-unique index on `resource_id` supports the per-resource sync described below. `Hyrax::RedirectsLookup` is the single point of truth for "is this path taken?". It queries the table: @@ -184,12 +184,12 @@ SELECT 1 FROM hyrax_redirect_paths WHERE path = ? AND resource_id <> ? LIMIT 1; The validator calls `Hyrax::RedirectsLookup.taken?(path, except_id: record.id)` to give the user friendly feedback at form-submit time. If two simultaneous requests both pass validation (because both checked the table before either committed), the unique index rejects the second one at insert time and the enclosing transaction returns `Failure`. -### Sync between `redirects` attribute and the ledger +### Sync between `redirects` attribute and the redirects table -The ledger is kept in sync by two `dry-transaction` steps composed into the create/update/destroy transactions: +The redirects table is kept in sync by two `dry-transaction` steps composed into the create/update/destroy transactions: -- `Hyrax::Transactions::Steps::SyncRedirectPaths` — runs after the resource is saved (in `WorkCreate`, `WorkUpdate`, `CollectionCreate`, `CollectionUpdate`). Deletes the resource's existing rows and reinserts the current redirect set in a single DB transaction. On `ActiveRecord::RecordNotUnique` (race lost), returns `Failure([:redirect_path_collision, ...])`, which short-circuits the enclosing transaction and surfaces back to the controller. No-op when either the config or the Flipflop is off — there's no point writing ledger rows when the feature isn't actively in use. -- `Hyrax::Transactions::Steps::RemoveRedirectPaths` — runs before `delete_resource` in `WorkDestroy` and `CollectionDestroy`. Clears the resource's rows so deleted resources don't leave dangling claims on redirect paths. Gated only on the config (not the Flipflop): cleanup must happen regardless of whether the feature is currently in active use, so that an admin toggling the Flipflop off mid-deployment doesn't leave orphaned rows that could later collide with new resources after a re-enable. +- `Hyrax::Transactions::Steps::SyncRedirectPaths` — runs after the resource is saved (in `WorkCreate`, `WorkUpdate`, `CollectionCreate`, `CollectionUpdate`). Deletes the resource's existing rows and reinserts the current redirect set in a single DB transaction. On `ActiveRecord::RecordNotUnique` (race lost), returns `Failure([:redirect_path_collision, ...])`, which short-circuits the enclosing transaction and surfaces back to the controller. On any other `ActiveRecord::StatementInvalid` (missing table, schema drift, connection drop), logs the error and returns `Failure([:redirect_path_sync_error, ...])` rather than raising — the resource is already persisted by the time this step runs, so a soft-fail is preferable to a 500 with partial state. No-op when either the config or the Flipflop is off. +- `Hyrax::Transactions::Steps::RemoveRedirectPaths` — runs before `delete_resource` in `WorkDestroy` and `CollectionDestroy`. Clears the resource's rows so deleted resources don't leave dangling claims on redirect paths. Same `StatementInvalid` treatment as the sync step: returns `Failure([:redirect_path_remove_error, ...])` instead of raising. Gated only on the config (not the Flipflop): cleanup must happen regardless of whether the feature is currently in active use, so that an admin toggling the Flipflop off mid-deployment doesn't leave orphaned rows that could later collide with new resources after a re-enable. Neither step does anything when the config is off, so adopters who don't enable the redirects feature pay no cost for them. @@ -287,5 +287,5 @@ Alternatively, gate the inclusion of the calling code itself on `Hyrax.config.re - `Hyrax::RedirectValidator` (`app/validators/hyrax/redirect_validator.rb`) — the form-level entry validator. - `Hyrax::RedirectPathNormalizer` (`app/services/hyrax/redirect_path_normalizer.rb`) — canonical-form normalization for redirect paths. - `Hyrax::RedirectsLookup` (`app/services/hyrax/redirects_lookup.rb`) — the uniqueness lookup against `hyrax_redirect_paths`. -- `Hyrax::RedirectPath` (`app/models/hyrax/redirect_path.rb`) — ActiveRecord model for the `hyrax_redirect_paths` ledger. -- `Hyrax::Transactions::Steps::SyncRedirectPaths` and `Hyrax::Transactions::Steps::RemoveRedirectPaths` (`lib/hyrax/transactions/steps/`) — the transaction steps that keep the ledger in sync with each resource's `redirects` attribute. +- `Hyrax::RedirectPath` (`app/models/hyrax/redirect_path.rb`) — ActiveRecord model for the `hyrax_redirect_paths` table. +- `Hyrax::Transactions::Steps::SyncRedirectPaths` and `Hyrax::Transactions::Steps::RemoveRedirectPaths` (`lib/hyrax/transactions/steps/`) — the transaction steps that keep the redirects table in sync with each resource's `redirects` attribute. diff --git a/lib/hyrax/transactions/steps/remove_redirect_paths.rb b/lib/hyrax/transactions/steps/remove_redirect_paths.rb index d20ef71d75..fa22fa3d17 100644 --- a/lib/hyrax/transactions/steps/remove_redirect_paths.rb +++ b/lib/hyrax/transactions/steps/remove_redirect_paths.rb @@ -5,7 +5,7 @@ module Hyrax module Transactions module Steps # A `dry-transaction` step that removes a resource's rows from the - # `hyrax_redirect_paths` uniqueness ledger. Runs as part of the destroy + # `hyrax_redirect_paths` redirects table. Runs as part of the destroy # transactions so deleted resources don't leave dangling claims on # redirect paths. # @@ -16,8 +16,12 @@ class RemoveRedirectPaths # @param [Valkyrie::Resource] resource # @return [Dry::Monads::Result] def call(resource) - Hyrax::RedirectPath.where(resource_id: resource.id.to_s).delete_all if removable?(resource) + return Success(resource) unless removable?(resource) + Hyrax::RedirectPath.where(resource_id: resource.id.to_s).delete_all Success(resource) + rescue ActiveRecord::StatementInvalid => e + Hyrax.logger.error("[redirects] remove_redirect_paths failed: #{e.message}") + Failure([:redirect_path_remove_error, e.message]) end private diff --git a/lib/hyrax/transactions/steps/sync_redirect_paths.rb b/lib/hyrax/transactions/steps/sync_redirect_paths.rb index d274d9f8c6..d407397e11 100644 --- a/lib/hyrax/transactions/steps/sync_redirect_paths.rb +++ b/lib/hyrax/transactions/steps/sync_redirect_paths.rb @@ -5,7 +5,7 @@ module Hyrax module Transactions module Steps # A `dry-transaction` step that mirrors a saved resource's `redirects` - # entries into the `hyrax_redirect_paths` uniqueness ledger. The unique + # entries into the `hyrax_redirect_paths` redirects table. The unique # index on `path` enforces global uniqueness at the DB level — if a # concurrent save already claimed a path, the insert raises # ActiveRecord::RecordNotUnique and this step returns Failure, which @@ -26,6 +26,9 @@ def call(object) Success(object) rescue ActiveRecord::RecordNotUnique => e Failure([:redirect_path_collision, e.message]) + rescue ActiveRecord::StatementInvalid => e + Hyrax.logger.error("[redirects] sync_redirect_paths failed: #{e.message}") + Failure([:redirect_path_sync_error, e.message]) end private diff --git a/spec/lib/hyrax/transactions/steps/remove_redirect_paths_spec.rb b/spec/lib/hyrax/transactions/steps/remove_redirect_paths_spec.rb index 1cd8745dc1..191ca939c1 100644 --- a/spec/lib/hyrax/transactions/steps/remove_redirect_paths_spec.rb +++ b/spec/lib/hyrax/transactions/steps/remove_redirect_paths_spec.rb @@ -35,5 +35,20 @@ expect(step.call(resource)).to be_success end end + + context 'when the underlying redirects table query raises StatementInvalid' do + before do + allow(Hyrax::RedirectPath).to receive(:where) + .and_raise(ActiveRecord::StatementInvalid, 'PG::UndefinedTable: relation "hyrax_redirect_paths" does not exist') + allow(Hyrax.logger).to receive(:error) + end + + it 'returns Failure with a redirect_path_remove_error tag and logs the error' do + result = step.call(resource) + expect(result).to be_failure + expect(result.failure.first).to eq(:redirect_path_remove_error) + expect(Hyrax.logger).to have_received(:error).with(/remove_redirect_paths failed/) + end + end end end diff --git a/spec/lib/hyrax/transactions/steps/sync_redirect_paths_spec.rb b/spec/lib/hyrax/transactions/steps/sync_redirect_paths_spec.rb index 41d0a78f73..18f7988fae 100644 --- a/spec/lib/hyrax/transactions/steps/sync_redirect_paths_spec.rb +++ b/spec/lib/hyrax/transactions/steps/sync_redirect_paths_spec.rb @@ -36,6 +36,21 @@ end end + context 'when the underlying redirects table query raises StatementInvalid' do + before do + allow(Hyrax::RedirectPath).to receive(:where) + .and_raise(ActiveRecord::StatementInvalid, 'PG::UndefinedTable: relation "hyrax_redirect_paths" does not exist') + allow(Hyrax.logger).to receive(:error) + end + + it 'returns Failure with a redirect_path_sync_error tag and logs the error' do + result = step.call(resource) + expect(result).to be_failure + expect(result.failure.first).to eq(:redirect_path_sync_error) + expect(Hyrax.logger).to have_received(:error).with(/sync_redirect_paths failed/) + end + end + context 'when the redirects feature is inactive' do before { allow(Hyrax.config).to receive(:redirects_active?).and_return(false) }