Skip to content

Initial commit adding the ability to use TextServices#623

Draft
JackLewis-digirati wants to merge 18 commits into
developfrom
feature/textServiceNoIngest
Draft

Initial commit adding the ability to use TextServices#623
JackLewis-digirati wants to merge 18 commits into
developfrom
feature/textServiceNoIngest

Conversation

@JackLewis-digirati

@JackLewis-digirati JackLewis-digirati commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Resolves #617

What does this change?

This PR enables pipelines in iiif-presentation as well as implementing a connection to the text services project.

Note

This PR now wraps database operations in a transaction so that failures to submit a manifest correctly rolls everything back. Additionally, as text services requires S3 to be available from the moment of submission, the S3 is saved before pipeline submission. This then needs to roll back by deleting the staging manifest from S3

Database Migration

Note

Details of migration. What does it change? Is it breaking/non-breaking?

  • What it does: adds the pipeline_jobs table that tracks jobs being submitted to text-services, while being extendable to other pipelines in the future
  • Breaking Change? No

Configuration Changes

Note

This PR introduces configuration changes.

Service AppSetting Required? Description Default
API TextServices:BuilderApiUri N The location of the text services builder null
BackgroundHandler TextServices:SearchApiUri N The location of the text services search null
BackgroundHandler AWS:SQS:TextJobQueueName N The SQS queue holding competed text services jobs null

@JackLewis-digirati

Copy link
Copy Markdown
Collaborator Author

What this PR does

Implements the text-services pipeline integration (issue #617). When a manifest is created or updated with a pipeline property containing a text step, the API:

  1. Creates a PipelineJob record (new DB table) tracking the job status
  2. Submits the job to the text-services API via TextServicesClient
  3. Returns 202 Accepted while the job is in flight (the staged manifest is held in S3)

Once the text service finishes, a background SQS handler (TextServiceJobCompletionMessageHandler) picks up the completion message and:

  • On success: reads the staged manifest from S3, merges any SearchService2 entries from the text-augmented manifest into it, writes the final manifest to the public bucket, and marks the job as Completed
  • On failure: marks the job as Failed, deletes the staged manifest from S3

Key design decisions

PipelineJob entity follows the same ManifestId? / CollectionId? pattern as Hierarchy — nullable FKs with a check constraint ensuring exactly one is set (num_nonnulls(manifest_id, collection_id) = 1). This allows the same table to be reused for collection-level pipeline jobs in future without a schema change.

Only SearchService2 is merged from the text-augmented manifest — no other service types. The Search 2 context URL (http://iiif.io/api/search/2/context.json) is also merged into the base manifest's @context.

ManifestContextExtensions.GetContextStrings() is a shared helper normalising the polymorphic manifest.Context field (string / JArray / JValue) into IEnumerable<string>, used by both ManifestMerger and the completion handler.

JackLewis-digirati and others added 2 commits June 24, 2026 16:20
Use AddDistinctById() helper as used elsewhere, updated it to return the
number of items added, change base type to make more accommodating and
add optional hook to alter item on add.

Set label on AutoComplete and SearchService if not set.

Rather than iterate and read contexts, add search2 context manually as
it's a published constant
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Manifests not requiring IIIF-CS ingestion

2 participants