-
Notifications
You must be signed in to change notification settings - Fork 135
Display VTT transcripts in audio/video players #7418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
eltiffster
wants to merge
43
commits into
main
Choose a base branch
from
av-transcripts
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
a35ae68
Add transcript_ids property to file set model(s)
eltiffster 9d63a58
Add transcript_ids to file set form
eltiffster 6c55ba0
Copy changes from @kirkkwang's draft PR, WIP: Implement Transcription…
eltiffster 2a4df1d
Support transcriptions for ActiveFedora works
eltiffster ec91a73
Fix broken specs
eltiffster b4a9419
Normalize VTT file set language values to 2-letter language codes.
eltiffster 757101b
Simplify iiif_manifest_presenter_spec#annotates_content
eltiffster 6a5da1d
Add comments and rubocop fixes
eltiffster 1d60f21
Add language field to file set form and move transcript ids form hint…
eltiffster 3da19f1
Add VTT transcripts to default audio/video partials
eltiffster 465ce27
Fix "l is not a function" error when displaying VTT transcripts in An…
eltiffster 1371460
Fix "l is not a function" error for other file types. The implementat…
eltiffster 5ed7239
Use the file set presenter to render the file set edit form instead o…
eltiffster 148273e
Allow transcriptions_controller to serve file types other than VTT. F…
eltiffster 3067885
Fix syntax error in clover.js
eltiffster 6dd9b6e
Increase the height of Clover IIIF viewer so that thumbnails for work…
eltiffster 5319280
Rename "transcriptions" to "transcripts" for consistency
eltiffster 26de256
Account for ActiveTriples::Resource in a transcript's language field.…
eltiffster ff06bb5
Remove unused fallback label, since a file set should always have a t…
eltiffster 1a10000
Make language optional again in IIIF manifest annotations, especially…
eltiffster d0074b0
Delete a stray, unnecessary comment
eltiffster 8368731
Rename .valid_transcripts to .available_transcripts
eltiffster 7ff49e6
Update file_set_form_helper_spec.rb
eltiffster b7dfbac
Merge branch 'main' into av-transcripts
eltiffster 2252b97
Fixes for Koppie: use file_ids_ssim.first instead of original_file_id
eltiffster 5c3eca3
Fix and refactor file_set_form_helper_spec to use top-level context b…
eltiffster 15a6c27
Rubocop fixes and add comment
eltiffster 1bca09e
Refactor file set form
eltiffster 4738f74
Use cached parent for FileSetFormHelper instead of running an extra q…
eltiffster 18d41a0
Add vtt file metadata to file_set_form_helper_spec
eltiffster e1cb16e
When searching for available transcripts. use an fq filter to capture…
eltiffster 3887e63
Merge branch 'main' into av-transcripts
eltiffster c5a6cfe
Fix whitespace and add some clarifying comments.
eltiffster dd1fc7f
Authorize transcript before streaming the file
eltiffster 66e0607
Remove test for ActiveTriples::Resource in presenters/hyrax/displays_…
eltiffster f06ef78
Make `transcript_ids` property conditional on Hyrax.config.file_set_i…
eltiffster 54d7d69
Update metadata profiles to fix allinson and koppie builds
eltiffster b9dc1ea
Move the transcripts_ids form field to views/records/edit_fields so i…
eltiffster 893f599
Move transcript_ids into the file set metadata schema
eltiffster 5b914be
Merge branch 'main' into av-transcripts
eltiffster cbc3d58
Update clover.js to v.3.6.0, which officially fixes https://github.co…
eltiffster d51a870
Merge branch 'main' into av-transcripts
eltiffster 97da895
Merge branch 'main' into av-transcripts
orangewolf File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module Hyrax | ||
| class TranscriptsController < DownloadsController | ||
| def show | ||
| # Using the extracted text from the index is blocked | ||
| # by https://github.com/samvera/hyrax/issues/7410, so we | ||
| # need to get the original file instead. | ||
| file_metadata = find_file_metadata(file_set: Hyrax.query_service.find_by(id: params.require(:id))) | ||
| file = Hyrax.storage_adapter.find_by(id: file_metadata.file_identifier) | ||
|
|
||
| prepare_file_headers_valkyrie(metadata: file_metadata, file: file) | ||
| response.headers['Access-Control-Allow-Origin'] = '*' | ||
| send_file file.disk_path, data_options(file_metadata) | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def disposition | ||
| 'inline' | ||
| end | ||
|
|
||
| def data_options(file_metadata) | ||
| { | ||
| type: "#{file_metadata.mime_type}; charset=utf-8", | ||
| filename: file_metadata.original_filename, | ||
| disposition: disposition | ||
| } | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| # frozen_string_literal: true | ||
| module Hyrax | ||
| module TranscriptsBehavior | ||
| extend ActiveSupport::Concern | ||
|
|
||
| class_methods do | ||
| def available_transcripts(parent:, current_ability:) | ||
| member_ids = Hyrax.custom_queries.find_child_file_set_ids(resource: parent) | ||
| Hyrax::SolrQueryService.new | ||
| # Cast Valkyrie::IDs to strings | ||
| .with_ids(ids: member_ids.map(&:to_s).to_a) | ||
| .accessible_by(ability: current_ability, action: :edit) | ||
| .solr_documents( | ||
| # Using "has_model_ssim:*FileSet" in fq will return both FileSet | ||
| # and Hyrax::FileSet documents. In test mode, Koppie and Sirenia | ||
| # index file sets with has_model_ssim:Hyrax::FileSet. | ||
| # But in dev mode, they index file sets with | ||
| # has_model_ssim:FileSet instead. This query covers both cases. | ||
| fq: [mime_type_filter_query.to_s, "has_model_ssim:*FileSet"], | ||
| fl: "id,title_tesim", | ||
| rows: 1000 | ||
| ) | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def mime_type_filter_query | ||
| valid_mime_types.map { |type| "mime_type_ssi:\"#{type}\"" }.join(" OR ") | ||
| end | ||
|
|
||
| # According to IIIF, .srt and .ttml are also acceptable but may | ||
| # not be supported by viewers. Clover and Ramp are confirmed to work | ||
| # with .vtt. (https://iiif.io/api/cookbook/recipe/0219-using-caption-file/). | ||
| # When Hyrax supports Ramp, we may want to add "text/plain" (.srt) to this list. | ||
| def valid_mime_types | ||
| ["text/vtt"] | ||
| end | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module Hyrax | ||
| module FileSetFormHelper | ||
| def render_transcript_ids_field?(file_set) | ||
| return unless file_set.persisted? | ||
| return if @parent.nil? | ||
| case file_set | ||
| when ActiveFedora::Base | ||
| file_set.video? || file_set.audio? | ||
| when Valkyrie::Resource | ||
| service = Hyrax::FileSetTypeService.new(file_set: file_set) | ||
| service.video? || service.audio? | ||
| end | ||
| end | ||
|
|
||
| def transcript_ids_select_options | ||
| options = Forms::FileSetForm.available_transcripts(parent: @parent, current_ability: current_ability) | ||
| options.each_with_object({}) do |doc, hash| | ||
| hash[doc.title_or_label] = doc.id.to_s | ||
| end | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # frozen_string_literal: true | ||
| module Hyrax | ||
| class FileSet | ||
| module Transcripts | ||
| extend ActiveSupport::Concern | ||
|
|
||
| included do | ||
| if Hyrax.config.file_set_include_metadata? | ||
| property :transcript_ids, predicate: ::RDF::URI.new('http://vocabulary.samvera.org/ns#transcriptIds'), multiple: true do |index| | ||
| index.as :stored_sortable | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| # frozen_string_literal: true | ||
|
|
||
| module Hyrax | ||
| module AnnotatesContent | ||
| extend ActiveSupport::Concern | ||
|
|
||
| include DisplaysTranscripts | ||
|
|
||
| def annotation_content | ||
| transcription_content if video? || audio? | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def transcription_content | ||
| transcripts.map do |doc| | ||
| options = { | ||
| type: 'Text', | ||
| motivation: 'supplementing', | ||
| body_id: transcript_url(doc, host: hostname, file_ext: file_ext(doc.mime_type)), | ||
| format: doc.mime_type, | ||
| label: doc.title_or_label | ||
| } | ||
| options[:language] = language_code(doc.language) if language_code(doc.language) | ||
| IIIFManifest::V3::AnnotationContent.new(**options) | ||
| end | ||
| end | ||
|
|
||
| # If you change the accepted mime types in Hyrax::TranscriptsBehavior, | ||
| # you may also want to override this method | ||
| def file_ext(_mime_type) | ||
| "vtt" | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| # frozen_string_literal: true | ||
| module Hyrax | ||
| module DisplaysTranscripts | ||
| extend ActiveSupport::Concern | ||
|
|
||
| # @return [Array<SolrDocument>] the solr documents represented by | ||
| # the audio/video file set's transcript_ids | ||
| def transcripts | ||
| return [] if transcript_ids.blank? | ||
| @transcripts ||= begin | ||
| results = Hyrax::SolrQueryService.new | ||
| .accessible_by( | ||
| ability: (try(:current_ability) || ability), | ||
| action: :read | ||
| ) | ||
| .with_ids(ids: transcript_ids) | ||
| .solr_documents | ||
| sort_transcripts_by_language(results) | ||
| end | ||
| end | ||
|
|
||
| def transcript_url(document, host: request.base_url, file_ext: "vtt") | ||
| Hyrax::Engine.routes.url_helpers.transcript_url(document.id, host: host, file_ext: file_ext) | ||
| end | ||
|
|
||
| # Try our best to convert language field to an ISO 639-1 code for use in the IIIF manifest. | ||
| # @param [Array<String>] - a solr document's language field. Parseable values include an ISO 639-1 code, | ||
| # an ISO 639-3 code, or the English name for a language | ||
| # Examples: https://github.com/scsmith/language_list#examples | ||
| # @return [String or NilClass] - the 2-letter code, or nil if no value or value is unparseable | ||
| def language_code(language) | ||
| return if language.empty? | ||
| value = language.first | ||
| if URI.parse(value).scheme | ||
| # This is probably a Library of Congress languages URI | ||
| # like http://id.loc.gov/vocabulary/iso639-3/eng, which can | ||
| # be configured with the Questioning Authority gem. | ||
| # Try to extract the code from the URI. | ||
| LanguageList::LanguageInfo.find(value.split("/").last).try(:iso_639_1) | ||
| else | ||
| # Otherwise, assume it is a language code or name and try | ||
| # to convert it to a 2-letter code | ||
| LanguageList::LanguageInfo.find(value).try(:iso_639_1) | ||
| end | ||
| end | ||
|
|
||
| private | ||
|
|
||
| def sort_transcripts_by_language(results) | ||
| current_locale = I18n.locale.to_s | ||
|
|
||
| # Sort alphabetically by language code | ||
| sorted = results.sort_by { |doc| language_code(doc.language) || '' } | ||
|
|
||
| # Move current locale to front | ||
| sorted.partition { |doc| language_code(doc.language) == current_locale }.flatten | ||
| end | ||
| end | ||
| end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.