-
-
Save elrayle/9a72ffc0c879927b327b to your computer and use it in GitHub Desktop.
| # ----------------------------------- PCDM example adding a file to an object --------------------------------- | |
| require "hydra/pcdm" | |
| obj1 = Hydra::PCDM::Object.create | |
| file1 = File.open("war_and_peace.pdf","r") | |
| pcdm_file = Hydra::PCDM::File.new | |
| pcdm_file.content = file1 | |
| obj1.files << pcdm_file | |
| obj1.save |
| # ----------------------------------- Works example doing the same thing --------------------------------- | |
| require "hydra/works" | |
| genfil1 = Hydra::Works::GenericFile::Base.create | |
| file1 = File.open("war_and_peace.pdf","r") | |
| Hydra::Works::UploadFileToGenericFile.call(genfil1, file1) | |
| genfil1.save |
| # ------------------- Works example doing the same thing + calling additional services ------------------- | |
| require "hydra/works" | |
| genfil1 = Hydra::Works::GenericFile::Base.create | |
| additional_services = [Hydra::Works::GenerateThumbnail] | |
| file1 = File.open("war_and_peace.pdf","r") | |
| Hydra::Works::UploadFileToGenericFile.call(genfil1, file1, additional_services: additional_services) | |
| genfil1.save | |
| # ------------------- Works example doing the same thing + calling additional services ------------------- | |
| # WHAT I EXPECTED BUT DOESN'T WORK | |
| # -------------------------------- | |
| require "hydra/works" | |
| genfil1 = Hydra::Works::GenericFile::Base.create | |
| additional_services = [Hydra::Works::GenerateThumbnail, | |
| Hydra::Works::ExtractFullText] | |
| file1 = File.open("war_and_peace.pdf","r") | |
| Hydra::Works::UploadFileToGenericFile.call(genfil1, file1, additional_services: additional_services) | |
| genfil1.save |
| # ------------------- Works example doing the same thing calling derivatives outside upload ------------------- | |
| # WHAT I EXPECTED BUT DOESN'T WORK | |
| # -------------------------------- | |
| require "hydra/works" | |
| genfil1 = Hydra::Works::GenericFile::Base.create | |
| additional_services = [Hydra::Works::GenerateThumbnail, | |
| Hydra::Works::ExtractFullText] | |
| file1 = File.open("war_and_peace.pdf","r") | |
| Hydra::Works::UploadFileToGenericFile.call(genfil1, file1) | |
| # create thumbnail - configured in models/concerns/generic_file/derivatives.rb#makes_derivatives | |
| genfil1.create_derivatives | |
| # create extracted text - Options: | |
| # 1) move this to hydra-derivatives and update #makes_derivatives to generate this for appropriate file types | |
| # 2) create a service object where this is one call and can be passed as an additional service | |
| extracted_text = Hydra::Works::FullTextExtractionService.run(generic_file) | |
| generic_file.build_extracted_text | |
| generic_file.extracted_text.content = extracted_text | |
| # QUESTION: Can makes_derivatives be overriden by classes using Hydra::Works::GenericFile to customize a different | |
| # set of derivatives to generate. models/concerns/generic_file/derivatives.rb#makes_derivatives becomes the default. | |
| genfil1.save |
I would attach a thumbnail and extracted text like this based on the current code.
generic_file = Hydra::Works::GenericFile::Base.create
file = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(generic_file, file)
generic_file.create_derivatives
# This part isn't ideal - the service returns a string
extracted_text = Hydra::Works::FullTextExtractionService.run(generic_file)
generic_file.build_extracted_text
generic_file.extracted_text.content = extracted_textThere's a good example on the hydra-derivatives readme.
After attaching a file to a generic_file, call create_derivatives on it to create and attach associated derivatives. Processors and the derivative files/formats are configured by the makes_derivatives block. The derivatives processors create the derivative filestreams and then call the output service (here is an example of the image processor calling the output_file_service). The output service must conform to the signature documented here. It takes 3 arguments: a generic_file, filestream, and destination name symbol.
There are three output file services implemented, persist_derivative in Hydra::Works (a directly attached derivative), persist_basic_contained_output_file in Hydra::Derivatives, and persist_derivative in CurationConcerns (which writes derivatives to the filesystem).
I got the extracted_text bit from the CurationConcerns chracterization service
👍 looks good to me