Skip to content

Instantly share code, notes, and snippets.

@elrayle
Last active September 16, 2015 15:43
Show Gist options
  • Save elrayle/9a72ffc0c879927b327b to your computer and use it in GitHub Desktop.
Save elrayle/9a72ffc0c879927b327b to your computer and use it in GitHub Desktop.
add files to pcdm/works
# ----------------------------------- PCDM example adding a file to an object ---------------------------------
require "hydra/pcdm"
obj1 = Hydra::PCDM::Object.create
file1 = File.open("war_and_peace.pdf","r")
pcdm_file = Hydra::PCDM::File.new
pcdm_file.content = file1
obj1.files << pcdm_file
obj1.save
# ----------------------------------- Works example doing the same thing ---------------------------------
require "hydra/works"
genfil1 = Hydra::Works::GenericFile::Base.create
file1 = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(genfil1, file1)
genfil1.save
# ------------------- Works example doing the same thing + calling additional services -------------------
require "hydra/works"
genfil1 = Hydra::Works::GenericFile::Base.create
additional_services = [Hydra::Works::GenerateThumbnail]
file1 = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(genfil1, file1, additional_services: additional_services)
genfil1.save
# ------------------- Works example doing the same thing + calling additional services -------------------
# WHAT I EXPECTED BUT DOESN'T WORK
# --------------------------------
require "hydra/works"
genfil1 = Hydra::Works::GenericFile::Base.create
additional_services = [Hydra::Works::GenerateThumbnail,
Hydra::Works::ExtractFullText]
file1 = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(genfil1, file1, additional_services: additional_services)
genfil1.save
# ------------------- Works example doing the same thing calling derivatives outside upload -------------------
# WHAT I EXPECTED BUT DOESN'T WORK
# --------------------------------
require "hydra/works"
genfil1 = Hydra::Works::GenericFile::Base.create
additional_services = [Hydra::Works::GenerateThumbnail,
Hydra::Works::ExtractFullText]
file1 = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(genfil1, file1)
# create thumbnail - configured in models/concerns/generic_file/derivatives.rb#makes_derivatives
genfil1.create_derivatives
# create extracted text - Options:
# 1) move this to hydra-derivatives and update #makes_derivatives to generate this for appropriate file types
# 2) create a service object where this is one call and can be passed as an additional service
extracted_text = Hydra::Works::FullTextExtractionService.run(generic_file)
generic_file.build_extracted_text
generic_file.extracted_text.content = extracted_text
# QUESTION: Can makes_derivatives be overriden by classes using Hydra::Works::GenericFile to customize a different
# set of derivatives to generate. models/concerns/generic_file/derivatives.rb#makes_derivatives becomes the default.
genfil1.save
@escowles
Copy link

escowles commented Sep 4, 2015

👍 looks good to me

@tampakis
Copy link

tampakis commented Sep 4, 2015

I would attach a thumbnail and extracted text like this based on the current code.

generic_file = Hydra::Works::GenericFile::Base.create
file = File.open("war_and_peace.pdf","r")
Hydra::Works::UploadFileToGenericFile.call(generic_file, file)
generic_file.create_derivatives

# This part isn't ideal - the service returns a string
extracted_text = Hydra::Works::FullTextExtractionService.run(generic_file)
generic_file.build_extracted_text
generic_file.extracted_text.content = extracted_text

There's a good example on the hydra-derivatives readme.

After attaching a file to a generic_file, call create_derivatives on it to create and attach associated derivatives. Processors and the derivative files/formats are configured by the makes_derivatives block. The derivatives processors create the derivative filestreams and then call the output service (here is an example of the image processor calling the output_file_service). The output service must conform to the signature documented here. It takes 3 arguments: a generic_file, filestream, and destination name symbol.

There are three output file services implemented, persist_derivative in Hydra::Works (a directly attached derivative), persist_basic_contained_output_file in Hydra::Derivatives, and persist_derivative in CurationConcerns (which writes derivatives to the filesystem).

@tampakis
Copy link

tampakis commented Sep 4, 2015

I got the extracted_text bit from the CurationConcerns chracterization service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment