Skip to content

Instantly share code, notes, and snippets.

@Hasstrup
Created November 5, 2024 21:42
Show Gist options
  • Save Hasstrup/232f60b0e5205c04db6df0bcdd855e50 to your computer and use it in GitHub Desktop.
Save Hasstrup/232f60b0e5205c04db6df0bcdd855e50 to your computer and use it in GitHub Desktop.
# frozen_string_literal: true
# The Creation service is responsible for creating a template from a given PDF file.
#
# It extracts instructions from the PDF, converts the PDF to HTML, and prepares parameters
# for template creation.
#
# @see Templates::Invoices::CreateInput for the expected input structure.
class Templates::Templates::Contexts::Creation < BaseService
performs_checks
# Initializes a new Creation service instance.
#
# @param [Templates::Invoices::CreateInput] input The input data containing file details and other attributes.
# @return [void]
def initialize(input:)
@input = input
@pages_instructions_map = {}
end
# Processes the PDF file, extracting instructions and converting it to HTML.
#
# @return [String] The HTML string content of the PDF.
def call
safely_execute do
after_extracting_pdf_instructions do
succeed(::Templates::Template.create!(**template_create_params))
end
ensure
tmpfile.close
end
end
private
attr_reader :pages_instructions_map
# Prepares parameters for creating a template.
#
# @return [Hash] A hash containing parameters for template creation.
def template_create_params
{
reference_file_name: input.file_name,
title: input.title,
html_content: template_html_content,
instructions: pages_instructions_map,
user_id: input.user_id
}
end
# Converts the PDF content to HTML using MuPDF.
#
# @return [String] The converted HTML content of the PDF.
def template_html_content
output_file = "#{File.basename(tmpfile.path, '.pdf')}.html"
system("mutool convert -o #{output_file} #{tmpfile.path}")
File.read(output_file)
ensure
File.delete(output_file)
end
# Extracts instructions from the PDF pages and yields to the block.
#
# @yield [void] Yields control to the block after extraction is complete.
def after_extracting_pdf_instructions
reader.pages.each.with_index do |page, index|
receiver = ::PDF::Reader::RegisterReceiver.new
page.walk(receiver)
pages_instructions_map[index + 1] =
# the first entry specifies the page number, redundant to save it.
receiver.callbacks[1..receiver.callbacks.length].to_s
end
yield
end
# Initializes a PDF reader for the provided file.
#
# @return [PDF::Reader] The PDF reader instance.
def reader
@reader ||= ::PDF::Reader.new(tmpfile.path)
end
# Creates a temporary file to hold the decoded PDF content.
#
# @return [Tempfile] The temporary file containing the PDF content.
def tmpfile
@tmpfile ||= begin
tmp = Tempfile.new(input.file_name)
tmp.binmode
tmp.write(Base64.decode64(input.file_base64))
tmp
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment