-
-
Save cesartalves/df919b261012499ed30dfd11a6a1a672 to your computer and use it in GitHub Desktop.
# SolidusImporter::Row now has a `entity_id` field. | |
# It helps aggregate rows that belong to the same "entity" (Order for now) so we can process them together. | |
# With this, there's no need to maintain a huge context inside the `ProcessRow` job any longer. | |
# So how might we? | |
# lib/solidus_importer/process_row.rb | |
# 1. Associate the Row with an `Identifier` we an use to group related rows together | |
def prepare_rows(data) | |
messages = check_data(data) | |
if messages.empty? | |
data.each do |row| | |
# since the Import contains metadata related to the Import type, we always know that the row is, for example, | |
# of an Order. | |
@import.rows << ::SolidusImporter::Row.new(data: row.to_h, entity_id: row['Identifier']) | |
end | |
{ success: true } | |
else | |
@import.update(state: :failed, messages: messages.join(', ')) | |
{ success: false, messages: messages.join(', ') } | |
end | |
end | |
# 2. Instead of processing in-line, move to a background job | |
def process_rows(initial_context) | |
group_of_rows = @import.rows.created_or_failed.order(id: :asc).group_by(:entity_id) | |
group_of_rows.each do |group| | |
row_ids = group.map(&:id) | |
# since all the Rows contain all the info we need, no `initial_context` is required. | |
# This allows full background processing of the related Rows!. | |
::SolidusImporter::ProcessRowGroupJob.new(@import.id, row_ids).perform_later | |
end | |
end | |
# 3. A new Job | |
```ruby | |
class ProcessRowGroupJob | |
# wrapps ProcessRow so that it can apply to a group of related rows | |
def perform(import_id, row_ids) | |
import = SolidusImport::Import.find(import_id) | |
rows = SolidusImport::Row.where(id: row_ids).all.to_a | |
@importer = import.importer | |
context = { } | |
rows.each do |row| | |
context = SolidusImporter::ProcessRow.new(@importer, row) | |
end | |
# key point! The after import now happens after a group of related rows is processed | |
(no longer after everything has been processed) | |
@importer.after_import(context) | |
end | |
end |
Does that mean that we need to add a new Identefier columns to each CSV
I gave some thought about this during lunch and we can infer the entity_id
based on the rows. Something like data['Number'] || data['Slug'] || data['Email']
etc on a before_create
hook.
One question: how are we planning to update the Import#state attribute? There are different ways. I was just wondering if you had thought about it.
I think that's another outstanding question. We could simply update the import to failed
whenever there's an error. Wouldn't change much from the way it currently is, but then I wonder how to indicate that we need to reprocess - since failures would now not occur inside the job that processes the Import itself. Which... is what is happening right now anyway during the Order import ๐ฎโ๐จ
I'm still wondering if this design will work or it will break too much stuff. Perhaps we even want to bring the RowGroup
concept a step further and expose it on the UI, with an option to Retry
. I'm gonna work on some minor errors on the extension for now until I feel comfortable getting my feet deep onto the water ๐
@import.rows << ::SolidusImporter::Row.new(data: row.to_h, entity_id: row['Identifier'])
Does that mean that we need to add a new Identfier columns to each CSV?
This is GREAT! Something that we've been talking about for a long time and I think that it will really help with performance and reliability. What would be nice is to make sure that we can re-import the same file without creating duplicates.
One question: how are we planning to update the
Import#state
attribute? There are different ways. I was just wondering if you had thought about it.