Skip to content

Instantly share code, notes, and snippets.

@dkubb
Created October 11, 2011 06:48
Show Gist options
  • Save dkubb/1277447 to your computer and use it in GitHub Desktop.
Save dkubb/1277447 to your computer and use it in GitHub Desktop.
Example data migration using Veritas
class MyMigration < Veritas::Migration
def call
# Transformations
# ---------------
# 1) remove all columns except id, name and email
# 2) rename user_id to id
# 3) add a boolean active column, setting the default to false
# 4) add a constraint on name where it must be between 1 and 100 characters
schema[:users] = schema[:users].
project([ :user_id, :name, :email ]).
rename(:user_id => :id).
extend { |r| r.add([ :active, Boolean ], false) }.
restrict { {r| r.name.length.gte(1).and(r.name.length.lte(100)) }
# This generates the following DDL statement:
# ALTER TABLE `users`
# DROP COLUMN `notes`,
# RENAME COLUMN `user_id` TO `id`,
# ADD COLUMN `active` BOOLEAN NOT NULL DEFAULT FALSE,
# CHANGE COLUMN `name` VARCHAR(100) NOT NULL CHECK (CHAR_LENGTH(`name`) >= 1)
end
end
@dkubb
Copy link
Author

dkubb commented Oct 11, 2011

@emmanuel @solnic @blambeau This is kind of what I was planning for the veritas migrations interface. The DDL shown is for PostgreSQL.

I may expand it to support other relational operators, but those will mostly be for data migration; like creating new tables that are the result of joining multiple other relations, or for creating views.

@dkubb
Copy link
Author

dkubb commented Oct 11, 2011

This kind of feels like it will be a bit weird. I know I need to change my frame of reference so that I think of the relational operators as being a transformation of some base relation into another relation which is assigned back to the original name, thus causing the datastore to update the relation stored under that name.

In a truly relational database, this is roughly how you'd do it.. and similar to what Tutorial D does. In this specific case we're doing to walk those operations and translate them to the DDL statements that are going to be executed.

@solnic
Copy link

solnic commented Oct 11, 2011

this is cool but I'd definitely build a nicer DSL on top of that. I presume what you're showing here is a bit low-level, right?

@dkubb
Copy link
Author

dkubb commented Oct 11, 2011

@solnic yeah, probably. We'll see I guess. The main approach I've been using is to start with something that I can ensure is 100% correct, and then iterate until I'm happy. The first thing I have to get over is the unfamiliarity and use it in practice, until then I won't be able to make an informed decision about the API, I'll only be reacting to the uncomfortableness of using something new.

I am curious to see how simplified it can get. I haven't tested this yet, but I think it would take more code to describe all these concepts using the current AR or DM migration APIs.

@postmodern
Copy link

I would simplify the API down into the common migration use-cases:

  • Adding a new column with a default value (or NULL).
  • Adding a new column whose value is computed based off of other columns or tables. Example: adding a new NOT NULL foreign-key.
  • Changing a column's type, length or NULLness.
  • Updating (rewriting) a column's values.
  • Renaming a column / table.
  • Removing a column / table.
  • Adding an index.
  • Changing the type of an index (index / unique).
  • Removing an index.

@dkubb
Copy link
Author

dkubb commented Oct 14, 2011

@postmodern I think that would be great for a higher level API. I'm thinking about a DSL that's block-based, where you specify the table, and then inside a block specify the transformations you want to apply. I will probably be thinking in terms of a few basic classes of operations:

  • Adding
  • Removing
  • Changing
  • Renaming

I would think that tables, columns and indexes should have one method for each of these actions. Even if the actual operations in the DB consist of multiple DDL statements, I would rather work at a higher level and just think in terms of the operation. I really don't want to make something that isn't an abstraction at all, and just a 1:1 mapping to some DDL statements ... if that's the case we'd be better off just using straight SQL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment