Skip to content

Instantly share code, notes, and snippets.

View paulghaddad's full-sized avatar

Paul Haddad paulghaddad

View GitHub Profile
@paulghaddad
paulghaddad / gist:9ac8456aaf1b12272650
Last active August 29, 2015 14:14
LevelUp 7: Employs YAGNI (and knows when to violate it)
1. Explain how YAGNI improves the quality of our code.
YAGNI improves our code in many ways, but the most important reasons are:
- There is less overall code. Less code means it is less likely to break and become spaghetti code; it is also easier to comprehend.
- By focusing on what is absolutely needed right now, we can focus on making the feature as well built as possible. With many priorities, it is more difficult to produce excellent code.
- If we implement a feature that is needed in the future right now, we probably don't understand its exact requirements. If we violate YAGNI and go ahead an build it now, it won't be as good as it could have been if we waited to build in it the future, when we understand the requirements better.
- Our code is not polluted by incorrect guesses that turned out to be wrong but haven't been removed.
In summary, YAGNI increases the quality of our code because it reduces our code to the absolute essentials. A smaller codebase will almost always be of higher quality because
@paulghaddad
paulghaddad / gist:9e29699b04323b9e47ed
Created February 5, 2015 23:06
LevelUp 7: Understands the Principle of Least Surprise
1. Explain several conventions in ruby we use to conform to the Principle of Least Surprise.
Ruby adheres to the Principle of Least Surprise in many ways. Here are several instances, in my opinion:
- Well named methods whose function are obvious from their names, such as exists?, select and reject.
- The use of ? and ! methods. If used properly, programmers understand ahead of time what will happen when they are used (a Boolean will be returned or something destructive will occur).
- The IO and File class have several methods and parameters that borrow from Unix (open, close, r, w, etc) that aren't a surprise to users of the Unix command line.
@paulghaddad
paulghaddad / Queries
Last active August 29, 2015 14:14
LevelUp 6 - Yadda Part 3
Query 1:
[local] phaddad@yadda=# EXPLAIN ANALYZE SELECT * FROM top_beers_for_brewery WHERE brewery_name = 'Great Lakes' LIMIT 3;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=23196.29..23196.33 rows=3 width=72) (actual time=253.532..253.532 rows=3 loops=1)
-> Sort (cost=23196.29..23196.64 rows=138 width=68) (actual time=253.530..253.530 rows=3 loops=1)
Sort Key: (count(ratings.id))
Sort Method: top-N heapsort Memory: 25kB
-> HashAggregate (cost=23190.01..23191.39 rows=138 width=68) (actual time=253.485..253.486 rows=10 loops=1)
@paulghaddad
paulghaddad / gist:22b50d936399b66989f6
Created February 5, 2015 14:23
LevelUp 6: Appropriately sizes database columns
1. Explain the problem with default column sizes and why we try to size columns properly.
Using default column sizes leads to an increased database size. If a column is larger than necessary for a column, the database will be larger than it should be. Suppose a column stores state abbreviations. If the column type is "text" instead of "char(2)", each record will use much more space than necessary, not only in the database but in the index, too (if it is indexed). When this occurs in a database with many columns and records, it leads to the following problems:
- The database size is much larger than it needs to be.
- The database is less efficient when performing queries because it must store and index larger column sizes.
In addition, using a variable column size when a fixed one will do leads to inefficiency.
Therefore, using the default column sizes without thinking through the requirements of the data leads to wasted resources and a less efficient database.
@paulghaddad
paulghaddad / gist:3a46a39a4d585f24c0dd
Created February 3, 2015 21:32
LevelUp 6: Knows why and how to explain a query and check index use
2. Explain what the cost numbers mean for a given query plan.
The cost consists of two numbers, displayed as follows: cost=0.00..458.00. 0.00 is the "Estimated Startup Cost", which is the time expended before the output phase (sorting for a sort node) can begin. 458.00 is the "Estimated Total Cost", which is the cost to retrieve all the rows. It assumes the plan node is run to completion, although the node's parent may stop short of retrieving all the nodes.
The costs are measured in arbitrary units, usually based in units of disk page fetches. It is computed with the following equation:
(disk pages read * seq_page_cost) + (rows scanned * cpu_tuple_cost)
The goal of the query planner is to minimize the cost for a given query plan. By using "Explain Analyze" on a query, you can check the actual cost numbers and row counts for each plan node. By modifying the query with the goal of reducing the cost numbers, you can minimize the time to perform a query.
@paulghaddad
paulghaddad / sample_migration.rb
Created January 20, 2015 15:02
LevelUp 6: Can write a Rails migration
1. Go back the rails project and write a migration, or paste a link to a migration you wrote.
Please see the second file on this Gist: sample_migration.rb.
2. Demonstrate how to add a unique index on two columns within a migration.
t.index([:column_1, :column_2], unique: true)
3. Explain the cases where you can or cannot modify an existing migration.
@paulghaddad
paulghaddad / gist:20a8eb09d10a81887842
Last active August 29, 2015 14:13
LevelUp 6: Can look up and use basic psql commands
1. Demonstrate the Rails command to enter the psql shell.
rails dbconsole
2. Demonstrate the command to get help on other psql commands.
psql --help for general help on psql.
From within psql, you can use \? to get the help on psql internal commands or \help for SQL commands.
@paulghaddad
paulghaddad / gist:baa2233208fdf2d5f526
Last active August 29, 2015 14:13
LevelUp 5: Uses semantic classes and IDs, and knows what deserves an ID
1. Name a few things that are often given IDs, but probably don't deserve them.
Unless an element will be targeted by JavaScript, it shouldn't require an ID. IDs are a very specific CSS selector, and if you write your CSS correctly, shouldn't be needed at all. Common elements that are given IDs are the header, footer, and sidebar regions of a page, whether they use the HTML5 elements, or a div. IDs aren't needed for such elements.
In addition to increasing specificity, using IDs for your CSS couples your presentation with your interaction. If you need to change or move an ID because your JavaScript changes, your CSS will also need to change. It is best to use classes for CSS and IDs for JavaScript.
@paulghaddad
paulghaddad / gist:5c36c21e9e3de2a80ca8
Last active August 29, 2015 14:13
LevelUp 6: Calculates mostly everything in the database, knows why
1. Explain why we prefer to calculate everything in the database.
It is common to summarize data from the database by collecting all the objects and performing the summary in Ruby code. But this leads to poor performance as needless data and objects are manipulated. Building ActiveRecord objects are expensive, and by writing the direct SQL, you'll get back plain Ruby Arrays, which is much less expensive. In addition, you are in control of the SQL you write, instead of letting ActiveRecord automatically generating it for you.
2. Give an example of a common operation in Rails that might be nonperformant if calculated at the app level, and then show how to calculate it properly using the database.
- Example 1: Let's say we want to get all the ids from an ActiveRecord model. The non-performance way would be to use #select from the Model:
User.select(:id)
@paulghaddad
paulghaddad / gist:64f49e82ac27cd840879
Last active August 29, 2015 14:13
LevelUp 6: Understands data quality, and how it has bitten us in the past
1. Define data quality. How do we know that we are capturing and delivering the right data?
Data quality is the degree of the reliability of the data we bring into our system and whether the data is complete (we aren't missing data).
Data of high quality is defined with the following traits, although they are not inclusive:
- Free from incomplete or corrupted records. This can lead to situations where expected information is not present in the record or can lead to attempts to divide by zero.
- Free from duplicate records.