- install psql and pgadmin
- set postgres password to access maintenance db; password is initially undefined, preventing login via postgres user account
sudo -u postgres psql postgres
alter user postgres with password 'postgres';
/etc/postgresql/13/main/postgresql.conf
edit config file, changing user's setting from md5
or peer
to trust
, log in, change password, return setting to previous value
- In /etc/postgresql/[version#]/main/pg_hba.conf change to:
# Database administrative login by Unix domain socket
local all [username] trust
- Restart the postgres service
sudo service postgresql restart
- Log in with same user
sudo -u [username] psql
- Update password
ALTER USER [username] PASSWORD '[password]';
$ createdb ls_burger
$ psql -d ls_burger < ls_burger.sql
$ psql -d ls_burger
\!
\! clear
Meta Command | Description | Example |
---|---|---|
\c $dbname | Connect to database $dbname. | \c blog_development |
\d | Describe available relations | |
\d $name | Describe relation $name | \d users |
? | List of console commands and options | |
\h | List of available SQL syntax Help topics | |
\h $topic | SQL syntax Help on syntax for $topic | \h INSERT |
\e | open query/statement currently being worked on in an editor | |
\q | Quit |
Command | Description |
---|---|
\conninfo | Info. about your connection to the DB |
\q | quit psql REPL |
\l(ist) | list all databases |
CREATE DATABASE [name] | create a new database |
psql -d [db_name] | connect to a database |
\c(onnect) | connect to a different database |
DROP DATABASE [db_name] | a database |
CREATE TABLE table_name() | create a new table in a database |
\dt | list all tables or relations in a database |
\d table_name | lists a particular table's schema |
DROP TABLE [IF EXISTS] table_name | delete a table if it exists (avoids 'does not exist' error messages) |
ALTER TABLE table_name HOW TO CHANGE THE TABLE additional arguments | change a table |
SELECT Clause | Notes |
---|---|
ORDER BY column_name [ASC, DESC] | Orders the data selected by a column name within the associated table. Data can be ordered in descending or ascending order; if neither are specified, the query defaults to ascending order. |
WHERE column_name [>,=, <=, =, <>] value | Filters a query result based on some comparison between a column's value and a specified literal value. There are several comparison operators available for use, from "greater than" to "not equal to". |
WHERE expression1 [AND, OR] expression2 | Filters a query result based whether one expression is true [and,or] another expression is true. |
WHERE string_column LIKE '%substring' | Filters a query result based on whether a substring is contained within string_column's data and has any number of characters before that substring. Those characters are matched using the wildcard %. % doesn't have to come before a substring, you can also put it after one as well, matching the substring first and then any number of characters after that substring. |
Command | Notes |
---|---|
INSERT INTO | table_name (column1_name, column2_name, ...) VALUES (data_for_column1, data_for_column2, ...); creates a new record in table_name with the specified columns and their associated values. |
ALTER TABLE table_name ADD UNIQUE (column_name); | Adds a constraint to table_name that prevent non-unique values from being added to the table for column_name |
ALTER TABLE table_name ADD CHECK (expression); | Adds a constraint to table_name that prevents new rows from being added if they don't pass a check based on a specified expression. |
Column Data Type | Description |
---|---|
serial | This data type is used to create identifier columns for a PostgreSQL database. These identifiers are integers, auto-incrementing, and cannot contain a null value. |
char(N) | This data type specifies that information stored in a column can contain strings of up to N characters in length. If a string less than length N is stored, then the remaining string length is filled with space characters. |
varchar(N) | This data type specifies that information stored in a column can contain strings of up to N characters in length. If a string less than length N is stored, then the remaining string length isn't used. |
boolean | This is a data type that can only contain two values "true" or "false". In PostgreSQL, boolean values are often displayed in a shorthand format, t or f |
integer or INT | An integer is simply a "whole number." An example might be 1 or 50, -50, or 792197 depending on what storage type is used. |
decimal(precision, scale) | The decimal type takes two arguments, one being the total number of digits in the entire number on both sides of the decimal point (the precision), the second is the number of the digits in the fractional part of the number to the right of the decimal point (the scale). |
timestamp | The timestamp type contains both a simple date and time in YYYY-MM-DD HH:MM:SS format. |
date | The date type contains a date but no time. |
Ooperator | Description |
---|---|
IS NULL | |
IS NOT NULL | |
BETWEEN | |
NOT BETWEEN | |
IS DISTINCT FROM | |
IS NOT DISTINCT FROM |
$ psql postgres
psql (10.12 (Ubuntu 10.12-0ubuntu0.18.04.1))
Type "help" for help.
postgres=#
NOTE: Utility functions (like createdb) are executed from the command prompt while SQL statements are executed from within the psql console.
Command | Description |
---|---|
createdb [db_name] | create a new database |
dropdb [db_name] | a database |
ls_burger=# SELECT * FROM orders;
ls_burger=# SELECT side FROM orders;
ls_burger=# SELECT drink, side FROM orders;
ls_burger=# SELECT * FROM orders WHERE id = 1;
ls_burger=# SELECT customer_name FROM orders WHERE side = 'Fries';
CREATE TABLE table_name (
column_1_name column_1_data_type [constraints, ...],
column_2_name column_2_data_type [constraints, ...],
.
.
.
constraints
);
CREATE TABLE users (id serial UNIQUE NOT NULL, username char(25), enabled boolean DEFAULT TRUE);
ALTER TABLE users RENAME TO all_users;
ALTER TABLE all_users RENAME COLUMN username TO full_name;
ALTER TABLE all_users ALTER COLUMN full_name TYPE varchar(25);
ALTER TABLE celebrities ALTER COLUMN date_of_birth TYPE date USING date_of_birth::date;
general form for adding a column constraint:
ALTER TABLE table_name ALTER COLUMN column_name SET constraint clause
ALTER TABLE all_users ALTER COLUMN full_name SET NOT NULL;
ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint clause
ALTER TABLE animals ADD CONSTRAINT unique_binomial_name UNIQUE (binomial_name);
removing both column and table constraints:
ALTER TABLE table_name DROP CONSTRAINT constraint_name
ALTER TABLE all_users ALTER COLUMN id DROP DEFAULT;
ALTER TABLE all_users ADD COLUMN last_login timestamp NOT NULL DEFAULT NOW();
ALTER TABLE all_users DROP COLUMN enabled;
DROP TABLE all_users;
CREATE TYPE spectral_type_enum AS ENUM ('O', 'B', 'A', 'F', 'G', 'K', 'M');
ALTER TABLE stars
ALTER COLUMN spectral_type TYPE spectral_type_enum
USING spectral_type::spectral_type_enum;
-- to delete an enum:
SELECT * FROM pg_enum;
-- note typid and label for those you wish to delete, then:
DELETE FROM pg_enum en WHERE en.enumtypid=124 AND en.enumlabel='unique';
CREATE TABLE flights (
id serial PRIMARY KEY,
flight_code varchar(8) NOT NULL,
airline_id int,
FOREIGN KEY (airline_id) REFERENCES airlines(id)
);
-- alternately, to define it later:
CREATE TABLE flights (
id serial PRIMARY KEY,
flight_code varchar(8) NOT NULL,
airline_id int
);
ALTER TABLE flights
ADD FOREIGN KEY (airline_id)
REFERENCES airlines(id);
-- and to delete it (may need to look up constraint name via \d flights):
ALTER TABLE flights
DROP CONSTRAINT flights_airline_id_fkey;
SELECT [*, (column_name1, column_name2, ...)]
FROM table_name WHERE (condition);
SELECT calls.when, calls.duration, contacts.first_name
FROM calls
INNER JOIN contacts
ON calls.contact_id = contacts.id
WHERE (contacts.first_name || ' ' || contacts.last_name) != 'William Swift';
INSERT INTO table_name (column1_name, column2_name, ...)
VALUES (data_for_column1, data_for_column2, ...);
INSERT INTO users (full_name, enabled)
sql_book-# VALUES ('John Smith', false);
INSERT INTO users (full_name)
sql_book-# VALUES ('Jane Smith'), ('Harry Potter');
ALTER TABLE users ADD CHECK (full_name <> '');
-
An INNER JOIN takes every row from the first table and joins it with every matching row from the second table.
-
A LEFT OUTER JOIN takes every row from the first table and joins it with every row from the second table, inserting empty (null) values for non-matching rows on the right side.
-
A RIGHT OUTER JOIN does the same, except taking every row from the second table and filling null values for non-matching rows in the left table.
-
A CROSS JOIN outputs joins of every possible combination from the left and right tables.
-- casts an integer to a float
COUNT(DISTINCT customers.id)::float
-- converts integer to text to search for numbers beginning with 3:
SELECT * FROM parts WHERE part_number::text LIKE '3%';
SELECT team, sum(score) AS team_score
FROM player_scores
GROUP BY team;
-- Note that in a statement using Group by, any columns referenced must either appear in the Group by clause itself or must be used in an aggregate function; if score was not used by sum or team was not in the Group by clause, this statement would produce an error.
SELECT [*, (column_name1, column_name2, ...)]
FROM table_name WHERE (condition)
ORDER BY column_name;
SELECT name FROM countries
ORDER BY population DESC LIMIT 1 OFFSET 1;
/* Gives country name with second-largest population */
SELECT full_name, enabled FROM users
ORDER BY full_name;
SELECT full_name, enabled FROM users
ORDER BY full_name DESC;
SELECT full_name, enabled FROM users
ORDER BY enabled DESC, id DESC;
SELECT * FROM users WHERE full_name = 'Harry Potter' OR enabled = 'false';
any # of chars, followed by 'Smith':
SELECT * FROM users WHERE full_name LIKE '%Smith';
-- any single char, followed by 'Smith':
SELECT * FROM users WHERE full_name LIKE '_Smith';
NOTE: SIMILAR TO
is another pattern-matching operator, which uses regexes
SELECT id FROM items
WHERE ROW('Painting', 100.00, 250.00) =
ROW(name, initial_price, sales_price);
LIMIT
is the basis, along with Offset, of pagination ('showing results 1-12 of 24')
LIMIT
determines the number of results to show out of the total that come back.
OFFSET
determines where to start, i.e., an offset of 1 starts at the second result, and so on
DISTINCT
filters out duplicates
SELECT * FROM users LIMIT 1;
SELECT topic, author, publish_date, category, replies_count, likes_count, last_activity_date FROM posts LIMIT 12 OFFSET 12;
SELECT DISTINCT binomial_name FROM animals;
Function | Example | Notes |
---|---|---|
length | SELECT length(full_name) FROM users; | This returns the length of every user's name. You could also use length in a WHERE clause to filter data based on name length. |
trim | SELECT trim(leading ' ' from full_name) FROM users; | If any of the data in our full_name column had a space in front of the name, using the trim function like this would remove that leading space. |
Function | Example | Notes |
---|---|---|
date_part | SELECT full_name, date_part('year', last_login) FROM users; | date_part allow us to view a table that only contains a part of a user's timestamp that we specify. The above query allows us to see each user's name along with the year of the last_login date. Sometimes having date/time data down to the second isn't needed |
age | SELECT full_name, age(last_login) FROM users; | The age function, when passed a single timestamp as an argument, calculates the time elapsed between that timestamp and the current time. The above query allows us to see how long it has been since each user last logged in. |
Function | Example | Notes |
---|---|---|
count | SELECT count(id) FROM users; | Returns the number of values in the column passed in as an argument. This type of function can be very useful depending on the context. We could find the number of users who have enabled account, or even how many users have certain last names if we use the above statement with other clauses. |
sum | SELECT sum(id) FROM users; | Not to be confused with count. This sums numeric type values for all of the selected rows and returns the total. |
min | SELECT min(last_login) FROM users; | This returns the lowest value in a column for all of the selected rows. Can be used with various data types such as numeric, date/ time, and string. |
max | SELECT max(last_login) FROM users; | This returns the highest value in a column for all of the selected rows. Can be used with various data types such as numeric, date/ time, and string. |
avg | SELECT avg(id) FROM users; | Returns the average (arithmetic mean) of numeric type values for all of the selected rows. |
SELECT enabled, count(id) FROM users GROUP BY enabled;
enabled | count
---------+-------
f | 1
t | 4
(2 rows)
SELECT COUNT(DISTINCT tickets.customer_id)
/ COUNT(DISTINCT customers.id)::float * 100
AS percent
FROM customers
LEFT OUTER JOIN tickets
ON tickets.customer_id = customers.id;
SELECT COUNT(id)
AS total_films,
COUNT(DISTINCT director_id) AS total_directors
FROM films;
ALTER TABLE films ADD CONSTRAINT director_name
CHECK (length(director) >= 3 AND position(' ' in director) > 0);
SELECT substr(email, strpos(email, '@') + 1) as domain, COUNT(id)
FROM people
GROUP BY domain
ORDER BY count DESC;
SELECT date, ROUND((high + low) / 2.0, 1) as average
FROM temperatures
WHERE date BETWEEN '2016-03-02' AND '2016-03-08';
SELECT title, extract("year" from current_date) - "year" AS age
FROM films
ORDER BY age ASC;
SELECT binomial_name FROM animals
encyclopedia-# ORDER BY length(binomial_name) DESC
encyclopedia-# LIMIT 1;
-- returns binomial_name string of greatest length
SELECT first_name
FROM celebrities
WHERE date_part('year', date_of_birth) = 1958;
-- select first names of celebrities born in 1958
SELECT max(max_age_years) FROM animals;
SELECT avg(max_weight)
FROM animals;
-- returns the average of the max_weights column values
SELECT count(id) FROM countries;
SELECT sum(population) FROM countries;
SELECT conservation_status, count(id)
FROM animals
GROUP BY conservation_status;
SELECT avg(burger_cost) FROM orders WHERE side = 'Fries';
SELECT min(side_cost) FROM orders
WHERE side IS NOT NULL;
SELECT side, count(id)
FROM orders
WHERE side = 'Fries'
OR side = 'Onion Rings'
GROUP BY side;
- HAVING is a means to sort after calling a function
- NOT IN is a filter to apply to results
SELECT customer_id, SUM(amount)
FROM payment
WHERE customer_id NOT IN(184,87,477)
GROUP BY customer_id
HAVING SUM(amount) > 100
SELECT store_id,COUNT(*)
FROM customer
GROUP BY store_id
HAVING COUNT(*) > 300
Use Select
first to confirm you've got the correct items, then change Select
to Update
or Delete
with the same modifiers, ensuring you're only affecting the rows you intend
UPDATE table_name SET [column_name1 = value1, ...]
WHERE (expression);
ALTER TABLE celebrities
RENAME TO singers;
/* change the name of the table */
UPDATE users SET enabled = false;
UPDATE users SET enabled = true
WHERE full_name = 'Harry Potter' OR full_name = 'Jane Smith';
DELETE FROM table_name WHERE (expression);
-- specific row
DELETE FROM users
sql_book-# WHERE full_name='Harry Potter' AND id > 3;
-- all rows
DELETE FROM users;
-- approximating deletion of a column with null
UPDATE table_name SET column_name1 = NULL
WHERE (expression);
-- Delete cascade (multiple tables, see below)
DROP TABLE public.stars CASCADE;
/* must be accompanied by ON DELETE CASCADE at the end of foreign key declaration, i.e.: */
ALTER TABLE plane
ADD COLUMN star_id integer NOT NULL REFERENCES stars (id) ON DELETE CASCADE;
ALTER TABLE users ADD PRIMARY KEY (id);
/*
one to one: User has one address
*/
CREATE TABLE addresses (
user_id int, -- Both a primary and foreign key
street varchar(30) NOT NULL,
city varchar(30) NOT NULL,
state varchar(30) NOT NULL,
PRIMARY KEY (user_id),
FOREIGN KEY (user_id) REFERENCES users (id) ON DELETE CASCADE
);
INSERT INTO addresses (user_id, street, city, state) VALUES
(1, '1 Market Street', 'San Francisco', 'CA'),
(2, '2 Elm Street', 'San Francisco', 'CA'),
(3, '3 Main Street', 'Boston', 'MA');
CREATE TABLE books (
id serial PRIMARY KEY,
title varchar(100) NOT NULL,
author varchar(100) NOT NULL,
published_date timestamp NOT NULL,
isbn char(12) UNIQUE
);
/*
one to many: Book has many reviews
*/
CREATE TABLE reviews (
id serial PRIMARY KEY,
book_id integer NOT NULL,
reviewer_name varchar(255),
content varchar(255),
rating integer,
published_date timestamp DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (book_id) REFERENCES books(id) ON DELETE CASCADE
);
INSERT INTO books (id, title, author, published_date, isbn) VALUES
(1, 'My First SQL Book', 'Mary Parker', '2012-02-22 12:08:17.320053-03', '981483029127'),
(2, 'My Second SQL Book', 'John Mayer', '1972-07-03 09:22:45.050088-07', '857300923713'),
(3, 'My First SQL Book', 'Cary Flint', '2015-10-18 14:05:44.547516-07', '523120967812');
INSERT INTO reviews (id, book_id, reviewer_name, content, rating, published_date) VALUES
(1, 1, 'John Smith', 'My first review', 4, '2017-12-10 05:50:11.127281-02'),
(2, 2, 'John Smith', 'My second review', 5, '2017-10-13 15:05:12.673382-05'),
(3, 2, 'Alice Walker', 'Another review', 1, '2017-10-22 23:47:10.407569-07');
CREATE TABLE checkouts (
id serial,
user_id int NOT NULL,
book_id int NOT NULL,
checkout_date timestamp,
return_date timestamp,
PRIMARY KEY (id),
FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
FOREIGN KEY (book_id) REFERENCES books(id) ON DELETE CASCADE
);
INSERT INTO checkouts (id, user_id, book_id, checkout_date, return_date) VALUES
(1, 1, 1, '2017-10-15 14:43:18.095143-07', NULL),
(2, 1, 2, '2017-10-05 16:22:44.593188-07', '2017-10-13 13:0:12.673382-05'),
(3, 2, 2, '2017-10-15 11:11:24.994973-07', '2017-10-22 17:47:10.407569-07'),
(4, 5, 3, '2017-10-15 09:27:07.215217-07', NULL);
CREATE TABLE continents (
id serial PRIMARY KEY,
continent_name varchar(50)
);
ALTER TABLE countries
DROP COLUMN continent;
ALTER TABLE countries
ADD COLUMN continent_id integer;
ALTER TABLE countries
ADD FOREIGN KEY (continent_id)
REFERENCES continents(id);
CREATE TABLE customers (
id serial PRIMARY KEY,
customer_name varchar(100)
);
CREATE TABLE email_addresses (
customer_id integer PRIMARY KEY,
customer_email varchar(50),
FOREIGN KEY (customer_id)
REFERENCES customers (id)
ON DELETE CASCADE);
INSERT INTO customers (customer_name)
VALUES ('James Bergman'),
('Natasha O''Shea'),
('Aaron Muller');
INSERT INTO email_addresses (customer_id, customer_email)
VALUES (1, '[email protected]'),
(2, '[email protected]');
-
inner joins - results in a set of records which match in both tables (Default behavior of JOIN not otherwise specified)
-
full outer joins - selects all elements from either table (OR); values from the left-side table without corresponding values in the right will fill the right values with
null
;WHERE
can be used to get the antithesis of an inner join, i.e., just values unique to left or right but not common to bothSELECT * FROM TableB FULL OUTER JOIN TableA ON TableA.col_match = TableB.col_match WHERE TableA.id IS null OR TableB.id IS null;
-
-
left outer joins - results in the set of records that are in the left table; if there's no match w/right table, results are
null
; i.e., Table A AND (Table A AND Table B) - things in Table A and things both in Table A and Table B -
right joins - same as left join, but switching tables around
-
union - used to combine the result sets of two or more
SELECT
statements -
subquery - a subquery is the embedding of one
SELECT
statement within another, which can often be used as a replacement for aJOIN
-
**self join **- a
JOIN
of two copies of the same table, using sameJOIN
syntax using the same table on left and right sides
/* Join syntax */
SELECT [table_name.column_name1, table_name.column_name2,..] FROM table_name1
join_type JOIN table_name2 ON (join_condition);
/* example */
SELECT colors.color, shapes.shape FROM
colors JOIN shapes ON
colors.id = shapes.color_id;
/* (Inner) Join:
common elements of the tables
*/
SELECT users.*, addresses.*
FROM users
INNER JOIN addresses
ON (users.id = addresses.user_id);
/* LEFT (OUTER) JOIN
joins all rows from LEFT table to the other */
SELECT users.*, addresses.*
FROM users
LEFT JOIN addresses
ON (users.id = addresses.user_id);
*/
```sql
/* RIGHT (OUTER) JOIN
joins all rows from RIGHT table to the other */
SELECT reviews.book_id, reviews.content,
reviews.rating, reviews.published_date,
books.id, books.title, books.author
FROM reviews RIGHT JOIN books ON (reviews.book_id = books.id);
/* FULL JOIN */
/* CROSS JOIN
returns all possible combinations of rows from the tables being joined*/
SELECT * FROM users CROSS JOIN addresses;
/* this will return every use combined with every address */
- a view is a saved query, allowing reuse without having to create it again
- can run queries against a view as though it is an actual table
- it doesn't take up any extra space as it uses the original tables
CREATE VIEW manufacturing.product_details AS
SELECT products.product_id,
products.name AS products_name,
products.manufacturing__cost,
categories.name AS category_name,
categories.market
FROM manufacturing.products JOIN manufacturing.categories
ON products.category_id - categories.category_id;
- allows assigning database ownership/access to a specific username
CREATE SCHEMA manufacturing
AUTHORIZATION postgres;
/* join 3 tables using INNER JOINs */
SELECT users.full_name, books.title, checkouts.checkout_date
FROM users
INNER JOIN checkouts ON (users.id = checkouts.user_id)
INNER JOIN books ON (books.id = checkouts.book_id);
SELECT u.full_name, b.title, c.checkout_date
FROM users AS u
INNER JOIN checkouts AS c ON (u.id = c.user_id)
INNER JOIN books AS b ON (b.id = c.book_id);
SELECT u.full_name, b.title, c.checkout_date
FROM users u
INNER JOIN checkouts c ON (u.id = c.user_id)
INNER JOIN books b ON (b.id = c.book_id);
sql_book=# SELECT count(id) AS "Number of Books Checked Out"
sql_book-# FROM checkouts;
Number of Books Checked Out
-----------------------------
4
(1 row)
SELECT u.full_name FROM users u
WHERE u.id NOT IN (SELECT c.user_id FROM checkouts c);
full_name
-------------
Harry Potter
(1 row)
SELECT title FROM books WHERE author_id =
(SELECT id FROM authors WHERE name = 'William Gibson');
SELECT 1 WHERE EXISTS
(SELECT id FROM books
WHERE isbn = '9780316005388');
/*
ls_burger queries
*/
/* returns a list of all orders and their associated product items */
SELECT o.*, p.*
FROM orders o JOIN order_items i
ON o.id = i.order_id
JOIN products p
ON i.product_id = p.id;
/* return the id of any order that includes Fries */
SELECT o.id
FROM orders o JOIN order_items i
ON o.id = i.order_id
JOIN products p
ON i.product_id = p.id
WHERE p.product_name = 'Fries';
/* build on the previous query to return the name of any customer who ordered fries, returned in a column called 'Customers who like Fries'. Don't repeat the same customer name more than once in the results. */
SELECT DISTINCT c.customer_name AS "Customers who like Fries"
FROM customers c JOIN orders o
ON c.id = o.customer_id
JOIN order_items as oi
ON o.id = oi.order_id
JOIN products AS p
ON oi.product_id = p.id
WHERE p.product_name = 'Fries';
/* returns the total cost of Natasha O'Shea's orders */
SELECT sum(p.product_cost)
FROM customers c JOIN orders o
ON c.id = o.customer_id
JOIN order_items oi
ON o.id = oi.order_id
JOIN products p
ON oi.product_id = p.id
WHERE c.customer_name = 'Natasha O''Shea';
/* returns the name of every product included in an order along side the number of times it has been ordered */
SELECT p.product_name, COUNT(oi.id)
FROM products p JOIN order_items oi
ON p.id = oi.product_id
GROUP BY p.product_name;
Use consistent and descriptive identifiers and names. Make judicious use of white space and indentation to make code easier to read. Store ISO-8601 compliant time and date information (YYYY-MM-DD HH:MM:SS.SSSSS). Try to use only standard SQL functions instead of vendor specific functions for reasons of portability. Keep code succinct and devoid of redundant SQL—such as unnecessary quoting or parentheses or WHERE clauses that can otherwise be derived. Include comments in SQL code where necessary. Use the C style opening /* and closing */ where possible otherwise precede comments with -- and finish them with a new line.
CamelCase—it is difficult to scan quickly. Descriptive prefixes or Hungarian notation such as sp_ or tbl. Plurals—use the more natural collective term where possible instead. For example staff instead of employees or people instead of individuals.
Quoted identifiers—if you must use them then stick to SQL92 double quotes for portability (you may need to configure your SQL server to support this depending on vendor). Object oriented design principles should not be applied to SQL or database structures.
Ensure the name is unique and does not exist as a reserved keyword. Keep the length to a maximum of 30 bytes—in practice this is 30 characters unless you are using multi-byte character set. Names must begin with a letter and may not end with an underscore. Only use letters, numbers and underscores in names. Avoid the use of multiple consecutive underscores—these can be hard to read. Use underscores where you would naturally include a space in the name (first name becomes first_name). Avoid abbreviations and if you have to use them make sure they are commonly understood.
SELECT first_name
FROM staff;
Use a collective name or, less ideally, a plural form. For example (in order of preference) staff and employees. Do not prefix with tbl or any other such descriptive prefix or Hungarian notation. Never give a table the same name as one of its columns and vice versa. Avoid, where possible, concatenating two table names together to create the name of a relationship table. Rather than cars_mechanics prefer services.
Always use the singular name. Where possible avoid simply using id as the primary identifier for the table. Do not add a column with the same name as its table and vice versa. Always use lowercase except where it may make sense not to such as proper nouns.
Should relate in some way to the object or expression they are aliasing. As a rule of thumb the correlation name should be the first letter of each word in the object’s name. If there is already a correlation with the same name then append a number. Always include the AS keyword—makes it easier to read as it is explicit. For computed data (SUM() or AVG()) use the name you would give it were it a column defined in the schema.
Spaces should be used to line up the code so that the root keywords all end on the same character boundary. This forms a river down the middle making it easy for the readers eye to scan over the code and separate the keywords from the implementation detail. Rivers are bad in typography, but helpful here.
(SELECT f.species_name,
AVG(f.height) AS average_height, AVG(f.diameter) AS average_diameter
FROM flora AS f
WHERE f.species_name = 'Banksia'
OR f.species_name = 'Sheoak'
OR f.species_name = 'Wattle'
GROUP BY f.species_name, f.observation_date)
UNION ALL
(SELECT b.species_name,
AVG(b.height) AS average_height, AVG(b.diameter) AS average_diameter
FROM botanic_garden_flora AS b
WHERE b.species_name = 'Banksia'
OR b.species_name = 'Sheoak'
OR b.species_name = 'Wattle'
GROUP BY b.species_name, b.observation_date);
Joins should be indented to the other side of the river and grouped with a new line where necessary.
SELECT r.last_name
FROM riders AS r
INNER JOIN bikes AS b
ON r.bike_vin_num = b.vin_num
AND b.engine_tally > 2
INNER JOIN crew AS c
ON r.crew_chief_last_name = c.last_name
AND c.chief = 'Y';
Subqueries should also be aligned to the right side of the river and then laid out using the same style as any other query. Sometimes it will make sense to have the closing parenthesis on a new line at the same character position as its opening partner—this is especially true where you have nested subqueries.
SELECT r.last_name,
(SELECT MAX(YEAR(championship_date))
FROM champions AS c
WHERE c.last_name = r.last_name
AND c.confirmed = 'Y') AS last_championship_year
FROM riders AS r
WHERE r.last_name IN
(SELECT c.last_name
FROM champions AS c
WHERE YEAR(championship_date) > '2008'
AND c.confirmed = 'Y');
Make use of BETWEEN where possible instead of combining multiple statements with AND. Similarly use IN() instead of multiple OR clauses. Where a value needs to be interpreted before leaving the database use the CASE expression. CASE statements can be nested to form more complex logical structures. Avoid the use of UNION clauses and temporary tables where possible. If the schema can be optimised to remove the reliance on these features then it most likely should be.
SELECT CASE postcode
WHEN 'BN1' THEN 'Brighton'
WHEN 'EH1' THEN 'Edinburgh'
END AS city
FROM office_locations
WHERE country = 'United Kingdom'
AND opening_time BETWEEN 8 AND 9
AND postcode IN ('EH1', 'BN1', 'NN1', 'KW1');
Where possible do not use vendor specific data types—these are not portable and may not be available in older versions of the same vendor’s software. Only use REAL or FLOAT types where it is strictly necessary for floating point mathematics otherwise prefer NUMERIC and DECIMAL at all times. Floating point rounding errors are a nuisance!
Default values must follow the data type declaration and come before any NOT NULL statement.
Deciding the column(s) that will form the keys in the definition should be a carefully considered activity as it will effect performance and data integrity.
- The key should be unique to some degree.
- Consistency in terms of data type for the value across the schema and a lower likelihood of this changing in the future.
- Can the value be validated against a standard format (such as one published by ISO)? Encouraging conformity to point 2.
- Keeping the key as simple as possible whilst not being scared to use compound keys where necessary.
It is a reasoned and considered balancing act to be performed at the definition of a database. Should requirements evolve in the future it is possible to make changes to the definitions to keep them up to date. Defining constraints
Once the keys are decided it is possible to define them in the system using constraints along with field value validation. General
Tables must have at least one key to be complete and useful. Constraints should be given a custom name excepting UNIQUE, PRIMARY KEY and FOREIGN KEY where the database vendor will generally supply sufficiently intelligible names automatically.
- Specify the primary key first right after the CREATE TABLE statement.
- Constraints should be defined directly beneath the column they correspond to. Indent the constraint so that it aligns to the right of the column name.
- If it is a multi-column constraint then consider putting it as close to both column definitions as possible and where this is difficult as a last resort include them at the end of the CREATE TABLE definition.
- If it is a table level constraint that applies to the entire table then it should also appear at the end.
- Use alphabetical order where ON DELETE comes before ON UPDATE.
- If it make senses to do so align each aspect of the query on the same character position. For example all NOT NULL definitions could start at the same character position. This is not hard and fast, but it certainly makes the code much easier to scan and read.
- Use LIKE and SIMILAR TO constraints to ensure the integrity of strings where the format is known.
- Where the ultimate range of a numerical value is known it must be written as a range CHECK() to prevent incorrect values entering the database or the silent truncation of data too large to fit the column definition. In the least it should check that the value is greater than zero in most cases.
- CHECK() constraints should be kept in separate clauses to ease debugging.
CREATE TABLE staff (
PRIMARY KEY (staff_num),
staff_num INT(5) NOT NULL,
first_name VARCHAR(100) NOT NULL,
pens_in_drawer INT(2) NOT NULL,
CONSTRAINT pens_in_drawer_range
CHECK(pens_in_drawer >= 1 AND pens_in_drawer < 100)
);
- Object oriented design principles do not effectively translate to relational database designs—avoid this pitfall.
- Placing the value in one column and the units in another column. The column should make the units self evident to prevent the requirement to combine columns again later in the application. Use CHECK() to ensure valid data is inserted into the column.
- EAV (Entity Attribute Value) tables—use a specialist product intended for handling such schema-less data instead.
- Splitting up data that should be in one table across many because of arbitrary concerns such as time-based archiving or location in a multi-national organisation. Later queries must then work across multiple tables with UNION rather than just simply querying one table.
adding an index to a table:
CREATE INDEX index_name ON table_name (field_name);
$ sudo yum install postgresql96 postgresql96-server
$ sudo service postgresql96 initdb
$ sudo service postgresql start
$ sudo -u postgres createuser -s $LOGNAME
- Rows are collected into a virtual derived table, including any tables used in JOIN clauses.
- Rows are filtered using WHERE conditions.
- Rows are divided into groups specified by GROUP BY clause.
- Groups are filtered using HAVING conditions.
- Each element in the select list is evaluated, including any functions and associated with the name of the column they are from or the name of the last function evaluated unless a different name is specified with AS.
- Result set is sorted as specified in any ORDER BY clause.
- Sorted results are LIMITed or OFFSET.
\password [ username ]
Changes the password of the specified user (by default, the current user). This command prompts for the new password, encrypts it, and sends it to the server as an ALTER ROLE command. This makes sure that the new password does not appear in cleartext in the command history, the server log, or elsewhere.
--password
Force psql to prompt for a password before connecting to a database.
This option is never essential, since psql will automatically prompt for a password if the server demands password authentication. However, psql will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing -W to avoid the extra connection attempt.
Note that this option will remain set for the entire session, and so it affects uses of the meta-command \connect as well as the initial connection attempt.
https://www.postgresql.org/docs/9.2/libpq-pgpass.html
The file .pgpass in a user's home directory or the file referenced by PGPASSFILE can contain passwords to be used if the connection requires a password (and no password has been specified otherwise). On Microsoft Windows the file is named %APPDATA%\postgresql\pgpass.conf (where %APPDATA% refers to the Application Data subdirectory in the user's profile).
This file should contain lines of the following format:
hostname:port:database:username:password
TBD
Command | Description |
---|---|
PG.connect(dbname: "a_database") | Create a new PG::Connection object |
connection.exec("SELECT * FROM table_name") | execute a SQL query and return a PG::Result object |
sql = "INSERT INTO ... VALUES ($1, $2)" | |
connection.exec_params(sql, [var1, var2]) | safely insert variable values into a sql statement (user input) |
result.values | Returns an Array of Arrays containing values for each row in result |
result.fields | Returns the names of columns as an Array of Strings |
result.ntuples | Returns the number of rows in result |
result.each(&block) | Yields a Hash of column names and values to the block for each row in result |
result.each_row(&block) | Yields an Array of values to the block for each row in result |
result[index] | Returns a Hash of values for row at index in result |
result.field_values(column) | Returns an Array of values for column, one for each row in result |
result.column_values(index) | Returns an Array of values for column at index, one for each row in result |
https://deveiate.org/code/pg/PG/Result.html
#! /usr/bin/env ruby
require 'pg'
require 'io/console'
class ExpenseData
def initialize
@db_connection = PG.connect(dbname: "ls185")
setup_schema
end
def setup_schema
# check if expenses table exists: query returns 1 (exists) or 0 (doesn't exist)
table_exists = @db_connection.exec <<~SQL
SELECT COUNT(*)
FROM information_schema.tables
WHERE table_schema = 'public' AND table_name = 'expenses';
SQL
if table_exists[0]["count"] == "0"
@db_connection.exec <<~SQL
CREATE TABLE expenses (
id serial PRIMARY KEY,
amount numeric(6, 2) NOT NULL CHECK(amount >= 0.01),
memo text NOT NULL,
created_on date NOT NULL
);
SQL
end
end
def list_expenses
result = @db_connection.exec("SELECT * FROM expenses ORDER BY created_on;")
sql = "SELECT SUM(amount) FROM expenses;"
total = @db_connection.exec(sql).values[0][0]
display_count(result)
display_expenses(result) if result.ntuples > 0
end
def add_expense(amount, memo)
date = Date.today
sql = "INSERT INTO expenses (amount, memo, created_on) VALUES ($1, $2, $3);"
@db_connection.exec_params(sql, [amount, memo, date])
end
def search_expenses(term)
sql = "SELECT * FROM expenses WHERE memo ILIKE $1"
result = @db_connection.exec_params(sql, ["%#{term}%"])
display_count(result)
display_expenses(result) if result.ntuples > 0
end
def delete_expense(input_id)
sql = "SELECT * FROM expenses WHERE id = $1;"
number_included = @db_connection.exec_params(sql, [input_id])
if number_included.ntuples == 1
sql = "DELETE FROM expenses WHERE id=$1;"
@db_connection.exec_params(sql, [input_id])
else
abort "You must use a valid id number from the list."
end
end
def delete_all_expenses
@db_connection.exec("DELETE FROM expenses;")
puts "All expenses have been deleted."
end
private
def display_count(expenses)
count = expenses.ntuples
if count == 0
puts "There are no expenses."
else
puts "There #{count > 1 ? "are" : "is"} #{count} expense#{"s" if count > 1}."
end
end
def display_expenses(expenses)
expenses.each do |tuple|
columns = [ tuple["id"].rjust(3),
tuple["created_on"].rjust(10),
tuple["amount"].rjust(12),
tuple["memo"] ]
puts columns.join(" | ")
end
total = @db_connection.exec("SELECT SUM(amount) FROM expenses;").values[0][0]
puts "------------------------------------"
puts "Total\t\t\t #{total}"
end
end
class CLI
def initialize
@application = ExpenseData.new
end
def run(args)
command = args.shift
case command
when "list"
@application.list_expenses
when "add"
amount = args[0]
memo = args[1]
abort "You must provide an amount and memo." unless amount && memo
@application.add_expense(amount, memo)
when "search"
term = args[0]
abort "You must provide a search term." unless term
@application.search_expenses(term)
when "delete"
input_id = args[0]
abort "You must provide a valid id number." unless input_id
@application.delete_expense(input_id)
when "clear"
puts "This will remove all expenses. Are you sure? (y/n)"
response = $stdin.getch
@application.delete_all_expenses if response == "y"
else
display_help
end
end
def display_help
puts <<~HELP
An expense recording system
Commands:
add AMOUNT MEMO [DATE] - record a new expense
clear - delete all expenses
list - list all expenses
delete NUMBER - remove expense with id NUMBER
search QUERY - list expenses with a matching memo field
HELP
end
end
CLI.new.run(ARGV)
- Install Pry
gem install pry pry-doc
- Install pg in pry
require "pg"
- Connect to a db
db = PG.connect(dbname: "films")
- Retrieve all available methods in current context
ls
- Show info. on a method
show-method PG.connect
- Store a query result in a variable
result = db.exec "SELECT * FROM films;" # ; is optional. never send more than one statement at a time.
result.values # show rows returned (arrays)
result.values.size # returns length of values array
result.ntuples # same as above
ls result # list all methods available
show-method [method-name] # show info. about a method
result.fields # show column names
- Manipulate Results
result.each do |tuple|
puts "#{tuple["title"]} came out in #{tuple["year"]}"
end
# when calling each, block will be yielded a hash with keys as the names of the columns and values as data in that row
# there is also an each_row which passes an array instead of a hash, and has normal Ruby array methods available to it
result.each_row do |row|
puts "#{row[1]} came out in #{row[2]}"
end
# you can also retrieve a specific row as a hash via
result[2]
# note that all values are strings and must be typecast to their original data types
# you can return an array of the values of a particular column and give them an alias name as follows:
result.field_values("duration")
# this returns the same array of values of a particular column, but instead of giving an alias, you give the index of the desired column:
result.column_values(4)
$ psql -d ls_burger < ls_burger.sql
$ pg_dump -d sql-course -t weather --inserts > dump.sql
-- when is a reserved word, so in order to use it as data, it must be put in quotes to escape its special meaning
INSERT INTO calls ("when", duration, contact_id) VALUES ('2016-01-17 11:52:00', 175, 26);
-- Generating a sequence
CREATE SEQUENCE counter;
-- Retrieve next value from a sequence
SELECT nextval('counter');
-- Remove a sequence
DROP SEQUENCE counter;
-- Create a sequence of even numbers
CREATE SEQUENCE even_counter INCREMENT BY 2 MINVALUE 2;
SELECT nextval('even_counter');
-- names of sequences
CREATE TABLE regions (id serial PRIMARY KEY, name text, area integer);
-- name: regions_id_seq
- embeds a query within a query so that queries can be chained together to get sub-results from one set of results
- can be used sometimes as an easier-to-read/understand alternative to
JOIN
s
SELECT student, grade
FROM test_scores
WHERE grade > (SELECT AVG(grade)
FROM test_scores)