Created
August 5, 2009 12:32
-
-
Save hgmnz/162668 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# PostgreSQL Notes | |
For first time Postgres users, the tutorial section on the documentation is a must read: http://www.postgresql.org/docs/8.4/interactive/tutorial.html | |
### Installation | |
Easiest way is to use your package manager (yum, apt-get, etc). On MacOS, macports made it easy. Building the source is straightforward as well. There are plenty of resources online... | |
### Get acquainted with psql | |
To enter the Postgres command prompt | |
psql -U your_username -d the_database | |
If there is a user in your database that matches your unix username, the -U parameter is optional (it will use your username). On development machines it is a good idea to create the matching postgres user. In that case, it is not needed to specify a username and password on database.yml. | |
From within a rails app with a database.yml configured, script/dbconsole works as expected (drops you into psql). | |
Within the psql command prompt you can list databases with \l, describe database objects using \d, connect to remote databases using \c, and many other useful tasks. Don't be shy about using \? for help, and \h <command> for help on a specific command. | |
A useful trick is to use \x before running a query that will return many columns. \x activates expanded mode. Try it. | |
### Rubygem/postgres adapter. | |
There are many adapters in the wild, some native and some pure ruby. The one that is known to work for me both on Mac OS and on Linux is the following. In the dark windows days, easiest way is to use the pure ruby driver, although with some work the native driver can also be installed (involved moving DLLs from the postgres installation directory to the ruby bin directory, if I remember correctly). | |
gem list --local |grep postgres | |
postgres (0.7.9.2008.01.28) | |
### Create db_user role (user) | |
sudo -u postgres createuser --no-superuser --createdb --no-createrole --login --pwprompt --encrypted -h 127.0.0.1 -p 5432 db_user | |
## Restore Database | |
### Create an empty database from template0. | |
Via SQL (ie: in a psql prompt) | |
create database db_name owner=db_user template=template0 encoding='utf8'; | |
Via createdb | |
createdb -U postgres -O db_user -T template0 -E 'utf8' db_name | |
### Drop a DB | |
dropdb -U postgres db_name | |
## Backup Database | |
### Dump Entire Database to Custom Format | |
pg_dump -U postgres -Fc -v -f db_name-20081107.bak db_name | |
Optional: -Z9 (maximum compression ratio) | |
### Dump users and people to Custom Format | |
pg_dump -U postgres -t users -t people -Fc -v -f db_name-user_and_people-20081107.bak db_name | |
### Restore from custom format. | |
pg_restore -U postgres -d db_name -v db_name.bak | |
pg_restore also takes options for restoring specific objects, if the database dump uses the custom format (-Fc) | |
### Size of Database | |
select pg_size_pretty( pg_database_size('db_name') ); | |
### Size of a table and its indexes | |
select pg_size_pretty( pg_total_relation_size('some_table') ); | |
### Troubleshooting | |
On Linux, if the server will not start, we can force the standard error to be redirected to standard output in order to troubleshoot properly: | |
sudo -u postgres pg_ctl -D /var/lib/pgsql/data start | |
This is particularly useful when tweaking the shared_buffers parameter. If postgresql doesn't start, usually shmmaxm must be increased, and the "request size" on the error log will tell you exactly what this kernel param must be set to. | |
### DBLink | |
DBLink allows you to execute queries on a remote database. It is part of the contrib modules. | |
#### DBLink Installation | |
First, make sure you've installed the postgresql-contrib package. This includes dblink.sql which is installed by doing: | |
psql db_name -U postgres < /usr/share/pgsql/contrib/dblink.sql | |
#### Querying via DBLink | |
Unfortunately, all column types of a query must be specified. For example: | |
select users.* | |
from dblink('dbname=db_name host=<remote_ip_addr> password=<remote_password> user=<remote_user>', 'select * from users') | |
as users(id integer, login varchar(255), email varchar(255), | |
crypted_password varchar(40), salt varchar(40), created_at timestamp, | |
updated_at timestamp, remember_token varchar(255), | |
remember_token_expires_at timestamp, person_id integer, | |
user_status_id integer); | |
#### Moving data from a remote server to the local DB | |
This is as simple as an insert ... from ..., only the from portion comes from a remote server using dblink: | |
insert into users select remote_users.* | |
from dblink('dbname=db_name host=<remote_ip_address> password=<remote_password> user=<remote_user>', 'select * from users') | |
as remote_users(id integer, login varchar(255), email varchar(255), | |
crypted_password varchar(40), salt varchar(40), created_at timestamp, | |
updated_at timestamp, remember_token varchar(255), remember_token_expires_at timestamp, | |
person_id integer, user_status_id integer); | |
### Basic Server Setup on Linux | |
This section is mostly useful for setting up a server for production use: | |
Set password | |
Change the PostgreSQL postgres user password, we will not be able to access the server otherwise. As the "postgres" Linux user, execute the psql command: | |
sudo -u postgres psql template1 | |
At the new prompt, change the password | |
ALTER USER postgres WITH PASSWORD 'new_password'; | |
\q | |
### Update postgresql.conf File | |
Set the listen_address option to '*' if this server will be accessed from other hosts. | |
### Performance tuning | |
Increase "System V IPC" shared memory min and max on linux kernel: | |
See the following: | |
http://www.postgresql.org/docs/current/static/kernel-resources.html | |
As a rule of thumb, set the shmmax value to 1/4 of the available memory on the system, in bytes. If this is a dedicated postgres server, set to half the RAM. | |
For example, server has 4GB of RAM: | |
4GB/4 = 1GB = 1024MB = 1048576 KB = 1073741824 B | |
sudo /sbin/sysctl -w kernel.shmmax=1073741824 | |
sudo /sbin/sysctl -w kernel.shmall=2097152 | |
edit the following file, and add the two parameters above (kernel.shm*=*) | |
/etc/sysctl.conf | |
To check what these values are set to: | |
cat /proc/sys/kernel/shmmax | |
cat /proc/sys/kernel/shmall | |
For shared_buffers, start with 1/4 of available RAM. cache_miss statistics can tell if you need more | |
more buffers needed for many concurrent queries and many CPUs. | |
shared_buffers: 1/4 RAM. | |
RAM available for queries: | |
effective_cache_size: set it to 2/3 the RAM on the server (a normal conservative setting). | |
The less max_connections, the better (each connection allocates a chunk of RAM). Think of the need for this server: | |
max_connections = 32 | |
Turns of WAL logging. Risky, but improves performance (don't do this until we've thoroughly tested) - if server crashes (unlikely), there will be no log available to restore latest transactions. | |
fsync=off #trying this on a staging server for now. | |
Write less to the WAL | |
checkpoint_segments = 64 (16 = checkpoint every 512MB. For bulk loading or DW, settings of 64/1GB or over are common) | |
checkpoint_completion_target = 0.90 (aim to finish writing WAL by the time 90% of the next checkpoint is ready. Note that this only works for postgresql version >= 8.3) | |
checkpoint_timeout = defaults to 5min, can be increased too (I've used 25min successfuly) | |
How much statistics to collect, helps query planner to use indexes. | |
default_statistics_target = from 200 - 1000 for large databases | |
This setting is used almost exclusively by the vacuum process, but also for create index and adding foreign keys. | |
you can additionally use log_temp_files to figure out if sorts are using disk instead of fitting in memory. | |
This setting can be set at load time, like work_mem | |
Larger values are common while performing bulk updates or DB restores. | |
maintenance_work_mem = 1/4 of your RAM # for version >= 8.3 | |
maintenance_work_mem = 1/8 of your RAM # for version < 8.3 | |
Default work_mem value is a whimpy 1MB. Tells postgres how much memory is available before creating temp files during sorts | |
for queries that require significant sorting and hash operations. | |
You may set this value on the fly, for example: | |
<code> | |
CONNECT | |
set work_mem = 1024MB | |
select distinct * from ... order by ... | |
DISCONNECT | |
</code> | |
In general, OS swapping == too much work_mem, while caching sorts in pg_temp == not enough work_mem. | |
work_mem = 32MB | |
Which method to use for commits to disk of WAL. This is not conservative, but popular. | |
wal_sync = open_sync | |
How much memory used in shared memory for WAL data: | |
wal_buffers = 32MB # do this for SMP systems (and otherwise) and DW with 4GB on server. OLTP => 16MB | |
this setting is all about how expensive it is to for your disks to seek a random disk page. Lower it to 2.5 to encourage the query optimizer to use index scans. The default of 4.0 is too conservative for modern hardware. | |
random_page_cost = 2.5 | |
This effectively double checks data log writes. Turn off if you have reliable hardware and filesystem | |
full_page_writes = false/off | |
Support "batch commits", for applications with many concurrent writers | |
commit_delay = 250 # I haven't tried this. | |
deadlock_timeout = 2000 | |
Minimize logging on production servers | |
log_error_verbosity = terse #writes lss than default | |
log_min_messages = warning for dev servers, error for prod servers. | |
log_statement = 'none' #don't log queries on prod servers | |
syslog_facility = 'LOCAL2' | |
syslog_ident = 'postgres' | |
log_connections = off | |
log_disconnections = off | |
log_duration = off |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment