- Introduction
The official instructions for installing Nominatim are complete, but brief in places, and several steps must be changed in the Amazon Linux environment (which is roughly CentOS / Redhat). The steps below are rough record of what I did to get it working, but I didn't keep perfect track so you shouldn't rely on them as a shell script. Just follow each step, make sure it worked, and hopefully you'll need to adapt very little (version numbers, for one thing). (I also skip in and out of root, but you can be more careful if you like.)
- Setting up the EC2 instance
There's plenty of information on setting up Amazon EC2 instances elsewhere. I chose an r3.2xlarge machine (61 GB memory, 8 vCPUs, $0.70/hour), based on several-year-old suggestions that you need at least 32 GB of memory for the install, and the assumption that OSM has grown since then. I attached 2 x 750GB EBS volumes (as /dev/sd[f,g]---eventually in RAID0 striping), again based on previous old size estimates plus an allowance for growth. The root volume can be relatively small, but you might want to allow, say, 10-20GB for source data files. Or you can store them on the large EBS volumes, as I ended up having to because I left the root at default 8GB.
The total install time was reasonable, but I'm sure this isn't an optimal configuration. I considered other storage options like provisioned IOPS EBS and instance-attached storage, which may have sped up disk-bound tasks, but decided to stick with plain vanilla EBS in the end.
Login in to your running EC2 instance. You may want to invoke screen
or equivalent so nothing quits if you get disconnected, as several commands will run for days.
- Setting up disk storage
These commands will construct a RAID0 striping volume over the two EBS volumes, create the filesystem and arrange for it to be mounted at boot time as /vol
.
sudo su
mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=2 /dev/sdf /dev/sdg
mkfs.ext4 /dev/md0
mkdir /vol
mount -t ext4 /dev/md127 /vol
cp /etc/fstab /etc/fstab.orig
echo "/dev/md0 /vol ext4 defaults,nofail 0 2" >> /etc/fstab
mount -a
Reference: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.html
- Install postgres + postgis
The standard repository packages won't work, so you'll need to get other packages and compile some things from source. It's pretty straightforward though.
Edit /etc/yum.repos.d/amzn-main.repo
and add the following line to the block [amzn-main]
:
exclude=postgresql*
Then install postgres.
cd ~/
wget http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-redhat93-9.3-1.noarch.rpm
rpm -ivh pgdg-redhat93-9.3-1.noarch.rpm
yum install postgresql93 postgresql93-server postgresql93-devel postgresql93-contrib
# I just symlinked the existing data directory to my mounted volume
rm -r /var/lib/pgsql/9.3/data/
ln -s /vol /var/lib/pgsql/9.3/data
# Set filepermissions to postgres user
chown postgres:postgres /vol
chmod 700 /vol
# Initialize the db
service postgresql-9.3 initdb
# Start the service
service postgresql-9.3 start
# exit from su
exit
And install postgis and dependencies.
sudo yum install gcc make gcc-c++ libtool libxml2-devel libpng libtiff
cd ~/
# Download GEOS and install
wget http://download.osgeo.org/geos/geos-3.4.2.tar.bz2
tar xjf geos-3.4.2.tar.bz2
cd geos-3.4.2
./configure
make
sudo make install
# Download Proj.4 and install
cd ~/
wget http://download.osgeo.org/proj/proj-4.8.0.tar.gz
tar xzf proj-4.8.0.tar.gz
cd proj-4.8.0
### If you don't want python bindings
./configure
### Or if you do want python bindings (which you need to import US street number data)
sudo yum install python26-devel.x86_64
./configure --with-python
make
sudo make install
# Download and install GDAL
cd ~/
wget http://download.osgeo.org/gdal/1.10.1/gdal-1.10.1.tar.gz
tar -xvzf gdal-1.10.1.tar.gz
cd gdal-1.10.1
./configure
make
make install
# Download and install JSON-C library
cd ~/
wget https://s3.amazonaws.com/json-c_releases/releases/json-c-0.11.tar.gz
tar -xvzf json-c-0.11.tar.gz
cd json-c-0.11
./configure
make
make install
# Download and install PostGIS
cd ~/
wget http://download.osgeo.org/postgis/source/postgis-2.1.2.tar.gz
tar -xvzf postgis-2.1.2.tar.gz
cd postgis-2.1.2
./configure --with-pgconfig=/usr/pgsql-9.3/bin/pg_config --with-geosconfig=/usr/local/bin/geos-config --with-gdalconfig=/usr/local/bin/gdal-config
make
make install
# update your libraries
sudo su
echo /usr/local/lib >> /etc/ld.so.conf
ldconfig
Reference: http://overtronic.com/2013/12/how-to-install-postgresql-with-postgis-on-amazon-ec2-linux/
- Nominatim dependencies
yum --enablerepo=epel install git make automake gcc gcc-c++ libtool
yum --enablerepo=epel install php-pgsql php php-pear php-pear-DB libpqxx-devel
yum --enablerepo=epel install bzip2-devel libxml2-devel protobuf-c-devel lua-devel
# These were installed from source above: proj-devel geos-devel proj-epsg
- Postgres config for install
Edit /vol/postgresql.conf
and make the following changes. Here I just took the examples from the official install guide and increased them a bit to reflect the larger memory size.
shared_buffers (4GB)
maintenance_work_mem (16GB/10GB)
work_mem (50MB)
effective_cache_size (24GB)
synchronous_commit = off
checkpoint_segments = 100
checkpoint_timeout = 10min
checkpoint_completion_target = 0.9
For the initial import, I also set:
fsync = off
full_page_writes = off
Also it seems less certain but I also had no problems with:
autovacuum = off
These last three changes will be reverted after installation per the official instructions.
- Nominatim main installation
As ec2-user
:
cd ~/
wget http://www.nominatim.org/release/Nominatim-2.3.0.tar.bz2
tar xvf Nominatim-2.3.0.tar.bz2
cd Nominatim-2.3.0
./configure --with-postgresql=/usr/pgsql-9.3/bin/pg_config
make
Edit settings/local.php
and copy in the following:
<?php
// Paths
@define('CONST_Postgresql_Version', '9.3');
@define('CONST_Postgis_Version', '2.1');
@define('CONST_Path_Postgresql_Contrib', '/usr/pgsql-9.3/share/contrib');
Download some optional data (I wanted everything)
wget --output-document=data/wikipedia_article.sql.bin http://www.nominatim.org/data/wikipedia_article.sql.bin
wget --output-document=data/wikipedia_redirect.sql.bin http://www.nominatim.org/data/wikipedia_redirect.sql.bin
wget --output-document=data/gb_postcode_data.sql.gz http://www.nominatim.org/data/gb_postcode_data.sql.gz
cd /
sudo -u postgres createuser -s ec2-user
createuser -SDR www-data
cd ~/
chmod +x ~
chmod +x ~/Nominatim-2.3.0
chmod +x ~/Nominatim-2.3.0/module
sudo su
mkdir /vol/planet
chown ec2-user /vol/planet
chmod a+rx /vol
Then you can do a test run with a small country:
### Test Luxembourg
wget --output-document=/vol/planet/luxembourg-latest.osm.pbf http://download.geofabrik.de/europe/luxembourg-latest.osm.pbf
cd ~/Nominatim-2.3.0
./utils/setup.php --osm-file /vol/planet/luxembourg-latest.osm.pbf --all --osm2pgsql-cache 18000 2>&1 | tee setup.log
# If all is good, then start over
dropdb nominatim
Before proceeding to the full install:
wget --output-document=/vol/planet/planet-latest.osm.pbf http://download.bbbike.org/osm/planet/planet-latest.osm.pbf
wget --output-document=/vol/planet/planet-latest.osm.pbf.md5 http://download.bbbike.org/osm/planet/planet-latest.osm.pbf.md5
# Check the md5 checksum to ensure we downloaded ok
md5sum --check /vol/planet/planet-latest.osm.pbf.md5
Warning this next command takes days
time ./utils/setup.php --osm-file /vol/planet/planet-latest.osm.pbf --all --osm2pgsql-cache 18000 2>&1 | tee setup.log
On my EC2 configuration, it took 5 days. Rank 28 and Rank 30 indexing took the longest.
real 6997m2.441s
user 409m31.204s
sys 88m19.924s
At the end of this 800GB was used in total across my RAID0 volume.
Then you can install the extras:
# Add special phrases
./utils/specialphrases.php --countries > specialphrases_countries.sql
psql -d nominatim -f specialphrases_countries.sql
./utils/specialphrases.php --wiki-import > specialphrases.sql
psql -d nominatim -f specialphrases.sql
And set up the website
# Set up website
sudo mkdir -m 755 /var/www/nominatim
sudo chown nginx /var/www/nominatim
./utils/setup.php --create-website /var/www/nominatim
Edit settings/local.php
and add/edit:
@define('CONST_Website_BaseURL', '/nominatim/');
I used nginx as the HTTP server:
sudo yum install nginx
sudo yum install php-fpm
psql -d nominatim -c 'ALTER USER "www-data" RENAME TO "nginx"'
As root, edit /etc/php-fpm.d/www.conf
to include:
; Comment out the tcp listener and add the unix socket
;listen = 127.0.0.1:9000
listen = /var/run/php5-fpm.sock
; Ensure that the daemon runs as the correct user
listen.owner = nginx
listen.group = nginx
listen.mode = 0666
As root, edit /etc/nginx/nginx.conf
# Edit to include, with in the http { ... server{ ... }} that is defined
index index.html index.htm index.php;
#root /usr/share/nginx/html;
root /var/www;
#location / {
#}
location ~ [^/]\.php(/|$) {
fastcgi_split_path_info ^(.+?\.php)(/.*)$;
if (!-f $document_root$fastcgi_script_name) {
return 404;
}
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index index.php;
# Note this next line is super important or you'll get empty responses with no error message!
include fastcgi.conf;
}
And then hopefully you can run
sudo /etc/init.d/php-fpm start
sudo /etc/init.d/nginx start
At this point, all going well, you should be able to connect to http://yourhost/nominatim and see the OSM Nominatim web page.
- TIGER files for US street numbers (optional)
This apparently helps Nominatim geocode street numbers more accurately. Unlike the other options above, this takes a substantial amount of time and space to run.
cd ~/Nominatim-2.3.0/data
mkdir -p TIGER2013/EDGES
# raw files are about 10GB, but will eventually expand to quite a bit more in SQL statements
wget -P TIGER2013/EDGES ftp://ftp2.census.gov/geo/tiger/TIGER2013/EDGES/*
# These next two steps took 24 hours together
./utils/imports.php --parse-tiger-2011 data/TIGER2013/EDGES/
./utils/setup.php --import-tiger-data
psql -d nominatim -c 'GRANT SELECT ON location_property_tiger TO "nginx"'
At this stage df looked like this, and my generous 1.5TB of EBS was looking like a good choice.
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8123812 4931312 3092252 62% /
devtmpfs 15701944 72 15701872 1% /dev
tmpfs 15710020 0 15710020 0% /dev/shm
/dev/md127 1548045540 1180848836 288537172 81% /vol
- Post install configuration
Revert some of the changes in /vol/postgresql.conf
fsync = on
full_page_writes = on
autovacuum = on
And then run
# Postgres gets upset otherwise
sudo chmod go-rx /vol
sudo chkconfig --add postgresql-9.3
sudo chkconfig --add php-fpm
sudo chkconfig --add nginx
sudo service php-fpm start
sudo service postgresql-9.3 start
sudo service nginx start
At this point you should be good to go. I haven't set up automatic updating, so if you proceed with that you'll have to follow the official Nominatim guide and adapt as necessary.
Thanks for the setup instructions :-)
Did you end up benchmarking this instance? How many RPS were you able to churn out?
At locationiq.org, we host 100+ OSM servers and we didn't have too much luck with AWS; moved to bare metal.