Here I describe my experience in setting up the LSST pipeline Gen3 butler on an IUCAA server.
mkdir -p lsst_stack
cd lsst_stack
curl -OL https://raw.githubusercontent.com/lsst/lsst/master/scripts/newinstall.sh
bash newinstall.sh -ct
source loadLSST.bash
eups distrib install -t v23_0_0_rc2
curl -sSL https://raw.githubusercontent.com/lsst/shebangtron/master/shebangtron | python
setup lsst_distrib
The LSST gen3 pipeline requires a database to be set up. This can be either done with sqlite3 (but may not be suited for heavy processing). Sqlite3 database creation is as simple as creating just an empty file with that name. But here I describe the setup of a postgresql server. This does not require any root password.
- Download the latest postgresql server from: https://www.postgresql.org/
- Untar the file and change the directory to the untarred one
./configure --prefix=$HOME
make -j 20
make install
cd contrib
make install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH=$HOME/lib
Now initialize the database, change the ~/gen3_db
to the location you want your database to reside
$HOME/bin/initdb -D ~/gen3_db
Open the file ~/gen3_db/postgresql.conf
, and change listen addresses to the appropriate address that you need to listen from.
listen_addresses = '*'
max_connections = 1200
In pg_hba.conf
, add the following line (assumes your infiniband network is on 192.168.1.XXX
addresses, otherwise change appropriately.
host all all 192.168.1.0/24 md5
Now start the postgresql server:
$HOME/bin/pg_ctl -D ~/gen3_db -l logfile start
First create the database location, then open it in psql
, add a btree_gist
extension and also add a password:
createdb gen3
~/bin/psql gen3
gen3=# CREATE EXTENSION btree_gist
gen3=# \password
Now that this has been setup, you can change all the trust
authentication in ~/gen3_db/pg_hba.conf
to md5. This way all access will now be password based. You can setup the password using the environment variable $PGPASSWORD
or write it in clear text in a $HOME/.pgpass
file.
export PGPASSWORD=YourPassword
Now let us create a space for the gen3 repository.
mkdir $HOME/gen3_repo
Setup butler and register the instrument (in our case Subaru HSC).
echo "registry:" > reg.yaml
echo " db: postgresql://username@server_ip_address/gen3" >> reg.yaml
DIR=$HOME/gen3_repo
butler create $DIR --seed-config reg_2018.yaml --override
butler register-instrument $DIR lsst.obs.subaru.HyperSuprimeCam
If you have access to gen3-shared-repo-admin tools, then skip this and go down one section:
Let us ingest some raw data from the directory $HOME/Subaru_rawdata
now. Depending upon the size of your data, this can take a really loooooooooooong time.
butler --progress ingest-raws $DIR $HOME/Subaru_rawdata -t direct 2>&1 > rawingest.log &
Define each exposure as a single visit using the next command:
butler define-visits $DIR HSC
Since you have ingested the raws, now we do not have to ingest the raws once again. So create a file called skipraws.py
.
To this file add,
# skipraws.py file
import lsst.obs.base.gen2to3.convertRepo
assert type(config)==lsst.obs.base.gen2to3.convertRepo.ConvertRepoConfig, 'config is of type %s.%s instead of lsst.obs.base.gen2to3.convertRepo.ConvertRepoConfig' % (type(config).__module__, type(config).__name__)
config.datasetIgnorePatterns=["raw"]
config.doMakeUmbrellaCollection=False
config.doExpandDataIds=False
If you have a gen2 root directory which has your previous processing say from HSCpipe, then you can utilize it here and ingest the skymaps, reference catalogs, calibrations with the next command:
GEN2ROOT=$HOME/gen2root
butler --progress convert $DIR --gen2root $GEN2ROOT -C skipraws.py -t direct 2>&1 > gen2convert.log &
If you do not have one, then you download the calibration data from https://www.subarutelescope.org/Observing/Instruments/HSC/calib_data.html. You need to then create a gen2 repo and ingest these calibrations into the gen2 repository first following instructions at https://hsc.mtk.nao.ac.jp/pipedoc/pipedoc_8_e/ . You need to specifically initialize the repository, use ingestRaws
to ingest a couple of exposures. Then ingest all the calibrations CALIB
, SKY
, FLAT
, BIAS
, DARK
following the procedure written there. Once you have all this, then you can get your skyamps, refcats and calibrations from this gen2 repository using the command above.
After this you can also inherit any of your reruns one after the other. For more complicated rerun ingestion you should take a look at the script convert.py
available here https://github.com/lsst/obs_base/blob/master/python/lsst/obs/base/script/convert.py and play around with it.
Next we can use some butler-admin tools to do some ingestion of data in to this repository. The repository however is invitation only at this moment.
mkdir $HOME/github
cd $HOME/github
git clone [email protected]:lsst-dm/gen3_shared_repo_admin.git