Skip to content

Instantly share code, notes, and snippets.

@mohammadsalem
Last active June 18, 2024 11:26
Show Gist options
  • Save mohammadsalem/25f88912069f08350c707c7fe661d208 to your computer and use it in GitHub Desktop.
Save mohammadsalem/25f88912069f08350c707c7fe661d208 to your computer and use it in GitHub Desktop.
List of problems, solutions and tricks for DSpace
DSpace docker solr data volume for the first time
  • Build and create dspace container without the volume
  • Copy the solr data docker cp dspace:/dspace/solr solrData
  • Re-create dspace container with a volume ./solrData:/dspace/solr
DSpace docker solr data volume (where there is a solr data)
  • Build and create dspace container with the volume
  • Enter dspace container as root docker exec -it dspace bash
  • Change the owner of solr data chown -R dspace:dspace /dspace/solr
  • Exit and restart the container docker restart dspace

Note: execute the commands whoami and id CURRENT_USER inside the conntainer to get the current user and group id, then change the ownership of the directory to this id to prevent overlapping with anems from the host!

DSpace solr data merge
  • Copy the data to the solrData folder cp -r ../old_temp solrData/
  • Enter dspace container as root docker exec -it dspace bash
  • Change the owner of solr data chown -R dspace:dspace /dspace/solr
  • Exit and restart the container docker restart dspace
  • Enter dspace container as dspace user docker exec -it -u dspace dspace /bin/bash
  • Check if all the solr segments are ok and remove corrupted ones "This will remove corrupted segments!"
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/statistics/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/oai/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/search/data/index -fix
$ java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /dspace/solr/old_temp/authority/data/index -fix
  • Copy the current solr data to a temp directory
$ mkdir /dspace/solr/new_temp && cp -r /dspace/solr/statistics /dspace/solr/oai /dspace/solr/search /dspace/solr/authority /dspace/solr/new_temp/
  • Copy the current solr data to a temp directory
$ mkdir /dspace/solr/new_temp && cp -r /dspace/solr/statistics /dspace/solr/oai /dspace/solr/search /dspace/solr/authority /dspace/solr/new_temp/
  • Remove the data from the original path and merge solr data
$ rm -rf dspace/solr/statistics/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/statistics/data/index /dspace/solr/new_temp/statistics/data/index /dspace/solr/old_temp/statistics/data/index
$ rm -rf dspace/solr/oai/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/oai/data/index /dspace/solr/new_temp/oai/data/index /dspace/solr/old_temp/oai/data/index
$ rm -rf dspace/solr/search/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/search/data/index /dspace/solr/new_temp/search/data/index /dspace/solr/old_temp/search/data/index
$ rm -rf dspace/solr/authority/data/index/* && java -cp /usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-core-4.10.2.jar:/usr/local/tomcat/webapps/solr/WEB-INF/lib/lucene-misc-4.10.2.jar org/apache/lucene/misc/IndexMergeTool /dspace/solr/authority/data/index /dspace/solr/new_temp/authority/data/index /dspace/solr/old_temp/authority/data/index
  • Run the following tasks to clean and index solr data
$ /dspace/bin/dspace oai import -o
$ /dspace/bin/dspace index-discovery
$ /dspace/bin/dspace index-discovery -o
$ /dspace/bin/dspace index-authority
$ /dspace/bin/dspace stats-util -i
$ /dspace/bin/dspace stats-util -o
$ /dspace/bin/dspace sub-daily
$ /dspace/bin/dspace filter-media
  • Exit and restart the container docker restart dspace
Script to import DSpace database IMPORT_DSPACE_DB.sh
#!/bin/bash

echo "$(tput setaf 6)ARE YOU SURE? <yes/no>"$(tput sgr 0)
read sure
if [ $sure != 'yes' ]; then
   exit
fi

echo "$(tput setaf 6)AGAIN ARE YOU SURE? <yes/no>"$(tput sgr 0)
read sureagain
if [ $sureagain != 'yes' ]; then
   exit
fi

cd ./db_backup

echo "$(tput setaf 6)mEnter database number <<default:Last downloaded database>>"$(tput sgr 0)
read num

if [ ! -n "$num" ]; then
    num=`ls -t | awk '{printf("%s",$0);exit}' | tr -d '[:alpha:]\-\.'`
fi

if [ -f "dspace-$num.dump" ]; then
   echo "$(tput setaf 5)File dspace-$num.dump  exist."$(tput sgr 0)
elif [ -f "dspace-$num.dump.tar.gz" ]; then
   echo "$(tput setaf 5)Extracting the file..."$(tput sgr 0)
   tar -xvzf dspace-$num.dump.tar.gz
else
   echo "$(tput setaf 5)File not found"$(tput sgr 0)
   exit
fi

echo "$(tput setaf 6)Stopping dspace container..."$(tput sgr 0)
docker stop dspace

echo "$(tput setaf 6)Droping database..."$(tput sgr 0)
docker exec dspace_db dropdb -U postgres dspace

echo "$(tput setaf 6)Creating database..."$(tput sgr 0)
docker exec dspace_db createdb -U postgres -O dspace --encoding=UNICODE dspace

echo "$(tput setaf 6)Creating dspace user..."$(tput sgr 0)
docker exec dspace_db psql -U postgres dspace -c 'alter user dspace createuser;'

echo "$(tput setaf 6)Copying database..."$(tput sgr 0)
docker cp dspace-$num.dump dspace_db:/

echo "$(tput setaf 6)Importing database..."$(tput sgr 0)
docker exec dspace_db pg_restore -U postgres -d dspace /dspace-$num.dump

echo "$(tput setaf 6)Removing dspace user..."$(tput sgr 0)
docker exec dspace_db psql -U postgres dspace -c 'alter user dspace nocreateuser;'

echo "$(tput setaf 6)Vacum database..."$(tput sgr 0)
docker exec dspace_db vacuumdb -U postgres dspace

echo "$(tput setaf 6)Updating sequences..."$(tput sgr 0)
docker cp dspace:/dspace/etc/postgres/update-sequences.sql .
docker cp update-sequences.sql dspace_db:/
docker exec dspace_db psql -U dspace -f /update-sequences.sql dspace

echo "$(tput setaf 6)Cleaning up..."$(tput sgr 0)
docker exec -it dspace_db bash -c "rm dspace-$num.dump"
docker exec -it dspace_db bash -c "rm update-sequences.sql"
rm dspace-$num.dump

echo "$(tput setaf 6)Starting dspace container..."$(tput sgr 0)
docker start dspace

echo "$(tput setaf 6)Finish"$(tput sgr 0)
Script to create DSpace database backup IMPORT_DSPACE_DB.sh
#!/bin/bash
db_name="dspace-$(date +%s).dump"

echo "$(tput setaf 6)Creating full backup $db_name.tar.gz ..."$(tput sgr 0)

docker exec -i dspace_db pg_dump -U dspace -Fc -f /$db_name dspace
cd ./db_backup/
docker cp dspace_db:/$db_name .
tar -czvf $db_name.tar.gz $db_name

echo "$(tput setaf 6)Cleaning up..."$(tput sgr 0)
docker exec -it dspace_db bash -c "rm /$db_name"
rm $db_name
echo "$(tput setaf 6)Backup finished."$(tput sgr 0)
Hide Item Metadata Fields

Fields named here are hidden in the following places UNLESS the logged-in user is an Administrator:

  1. XMLUI metadata XML view, and Item splash pages (long and short views).
  2. JSPUI Item splash pages
  • Add a property in /dspace/config/dspace.cfg in the form: metadata.hide.SCHEMA.ELEMENT.QUALIFIER = true e.g. metadata.hide.mel.partner.id = true
Solr view statistics
  • https://alanorth.github.io/cgspace-notes/2019-04/ search for "Holy shit".
  • Insure that the dbfile = /dspace/config/GeoLite2-City.mmdb defined in /dspace/config/modules/usage-statistics.cfg exists, otherwise it wont record the hits.
  • Insure hits get to DSpace container with the client IP address, in my case in nginx changing this proxy_set_header XForwardedFor $proxy_add_x_forwarded_for; to proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; did it.
DSpace logs compress
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment