Skip to content
ljo edited this page Oct 27, 2025 · 39 revisions

There are two ways to install DSpace. First is in Docker, which is easier and preferred. The second consists of downloading all necessary software, matching versions, configuring it, compiling, installing and running.

RUNNING DSPACE IN DOCKER

  • Install Docker Desktop

  • Docker compose v2 is required. On Linux, it does not always come by default, so if necessary, install it with official guide.

  • All necessary files are in frontend repository, so first, checkout the repository from GitHub.

git clone https://github.com/ufal/dspace-angular
cd dspace-angular
  • In order to run DSpace in Docker, .env file in the front-end root folder (dspace-angular/) with environment variables is necessary. There are two basic scenarios that require slightly different configurations - the example of .env file for each scenario is specified below. (Localhost/Public)
    • The frontend uses several variables to compile the backend address, all starting with a common prefix DSPACE_REST_, they are exposed in the .env + compose file under different names (also all of them have some default values):
      • SSL - exposed as DSPACE_SSL, set true if connecting with the backend over https
      • HOST - exposed as DSPACE_HOST, the hostname
      • PORT - exposed as DSPACE_REST_PORT, backend port
      • NAMESPACE - exposed as DSPACE_REST_NAMESPACE
    • The frontend doesn't need to be deployed in root (/) and you influence this with DSPACE_UI_NAMESPACE (also see the section below on custom namespaces)
    • The backend "needs to know" where it's deployed and which frontends it can communicate with. This is configured via two URLs:
      • REST_URL - This is basically what you compile from SSL, HOST, REST_PORT and REST_NAMESPACE
      • UI_URL - Compile this from SSL, HOST and UI_NAMESPACE
    • The sample .env files also contain INSTANCE variable that's used in container names, published ports etc. which provides an option to run multiple dspaces on the same machine
    • The REST_URL must be accessible to your users (but during SSR also the express server from the frontend container uses this)
    • The S3 portion in the samples is not needed (if not using S3 storage) it's there to surface the variables needed. Also if not using S3 make sure to include the following in your local.cfg
s3.download.direct.enabled = false
s3.upload.by.parts.enabled = false
sync.storage.service.enabled = false
  • After setting up .env file, run Docker and create users.

  • After Docker containers are started, don't forget to set up Nginx as specified below, in order to be able to access DSpace from remote hosts.


a) Localhost .env set-up

When running on localhost, the frontend MUST run in development mode. The .env file example is here:

INSTANCE=0
DSPACE_HOST=localhost
DSPACE_VER=dspace-7_x
DSPACE_SSL=false

#  NOTE!: The line below is NECESSARY for localhost, so you want to run the front-end in development mode, uncomment the next line
# FE_CMD=yarn start:dev
FE_CMD=yarn start:dev

#please do not edit the following variables unless you know what you are doing
DOCKER_OWNER=ufal
DSPACE_UI_IMAGE=${DOCKER_OWNER}/dspace-angular:$DSPACE_VER
DSPACE_REST_IMAGE=${DOCKER_OWNER}/dspace:$DSPACE_VER

DSPACE_REST_PORT=808${INSTANCE}
UI_PORT=400${INSTANCE}

DSPACE_REST_NAMESPACE=/server
DSPACE_UI_NAMESPACE=/

REST_URL=http://${DSPACE_HOST}:${DSPACE_REST_PORT}${DSPACE_REST_NAMESPACE}
UI_URL=http://${DSPACE_HOST}:${UI_PORT}${DSPACE_UI_NAMESPACE}

# ===== S3 Storage =====
S3_STORAGE=1
S3_ENABLED=true
S3_BUCKET=YOUR_BUCKET_NAME          # replace with your S3 bucket
S3_SUBFOLDER=assetstore
S3_ACCESS=YOUR_ACCESS_KEY           # replace with your S3 access key
S3_SECRET=YOUR_SECRET_KEY           # replace with your S3 secret key
S3_ENDPOINT=YOUR_S3_ENDPOINT_URL    # replace with your S3 endpoint URL
S3_RELATIVE_PATH=true
S3_PATH_STYLE_ACCESS=false
S3_REGION_NAME=

b) Public .env set-up

Example of .env in frontend:

INSTANCE=0
DSPACE_HOST=example.com
DSPACE_VER=dspace-7_x
DSPACE_SSL=true

# If you want to run the front-end in development mode, uncomment the next line
# FE_CMD=yarn start:dev
# NOTE!: The line above is NECESSARY for localhost.

#please do not edit the following variables unless you know what you are doing
DOCKER_OWNER=ufal
DSPACE_UI_IMAGE=${DOCKER_OWNER}/dspace-angular:$DSPACE_VER
DSPACE_REST_IMAGE=${DOCKER_OWNER}/dspace:$DSPACE_VER

DSPACE_REST_PORT=8${INSTANCE}
UI_PORT=8${INSTANCE}

DSPACE_REST_NAMESPACE=/server
DSPACE_UI_NAMESPACE=/

REST_URL=http://${DSPACE_HOST}:${DSPACE_REST_PORT}${DSPACE_REST_NAMESPACE}
UI_URL=http://${DSPACE_HOST}:${UI_PORT}${DSPACE_UI_NAMESPACE}

# ===== S3 Storage =====
S3_STORAGE=1
S3_ENABLED=true
S3_BUCKET=YOUR_BUCKET_NAME          # replace with your S3 bucket
S3_SUBFOLDER=assetstore
S3_ACCESS=YOUR_ACCESS_KEY           # replace with your S3 access key
S3_SECRET=YOUR_SECRET_KEY           # replace with your S3 secret key
S3_ENDPOINT=YOUR_S3_ENDPOINT_URL    # replace with your S3 endpoint URL
S3_RELATIVE_PATH=true
S3_PATH_STYLE_ACCESS=false
S3_REGION_NAME=

# If you want to set up JAVA_OPTS
# Server memory limit (4GB)
# JAVA_OPTS=-Xmx4g

You may need to change DSPACE_REST_PORT to something else, e.g.443. Feel free to leave out the $INSTANCE part and just use the port number. Remember to replace S3_BUCKET, S3_ACCESS, S3_SECRET, S3_ENDPOINT and S3_REGION_NAME with your own configuration values. In both versions, it is possible to modify the first section of values. An instance is an arbitrary number, but enables several DSpace instances to run on the same machine. Be sure to use different project names (-p parameter for Docker Compose)! Also, be sure to check if your machine has sufficient resources (CPU, RAM) for that.

DSPACE_VER refers to image tag, most are in this list: Docker Tags

If your reverse proxy is on a different machine add HOST_IP=a.b.c.d to your .env where a.b.c.d is the IP on the interface that you reverse proxy can reach


Example: Public .env file

INSTANCE=0
DSPACE_HOST=example.com
DSPACE_VER=dspace-7_x
DSPACE_SSL=true

DOCKER_OWNER=ufal
DSPACE_UI_IMAGE=${DOCKER_OWNER}/dspace-angular:$DSPACE_VER
DSPACE_REST_IMAGE=${DOCKER_OWNER}/dspace:$DSPACE_VER

DSPACE_REST_PORT=443
UI_PORT=8443

DSPACE_REST_NAMESPACE=/server
DSPACE_UI_NAMESPACE=/

REST_URL=https://${DSPACE_HOST}:${DSPACE_REST_PORT}${DSPACE_REST_NAMESPACE}
UI_URL=https://${DSPACE_HOST}:${UI_PORT}${DSPACE_UI_NAMESPACE}

S3_STORAGE=1
S3_ENABLED=true
S3_BUCKET=my-dspace-bucket
S3_SUBFOLDER=assetstore
S3_ACCESS=AKIAIOSFODNN7EXAMPLE
S3_SECRET=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
S3_ENDPOINT=https://s3.example.com
S3_RELATIVE_PATH=true
S3_PATH_STYLE_ACCESS=false
S3_REGION_NAME=us-east-1

or check lindat .env file sample

Run Docker

After setting up .env file, run the commands for starting Docker (you can replace dspace-project-name with something suitable for you):

docker compose --env-file .env -f docker/docker-compose.yml -f docker/docker-compose-rest.yml pull

For development mode change cycles you probably want it explicitly rebuilt, use --build:

docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml up -d --build

If you do not want it rebuilt add --no-build instead to start it as quickly as possible:

docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml up -d --no-build

Now you should be able to open $UI_URL (http://localhost:4000/ if you haven't changed it) in you browser. It takes a while before everything starts.

To add administrator and other users, use the following commands, docker compose files and .env exactly the same as above.

docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml -f docker/cli.yml run --rm dspace-cli create-administrator -e [email protected] -f firstname -l lastname -p password -c en -o organization
docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml -f docker/cli.yml run --rm dspace-cli user --add -m [email protected] -g givenname -s surname -l en -p password -o organization

Obviously, it is possible to change parameters like -e for email, -m for email, -f for first name, -g for given name, -s for surname, -l for last name, -p for password, -o for organization. Only use the arguments for the command as specified above. Just modify values if needed.

In the folder with Docker compose files (docker file in the above) it is also possible to have a config.prod.yml file for the front-end and a local.cfg file for the back-end.


Defining a custom namespace

Custom-namespace

NOTE: the namespaces (subpath) in the sample .env file above and the defaults in aai.js don't match, have a look into the custom namespaces also if you plan to use discojuice. There's a dedicated page Discojuice set up.


Avoiding deleting volumes

The main rule is just to be careful. When volume is mounted on another disk, Docker doesn't allow the removal of the volume. Instead, error is displayed: Error response from daemon: remove <volume-name>: Unable to remove a directory outside of the local volume root /var/lib/docker: /<path-to-docker-storage>/volumes/test/_data. It is possible to use this fact to add another layer of protection of volumes by placing them on another disk (which is sometimes necessary in any case, due to data size). It can be done simply by sym link /var/lib/docker/volumes to a specified place on another disk. But be sure to test it before relying on it.


RUNNING DSPACE WITHOUT DOCKER

There are original installation instructions from vanilla DSpace. However, they are quite long and extensive and some parts are not necessary. They also list several possible versions, so here is a shortened list. Consult the original instructions if anything is unclear.


Required software

Make sure you know and are able to access the installed/extracted software.


Installation

  • create a database

    • go to the database installation folder
    • createuser --username=postgres --no-superuser --pwprompt dspace
    • createdb --username=postgres --owner=dspace --encoding=UNICODE dspace
    • psql --username=postgres -c "CREATE EXTENSION pgcrypto;" dspace
  • download DSpace sources (this repo)

  • edit configuration in dspace/config/clarin-dspace.cfg (and other configs)

  • use the command mvn clean install in the repo root

  • (go to /dspace/target/dspace-installer)

  • use command ant fresh_install in <dspace-repo>/dspace/target/dspace-installer

    • above command creates dspace installation in a new folder. By default, it is C:/dspace or /dspace.
    • locate it and make sure this command created it.
    • from now on, we will refer to it as <dspace-installation-folder>
  • (go to DSpace installation folder )

  • use the command bin/dspace database migrate force in <dspace-installation-folder>

  • create admin bin/dspace create-administrator in <dspace-installation-folder>

  • copy everything from webapps/* to <tomcat>\webapps

  • copy solr cores cp -R [dspace]/solr/* [solr]/server/solr/configsets

  • download frontend sources

  • use command yarn install


Running DSpace

  • make sure your database is running (it should be automatically)
  • (go to frontend sources)
  • use the command yarn start in <frontend-source>
  • start solr by solr start
  • start tomcat by using catalina run

Notes

The .env file can contain the following additional variables to configure S3

S3_STORAGE=1
S3_ENABLED=true

S3_RELATIVE_PATH=false
S3_BUCKET=docker-dummy-bucket
S3_SUBFOLDER=
S3_ACCESS=myaccestoken
S3_SECRET=mysecretpasswordtoken
S3_REGION_NAME=us-east-1

This should be valid since version 7.5. The first two must remain as is, in order to enable S3. The rest can (should) be modified.


Nginx

The whole server block should look like this:

server {
        listen 80;
        server_name dspace.url;
        location / {
            proxy_pass http://localhost:4000;
        }
        location /server/ {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

            proxy_pass http://localhost:8080;
        }
}

This assumes the following:

  • DSpace is run in Docker
  • The back-end runs on port 8080
  • The front-end runs on port 4000
  • settings in .env or config state following addresses:
    • DSPACE_UI_URL: dspace.url
    • DSPACE_REST_URL: dspace.url/server/

Of course, if some ports are different, change them in configuration.

TODO: document necessary headers (such as X-Forwarded-Proto and X-Forwarded-Port) and ref https://github.com/dataquest-dev/DSpace/issues/536


CMDI data for machines

Returning just the cmdi metadata must be ensured in Clarin installations. Add this to the location / block in NGINX from above.

# placed in location block of DSpace frontend

# redirect .../handle/123456/123456?format=cmdi to .../cmdi/oai-metadata... which returns just XML file with metadata
# ? at the end of the redirect stops nginx from appending original parameters
if ($query_string ~* "format=cmdi"){
    rewrite ^/(.*)handle/(.*)$ http://$http_host/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$2? redirect;
}

# if HTTP request to .../handle/123456/123456 contains header "Accept: application/x-cmdi+xml" or similar, redirect
# to the same as above. 
# http_*name*of*header* returns any header, in this case Accept:
if ($http_accept ~ "(.*cmdi.xml*)"){
    rewrite ^/(.*)handle/(.*)$ http://$http_host/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$2? redirect;
}

assuming the FE and BE are behind the same host (proxy); you can:

    # CMDI content - # replace repository-ng with your path prefix, or tweak the regexp as above
    if ($arg_format ~* "cmdi"){
        rewrite ^/repository-ng/handle/(.*)$ /repository-ng/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$1? last;
    }

    if ($http_accept = "application/x-cmdi+xml"){
        rewrite ^/repository-ng/handle/(.*)$ /repository-ng/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$1? last;
    }
    # /CMDI content

Check

To check the first part, use a command like

curl -k https://dspacehost.com/handle/1234/56789?format=cmdi -L

To check the second part, use a command like

curl -k https://dspacehost.com/handle/1234/56789 -L -H "Accept: application/x-cmdi+xml"


TODO shibboleth configuration

start from https://github.com/ufal/clarin-dspace/issues/1032#issuecomment-2066469795


CRON jobs

run-cli-command.sh = sudo docker exec -w /dspace/bin dspace8 ./dspace "$@" chmod +x /path/to/run-cli-command-88.sh

0 23 * * * cd /app && ./run-cli-command-88.sh oai import

20 0 * * * cd /app && ./run-cli-command-88.sh index-discovery

1 3 * * * cd /app && ./run-cli-command-88.sh subscription-send -f D

2 3 * * 0 cd /app && ./run-cli-command-88.sh subscription-send -f W

3 3 1 * * cd /app && ./run-cli-command-88.sh subscription-send -f M

0 4 1 * * cd /app && ./run-cli-command-88.sh cleanup

30 0 * * * cd /app && ./run-cli-command-88.sh health-report -e <YOUR_EMAIL>

or

cat /etc/cron.d/lindatrepo
MAILTO=root
RUNCMD="docker compose -p lindatrepo exec dspace /dspace/bin/dspace"

0 23 * * * root $RUNCMD oai import

20 0 * * * root $RUNCMD index-discovery

1 3 * * * root $RUNCMD subscription-send -f D

2 3 * * 0 root $RUNCMD subscription-send -f W

3 3 1 * * root $RUNCMD subscription-send -f M

0 4 1 * * root $RUNCMD cleanup -v

30 0 * * * root $RUNCMD health-report -e <YOUR_EMAIL>

NOTE

Avoid running any /dspace/bin/dspace commands around midnight. That's when log rotation happens and we've seen log lost (probably due to multiple log rotations)


NOTES

OAI

Please check if OAI shows items after adding (or harvesting) many of them. They should be visible in the OAI interface. If something is wrong, either an empty site or an Error number and a short description will be shown. Check logs of apache-tomcat, try folders /dspace/log and tomcat/logs (in docker it's usually /usr/local/tomcat/logs).

Clone this wiki locally