This post summarizes the process of importing CSV data into Neo4j running on Docker Compose.
When attempting to import CSV data into Neo4j, direct installations and standalone Docker deployments worked as expected. However, using Docker Compose resulted in failures. After various attempts, the import process was successfully completed, but several issues were encountered. This serves as a record of the successful approach taken.
Preparation #
Creating CSV Data for Import #
The CSV data for import is prepared. The following three files, as presented in the official Neo4j manual, are used.
-
movies.csv
movies.csv movieId:ID,title,year:int,:LABEL tt0133093,"The Matrix",1999,Movie tt0234215,"The Matrix Reloaded",2003,Movie;Sequel tt0242653,"The Matrix Revolutions",2003,Movie;Sequel
-
actors.csv
actors.csv personId:ID,name,:LABEL keanu,"Keanu Reeves",Actor laurence,"Laurence Fishburne",Actor carrieanne,"Carrie-Anne Moss",Actor
-
roles.csv
roles.csv :START_ID,role,:END_ID,:TYPE keanu,"Neo",tt0133093,ACTED_IN keanu,"Neo",tt0234215,ACTED_IN keanu,"Neo",tt0242653,ACTED_IN laurence,"Morpheus",tt0133093,ACTED_IN laurence,"Morpheus",tt0234215,ACTED_IN laurence,"Morpheus",tt0242653,ACTED_IN carrieanne,"Trinity",tt0133093,ACTED_IN carrieanne,"Trinity",tt0234215,ACTED_IN carrieanne,"Trinity",tt0242653,ACTED_IN
Preparing docker-compose.yml
#
The following docker-compose.yml
file is created:
version: '3'
services:
neo4j:
image: neo4j:latest
ports:
- 57474:7474
- 57687:7687
volumes:
- ./volumes/neo4j/data:/data
- ./volumes/neo4j/logs:/logs
- ./volumes/neo4j/import:/import
- ./volumes/neo4j/script:/script
environment:
- EXTENSION_SCRIPT=/script/import_data.sh
-
Image: Uses the latest Neo4j image.
-
Ports: Maps ports 57474 and 57687 for access.
-
Volumes: Mounts host directories for persistence.
Host Directory Container Directory Purpose ./volumes/neo4j/data /data Stores Neo4j data ./volumes/neo4j/logs /logs Stores logs ./volumes/neo4j/import /import Stores import data ./volumes/neo4j/script /script Stores import scripts - Environment: The
EXTENSION_SCRIPT
variable specifies a script that initializes the environment before Neo4j starts. The Neo4j Docker image allows specifying a script usingEXTENSION_SCRIPT
, which can be used for environment initialization, configuration settings, credential loading, and dynamic modifications toneo4j.conf
. WhenEXTENSION_SCRIPT
is set, the entrypoint code of the Neo4j Docker image is executed first, followed by the specified script, and finally, Neo4j itself starts. In the case of CSV data import using Docker Compose, executing the import process after Neo4j starts results in errors. Therefore, the import script is specified inEXTENSION_SCRIPT
, ensuring that the import process is completed before Neo4j starts.
- Environment: The
Preparing the Import Script #
The following import script is specified in EXTENSION_SCRIPT
:
#!/bin/bash
set -euC
if [ -f /import/done ]; then
echo "Skip import process"
return
fi
echo "Start the database deletion process"
rm -rf /data/databases
rm -rf /data/transactions
echo "Complete the database deletion process"
echo "Start the data import process"
bin/neo4j-admin database import \
--nodes=/import/movies.csv \
--nodes=/import/actors.csv \
--relationships=/import/roles.csv \
neo4j
echo "Complete the data import process"
touch /import/done
echo "Start ownership change"
chown -R neo4j:neo4j /data
chown -R neo4j:neo4j /logs
echo "Complete ownership change"
Running Neo4j with Docker Compose #
Placing Files #
The following directory structure is used:
(Neo4j directory)
├── docker-compose.yml
└── volumes
└── neo4j
├── import
│ ├── actors.csv
│ ├── movies.csv
│ └── roles.csv
└── script
└── import_data.sh
Starting Neo4j #
Navigate to the directory containing docker-compose.yml
and execute:
docker-compose up -d
Confirming Neo4j Startup #
Access http://localhost:57474 and connect with:
Field | Value |
---|---|
Connect URL | neo4j://localhost:57687 |
Database | |
Authentication Type | Username / Password |
Username | neo4j |
Password | neo4j |
Upon running a Cypher query, the imported data should be visible.
Conclusion #
The process of importing CSV data into Neo4j on Docker Compose is documented. Key issues encountered and their resolutions:
- Neo4j on Docker Compose requires data import before startup.
- Resolved by executing import via
EXTENSION_SCRIPT
.
- Resolved by executing import via
EXTENSION_SCRIPT
runs with root privileges, causing ownership issues.- Resolved by changing ownership to
neo4j
after import.
- Resolved by changing ownership to