This post summarizes the process of importing CSV data into Neo4j running on Docker Compose.
When attempting to import CSV data into Neo4j, direct installations and standalone Docker deployments worked as expected. However, using Docker Compose resulted in failures. After various attempts, the import process was successfully completed, but several issues were encountered. This serves as a record of the successful approach taken.
Preparation #
Creating CSV Data for Import #
The CSV data for import is prepared. The following three files, as presented in the official Neo4j manual, are used.
-
movies.csvmovies.csv movieId:ID,title,year:int,:LABEL tt0133093,"The Matrix",1999,Movie tt0234215,"The Matrix Reloaded",2003,Movie;Sequel tt0242653,"The Matrix Revolutions",2003,Movie;Sequel -
actors.csvactors.csv personId:ID,name,:LABEL keanu,"Keanu Reeves",Actor laurence,"Laurence Fishburne",Actor carrieanne,"Carrie-Anne Moss",Actor -
roles.csvroles.csv :START_ID,role,:END_ID,:TYPE keanu,"Neo",tt0133093,ACTED_IN keanu,"Neo",tt0234215,ACTED_IN keanu,"Neo",tt0242653,ACTED_IN laurence,"Morpheus",tt0133093,ACTED_IN laurence,"Morpheus",tt0234215,ACTED_IN laurence,"Morpheus",tt0242653,ACTED_IN carrieanne,"Trinity",tt0133093,ACTED_IN carrieanne,"Trinity",tt0234215,ACTED_IN carrieanne,"Trinity",tt0242653,ACTED_IN
Preparing docker-compose.yml
#
The following docker-compose.yml file is created:
version: '3'
services:
neo4j:
image: neo4j:latest
ports:
- 57474:7474
- 57687:7687
volumes:
- ./volumes/neo4j/data:/data
- ./volumes/neo4j/logs:/logs
- ./volumes/neo4j/import:/import
- ./volumes/neo4j/script:/script
environment:
- EXTENSION_SCRIPT=/script/import_data.sh-
Image: Uses the latest Neo4j image.
-
Ports: Maps ports 57474 and 57687 for access.
-
Volumes: Mounts host directories for persistence.
Host Directory Container Directory Purpose ./volumes/neo4j/data /data Stores Neo4j data ./volumes/neo4j/logs /logs Stores logs ./volumes/neo4j/import /import Stores import data ./volumes/neo4j/script /script Stores import scripts - Environment: The
EXTENSION_SCRIPTvariable specifies a script that initializes the environment before Neo4j starts. The Neo4j Docker image allows specifying a script usingEXTENSION_SCRIPT, which can be used for environment initialization, configuration settings, credential loading, and dynamic modifications toneo4j.conf. WhenEXTENSION_SCRIPTis set, the entrypoint code of the Neo4j Docker image is executed first, followed by the specified script, and finally, Neo4j itself starts. In the case of CSV data import using Docker Compose, executing the import process after Neo4j starts results in errors. Therefore, the import script is specified inEXTENSION_SCRIPT, ensuring that the import process is completed before Neo4j starts.
- Environment: The
Preparing the Import Script #
The following import script is specified in EXTENSION_SCRIPT:
#!/bin/bash
set -euC
if [ -f /import/done ]; then
echo "Skip import process"
return
fi
echo "Start the database deletion process"
rm -rf /data/databases
rm -rf /data/transactions
echo "Complete the database deletion process"
echo "Start the data import process"
bin/neo4j-admin database import \
--nodes=/import/movies.csv \
--nodes=/import/actors.csv \
--relationships=/import/roles.csv \
neo4j
echo "Complete the data import process"
touch /import/done
echo "Start ownership change"
chown -R neo4j:neo4j /data
chown -R neo4j:neo4j /logs
echo "Complete ownership change"Running Neo4j with Docker Compose #
Placing Files #
The following directory structure is used:
(Neo4j directory)
├── docker-compose.yml
└── volumes
└── neo4j
├── import
│ ├── actors.csv
│ ├── movies.csv
│ └── roles.csv
└── script
└── import_data.shStarting Neo4j #
Navigate to the directory containing docker-compose.yml and execute:
docker-compose up -dConfirming Neo4j Startup #
Access http://localhost:57474 and connect with:
| Field | Value |
|---|---|
| Connect URL | neo4j://localhost:57687 |
| Database | |
| Authentication Type | Username / Password |
| Username | neo4j |
| Password | neo4j |
Upon running a Cypher query, the imported data should be visible.
Conclusion #
The process of importing CSV data into Neo4j on Docker Compose is documented. Key issues encountered and their resolutions:
- Neo4j on Docker Compose requires data import before startup.
- Resolved by executing import via
EXTENSION_SCRIPT.
- Resolved by executing import via
EXTENSION_SCRIPTruns with root privileges, causing ownership issues.- Resolved by changing ownership to
neo4jafter import.
- Resolved by changing ownership to