Skip to main content

Importing CSV Data into Neo4j on Docker Compose

··3 mins·
Neo4j
Makoto Morinaga
Author
Makoto Morinaga
A personal notebook for tech notes, coding, and system experiments.
Table of Contents

This post summarizes the process of importing CSV data into Neo4j running on Docker Compose.

When attempting to import CSV data into Neo4j, direct installations and standalone Docker deployments worked as expected. However, using Docker Compose resulted in failures. After various attempts, the import process was successfully completed, but several issues were encountered. This serves as a record of the successful approach taken.

Preparation
#

Creating CSV Data for Import
#

The CSV data for import is prepared. The following three files, as presented in the official Neo4j manual, are used.

  • movies.csv

    movies.csv
    movieId:ID,title,year:int,:LABEL
    tt0133093,"The Matrix",1999,Movie
    tt0234215,"The Matrix Reloaded",2003,Movie;Sequel
    tt0242653,"The Matrix Revolutions",2003,Movie;Sequel
  • actors.csv

    actors.csv
    personId:ID,name,:LABEL
    keanu,"Keanu Reeves",Actor
    laurence,"Laurence Fishburne",Actor
    carrieanne,"Carrie-Anne Moss",Actor
  • roles.csv

    roles.csv
    :START_ID,role,:END_ID,:TYPE
    keanu,"Neo",tt0133093,ACTED_IN
    keanu,"Neo",tt0234215,ACTED_IN
    keanu,"Neo",tt0242653,ACTED_IN
    laurence,"Morpheus",tt0133093,ACTED_IN
    laurence,"Morpheus",tt0234215,ACTED_IN
    laurence,"Morpheus",tt0242653,ACTED_IN
    carrieanne,"Trinity",tt0133093,ACTED_IN
    carrieanne,"Trinity",tt0234215,ACTED_IN
    carrieanne,"Trinity",tt0242653,ACTED_IN

Preparing docker-compose.yml
#

The following docker-compose.yml file is created:

docker-compose.yml
version: '3'

services:
  neo4j:
    image: neo4j:latest
    ports:
      - 57474:7474
      - 57687:7687
    volumes:
      - ./volumes/neo4j/data:/data
      - ./volumes/neo4j/logs:/logs
      - ./volumes/neo4j/import:/import
      - ./volumes/neo4j/script:/script
    environment:
      - EXTENSION_SCRIPT=/script/import_data.sh
  • Image: Uses the latest Neo4j image.

  • Ports: Maps ports 57474 and 57687 for access.

  • Volumes: Mounts host directories for persistence.

    Host Directory Container Directory Purpose
    ./volumes/neo4j/data /data Stores Neo4j data
    ./volumes/neo4j/logs /logs Stores logs
    ./volumes/neo4j/import /import Stores import data
    ./volumes/neo4j/script /script Stores import scripts
    • Environment: The EXTENSION_SCRIPT variable specifies a script that initializes the environment before Neo4j starts. The Neo4j Docker image allows specifying a script using EXTENSION_SCRIPT, which can be used for environment initialization, configuration settings, credential loading, and dynamic modifications to neo4j.conf. When EXTENSION_SCRIPT is set, the entrypoint code of the Neo4j Docker image is executed first, followed by the specified script, and finally, Neo4j itself starts. In the case of CSV data import using Docker Compose, executing the import process after Neo4j starts results in errors. Therefore, the import script is specified in EXTENSION_SCRIPT, ensuring that the import process is completed before Neo4j starts.

Preparing the Import Script
#

The following import script is specified in EXTENSION_SCRIPT:

import_data.sh
#!/bin/bash
set -euC

if [ -f /import/done ]; then
    echo "Skip import process"
    return
fi

echo "Start the database deletion process"
rm -rf /data/databases
rm -rf /data/transactions
echo "Complete the database deletion process"

echo "Start the data import process"
bin/neo4j-admin database import \
  --nodes=/import/movies.csv \
  --nodes=/import/actors.csv \
  --relationships=/import/roles.csv \
  neo4j
echo "Complete the data import process"

touch /import/done
echo "Start ownership change"
chown -R neo4j:neo4j /data
chown -R neo4j:neo4j /logs
echo "Complete ownership change"

Running Neo4j with Docker Compose
#

Placing Files
#

The following directory structure is used:

directory tree
(Neo4j directory)
├── docker-compose.yml
└── volumes
    └── neo4j
        ├── import
        │   ├── actors.csv
        │   ├── movies.csv
        │   └── roles.csv
        └── script
            └── import_data.sh

Starting Neo4j
#

Navigate to the directory containing docker-compose.yml and execute:

Terminal
docker-compose up -d

Confirming Neo4j Startup
#

Access http://localhost:57474 and connect with:

Field Value
Connect URL neo4j://localhost:57687
Database
Authentication Type Username / Password
Username neo4j
Password neo4j

Upon running a Cypher query, the imported data should be visible.

Conclusion
#

The process of importing CSV data into Neo4j on Docker Compose is documented. Key issues encountered and their resolutions:

  • Neo4j on Docker Compose requires data import before startup.
    • Resolved by executing import via EXTENSION_SCRIPT.
  • EXTENSION_SCRIPT runs with root privileges, causing ownership issues.
    • Resolved by changing ownership to neo4j after import.

Related

Sticky Shift with libskk
··2 mins
Linux Skk
Home Server Setup (2020 Edition)
··3 mins
Server
RDM Configuration on ESXi 6.5/7.0
··2 mins
Rdm Esxi