Skip to content

Getting Started

ksrinivs64 edited this page Feb 4, 2015 · 32 revisions

Getting Started with PostgreSQL

Set up Quetzal with Docker with Linux (recommended)

  • Install docker using package manager (instructions are available on the docker website). The following instructions were tested on Ubuntu 14.10.

  • Clone the code from the github repository.

  • cd ~/git/quetzal/com.ibm.research.quetzal.core/docker/postgresql

  • Build the base docker image for the project: `sudo docker build --no-cache --rm -t "ibmresearch/quetzal_postgres" .`` What this step does is build a docker container with Ubuntu as the OS, pre-populated with the quetzal code, as cloned from the git repository, and the postgreSQL server and client code installed. The -rm option removes all intermediate images in the build process, -t refers to the tag given to the image so we can re-use it later, and --no-cache tells the system to build from scratch without consulting left over images.

  • Copy the nt file you want to load into /tmp/test by: `cp test.nt /tmp/test.nt``

  • sudo docker run -i -t -v /tmp/test:/tmp/test ibmresearch/quetzal_postgres /bin/bash``. This step will now run the docker container, and log you in as a postgres user, and put you in a directory called /data. The volume in the host machine /tmp/test is mapped to `/tmp/test inside the container.

  • `cp /tmp/test.nt /data`` so the rest of the scripts can access the nt file to load.

  • `bash quetzal/com.ibm.research.quetzal.core/docker/postgresql/runLoadPostgres.sh``

You should see output like:

`INSERT INTO kb_TOPKSTATS(TYPE, GRAPH , CNT)select 'graph',gid,count() as COUNT from quetzal.kb_DS group by GID having count() > 100000 order by count(*) fetch first 5000 rows only INSERT INTO kb_TOPKSTATS(TYPE , CNT)VALUES('nr_triples',5)``

  • Your dataset is now loaded.
Clone this wiki locally