Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - 13 dockerize #14

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
FROM python:3-alpine

WORKDIR /usr/src/app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY data data/
COPY *.py *.py.sample ./
RUN mv dvconfig.py.sample dvconfig.py

CMD [ "python", "./create_sample_data.py" ]
59 changes: 59 additions & 0 deletions Jenkinsfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
void setBuildStatus(String message, String state) {
step([
$class: "GitHubCommitStatusSetter",
reposSource: [$class: "ManuallyEnteredRepositorySource", url: "${env.GIT_URL}"],
contextSource: [$class: "ManuallyEnteredCommitContextSource", context: "ci/docker/dataverse-sample-data"],
errorHandlers: [[$class: "ChangingBuildStatusErrorHandler", result: "UNSTABLE"]],
statusResultSource: [ $class: "ConditionalStatusResultSource", results: [[$class: "AnyBuildResult", message: message, state: state]] ]
]);
}

pipeline {
agent any
environment {
DOCKER_IMAGE_NAME = "iqss/dataverse-sample-data"
DOCKER_IMAGE_TAG = "build-${env.BRANCH_NAME}"
DOCKER_WORKDIR = "."
DOCKER_HUB_CRED = "dockerhub-dataversebot"
DOCKER_REGISTRY = "https://registry.hub.docker.com"
}
stages {
stage('build') {
when {
anyOf {
branch 'master'
branch 'PR-14'
}
}
steps {
script {
docker_image = docker.build("${env.DOCKER_IMAGE_NAME}:${env.DOCKER_IMAGE_TAG}", "--pull ${env.DOCKER_WORKDIR}")
}
}
}
stage('push') {
when {
anyOf {
branch 'master'
branch 'PR-14'
}
}
steps {
script {
// Push master image to latest tag
docker.withRegistry("${env.DOCKER_REGISTRY}", "${env.DOCKER_HUB_CRED}") {
docker_image.push("latest")
}
}
}
}
}
post {
success {
setBuildStatus("Image build and push succeeded", "SUCCESS");
}
failure {
setBuildStatus("Image build or push failed", "FAILURE");
}
}
}
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,36 @@ All of the steps above can be automated on an fresh installation of Dataverse on

For more information on spinning up Dataverse on AWS (especially if you don't have the `aws` executable installed), see http://guides.dataverse.org/en/latest/developers/deployment.html

## Usage in automated processes without API key
Sometimes you don't want to retrieve an API key manually, like at demo times
or when you do automatic deployments of sample data.

For these cases, a script `get_api_token.py` has been added for your convenience.

You can use it like follows (just an example):
```
API_TOKEN=`python get_api_token.py` python create_sample_data.py
```

The script understands two additional environment variables (in addition to
those from `dvconfig.py`):
* `DATAVERSE_USER`
* Username of the user whos API token we want to retrieve
* Defaults to `dataverseAdmin`
* `DATAVERSE_PASSWORD`
* either a cleartext password or a path to a file containing the clear text
password for the user.
* Defaults to `admin1` (usable for dataverse-ansible and dataverse-kubernetes)

In case of a successfull retrieval, it will print the password to standard out.
Error will be written to standard error, so you will see it at the top of
a failed attempt to load the data.

Please be aware that you will have to enable
[:AllowApiTokenLookupViaApi](http://guides.dataverse.org/en/latest/installation/config.html#allowapitokenlookupviaapi)
configuration option in your Dataverse to use this script. You can disable
after deploying the data, no harm done.

## Contributing

We love contributors! Please see our [Contributing Guide][] for ways you can help.
Expand Down
54 changes: 32 additions & 22 deletions dvconfig.py.sample
Original file line number Diff line number Diff line change
@@ -1,25 +1,35 @@
base_url = 'http://localhost:8080'
api_token = ''
import os

# Create the base_url from different parts (very usefull on K8s) or just
# read completely from a single env var. Default to "http://localhost:8080"
host = os.getenv('DATAVERSE_SERVICE_HOST', 'localhost')
port = os.getenv('DATAVERSE_SERVICE_PORT_HTTP', '8080')
proto = os.getenv('DATAVERSE_SERVICE_PORT_PROTO', 'http')
subpath = os.getenv('DATAVERSE_SERVICE_SUBPATH', '')
base_url = os.getenv('BASE_URL', proto+'://'+host+':'+port+subpath)

api_token = os.getenv('API_TOKEN', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will! But Ansible allows for a regex, so maybe we can change that one?


# sample data will be created in the following order
sample_data = [
'data/dataverses/open-source-at-harvard/open-source-at-harvard.json',
'data/dataverses/open-source-at-harvard/dataverses/dataverse-project/dataverse-project.json',
'data/dataverses/open-source-at-harvard/dataverses/dataverse-project/datasets/dataverse-irc-metrics/dataverse-irc-metrics.json',
'data/dataverses/ecastro/ecastro.json',
'data/dataverses/ecastro/datasets/this-is-my-test-dataset/this-is-my-test-dataset.json',
'data/dataverses/manchester/manchester.json',
'data/dataverses/manchester/datasets/test-dataset/test-dataset.json',
'data/dataverses/HCPDS/HCPDS.json',
'data/dataverses/HCPDS/datasets/reproductive-health-laws-around-the-world/reproductive-health-laws-around-the-world.json',
'data/dataverses/cms/cms.json',
'data/dataverses/cms/datasets/cmssampledata/cmssampledata.json',
'data/dataverses/scholcommlab/scholcommlab.json',
'data/dataverses/scholcommlab/datasets/diabeticconnect/diabeticconnect.json',
'data/dataverses/ubiquity-press/ubiquity-press.json',
'data/dataverses/ubiquity-press/dataverses/jopd/jopd.json',
'data/dataverses/ubiquity-press/dataverses/jopd/datasets/flynn-effect-in-estonia/flynn-effect-in-estonia.json',
'data/dataverses/ubiquity-press/dataverses/jopd/datasets/bafacalo/bafacalo.json',
'data/dataverses/open-source-at-harvard/datasets/open-source-at-harvard/open-source-at-harvard.json',
'data/dataverses/king/king.json',
'data/dataverses/king/datasets/cause-of-death/cause-of-death.json',
'data/dataverses/open-source-at-harvard/open-source-at-harvard.json',
'data/dataverses/open-source-at-harvard/dataverses/dataverse-project/dataverse-project.json',
'data/dataverses/open-source-at-harvard/dataverses/dataverse-project/datasets/dataverse-irc-metrics/dataverse-irc-metrics.json',
'data/dataverses/ecastro/ecastro.json',
'data/dataverses/ecastro/datasets/this-is-my-test-dataset/this-is-my-test-dataset.json',
'data/dataverses/manchester/manchester.json',
'data/dataverses/manchester/datasets/test-dataset/test-dataset.json',
'data/dataverses/HCPDS/HCPDS.json',
'data/dataverses/HCPDS/datasets/reproductive-health-laws-around-the-world/reproductive-health-laws-around-the-world.json',
'data/dataverses/cms/cms.json',
'data/dataverses/cms/datasets/cmssampledata/cmssampledata.json',
'data/dataverses/scholcommlab/scholcommlab.json',
'data/dataverses/scholcommlab/datasets/diabeticconnect/diabeticconnect.json',
'data/dataverses/ubiquity-press/ubiquity-press.json',
'data/dataverses/ubiquity-press/dataverses/jopd/jopd.json',
'data/dataverses/ubiquity-press/dataverses/jopd/datasets/flynn-effect-in-estonia/flynn-effect-in-estonia.json',
'data/dataverses/ubiquity-press/dataverses/jopd/datasets/bafacalo/bafacalo.json',
'data/dataverses/open-source-at-harvard/datasets/open-source-at-harvard/open-source-at-harvard.json',
'data/dataverses/king/king.json',
'data/dataverses/king/datasets/cause-of-death/cause-of-death.json',
]
33 changes: 26 additions & 7 deletions get_api_token.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,33 @@
from pyDataverse.api import Api
import json
import dvconfig
from pathlib import Path
import os
import sys

base_url = dvconfig.base_url
api_token = dvconfig.api_token
api = Api(base_url, api_token)
username = 'dataverseAdmin'
password = 'admin1'
api = Api(base_url, '')

username = os.getenv('DATAVERSE_USER', 'dataverseAdmin')
password = os.getenv('DATAVERSE_PASSWORD', 'admin1')
# On K8s or with Docker we should get secrets from files, not env vars
if Path(password).is_file():
f = open(Path(password), 'r')
password = f.read().strip()
f.close()

endpoint = '/builtin-users/' + username + '/api-token'
params = {}
params['password'] = password
resp = api.get_request(endpoint, params=params, auth=True)
api_token = resp.json()['data']['message']
print(api_token)

resp = api.get_request(endpoint, params=params, auth=False)
if resp.json()['status'] == "OK":
api_token = resp.json()['data']['message']
print(api_token)
sys.exit(0)
else:
print("ERROR receiving API token:", file=sys.stderr)
print(resp.json(), file=sys.stderr)
print("Did you enable :AllowApiTokenLookupViaApi configuration option?", file=sys.stderr)
print("See http://guides.dataverse.org/en/latest/installation/config.html#allowapitokenlookupviaapi", file=sys.stderr)
sys.exit(1)