Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticSearch Indexing #2

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
2 changes: 1 addition & 1 deletion .env
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
COMPOSE_PROJECT_NAME=metacpan
PLACK_ENV=development
PGDB=pgdb:5432
API_SERVER=morbo -l http://*:5000 -w app.psgi -w bin -w lib -w templates --verbose
API_SERVER="morbo -l http://*:5000 -w app.psgi -w bin -w lib -w templates --verbose"
118 changes: 118 additions & 0 deletions .github/workflows/indexing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# This is a basic workflow to help you get started with Actions

name: Build ElasticSearch Index

# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
pull_request:
branches: [ master ]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# workflow_dispatch:
# branches:
# - master

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build-index:
# The type of runner that the job will run on
runs-on: ubuntu-20.04

# env:
# Environment Variables for the Indexation Script
# ES_TEST: "elasticsearch:9200"
# HARNESS_ACTIVE: 0
# TEST_VERBOSE: 0

# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Start Docker Engine
# sudo systemctl start docker
- name: Start Docker Engine
run: |
sudo systemctl status docker -l

# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
- uses: actions/checkout@v2

- name: Listing Directory Contents
run: |
echo 'User:' $(whoami)
echo 'Working Directory:' $(pwd)
echo 'Directory Content:'
ls -lah

- name: Clone MetaCPAN API Project
run: |
mkdir -p src ; cd src
git clone https://github.com/bodo-hugo-barwich/metacpan-api.git metacpan-api
cd metacpan-api
git fetch && git checkout no-47_wrong-index && git pull && git log -1 | sed -re 's/@/ at /' | tr -s '<>' "'"
cd ../../

# Build Container Images
# sudo chmod a+w cache log perl5
- name: Build Container Images with 'docker-compose'
run: |
mkdir -p cpan src ; chmod a+w cpan -R ; cd src
git clone https://github.com/metacpan/metacpan-web.git metacpan-web
git clone https://github.com/metacpan/metacpan-grep-front-end.git metacpan-grep-front-end
git clone https://github.com/metacpan/metacpan-cpan-extracted-lite.git metacpan-cpan-extracted-lite
ln -s metacpan-cpan-extracted-lite metacpan-cpan-extracted
cd ../
docker volume create --driver local --opt device=:$(pwd)/src/metacpan-cpan-extracted --opt o=bind --opt type=none metacpan_git_shared
echo "Docker: Volumes listing ..."
docker volume ls
docker volume inspect metacpan_git_shared
echo "Docker: Images building ..."
docker-compose up --build --no-start traefik
docker-compose up --build --no-start elasticsearch
docker-compose up --build --no-start pgdb
docker-compose up --build --no-start api
echo "Docker: Images listing ..."
docker image ls
docker image inspect metacpan/metacpan-api
echo "Docker: Volumes listing ..."
docker volume ls
docker volume inspect metacpan_elasticsearch
echo "Docker: Service 'api' starting ..."
docker-compose up -d api

# Check Container Health Exit the test when Containers are failed
- name: Check Container Health
run: |
constat=`docker-compose ps 2>&1`
echo -e "Container Status Complete:\n$constat"
constat=`echo "$constat" | sed -re 's/[[:space:]][[:space:]]+/|/g' | cut -d"|" -f1,3 | grep "|"`
echo -e "Container Status Up:\n$constat"
echo "Component 'api' Logs:"
docker-compose logs api
confailed=`echo "$constat" | grep -i "exit" | wc -l`
if [ $confailed -ne 0 ]; then exit 1 ; fi
echo "Component 'elasticsearch' Logs:"
docker-compose logs elasticsearch

# Test the Elastic Search Indexation
- name: Run the Elastic Search Indexation
run: |
echo "Indexation: starting ..."
docker-compose exec -T api index-cpan.sh
elasticsearchport=`docker-compose ps 2>&1 | grep -i elasticsearch | grep -ioE ":[0-9]+\->" | grep -ioE "[0-9]+" | sed -n 1p`
export ES_PORT=$elasticsearchport
echo "ElasticSearch Port: '$ES_PORT'"
echo "ElasticSearch Version:"
curl -v "localhost:$ES_PORT" 2>&1
echo "Indexation: Indices showing ..."
docker-compose exec -T api curl -v 'elasticsearch:9200/_cat/indices'
echo "Indexation: Activity Log showing ..."
cat src/metacpan-api/var/log/metacpan.log
doccount=`curl "localhost:$ES_PORT/_cat/indices" 2>&1 | grep -i "cpan_v1_01" | awk '{print $6}'`
alias=`curl "localhost:$ES_PORT/_cat/aliases" 2>&1 | grep -i "cpan_v1_01" | cut -d" " -f1,2`
if [ -z "$doccount"]; then doccount=0 ; fi
if [ -z "$alias" ]; then alias="none" ; fi
echo "Indexation: Document Count '$doccount'"
echo "Indexation: Alias '$alias'"
if [ $doccount -eq 0 ]; then exit 1 ; fi
if [ "$alias" != "cpan cpan_v1_01" ]; then exit 1 ; fi
44 changes: 43 additions & 1 deletion bin/index-cpan.sh
Original file line number Diff line number Diff line change
@@ -1,10 +1,52 @@
#!/bin/sh

./bin/run bin/metacpan mapping --delete
MODULE=`basename $0`

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: Indices re-creating ..."

#sdeletelog=`echo "" 2>&1`
./bin/run bin/metacpan mapping --delete $@
ideleters=$?

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: Re-creation finished with [$ideleters]"
#echo "${NOW} I ${MODULE}: Re-creation Log:\n'$sdeletelog'"

if [ $ideleters -ne 0 ]; then
echo "${NOW} E ${MODULE}: Re-creation failed!"

exit $ideleters
fi

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: ElasticSearch - Info collecting ..."

sinfolog=`./bin/run bin/metacpan mapping --show_cluster_info $@ 2>&1`
iinfors=$?

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: Info finished with [$iinfors]"
echo "${NOW} I ${MODULE}: Info Log:\n'$sinfolog'"

if [ $iinfors -ne 0 ]; then
echo "${NOW} E ${MODULE}: ElasticSearch unavailable!"

exit $iinfors
fi

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: Packages downloading ..."

/bin/partial-cpan-mirror.sh

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: Indices rebuilding ..."

./bin/run bin/metacpan release /CPAN/authors/id/
./bin/run bin/metacpan latest
./bin/run bin/metacpan author
./bin/run bin/metacpan permission

NOW=$(date +"%F %T")
echo "${NOW} I ${MODULE}: done."
39 changes: 0 additions & 39 deletions configs/metacpan-web/metacpan_web.conf

This file was deleted.

8 changes: 8 additions & 0 deletions configs/metacpan-web/metacpan_web_local.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
api = http://api:5000
api_public = http://localhost:5000
source_host = http://localhost:5000
web_host = http://localhost:5001

<View::Xslate>
cache_dir = /var/tmp/templates
</View::Xslate>
8 changes: 5 additions & 3 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,8 @@ services:
source: web_carton
target: /carton
- type: bind
source: ./configs/metacpan-web/metacpan_web.conf
target: /metacpan-web/metacpan_web.conf
source: ./configs/metacpan-web/metacpan_web_local.conf
target: /metacpan-web/metacpan_web_local.conf
read_only: true
- type: bind
source: ./src/metacpan-web
Expand Down Expand Up @@ -366,6 +366,8 @@ services:
PG_TAG: "${PG_VERSION_TAG:-9.6-alpine}"
environment:
POSTGRES_PASSWORD: metacpan
POSTGRES_USERNAME: metacpan123
POSTGRES_DB: metacpan
networks:
- database
healthcheck:
Expand Down Expand Up @@ -396,7 +398,7 @@ services:
#

mongodb:
image: mongo:latest
image: mongo:4.4.9
networks:
- mongo
healthcheck:
Expand Down
2 changes: 1 addition & 1 deletion grep.env
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# grep service - development environment

GREP_SITE_PORT=3001
GREP_PLACKUP_SERVER_ARGS=-E development -R lib,bin
GREP_PLACKUP_SERVER_ARGS="-E development -R lib,bin"