Skip to content

depahelix2021/es-hack-1000

Repository files navigation

es-hack-1000

Create an elasticsearch cluster in azure, host mongodb, get data from traackr, do something cool with maps. Scale up.

##Setup servers -Create github repo -Go to azure portal and create some nods named cm-es-9200.cloudapp.net through cm-es-9204.cloudapp.net and make the node names be the same as the ports you're going to use -setup mongodb on the first node

##Mongo DB -http://docs.mongodb.org/manual/tutorial/install-mongodb-on-red-hat-centos-or-fedora-linux/ -Create a /etc/yum.repos.d/mongodb.repo file to hold the following configuration information for the MongoDB repository:

[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1

Then "sudo su -" and then

yum install mongo-10gen mongo-10gen-server
mongo cluster-7-data-00.sl.hackreduce.net:28953/traackr
> db.posts.find()
> db.influencers.find()

Install Mongo on windows. just download the 64 bit version, unpack the zip file and put mongo bin in your PATH.

mvn package
mvn assembly:assembly
cluster-7-data-00.sl.hackreduce.net:28953

cd target
java -jar hackday*

java -jar hackday-mongo-loader.jar -c posts -d traackr -m cluster-7-data-00.sl.hackreduce.net:28953

cd java/mondo-data
mvn package; mvn assembly:assembly; java -jar target/hackday-mongo-loader.jar -c influencers -d traackr -m cluster-7-data-00.sl.hackreduce.net:28953 -o 10

##Project Home

[here] (https://github.com/depahelix/es-hack-1000)

##Goal 1 git clone https://github.com/hackreduce/elasticsearch-hackathon build the project with maven. see: [elasticsearch-hackathon] (https://github.com/hackreduce/elasticsearch-hackathon)

##Goal 2 Index some data.

##Section 3

hack/reduce elasticsearch sept 2013

elasticsearch-hackathon

ElasticSearch Hackathon Material

Prerequisites

All attendees:

ElasticSearch Installation

It's recommended that you download and play with Elasticsearch locally if only to get familiar with the basic commands.

http://www.elasticsearch.org/guide/reference/setup/installation/

Available Datasets

Elasticsearch Data

Traackr Data

This data is loaded in MongoDB so that you can re-index it into ES in any way you find interesting:

  • Loaded on Mongo instance on cluster-7-data-00.sl.hackreduce.net
  • Mongo URI: mongodb://cluster-7-data-00.sl.hackreduce.net:28953
  • Database name: traackr
  • Two collections are available:

Useful Links

Elasticsearch Clients

Elasticsearch Plugin Examples

Indexing Data into Elasticsearch

Loading data from MongoDB

  • Two skeleton projects are availalbe to get you up and running right away: Java or Python
  • Using the Java driver
  • Java Driver Examples Code
  • Using the Python driver
  • Python driver tutorial
  • How to connect to the Hack/Reduce MongoDB Shell via local client:
    • Install MongoDB in your local environment
      • Ubuntu / Debian: sudo apt-get update; sudo apt-get install mongodb
      • Fedora / RedHat: sudo yum install mongodb
      • Test if installed successfully: mongo --version
      • Connect to Mongo instance on cluster-7-data-00.sl.hackreduce.net: mongo cluster-7-data-00.sl.hackreduce.net:28953/traackr

About

hack/reduce elasticsearch sept 2013

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published