Need better way of specifying software versions #4

tpiscitell · 2015-07-02T15:21:03Z

Keeping a hard coded version number in common.sh is tedious because as new versions are released, old ones are purged from the Apache mirrors. It would be nice if this didn't break every time a new version of Hbase/Storm/Hadoop/etc.. was released.

Idea:

If any version of $software exists in vagrant/resources/tmp, use that
If not, get the closest apache mirror of $software and see what version are available
Do some "fuzzy matching" to identify what version to use. (Ex. ver 0.98.* of Hbase)
wget identified version
carry on....

chokha · 2015-07-06T10:43:28Z

Let me know if this works. This will address 2 & 3 and will return a valid link of the first available binary in the directory listing from the closest apache mirror.

Usage example:
python list_apache.py http://www.webhostingreviewjam.com/mirror/apache/hbase
python list_apache.py http://www.webhostingreviewjam.com/mirror/apache/hive

import sys
import urllib
import re
import urllib2

parse_re = re.compile('href="([^"]*)".*(..-...-.... ..:..).*?(\d+[^\s<]*|-)')
          # look for          a link    +  a timestamp  + a size ('-' for dir)
response_code = -1
def list_apache_dir(url):
    global response_code
    try:
        html = urllib.urlopen(url).read()
    except IOError, e:
        print 'error fetching %s: %s' % (url, e)
        return
    if not url.endswith('/'):
        url += '/'
    files = parse_re.findall(html)
    dirs = []
#   print url + ' :'  
#   print '%4d file' % len(files) + 's' * (len(files) != 1)
    for name, date, size in files:
        if size.strip() == '-':
            size = 'dir'
        if name.endswith('/'):
            dirs += [name]
#       print '%5s  %s  %s' % (size, date, name)

        if not name.endswith('/') and name.endswith('bin.tar.gz'):
                 print url + name
                 resp = urllib.urlopen(url+name)
                 response_code = resp.code
#                print response_code
        if response_code == 200:
                break

    for dir in dirs:
        print
#        print response_code
        list_apache_dir(url + dir)
        if response_code ==200:
#                        print response_code
                 break

for url in sys.argv[1:]:
    print
    list_apache_dir(url)

tpiscitell mentioned this issue Jul 23, 2015

need to merge from m2127299045:master to data #5

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need better way of specifying software versions #4

Need better way of specifying software versions #4

tpiscitell commented Jul 2, 2015

chokha commented Jul 6, 2015

Need better way of specifying software versions #4

Need better way of specifying software versions #4

Comments

tpiscitell commented Jul 2, 2015

chokha commented Jul 6, 2015