Related to issue#335, Prevent concurrent malheur and support malheur … #343

SeanKim777 · 2016-11-02T01:59:39Z

Related to issue #335
Allow malheur running in 'cluster' and 'increment' mode depending on number of analysis reports collected.
Run malheur in 'increment' will reduce its execution time for newly created mist report
Also, prevent concurrent malheur running by use pid file configured in reporting.conf

…running mode between increment and cluster

mallorybobalice · 2016-11-02T09:24:11Z

hey, thanks very much for this. I'll try have a thorough read of the commit by the end of the week and to test it on our test box syncing a 2-3k samples to it. (while it's no subs for Brad and co revieweing it -who knows maybe I'll find something interesting bug wise - in a positive peer review way)

Is there any way we could also add a force malheur analysis button in the summary dashboard page or a helper script just in case we want to force it? ( guess I could always vary the threshold )

Is there anything i need to know about the pull request before i try to merge and configure it on the test system?

mallorybobalice · 2016-11-02T10:43:22Z

(e.g. wipe existing malheur report or output folders or leave as is?)

spender-sandbox · 2016-11-02T11:28:37Z

I'm not sure about this change. For starters, wouldn't a simple multiprocessing Lock make more sense than maintaining a pid file?

Also, if I remember right, when I was initially testing with malheur, increment mode didn't give me good results (but I was testing at the time with a small number of samples). Have you run a comparison to see if the increment mode with 10k samples or so in the db will show a similarity with a recent sample that also had increment mode run on it?

-Brad

SeanKim777 · 2016-11-03T01:33:26Z

@mallorybobalice, not sure whether we need force malheur scan under my change because all the analsysis will have its own mist data generated and increment mode scanned if there are more than 10K analysis(default value for threshold) mist data existing.

@spender-sandbox I just found multiprocessing Lock usage on other module after I have committed changes...my bad, will change if I have spare time shortly...but its functional :)
I have compared with increment VS cluster mode then the result rendered on Django web UI was same. Basically cluster id eg. C000-0123 in malheur.txt was same. The only thing required for increment mode was merge increment mode output file to malheur.txt(previous scan result) (commented on source)
Average scan time with increment mode for my dataset (now 20K analyses) is about 20~30 seconds but as @rieck mentioned on #335 it is expected to take similar amount of time for any new analysis mist data which is a lot better than cluster mode

…ning and minor typos

spender-sandbox · 2016-11-03T17:25:30Z

That cleaned it up nicely -- I am still a bit concerned about this though, as this lock should be highly contentious. If someone's running 100 instances of process.py fully loaded, and to complete processing the lock must be held, and one instance holds it for 20-30 seconds at a time, then 100 instances all at the malheur stage will take close to an hour to complete. That seems like a showstopper to me.

-Brad

SeanKim777 · 2016-11-03T23:36:28Z

@spender-sandbox You are right. I didn't thought about that many instances may running because I only using single instance. but tested with 3 instances though

So it would better to register Malheur Application As a Scheduled Service(use cron or ??) to be executed every configured interval with option to scan directory which will hold bunch of mist data created by malheur.py module during that interval

Current malheur.py module need to be changed to just put newly created mist data into a directory which will be used by MAASS. Directory name should be changed dynamically for each running of MAASS.

Also, I don't think malheur in Cuckoo need to be run in a 'cluster' mode and this may cause more trouble in the future. but need to think about compatibility with older version of cuckoo

I am look forward to your comment about this and please give me what kind of tool(?) for MAASS would be good? cron? or anything else I don't know?

mallorybobalice · 2016-11-13T10:09:38Z

@spender-sandbox - Brad, is there any way you could please help Sean fix this for process.py compatibility?

I guess you could say retention.py should be a cron job (yet it's a reporting module), so I don't quite see why malheur.py in batch mode can't be a reporting module ala - 'trigger , put a bookmark type thing in, don't lock up any processing jobs- anything after comes in the next run ' ?
thoughts?

mallorybobalice · 2016-11-21T08:43:12Z

@spender-sandbox final bump :( ?

Related to issue#335, Prevent concurrent malheur and support malheur …

7c814ec

…running mode between increment and cluster

Changed PID to multiprocessing.Lock to prevent concurrent malheur run…

aa78f4a

…ning and minor typos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Related to issue#335, Prevent concurrent malheur and support malheur … #343

Related to issue#335, Prevent concurrent malheur and support malheur … #343

SeanKim777 commented Nov 2, 2016

mallorybobalice commented Nov 2, 2016

mallorybobalice commented Nov 2, 2016

spender-sandbox commented Nov 2, 2016

SeanKim777 commented Nov 3, 2016

spender-sandbox commented Nov 3, 2016

SeanKim777 commented Nov 3, 2016 •

edited

Loading

mallorybobalice commented Nov 13, 2016 •

edited

Loading

mallorybobalice commented Nov 21, 2016

Related to issue#335, Prevent concurrent malheur and support malheur … #343

Are you sure you want to change the base?

Related to issue#335, Prevent concurrent malheur and support malheur … #343

Conversation

SeanKim777 commented Nov 2, 2016

mallorybobalice commented Nov 2, 2016

mallorybobalice commented Nov 2, 2016

spender-sandbox commented Nov 2, 2016

SeanKim777 commented Nov 3, 2016

spender-sandbox commented Nov 3, 2016

SeanKim777 commented Nov 3, 2016 • edited Loading

mallorybobalice commented Nov 13, 2016 • edited Loading

mallorybobalice commented Nov 21, 2016

SeanKim777 commented Nov 3, 2016 •

edited

Loading

mallorybobalice commented Nov 13, 2016 •

edited

Loading