Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal on the rules #8

Open
ruppde opened this issue Nov 5, 2024 · 5 comments
Open

proposal on the rules #8

ruppde opened this issue Nov 5, 2024 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@ruppde
Copy link
Contributor

ruppde commented Nov 5, 2024

hi,

did a test drive with your yara rules and while they find malware and nasty things, they just produce too many false positives, to be usable. The ReactOS live CD has 144 hits, the /usr/sbin of debian has 125 hits. So scanning a complete hard drive of a system infected with maybe 3 malware files would produce something like 10.000 false positives. there's just no way to find those needles in the haystack.

proposals to bring that number down:

  • scan some goodware and remove the strings, which just hit too often
  • for rules with many strings, it might be ok to switch the yara condition from any of them to 2 of them
  • for rules with few strings, consider merging multiple of them with a similar topic into one rule and also go for 2 of them. for example there is string18_cat_greyware_tool_keyword: cat /etc/passwd, which will also be in lots of legitimate scripts. but if there's also string31_net_greyware_tool_keyword: net localgroup admin in the same file, that's rather unusual.
  • add a condition of filesize < 10MB to avoid matching on huge legitimate files, which just contain many many strings
  • for windows binary hacktools, add a condition of "uint16(0) == 0x5a4d" to match only binaries. otherwise the rules will also match on e.g. emails and browser cache of pentesters, which just mention a tool.
    ( linux ELF is uint16(0) == 0x457f, macos ( uint32be(0) == 0x7f454c46 or uint16(0) == 0xfeca or uint16(0) == 0xfacf or uint32(0) == 0xbebafeca ))

some repos for testing:

malware:
https://github.com/Flangvik/SharpCollection
https://github.com/tennc/webshell

goodware:
ReactOS LiveCD: https://reactos.org/download/
any linux live DVD

sorry, that's a bunch of worky, but I think it's really needed to make this project usable.

best regards
arnim

@mthcht
Copy link
Owner

mthcht commented Nov 6, 2024

Hi Arnim,

Appreciate the feedback and all the detailed suggestions. I usually don’t run these YARA rules on entire drives, I target directories of collected artifacts and logs, automatically ingesting the JSON results into Splunk for analysis. I know it picks up a lot (even strings from Notepad...), but that broad coverage with extensive triage was my intention. That said, you're not the first to mention this, so I’ll set up a dedicated directory or another branch this week with all your modifications to make it more usable !

Thanks again for the input 🙏

Best regards,

update: After testing across full disks on multiple OS, I identified several bad keywords that shouldn’t be present. I'm implementing adjustments

@mthcht mthcht self-assigned this Nov 12, 2024
@mthcht mthcht added the enhancement New feature or request label Nov 12, 2024
@ruppde
Copy link
Contributor Author

ruppde commented Nov 12, 2024

thanks. you could easily create two different rulesets: for hunting with more false positives and a more strict one that might be usable in e.g. yara-forge

@mthcht
Copy link
Owner

mthcht commented Nov 17, 2024

alright new ruleset available here https://github.com/mthcht/ThreatHunting-Keywords-yara-rules/tree/main/yara_rules_binaries_strict @ruppde i would love to hear your feedback on this one

@mthcht mthcht assigned ruppde and unassigned mthcht Nov 18, 2024
@ruppde
Copy link
Contributor Author

ruppde commented Nov 18, 2024

cool, lots better. you could still improve these 3 warnings at yara-startup:

warning: rule "rule_cobaltstrike_offensive_tool_keyword" in all.yara(90424): rule is slowing down scanning
warning: rule "rule_DynastyPersist_offensive_tool_keyword" in all.yara(104429): string "$string19_DynastyPersist_offensive_tool_keyword" may slow down scanning
warning: rule "rule_nmap_offensive_tool_keyword" in all.yara(172331): rule is slowing down scanning

rule_cobaltstrike_offensive_tool_keyword and rule_nmap_offensive_tool_keyword probably just contain too many strings.

I've used the rule on a bunch of cobalt strike samples and there are the only 4 strings found:

beacon.dll
beacon.x64.dll
cobaltstrike
cobaltstrike-

ok, doesn't mean the rest useless, might be found in some BOF or whatever.

some strings are redundant, e.g. /cobaltstrike/ will match anything of /cobaltstrike\-/ , /\-cobaltstrike/ and /cobaltstrike\./ and some more below.

maybe just remove string19_DynastyPersist_offensive_tool_keyword if yara is slowed down by the \s (it can't find any 4 byte atom for aho corasick (https://github.com/Neo23x0/YARA-Performance-Guidelines#1-compiling-the-rules)

from the remaining false positives it might be better to just remove the rules for so common linux tools as whoami, dd and wireshark?

binwalk is a tool that mostly used by researchers and no attacker would use that on a victims machine. so unless someone really get his hands on a real attacker machine, there will be many many false positives before the rule is any use.

there are also many regexes, which could be normal strings, e.g.:
$string7_proxychains_offensive_tool_keyword = /\/proxychains\.conf/ nocase ascii wide

nocase and wide also slow down performance and are probably not needed for linux tools. linux is case sensitive, so any camel-case obfuscation wouldn't work anymore. and utf16 strings are hyper seldom on linux.

so if there's no reason to use a regex, maybe just generate:
$string7_proxychains_offensive_tool_keyword = "/proxychains.conf" ascii wide

@mthcht
Copy link
Owner

mthcht commented Nov 19, 2024

Thanks! I already planned to only use regex for patterns that require it (mainly those with wildcards in the middle). For the rest, I'll use strings since it's much faster !

I'll remove binwalk as suggested. For the greyware tools, if you prefer not to detect them, you can use offensive_tools.yara, which only includes offensive tools. The all.yara file will contain everything: offensive tools, greyware, and additional signatures.

Regarding the Linux-specific modifications, I'll need to first categorize the patterns that apply solely to Linux systems. It might take some time, but I'll get it done.

As for the three problematic rules, I'll see what I can do! thanks for pointing them out!

todo:

  • Remove binwalk.
  • Replace unnecessary regex patterns with plain strings for efficiency.
  • Add new tags: #linux and #windows to ThreatHunting-Keywords.
  • Modify the conversion script to remove nocase, ascii, and wide modifiers for patterns tagged with #linux.
  • Improve the following rules:
    • rule_nmap_offensive_tool_keyword (it will remain slow due to detection of all existing Nmap vulnerability scripts)
    • rule_DynastyPersist_offensive_tool_keyword
    • rule_cobaltstrike_offensive_tool_keyword (Remains slow as it includes detection for every existing BOF for Cobalt Strike)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants