Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pointers smear memory protection #1243

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

gcmoreira
Copy link
Contributor

Context

When working on Linux Page Cache and IDR I realized that Volatility3 doesn't really check if a pointer is valid when doing something like:

if not ptr:
    continue

What that actually does is check if ptr != 0.

For instance, when the pointer is zero:

>>> type(inode)
<class 'volatility3.framework.objects.Pointer'>

>>> inode
0

>>> inode or True
True

As shown above, in those cases, it works as expected: inode evaluates to False, and True is returned due to the or statement.

However, when smear memory happens, the above code will not abort, potentially leading to a crash or incorrect behavior.

>>> type(inode)
<class 'volatility3.framework.objects.Pointer'>

>>> inode
29810

>>> inode or True
29810

The above means that inode evaluated to True and the or True wasn't executed.

Volatility3 code assesment

From analyzing the Volatility3 source code, the following three cases were identified:

  1. Volatility3 pointers don't have a is_valid() method as in Volatility2. It does have an is_readable() method, however, it's used in only a few places, and it requires the developer to be aware of this method, something that doesn't seem to be the case for most of us.

  2. Developers who are unaware of the is_readable() method are using layer.is_valid(ptr.vol.offset, ptr.vol.size) or simply layer.is_valid(ptr.vol.offset) instead to verify pointers. However, this requires first obtaining the respective layer, and since it is not used widely, it is likely unknown to most developers.

  3. Finally, the remaining developers are assuming and trusting that with simply doing the following will be enough. This applies to most cases in the Volatility3 source code:

if not ptr:
    continue

Benefits of this Pull Request

This PR takes advantage of the Python3's built-in object truth value testing to verify a pointer has a valid address in its memory layer.
This will provide immediate benefits to those using the source code in (3) and will allow to simplify the source code in (1), (2) and future implementations.
Refer to 305eeda to see how the changes in this PR can benefit the framework.

@ikelos
Copy link
Member

ikelos commented Aug 25, 2024

This is a significant shift in semantics. I'm not immediately against it, but I definitely want to consider the ramifications. This could technically differ from the current functionality (if 0 happens to be mapped, or people are using it just to check for 0 rather than readability). It will require at least a MINOR bump of the framework, and possibly a MAJOR bump too. It would also be good to identify places in existing plugins where this functionality is already expected (so 3) and or one of 1 or 2 to see how prevalent of an issue this is.

I'm sure there's instances where windows says something is a pointer, but actually use it to just hold a number, and where the 0 test might be used legitimately which this functionality may now break. Similarly pointers in python are just integers, and this change will make the integer logic and pointer logic diverge.

It wasn't called is_valid because that says whether the object is valid in the space it's been constructed on (which is true even for a bad pointer) so is_readable talks about the thing the pointer points to. I'd sooner we tried to make people more aware of is_readable as the way pointers are supposed to be tested. This can be through documentation and/or more use throughout the codebase (finding examples of layer.is_valid(ptr.vol.offset)) but it's a lot less involved than changing the underlying semantics of a type used everywhere (in both our code and plugins written by other people). I'm not sure the convenience/reduction in a few characters is worth it?

@gcmoreira
Copy link
Contributor Author

If, in a hypothetical scenario, a pointer to 0 is readable in a specific layer and someone wants to check for 0 specifically, why are they evaluating it as a boolean? shouldn't they use ptr == 0?

I'm sure there's instances where windows says something is a pointer, but actually use it to just hold a number, and where the 0 test might be used legitimately which this functionality may now break.

Could you point me out to where you think this would break so that I can test it? In Linux, the XArray and RadixTree extensively use pointers as values. These pointers, when evaluated in a boolean context, will return False because their values are augmented with additional tags in the unused bits, such as in a page-aligned pointer. However, no changes were needed for this to work with this PR. Since the changes here don't convert the integer in a boolean, this only works like that when it's evaluated as a boolean. You can still perform the same operations as before, just like with an integer + - * / & << >> etc.

Similarly pointers in python are just integers, and this change will make the integer logic and pointer logic diverge.

I assume you're talking about pointers in Volatility3, not Python... yeah we talked about this already. Pointers already diverged from integers from the moment we inherited from integers and added methods and states.

Anyway, if you don't think it has a chance of being merged, fair enough, let's close this PR and not waste any more time.
Again, I can test it with any Windows plugin you indicate. It also passes all the Windows tests, which isn't a bulletproof guarantee, but it's definitely a good sign.

@ikelos
Copy link
Member

ikelos commented Aug 25, 2024

If, in a hypothetical scenario, a pointer to 0 is readable in a specific layer and someone wants to check for 0 specifically, why are they evaluating it as a boolean? shouldn't they use ptr == 0?

Yes, but you're arguing for change, saying that both are doable actually holds to keep things as they are, rather than uproot them. Why are you evaluating an int as a bool and expecting it to tell you whether it's valid or not? That's against how people expect integers to work (whether you expect pointers to just work that way or not). You didn't know that is_readable existed and were surprised when it wasn't used. Surely changing the default way that a standard python type works is harder to document?

Could you point me out to where you think this would break so that I can test it? In Linux, the XArray and RadixTree extensively use pointers as values. These pointers, when evaluated in a boolean context, will return False because their values are augmented with additional tags in the unused bits, such as in a page-aligned pointer. However, no changes were needed for this to work with this PR. Since the changes here don't convert the integer in a boolean, this only works like that when it's evaluated as a boolean. You can still perform the same operations as before, just like with an integer + - * / & << >> etc.

I don't have specific examples, but I recall from the past instances where a struct said it was a pointer but actually that was just to get a structure that fit the size of a register but didn't actually point to anything in particular. That's only a recollection though, but you still haven't convinced me why we should be changing a fundamental test for what's supposed to be an integer, rather than just using the provided method that's already in place?

Similarly pointers in python are just integers, and this change will make the integer logic and pointer logic diverge.
I assume you're talking about pointers in Volatility3, not Python... yeah we talked about this already. Pointers already diverged from integers from the moment we inherited from integers and added methods and states.

Yes, I meant pointers in volatility (and technically pointers in C, I think). They are just integers. I disagree that augmenting integers is diverging. Adding methods and states just augments an integer, but essentially is still works the same way. This changes the way the underlying type works. I don't recall if we cap pointers to only live within their address space, but I suspect/hope we don't.

Anyway, if you don't think it has a chance of being merged, fair enough, let's close this PR and not waste any more time. Again, I can test it with any Windows plugin you indicate. It also passes all the Windows tests, which isn't a bulletproof guarantee, but it's definitely a good sign.

I think it's very unlikely I'll merge it, but it does raise a good point and I haven't ruled it out. At the moment though, the advantages (particularly compared to is_readable) don't outweigh the possible risks. What I think would be the best way forward is to find all the existing places where pointers are tested using 1, 2 and 3 originally mentioned (if any are tested on the offset of the pointer, that should always return true, it's the value of the pointer that needs to be tested not its offset). A PR that changes all those to is_readable will be helpful either way and is extremely likely to get merged, because it'll show us how often it happens, and it'll mean that if we do decide to make the change finding places where it would help has already been done. It'll also help other users and us keep it in mind when reviewing PRs to get people to do their pointer tests that way and find it easily when copy code to use in our own...

gcmoreira added a commit to gcmoreira/volatility3 that referenced this pull request Sep 13, 2024
- lsof plugin: source code refactored
- lsof plugin: Added the 'device' column to complete the inode information. An inode number is specific to the filesystem/device it belongs to.
- lsof/sockstat plugins: Add threads support. Threads may or may not share the file descriptor table with the thread group leader, depending on whether the CLONE_FILES flag is included in the clone() syscall. Also, once started, a thread can unshare the fd table with its parent. Refer to the unshare() libc syscall wrapper man page, unshare(2). Additionally, note that the Linux lsof command in user space includes thread listings by default as well. Now, there are two columns to identify the thread group ID  (PID) and the task/thread ID (TID).
- Added inode getters from both, the dentry and file structs. From kernels +3.9 the file struct cached the inode pointer. So, when possible, we get this value.
- Improve smear protection in these plugins and various APIs (volatilityfoundation#1243)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants