Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added type hints #849

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

marvinvanaalst
Copy link

Since #848 contained too many non-typing related changes, here is the cleaner version.

I commented out the type hints that require the typing-extensions package like Literal[0, 1, 2, 3] for opt in llvmpy.passes.create_pass_manager_builder. If you don't have anything against adding that dependency, I'd happily add it back in, as it does carry additional information.

I had to punt on several occasions and go for type: ignore, mainly because mypy / pylance don't like mixin classes.
There are also a number of typecasts that I did. I usually added a FIXME such that they are double-checked.

@gmarkall let me know if I forgot to clean up something, it has been a long day ;)

@marvinvanaalst
Copy link
Author

So CI seems to fail for test test_simple (llvmlite.tests.test_ir.TestBuildInstructions)

with the following difference (so essentially in a single repr there is a list instead of a tuple)

<ir.Instruction 'res' of type 'i32', opname 'add', operands [<ir.Argument '.1' of type i32>, <ir.Argument '.2' of type i32>]>
<ir.Instruction 'res' of type 'i32', opname 'add', operands (<ir.Argument '.1' of type i32>, <ir.Argument '.2' of type i32>)>

This comes from replace_usage in Instruction. Here I identified the type list[Value] for operands, so I removed the tuple call

class Instruction(NamedValue, _HasMetadata):
     operands: list[Value]

    def replace_usage(self, old: Constant, new: Constant) -> None:
        if old in self.operands:
            ops: list[Value] = []
            for op in self.operands:
                ops.append(new if op is old else op)
            self.operands = ops  # remove tuple() here
            self._clear_string_cache()

However, that tuple type doesn't really make sense in the broader spectrum, since in CallInstr self.operands is assigned to, why isn't allowed for tuples

class CallInstr(Instruction):

    @callee.setter
    def callee(self, newcallee: Function) -> None:
        self.operands[0] = newcallee

I'd argue that list[Value] is the correct type, what do you folks think about that?

There is also an additional error due to llvmlite.tests.test_binding.TestObjectFile::test_get_section_content, but that only seems to be about ObjectFileRef and SectionIteratorRef, which I only annotated, so I don't really get why that one fails.

@gmarkall
Copy link
Member

(Labelled as Waiting on Reviewer as this is waiting for some feedback about the CI issues)

Apologies for the delay in getting back to you about this - The 0.39 / Numba 0.56 release has occupied the majority of maintainers' time recently.

@enathang
Copy link

enathang commented Feb 1, 2023

Hello! I have a similar issue of wanting type annotations for llvmlite and ended up here :)

Is there a plan to circle back to this PR eventually? I imagine it's lower priority than supporting new LLVM versions but would be a nice QoL update. Happy to contribute

@gmarkall
Copy link
Member

gmarkall commented Feb 6, 2023

@enathang Thanks for the ping, and for offering to contribute - as none of the maintainers are very familliar with type annotations, it would be great if you could guide us a little - do you have any thoughts on the comments in #849 (comment)? I think being able to address them will help un-stick this PR.

@enathang
Copy link

enathang commented Feb 23, 2023

Hey @gmarkall ! I'm also not super familiar with type annotations, but I'll throw my hat in the ring regardless :)

The main issue of the comment doesn't seem to be a type annotation issue per-se. Mostly, it's "should the operands for Instruction be represented as a tuple or a list?" Based on the current unit tests, they're represented as a tuple. However, the code treats it as a list, including mutating specific elements of it (which is impossible for tuples because they're static but fine for lists because they're dynamic.)

If we want to stick with tuples, the question is: why are we trying to mutate the tuple (and why is python letting us)? Is it actually a list under the hood or is it truly a tuple?

If we want to switch to a list, the trade-off is mostly performance afaik. Tuples have nice memory layout properties that lists do not.

So the path forward is:

  1. Understand why Python is letting a tuple be mutated
  2. See if the example above is the only place where a mutation occurs
  3. Try switching it to a list throughout and sanity-check performance
  4. Decide based on 2 and 3 which approach to take
  5. Win?

@marvinvanaalst can correct me if I've missed anything, mostly I'm rephrasing his comments for my own understanding.

I'll try to get everything up and running where I can answer (1). If there are any un-obvious reasons why the operands were set to be tuples ([sklam] was the original implementer according to git blame), I'd love to hear their opinion.

@gmarkall
Copy link
Member

gmarkall commented Mar 6, 2023

@enathang Thanks for thinking about this and adding some detailed thoughts and a path forward!

If we want to stick with tuples, the question is: why are we trying to mutate the tuple (and why is python letting us)? Is it actually a list under the hood or is it truly a tuple?

My guess would be that it's supposed to be a list - presumably the tests that pass a tuple avoid causing anything to happen that would result in them being mutated? I'd guess that the right way forward is to just say that they should be lists, and annotate as such.

I think most uses I've noticed in Numba pass lists to instructions - I have a vague recollection of trying to convert some lists to tuples in the Numba codebase for the general reasons that tuples tend to be preferable to lists, and running into issues as a result.

I'll try to get everything up and running where I can answer (1). If there are any un-obvious reasons why the operands were set to be tuples ([sklam] was the original implementer according to git blame), I'd love to hear their opinion.

That would be great - let us know if you need any guidance, and I look forward to seeing what you discover!

@marvinvanaalst
Copy link
Author

The main issue of the comment doesn't seem to be a type annotation issue per-se. Mostly, it's "should the operands for Instruction be represented as a tuple or a list?" Based on the current unit tests, they're represented as a tuple. However, the code treats it as a list, including mutating specific elements of it (which is impossible for tuples because they're static but fine for lists because they're dynamic.)

Well yes, it's a OOP design issue really. CallInstr is a subclass of Instruction, so the expectation is that if operands is a list[Value] in the parent class, it will also be that type in all children. And with type annotations that has to be enforced, while Python itself doesn't care about you creating invariants.

If we want to stick with tuples, the question is: why are we trying to mutate the tuple (and why is python letting us)? Is it actually a list under the hood or is it truly a tuple?

In replace_usage the tuple itself is never mutated. In the original code a temporary list is created and then a tuple is made from that list again. In callee it is, but I'm guessing that people have just always used CallInstr with lists, so it just hasn't crashed yet. But yes, if someone called replace_usage and then later on sets the callee property that should crash and is a bug, apparently just not a bug that comes up a lot.

I'd also still be open to help out on this issue, we will just have to see how much work it is going forward to merge this.

Decide based on 2 and 3 which approach to take

A third option is to use a type that works for both, like Iterable, but that some functions may need to be changed. replaced_usage would work with any Iterable interface, callee doesn't because it requires mutation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants