-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nbtlib 2.0
#156
Comments
I recently diagnosed a performance problem where serializing a structure file with millions of entries was taking 60 seconds. I'd like to make At the same time I also want to keep |
There's another type of feedback I got a few times over the years, and it's that the wrappers are annoying, especially for beginners. They expect to be able to use Also for schemas maybe there's a way to leverage or get inspiration from the way |
About SNBT support and the nbt path parser in |
Just a quick feedback before I have time for more in-depth discussions (right now I and the one busy, lol). And, btw, loved this idea of a discussion issue before 2.0!
Yeah, serialization is really slow. Not that I care about SNBT. But before you think otherwise, know that My mcworldlib deals with Worlds, so it too deals with millions of entries, considering each So I did some benchmarks using several NBT libraries (Amulet-NBT, twoolies's NBT, Tk's pynbt) and to my surprise nbtlib came first among pure-python implementations (closely tied with pynbt). Twoolie is noticeably slower (~ -30%), and Amulet came last (~ -40%). But of course, none can compete with Amulet's Cython implementation, which is 4-5x faster. I'll son post the benchmark script in my library, if you're interested. Main point is: nbtlib's
If one can't use wheels (why?), it will build from source, what's the problem? |
I've also benchmarked serialization alternatives, and man, MCC Chunk |
I also think a lot about this, but any solutions I came up require creating a The simplest implementation strategy I'm thinking of is something similar to this pseudocode: def __setitem__(self, key, value) -> None:
if not isinstance(value, Base) and key in self: # and isinstance(self[key], Base) too?
# Overwriting existing key with a non-tag value. Cast to old value type (if old is a tag)
return super()[key] = self[key].__class__(value)
super()[key] = value But that's an 1-2 Or we can let the user select this "auto-cast" mode per instance: @property
def auto_cast(self, value: bool):
return self.autocast_setitem == super().__setitem__
@autocast.setter
def auto_cast(self, value: bool):
self.__setitem__ = self.autocast_setitem if value else super().__setitem__ This could default to |
I kinda lose confidence in code the longer it sits around unattended and since the core of the library is now a few years old I'm really itching to do a refresh haha. But I actually didn't know how well When it comes to performance I'd primarily try to improve parsing and writing raw nbt data, as you mentioned I agree that's the most critical use-case. And I won't unnecessarily break things, I really like how the API feels so it's really more about potentially refactoring some lower-level elements or finding fast paths.
A lot of windows users have python installed but no Visual C++ so if there aren't pre-built binaries available for them they won't be able to install the package. Also I've seen people interested in running their utilities on mobile or in the browser with Brython and I'm pretty sure C extensions are a massive headache for this. Also I forgot about
Haha yeah sometimes I kind of wish Mojang wasn't stuck with nbt too! |
The |
Will take a look, never heard of mypc, sounds awesome. But meanwhile... there's always the Amulet approach: # in nbt.__init__.py
try:
from nbt_cy import ( # a cyton nbt.pyx module
a,
b,
...,
)
except ImportError:
from nbt_py import ( # a pure-python nbt_py fallback module
a,
b,
...
) |
And I'm really glad you are focusing on parsing/raw NBT. But yeah, We need good benchmarks on large data. Grab that 3.8MB MCC chunk in my |
That's really sweet, thanks! I'm gonna make good use of these test files, I'm about to start messing around with mypyc. |
So another thing I'm thinking about for improving |
Minecraft is, but I'm not ;-) I'm planning on using Of course, this trouble is only worth it if msgpack is considerably faster than nbt's write(). And it is. By a tenfold. *: man that thing is a beast, it's even faster (and larger) than CPython's native |
I would surely benefit from it, bounds like a lot of trouble for possibly very little gain. I mean, I was really impressed by Amulet's Cython implementation. It's so fast that it really discouraged me to try any non-trivial optimization on pure-python code. Too bad Amulet's API is so bad, with its outdated pymclevel heritage (that everyone seems to mimic). I even considered learning Cython just for the sake of making a |
Hello, I'm not sure if you still want to work on this or if you took a break, do you have any updates? |
I have a local branch that I revisit from time to time to experiment with some of the stuff mentioned in the thread, but I haven't had the occasion to focus on nbtlib in a while. I'm always working on some other project and when I need to use nbtlib it still does its job pretty well, so in practice the things I want to improve don't really bother me enough to make me drop whatever I'm working on to do the nbtlib revamp. However I really want to get to it done at some point. It's kind of weighing on me mentally but I really care about doing it right. I think I'll try to set some time aside for nbtlib in september. It's getting kinda ridiculous, I hope I'll be able to close this issue before its second birthday... |
This tracks a number of issues that should be resolved before making 2.x releases stable. I think the recent feedback shows that
nbtlib
could benefit from a little revamp. Proper static typing is long overdue and I'd like to take some time to experiment with optimizations to speed everything up a bit.Path
cannot add or init integers, only strings and other Paths #146File
is not writing the trailing End tag #153read_numeric()
to vastly increaseparse()
performance for all tags #150Path.from_parts()
or similar to allow tuple constructor likepathlib.Path
#157Also this could be the perfect occasion to revisit old issues. It would be interesting to revisit the idea of
nbtlib.contrib
once the main issues have been addressed (#60). Also I'm not sure about the status of thesetup.py
situation anymore (#54). And finally, I've always struggled with writing documentation. Maybe we can come up with a better strategy, or find a way to break it down and make a proper roadmap for this (#16).If anything else comes up I'll also add it in here.
The text was updated successfully, but these errors were encountered: