Return to Tralla La or: RPM in C++ #2983
Replies: 8 comments 14 replies
-
I have some experience with this; we did a similar thing in rpm-ostree starting around coreos/rpm-ostree#2336 (comment) There were a lot of things there, see various PRs like https://github.com/coreos/rpm-ostree/issues?q=label%3Aproject%2Fc%2B%2B+is%3Aclosed around that time (I didn't label all of them, there were a lot). Definitely dealing with NUL terminated strings versus C++ strings and Rust strings was and is a notable pain. Probably the biggest ergonomic problem was exceptions...Rust and C align there, and "standard" C++ doesn't. Of course it's possible to use C++ without exceptions...but, it's an ecosystem bifurcation.
Yes. Not to be That Person but...I dunno, I have a pretty strong opinion that Rust is the right choice for systems software, and for those bits you are rewriting, it can pay itself back. Rust is also very complex and was a huge learning curve for me (and others). The trap with C++ in a nutshell IMO is that it can feel very high level, yet low-level things like string_view use-after-frees will just come and bite you. |
Beta Was this translation helpful? Give feedback.
-
(Threading this) do you have a link to these discussions? There's a ton of work on bootstrapping Rust (and systems in general) on self-hosting OSes/distributions. The GUIX one is pretty cool. And for sure Rust gets handled as part of that, it's well documented and understood. I get that people who want to do something similar for self-hosting rpm-based systems would want to build rpm pretty early and switch over a bootstrap process to using it, but I'd imagine there's quite a bit of stuff that needs to be done using not-RPM before that that maintaining such a system would be quite close to needing to maintain two different build systems anyways, and having Rust in that set in addition to other compilers and toolchains that are already needed seems like not a large addition. |
Beta Was this translation helpful? Give feedback.
-
I understand that rpm is not ready for Rust today. But I wonder if rpm will be ready for Rust in 3 or 4 years. In particular, the Rust ecosystem is quickly becoming a fundamental dependency. For instance, Rust is already used in the Linux kernel. So far, it is only used by non-essential code, but it is only a question of time before it is also used there. Given this, my expectation is that Rust's bootstrapping story will improve even more in the near future. The flip side is it's easier to use Rust with C than it is with C++, as @cgwalters pointed out regarding error handling. As such, moving to C++ now will probably make it harder to move to Rust later. So, my suggestion is: since you've already waited 15 years for C++, what do you think about waiting another 2 or 3 years to see what happens with Rust? |
Beta Was this translation helpful? Give feedback.
-
If somebody wants to rewrite rpm from scratch in Rust in another 15 years when I'm retired, be my guest. |
Beta Was this translation helpful? Give feedback.
-
Let me put it this way: if we had a will to move to Rust, maybe sitting out a few more years waiting for technical matters to sort themselves would be a meaningful option. That's just not the case. |
Beta Was this translation helpful? Give feedback.
-
For what it's worth, I'm excited about the transition to C++, because as a C++ programmer, I feel much more comfortable working my way through and cleaning things up leveraging the things I know well. So I'm looking forward to this! |
Beta Was this translation helpful? Give feedback.
-
And we're live now with #3028 🥳 🙈 For the morbidly curious, there was certainly more to clean up before that, the major preparatory work being in That was quite the push up the hill, it'll be nice to get to the real thing now. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Ever since the RPM upstream reboot in 2006, we've been striving to replace the old-school, hand-written pointer gymnastics with general purpose APIs in the codebase. We've come a long way, but all this time we've been dreaming about richer data structures than C has to offer. Of course in C, you roll your own, but then it's time wasted on crafting basic tools instead of solving the actual problem you had. We believe RPM is being held back by the current implementation language, and we don't want to play Robinson Crusoe anymore.
Few people know this, but I actually ported RPM to C++ in 2010. Ported in the sense that it builds with a C++ compiler, and then ported some internal data structures to C++ native ones to get a feel for things. There were advantages certainly, but I came out of the experience with a resounding "ugh no" conclusion. The world just wasn't ready for that, for several reasons. It didn't stop the dreaming though, and the C++ topic has raised it's head a few times in the intervening years.
It happened again just recently, and this time we woke up to a language that seems almost like a distant cousin to the C++ I cursed at in 2010 (and before). Ever since C++11, the language usability has improved in leaps and bounds, whereas C has completely stagnated at C99. Which is at least part of the reason I haven't paid much attention to C++ either.
With the first major RPM package format update in two decades on the horizon, what better opportunity there could there possibly be?
This is NOT a rewrite from scratch operation, not even a rewrite, just a mere implementation detail. For RPM 6.0 in 2025, the plan is to enable use of C++ in the codebase. Fully taking advantage of it is for the following years. Here's the initial sketch for the transition, which will happen after 4.20 is branched (in the coming weeks):
As in: open the door, step through it and then close it. There will be no looking back.
The work on step 1 is already underway. Much of the required cleanup work from the "2010 expedition" was merged into the RPM codebase despite remaining in C, so this initial transition step is mostly about adding explicit casts in the places that C++ requires and C doesn't. Some new incompatibilities have appeared since then of course.
The public C API will remain of course, and will be the only available API in 6.0. There will eventually be a C++ native API alongside it though. For 6.0, the only C++ related goal is to enable use of C++ in the major data structures of rpm internally. In the process some amount of code will of course be moved to native C++, but there are no coverage goals.
Of course, moving to C++ is no panacea. It is a ridiculously large and complicated language, there will be quite the learning curve for us. But, we do believe that this will eventually pay back big time in terms of developer productivity for both old and new, and co-operation with other teams and projects in the RPM stack.
We have no intentions of going haywire with abstract design patterns here. The main interest initially is to gain access to more advanced data structures and RAII, but certainly there are places in rpm that will be nicer in proper OOP.
This transition also concludes our build infra upgrade: first we made the leap from autotools to cmake, then fakechroot to containers and as the final piece, the codebase itself gets dragged to this millenium. The initial plan is to target C++17 as this seems mature and widely available.
There are of course a thousand details not answered by this announcement, so ask away.
Beta Was this translation helpful? Give feedback.
All reactions