This is a transcript of What's Up With That Episode 1, a 2022 video discussion between Sharon ([email protected]) and Dana ([email protected]).
The transcript was automatically generated by speech-to-text software. It may contain minor errors.
Welcome to the first episode of What’s Up With That, all about pointers! Our special guest is C++ expert Dana. This talk covers smart pointer types we have in Chrome, how to use them, and what can go wrong.
Notes:
Links:
- Life of a Vulnerability
- MiraclePtr
0:00 SHARON: Hi, everyone, and welcome to the first installment of "What's Up With That", the series that demystifies all things Chrome. I'm your host, Sharon, and today's inaugural episode will be all about pointers. There are so many types of types - which one should I use? What can possibly go wrong? Our guest today is Dana, who is one of our Base and C++ OWNERS and is currently working on introducing Rust to Chromium. Previously, she was part of bringing C++11 support to the Android NDK and then to Chrome. Today, she'll be telling us what's up with pointers. Welcome, Dana!
00:31 DANA: Thank you, Sharon. It's super exciting to be here. Thank you for letting me be on your podcast thingy.
00:36 SHARON: Yeah, thanks for being the first episode. So let's just jump right in. So when you use pointers wrong, what can go wrong? What are the problems? What can happen?
00:48 DANA: So pointers are a big cause in security problems for Chrome, and that's what we mostly think about when things go wrong with pointers. So you have a pointer to some thing, like you've pointed to a goat. And then you delete the goat, and you allocate some new thing - a cow. And it gets stuck in the same spot. Your pointer didn't change. It's still pointing to what it thinks is a goat, but there's now a cow there. And so when you go to use that pointer, you use something different. And this is a tool that malicious actors use to exploit software, like Chrome, in order to gain access to your system, your information, et cetera.
01:39 SHARON: And we want to avoid those. So what's that general type of attack called?
01:39 DANA: That's a Use-After-Free because you have freed the goat and replaced it with a cow. And you're using your pointer, but the thing it pointed to was freed. There are other kinds of pointer badness that can happen. If you take a pointer and you add to it some number, or you go to an offset off the pointer, and you have an array of five things, and you go and read 20, or minus 2, or something, now you're reading out of bounds of that memory allocation. And that's not good. these are both memory safety bugs that occur a lot with pointers.
02:23 SHARON: Today, we'll be mostly looking at the Use-After-Free kind of bugs. We definitely see a lot of those. And if you want to see an example of one being used, Dana has previously done a talk called, "Life of a Vulnerability." It'll be linked below. You can check that out. So that being said, should we ever be using just a regular raw pointer in C++ in Chrome?
02:41 DANA: First of all, let's call them native pointers. You will see them
called raw pointers a lot in literature and stuff. But later on, we'll see why
that could be a bit ambiguous in this context. So we'll call them a native
pointer. So should you use a native pointer? If you don't want to
Use-After-Free, if you don't want a problem like that, no. However, there is a
performance implication with using smart pointers, and so the answer is yes.
The style guide that we have right now takes this pragmatic approach of saying
you should use raw pointers for giving access to an object. So if you're
passing them as a function parameter, you can share it as a pointer or a
reference, which is like a pointer with slightly different rules. But you
should not store native pointers as fields and objects because that is a place
where they go wrong a lot. And you should not use a native pointer to express
ownership. So before C++11, you would just say, this is my pointer, use a
comment, say this one is owning it. And then if you wanted to pass the
ownership, you just pass this native pointer over to something else as an
argument, and put a comment and say this is passing ownership. And you just
kind of hope it works out. But then it's very difficult. It requires the
programmer to understand the whole system to do it correctly. There is no help.
So in C++11, the type called std::optional_ptr
- or sorry, std::unique_ptr
-
was introduced. And this is expressing unique ownership. That's why it's
called unique_ptr
. And it's just going to hold your pointer, and when it goes
out of scope, it gets deleted. It can't be copied because it's unique
ownership. But it can be moved around. And so if you're going to express
ownership to an object in the heap, you should use a unique_ptr
.
04:48 SHARON: That makes sense. And that sounds good. So you mentioned smart
pointers before. You want to tell us a bit more about what those are? It sounds
like unique_ptr
is one of those.
04:55 DANA: Yes, so a smart pointer, which can also be referred to as a
pointer-like object, perhaps as a subset of them, is a class that holds inside
of it a pointer and mediates access to it in some way. So unique_ptr
mediates access by saying I own this pointer, I will delete this pointer when I
go away, but I'll give you access to it. So you can use the arrow operator or
the star operator to get at the underlying pointer. And you can construct them
out of native pointers as well. So that's an example of a smart pointer.
There's a whole bunch of smart pointers, but that's the general idea. I'm going
to add something to what a native pointer is, while giving you access to it in
some way.
05:40 SHARON: That makes sense. That's kind of what our main thing is going to
be about today because you look around in Chrome, you'll see a lot of these
wrapper types. It'll be a unique_ptr
and then a type. And you'll see so many
types of these, and talking to other people, myself, I find this all very
confusing. So we'll cover some of the more common types today. We just talked
about unique pointers. Next, talk about absl::optional
. So why don't you tell
us about that.
06:10 DANA: So that's actually a really good example of a pointer-like object
that's not actually holding a pointer, so it's not a smart pointer. But it
looks like one. So this is this distinction. So absl::optional
, also known as
std::optional
, if you're not working in Chromium, and at some point, we will
hopefully migrate to it, std::optional
and absl::optional
hold an object
inside of it by value instead of by pointer. This means that the object is held
in that space allocated for the optional
. So the size of the optional
is
the size of the thing it's holding, plus some space for a presence flag.
Whereas a unique_ptr
holds only a pointer. And its size is the size of a
pointer. And then the actual object lives elsewhere. So that's the difference
in how you can think about them. But otherwise, they do look quite similar. An
optional
is a unique ownership because it's literally holding the object
inside of it. However, an optional
is copyable if the object inside is
copyable, for instance. So it doesn't have quite the same semantics. And it
doesn't require a heap allocation, the way unique_ptr
does because it's
storing the memory in place. So if you have an optional
on the stack, the
object inside is also right there on the stack. That's good or bad, depending
what you want. If you're worried about your object sizes, not so good. If
you're worried about the cost of memory allocation and free, good. So this is
the trade-off between the two.
07:51 SHARON: Can you give any examples of when you might want to use one
versus the other? Like you mentioned some kind of general trade-offs, but any
specific examples? Because I've definitely seen use cases where unique_ptr
is
used when maybe an optional
makes more sense or vice versa. Maybe it's just
because someone didn't know about it or it was chosen that way. Do you have any
specific examples?
08:14 DANA: So one place where you might use a unique_ptr
, even though
optional
is maybe the better choice, is because of forward declarations. So
because an optional
holds the type inside of it, it needs to know the type
size, which means it needs to know the full declaration of that type, or the
whole definition of that type. And a unique_ptr
doesn't because it's just
holding a pointer, so it only needs to know the size of a pointer. And so if
you have a header file, and you don't want to include another header file, and
you just want to forward declare the types, you can't stick an optional of that
type right there because you don't know how big it's supposed to be. So that
might be a case where it's maybe not the right choice, but for other
constraining reasons, you choose to use a unique_ptr
here. And you pay the
cost of a heap allocation and free as a result. But when would you use an
optional
? So optional
is fantastic for returning a value sometimes. I want
to do this thing, and I want to give you back a result, but I might fail. Or
sometimes there's no value to give you back. Typically, before C++ - what are
we on now, was it came in 14? I'm going to say it wrong. That's OK. Before we
had absl::optional
, you would have to do different tricks. So you would pass
in a native pointer as a parameter and return a bool as the return value to say
did I populate the pointer. And yes, that works. But it's easy to mess it up.
It also generates less optimal code. Pointers cause the optimizer to have
troubles. And it doesn't express as nicely what your intention is. A return,
this thing, sometimes. And so in place of using this pointer plus bool, you can
put that into a single type, return an optional
. Similar for holding
something as a field, where you want it to be held inline in your class, but
you don't always have it present, you can do that with an optional
now, where
you would have probably used a pointer before. Or a union
or something, but
that gets even more tricky. And then another place you might use it as a
function argument. However, that's usually not the right choice for a function
argument. Why? Because the optional
holds the value inside of it.
Constructing an optional
requires constructing the whole object inside of it.
And so that's not free. It can be arbitrarily expensive, depending on what your
type is. And if your caller to your function doesn't have already an
optional
, they have to go and construct it to pass it to you. And that's a
copy or move of that inner type. So generally, if you're going to receive a
parameter, maybe sometimes, the right way to spell that is just to pass it as a
native pointer, which can be null, when it's not present.
11:29 SHARON: Hopefully that clarifies some things for people who are trying to
decide which one best suits their use case. So moving on from that, some people
might remember from a couple of years ago that instead of being called
absl::optional
, it used to be called base::optional
. And do you want to
quickly mention why we switched from base
to absl
? And you mentioned even
switching to std::optional
. Why this transition?
11:53 DANA: Yeah, absolutely. So as the C++ standards come out, we want to use
them, but we can't until our toolchain is ready. What's our toolchain? It's our
compiler, our standard library - and unfortunately, we have more than one
compiler that we need to worry about. So we have the NaCl compiler. Luckily, we
just have Clang for the compiler choice we really have to worry about. But we
do have to wait for these things to be ready, and for a code base to be ready
to turn on the new standard because sometimes there are some non-backwards
compatible changes. But we can forward port stuff out of the standard library
into base. And so we've done that. We have a bunch of C++20 backports in base
now. We had 17 backports before. We turned on 17, now they should hopefully be
gone. And so base::optional
was an example of a backport, while optional
was still considered experimental in the standard library. We adopted use of
absl
since then, and absl
had also, essentially, a backport of the
optional
type inside of it for presumably the same reasons. And so why have
two when you can have one? That's a pretty good rule. And so we deprecated the
base
one, removed it, and moved everything to the absl
one. One thing to
note here, possibly interest, is we often add security hardening to things in
base
. And so sometimes there is available in the standard library something.
But we choose not to use it and use something in base
or absl
, but we use
it in base
instead, because we have extra hardening checks. And so part of
the process of removing base::optional
and moving to absl::optional
was
ensuring those same security hardening checks are present in absl
. And we're
going to have to do the same thing to stop using absl
and start using the
standard one. And that's currently a work in progress.
13:48 SHARON: So let's go through some of the base
types because that's
definitely where the most of these kind of wrapper types live. So let's just
start with one that I learned about recently, and that's a scoped_refptr
.
What's that? When should we use it?
13:59 DANA: So scoped_refptr
is kind of your Chromium equivalent to
shared_ptr
in the standard library. So if you're familiar with that, it's
quite similar, but it has some slight differences. So what is scoped_refptr
?
It gives you shared ownership of the underlying object. And it's a smart
pointer. It holds a pointer to an object that's allocated in the heap. When all
scoped_refptr
that point to the same object are gone, it'll be deleted. So
it's like unique_ptr
, except it can be copied to add to your ref count,
basically. And when all of them are gone, it's destroyed. And it gives access
to the underlying pointer in exactly the same ways. Oh, but why is it different
than shared_ptr
? I did say it is. scoped_refptr
requires the type that is
held inside of it to inherit from RefCounted
or RefCountedThreadSafe
.
shared_ptr
doesn't require this. Why? So shared_ptr
sticks an allocation
beside your object and then puts your object here. So the ref count is
externalized to your object being stored and owned by the shared pointer.
Chromium took this position to be doing intrusive ref counting. So because we
inherit from a known type, we stick the ref count in that base class,
RefCounted
or RefCountedThreadSafe
. And so that is enforced by the
compiler. You must inherit from one of these two in order to be stored and
owned in a scoped_refptr
. What's the difference? RefCounted
is the default
choice, but it's not thread safe. So the ref counting is cheap. It's the more
performant one, but if you have a scoped_refptr
on two different threads
owning the same object, their ref counting will race, can be wrong, you can end
up with a double free - which is another way that pointers can go wrong, two
things freeing the same thing - or you could end up with potentially not
freeing it at all, probably. I guess I've never checked if that's possible. But
they can race, and then bad things happen. Whereas, RefCountedThreadSafe
gives you atomic ref counting. So atomic means that across all threads, they're
all going to have the same view of the state. And so it can be used across
threads and be owned across threads. And the tricky part there is the last
thread that owns that object is where it's going to be destroyed. So if your
object's destructor does things that you expect to happen on a specific thread,
you have to be super careful that you synchronize which thread that last
reference is going away on, or it could explode in a really flaky way.
17:02 SHARON: This sounds useful in other ways. What are some kind of more
design things to consider, in terms of when a scoped_refptr
is useful and
does help enforce things that you want to enforce, like relative lifetimes of
certain objects?
17:15 DANA: Generally, we recommend that you don't use ref counting if you can
help it. And that's because it's hard to understand when it's going to be
destroyed, like I kind of alluded to with the thread situation. Even in a
single thread situation, how do you know which one is the last reference? And
is this object going to outlive that other object? Maybe sometimes. It's not
super obvious. It's a little more clear with a unique_ptr
, at least local to
where that unique_ptr
's destruction is. But there's usually no
scoped_refptr
. You can say this is the last one. So I know it's gone after
this thing is gone. Maybe it is, maybe it's not, often. So it's a bit tricky.
However, there are scenarios when you truly want a bunch of things to have
access to a piece of data. And you want that data to go away when nobody needs
it anymore. And so that is your use case for a scoped_refptr
. It is nicer
when that thing being with shared ownership is not doing a lot of interesting
things, especially in its destructor because of the complexity that's involved
in shared ownership. But you're welcome to shoot yourself in the foot with this
one if you need to.
18:33 SHARON: We're hoping to help people not shoot themselves in the foot. So
use scoped_refptr
carefully, is the lesson there. So you mentioned
shared_ptr
. Is that something we see much of in Chrome, or is that something
that we generally try to avoid in terms of things from the standard library?
18:51 DANA: That is something that is banned in Chrome. And that's just
basically because we already have scoped_refptr
, and we don't want two of the
same thing. There's been various times where people have brought up why do we
need to have both? Can we just use shared_ptr
now? And nobody's ever done the
kind of analysis needed to make that kind of decision. And so we stay with what
we're at.
19:18 SHARON: If you want to do that, there's someone that'll tell you what to
do. So something that when I was using scoped_refptr
, I came across that you
need a WeakPtrFactory to create such a pointer. So weak pointers and WeakPtr
factories are one of those things that you see a lot in Chrome and one of these
base things. So tell us a bit about weak pointers and their factories.
19:42 DANA: So WeakPtr and WeakPtrFactory have a bit of an interesting history.
Their major purpose is for asynchronous work. Chrome is basically a large
asynchronous machine, and what does that mean? It means that we break all of
the work of Chrome up into small pieces of work. And every time you've done a
piece, you go and say, OK, I'm done. And when the next piece is ready, run this
thing. And maybe that next thing is like a user input event, maybe that's a
reply from the network, whatever it might be. And there's just a ton of steps
in things that happen in Chrome. Like, a navigation has a request, a response,
maybe another request - some redirects, whatever. That's an example of tons of
smaller asynchronous tasks that all happen independently. So what goes wrong with
asynchronous tasks? You don't have a continuous stack frame. What does that
mean? So if you're just running some synchronous code, you make a variable, you
go off and you do some things, you come back. Your variable is still here,
right? You're in this stack frame and you can keep using it. You have
asynchronous tasks. You make a variable, you go and do some work, and you are
done your task. Boop, your stack's gone. You come back later, you're going to
continue. You don't have your variable anymore. So any state that you want to
keep across your various tasks has to be stored and what we call bound in with
that task. If that's a pointer, that's especially risky. So we talked earlier
about Use-After-Frees. Well, you can, I hope, imagine how easy it is to stick a
pointer into your state. This pointer is valid, I'm using it. I go away, I come
back when? I don't know, sometime in the future. And I'm going to go use this
pointer. Is it still around? I don't own it. I didn't use a unique_ptr
. So
who owns it? How do they know that I have a task waiting to use it? Well,
unless we have some side channel communicating that, they don't. And how do I
know if they've destroyed it if we don't have some side channel communicating
that? I don't know. And so I'm just going to use this pointer and bad things
happen. Your bank account is gone.
22:06 SHARON: No! My bank account!
22:06 DANA: I know. So what's the side channel? The side channel that we have is WeakPtr. So a WeakPtr and WeakPtrFactory provide this communication mechanism where WeakPtrFactory watches an object, and when the object gets destroyed, the WeakPtrFactory inside of it is destroyed. And that sets this little bit that says, I'm gone. And then when your asynchronous task comes back with its pointer, but it's a WeakPtr inside of it and tries to run, it can be like, am I still here? If the WeakPtrFactory was destroyed, no, I'm not. And then you have a choice of what to do at that point. Typically, we're like, abandon ship. Don't do anything here. This whole task is aborted. But maybe you do something more subtle. That's totally possible.
22:59 SHARON: I think the example I actually meant to say that uses a WeakPtrFactory is a SafeRef, which is another base type. So tell us a bit about SafeRefs.
23:13 DANA: WeakPtr is cool because of the side channel that you can examine. So you can say are you still alive, dear object? And it can tell you, no, it's gone. Or yeah, it's here. And then you can use it. The problem with this is that in places where you as the code author want to believe that this object is actually always there, but you don't want a security bug if you're wrong. And it doesn't mean that you're wrong now, even. Sometime later, someone can change code, unrelated to where this is, where the ownership happens, and break you. And maybe they don't know all the users of a given object and changing its lifetime in some subtle way, maybe not even realizing they are. Suddenly you're eventually seeing security bugs. And so that's why native pointers can be pretty scary. And so SafeRef is something we can use instead of a native pointer to protect you against this type of bug. It's built on top of WeakPtr and WeakPtrFactory. That's its relationship, but its purpose is not the same. so what SafeRef does is it says - SafePtr?
24:31 SHARON: SafeRef.
24:31 DANA: SafeRef.
24:31 SHARON: I think there's also a safe pointer, but there -
24:38 DANA: We were going to add it. I'm not sure if it's there yet. But so two differences between SafeRef and WeakPtr then, ref versus ptr, it can't be null. So it's like a reference wrapper. But the other difference is you can't observe whether the object is actually alive or not. So it has the side channel, but it doesn't show it to you. Why would you want that? If the information is there anyway, why wouldn't you want to expose it? And the reason is because you are documenting that you as the author understand and expect that this pointer is always valid at this time. It turns out it's not valid. What do you do? If it's a WeakPtr, people tend to say, we don't know if it's valid. It's a WeakPtr. Let's check. Am I valid? And if I'm not, return. And what does that result in? It results in adding a branch to your code. You do that over, and over, and over, and over, and static analysis, which is what we as humans have to do - we're not running the program, we're reading the code - can't really tell what will happen because there's so many things that could happen. We could exit here, we could exit there, we could exit here. Who knows. And that makes it increasingly hard to maintain and refactor the code. So SafeRef gives you the option to say this is always going to be valid. You can't check it. So if it's not valid, go fix that bug somewhere else. It should be valid here.
26:16 SHARON: So what kind of -
26:16 DANA: The assumptions are broken.
26:16 SHARON: So what kind of errors happen when that assumption is broken? Is that a crash? Is that a DCHECK kind of thing?
26:22 DANA: For SafeRef and for WeakPtr, if you try to use it without checking it, or write it incorrectly, they will crash. And crashing in this case means a safe crash. It's not going to lead to a security bug. It's literally just terminating the program.
26:41 SHARON: Does that also mean you get a sad tab as a user? Like when the little sad file comes up?
26:47 DANA: Yep. It would. If you're in the render process, you take it down. It's a sad tab. So that's not great. It's better than a security bug. Because your options here are don't write bugs. Ideal. I love that idea, but we know that bugs happen. Use a native pointer, security problem. Use a WeakPtr, that makes sense if you want it to sometimes not be there. But if you want it to always be there - because you have to make a choice now of what you're supposed to do if it's not, and it makes the code very hard to understand. And you're only going to find out it can't be there through a crash anyhow. Or use a SafeRef. And it's going to just give you the option to crash. You're going to figure out what's wrong and make it no longer do that.
27:38 SHARON: I think wanting to guarantee the lifetime of some other things
seems like a pretty common thing that you might come across. So I'm sure there
are many cases for many people to be adding SafeRefs to make their code a bit
safer, and also ensure that if something does go wrong, it's not leading to a
memory bug that could be exploited in who knows how long. Because we don't
always hear about those. If it crashes, and they can reliably crash, at least
you know it's there. You can fix it. If it's not, we're hoping that one of our
VRP vulnerability researchers find it and report it, but that doesn't always
happen. So if we can know about these things, that's good. So another new type
in base that people might have been seeing recently is a raw_ptr
which is
maybe why earlier we were saying let's call them native pointers, not raw
pointers. Because the difference between raw_ptr
and raw pointer, very easy
to mix those up. So why don't you tell us a bit about raw_ptr
s?
28:40 DANA: So raw_ptr
is really cool. It's a non-owning smart pointer. So
that's kind of like WeakPtr or SafeRef. These are also non-owning. And it's actually
very similar in inspiration to what WeakPtr is. So it has a side channel where
it can see if the thing it's pointing to is alive or gone. So for WeakPtr, it
talks to the WeakPtrFactory and says "am I deleted?" And for raw_ptr
, what it
does is it keeps a reference count, kind of like scoped_refptr
, but it's a
weak reference count. It's not owning. And it keeps this reference count in the
memory allocator. So Chrome has its own memory allocator for new
and delete
called PartitionAlloc. And that lets us do some interesting stuff. And this is
one of them. And so what happens is as long as there is raw_ptr
around, this
reference count is non-zero. So even if you go and you delete the object, the
allocator knows there is some pointer to it. It's still out there. And so it
doesn't free it. It holds it. And it poisons the memory, so that just means
it's going to write some bit pattern over it, so it's not really useful
anymore. It's basically re-initialized the memory. And so later, if you go and
use this raw_ptr
, you get access to just dead memory. It's there, but it's
not useful anymore. You're not going to be able to create security bugs in the
same way. Because when we first started talking about a Use-After-Free - you
have your goat, you free it, a cow is there, and now your pointer is pointing
at the wrong thing - you can't do that because as long as there's this
raw_ptr
to your goat, the goat can be gone, but nothing else is going to come
back here. It's still taken by that poisoned memory until all the raw_ptr
s
are gone. So that's their job, to protect us from a Use-After-Free being
exploitable. It doesn't necessarily crash when you use it incorrectly, you just
get to use this bad memory inside of it. If you try to use it as a pointer,
then you're using a bad pointer, you're going to probably crash. But it's a
little bit different than a WeakPtr, which is going to deterministically crash
as soon as you try to use it when it's gone. It's really just a protection or a
mitigation against security exploits through Use-After-Free. And then we
recently just added raw_ref
, which is really the same as raw_ptr
, except
addressing nullability. So smart pointers in C++ have historically all allowed
a null state. That's representative of what native pointers did in C and C++.
And so this is kind of just bringing this along in this obvious, historical
way. But if you look at other languages that have been able to break with
history and make their own choices kind of fresh, we see that they make choices
like not having null pointers, not having null smart pointers. And that
increases the readability and the understanding of your code greatly. So just
like for WeakPtr, how we said, we just check if it's there or not. And if it's
not, we return, and so on. It's every time you have a WeakPtr, if you were
thinking of a timeline, every time you touch a WeakPtr, your timeline splits.
And so you get this exponential timeline of possible states that your
software's in. That's really intense. Whereas every time you can not do that,
say this can't be null, so instead of WeakPtr, you're using SafeRef. This can't
be not here or null, actually - WeakPtr can just be straight up null - this is
always present. Then you don't have a split in your timeline, and that makes it
a lot easier to understand what your software is doing. And so for raw_ptr
,
it followed this historical precedent. It lets you have a null value inside of
it. And raw_ref
is our kind of modern answer to this new take on nullability.
And so raw_ref
is a reference wrapper, meaning it holds a reference inside of
it, conceptually, meaning it just can't be null. That is just basically - it's
a pointer, but it can't be null.
33:24 SHARON: So these do sound the most straightforward to use. So basically,
if you're not sure - for your class members at least - any time you would use a
native pointer or an ampersand, basically you should always just put those in
either a raw_ptr
or a raw_ref
, right?
33:45 DANA: Yeah, that's what our style guide recommends, with one nuance. So
because raw_ptr
and raw_ref
interact with the memory allocator, they have
the ability to be like, turned on or off dynamically at runtime. And there's a
performance hit on keeping this reference count around. And so at the moment,
they are not turned on in the renderer process because it's a really
performance-critical place. And the impact of security bugs there is a little
less than in the browser process, where you just immediately get access to the
whole system. And so we're working on turning it on there. But if you're
writing code that's only in the renderer process, then there's no point to use
it. And we don't recommend that you use it. But the default rule is yes. Don't
use a native pointer, don't use a native reference. As a field to an object,
use a raw_ptr
, use a raw_ref
. Prefer raw_ref
- prefer something with less states, always,
because you get less branches in your timeline. And then you can make it
const
if you don't want it to be able to rebound to a new object, if you
don't want the pointer to change. Or you can make it mutable if you wanted to
be able to.
34:58 SHARON: So you did mention that these types are ref counted, but earlier you said that you should avoid ref counting things. So -
35:04 DANA: Yes.
35:11 SHARON: So what's the balance there? Is it because with a
scoped_refptr
, you're a bit more involved in the ref counting, or is it just,
this is we've done it for you, you can use it. This is OK.
35:19 DANA: No, this is a really good question. Thank you for asking that. So
there's two kinds of ref counts going on here. I tried to kind of allude to it,
but it's great to make it clear. So scoped_refptr
is a strong ref count,
meaning the ref count owns the object. So the destructor runs, the object is
gone and deleted when that ref count goes to 0. raw_ref
and raw_ptr
are a
weak ref count. They could be pointing to something owned in a
scoped_refptr
even. So they can exist at the same time. You can have both
kind of ref counts going at the same time. A weak ref count, in this case, is
holding the memory alive so that it doesn't get re-used. But it's not keeping
the object in that memory alive. And so from a programming state point-of-view,
the weak refs don't matter. They're helping protect you from security bugs.
When things go wrong, when a bug happens, they're helping to make it less
impactful. But they don't change your program in a visible way. Whereas, strong
references do. That destrutor's timing is based on when the ref count goes to 0
for a strong reference. So that's the difference between these two.
36:46 SHARON: So when you say don't use ref counting, you mean don't use strong ref counting.
36:46 DANA: I do, yes.
36:51 SHARON: And if you want to learn more about the raw pointer, raw_ptr
,
raw_ref
, that's all part of the MiraclePtr project, and there's a talk about
that from BlinkOn. I'll link that below also. So in terms of other base types,
there's a new one that's called base::expected
. I haven't even really seen
this around. So can you tell us a bit more about how we use that, and what
that's for?
37:09 DANA: base::expected
is a backport from C++23, I want to say. So the
proposal for base::expected
actually cites a Rust type as inspiration, which
is called std::result
in Rust. And it's a lot like optional
, so it's used
for return values. And it's more or less kind of a replacement for exceptions.
So Chrome doesn't compile with exceptions enabled even, so we've never relied
on exceptions to report errors. But we have to do complicated things, like with
optional
to return a bool or an enum. And then maybe some value. And so this
kind of compresses all that down into a single type, but it's got more state
than just an option. So expected
gives you two choices. It either returns
your value, like optional
can, or it returns an error. And so that's the
difference between optional
and expected
. You can give a full error type.
And so this is really useful when you want to give more context on what went
wrong, or why you're not returning the value. So it makes a lot of sense in
stuff like file IO. So you're opening a file, and it can fail for various
reasons, like I don't have permission, it doesn't exist, whatever. And so in
that case, the way you would express that in a modern way would be to return
base::expected
of your file handle or file class. And as an error, some
enumerator, perhaps, or even an object that has additional state beyond just I
couldn't open the file. But maybe a string about why you couldn't open the file
or something like this. And so it gives you a way to return a structured error
result.
39:05 SHARON: That's found useful in lots of cases. So all of these types are making up for basically what is lacking in C++, which is memory safety. C++, it does a lot. It's been around for a long time. Most of Chrome is written in it. But there are all these memory issues. And a lot of our security bugs are a result of this. So you are working on bringing Rust to Chromium. Why is that a good next step? Why does that solve these problems we're currently facing?
39:33 DANA: So Rust has some very cool properties to it. Its first property that is really important to this conversation is the way that it handles pointers, which in Rust would be treated pretty much exclusively as references. And what Rust does is it requires you to tell the compiler the relationships between the lifetimes of your references. And the outcome of this additional knowledge to the compiler is memory safety. And so what does that mean? It means that you can't write a Use-After-Free bug in Rust unless you're going into the unsafe part of the language, which is where scariness exists. But you don't need to go there to write a normal program. So we'll ignore it. And so what that means is you can't write the bug. And so that doesn't just mean I also like to believe I can write C++ without a bug. That's not true. But I would love to believe that. But it means that later, when I come back and refactor my code, or someone comes who's never seen this before and fixes some random bug somewhere related to it, they can't introduce a Use-After-Free either. Because if they do, the compiler is like, hey - it's going to outlive it. You can't use it. Sorry. And so there's this whole class of bugs that you never have to debug, you never ship, they never affect users. And so this is a really nice promise, really appealing for a piece of software like Chrome, where our basic purpose is to handle arbitrary and adversarial data. You want to be able to go on some web page, maybe it's hostile, maybe not. You just get a link. You want to be able to click that link and trust that even if it's really hostile and wanting to destroy you, it can't. Chrome is that safety net for you. And so Rust is that kind of safety net for our code, to say no matter how you change it over time, it's got your back. You can't introduce this kind of bug.
42:03 SHARON: So this Rust project sounds really cool. If people want to learn more or get involved - if you're into the whole languages, memory safety kind of thing - where can people go to learn more?
42:09 DANA: So if you're interested in helping out with our Rust experiment, then you can look for us in the Rust channel on Slack. If you're interested in C++ language stuff, you can find us in the CXX channel on Slack, as well. As well as the [email protected] mailing list. And there is, of course, the [email protected] mailing list if you want to use email to reach us as well.
42:44 SHARON: Thank you very much, Dana. There will be notes from all of this also linked in the description box. And thank you very much for this first episode.
42:52 DANA: Thanks, Sharon This was fun.