Reading for 11/2: SELF #411

AliceSzzze · 2023-10-31T04:33:46Z

AliceSzzze
Oct 31, 2023

Hi all! Here is the discussion thread for An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes! Looking forward to all your thoughts and comments :)

stephenverderame · 2023-11-01T23:02:10Z

stephenverderame
Nov 1, 2023

A common complaint with template instantiation in languages like C++ is the code bloat, and it sounds like SELF is essentially doing a similar thing with compiling special versions of methods for every receiver. I guess it's not as bad as C++ since message splitting would allow the compiler to reuse parts of the different methods, but at its core, SELF seems to be generating a lot of copies of methods. The paper does mention employing some limitations on inlining based on method or block size which does help prevent a combinatorial explosion for methods that might themselves call a bunch of other methods, but other than that, it doesn't seem like the authors address this too much.

From a quick glance, it seems that more recent JITs employ a set of heuristics for determining when to inline things, reuse parts of traces, and compile code to avoid excessive code bloat.

Overall, it seems (from a very cursory survey) that code bloat in language mechanics like C++ templates have much more "press coverage" than potentially duplicated bits of code in JITs. I'm not sure if that's due to the contexts in which C++ is generally used caring more about this type of thing, or if the primary concern is on executable size. Or maybe it's not too much of a concern with JITs because they might only compile function and argument-type pairs that are hot or only small, potentially reusable, bits of a function.

A final unrelated thought is that I found their invention of a new metric (MiMs) kind of funny, especially being pretty unfamiliar with message-passing paradigms. They didn't really use it to compare to other compilers, so I suppose it wasn't really concerning. What was more concerning was the comparisons of wall clock and CPU time, despite their assertions that the comparison is valid. If the wall clock and CPU times were basically identical for the SELF benchmarks as they claim, then why not put them in the paper?

3 replies

willwng Nov 2, 2023

I think you make a great point - I would like to have seen some more benchmarks, especially how performance scales as the number of objects/types of objects increases. Most of the benchmarks (Stanford integer benchmark), with the exception of the Richards benchmarks, I would assume don't really stress the number of objects. If the "integer" operations is always "hot" then this is probably the best case scenario for message splitting and inlining.

I'm also confused about why only the wall/real time was used for the Smalltalk benchmarks - I'm a little more skeptical since they are using transliterations for the Smalltalk and SELF benchmarks.

collinzrj Nov 2, 2023

One technique they apply to reduce the code size if to only compile customized code lazily. At the initial compilation, only type inferenced code blocks will have multiple versioned. While at the runtime, the SELF compiler traces the execution of code blocks and keep track of the type information, then produces a more efficient code with the type information.

sampsyo Nov 2, 2023
Maintainer

To state the maybe-obvious, I can think of a couple of straightforward reasons why the code-size expansion of specialization in a JIT is different from copious C++ template monomorphization:

It can plausibly only affect the code that is actually executed, rather than conservatively expanding everything up front. This is more or less what @collinzrj said.
It only affects the code size in memory, not on disk. So the downsides are (1) just plain larger memory footprint for the process, and (2) instruction cache pressure. (1) seems unlikely to be a big deal compared to the heap. (2) is probably the more important one and the reason why heuristics are required to avoid over-specialization.

keikun555 · 2023-11-02T00:51:00Z

keikun555
Nov 2, 2023

In the conclusions the authors write:

Researchers seeking to improve performance should improve their compilers instead of compromising their languages.

How much do we agree with this statement?

6 replies

vivianyyd Nov 2, 2023

The last bit of this comment reminds me of a talk I heard about work on the OCaml compiler by JS, which would allow users to specify memory layout (kinds) and whether values can be stack allocated (modes). Their goal was to make these features opt-in, so code written by the typical user just looks "normal".

I guess this is different from what the authors of this paper are aiming for, which is performance while specifically keeping things we want from dynamically-typed languages. But is there any opportunity for taking a similar "opt-in to more restrictions" approach for the problems the authors try to solve?

zachary-kent Nov 2, 2023

I don't think I really agree with this, or the insinuation the authors make that statically typed languages somehow "compromise" flexibility. Designing languages (and type systems) around performance, like affine types in rust, has obviously seen great success.

vivianyyd Nov 2, 2023

That may be true, but I think one of the reasons python for example is quite popular is how straightforward it is to write (no type annnotations). I think having to type less is always going to be a win for somebody

sampsyo Nov 2, 2023
Maintainer

Thanks for highlighting this quote! I think it is a really under-emphasized notion that, perhaps especially today, seems downright radical. I think it's really easy to see the downsides & limitations of this philosophy, but the fundamental trends behind the ideology are also pretty clear: computers keep getting faster, and the kinds of people who want to program them keep expanding. The radical notion is that we should be thinking first and foremost about how to meet people where they are to let them express what they want to write, and then take on the job of making that as efficient as we can.

In an indirect way, it reminds me a little bit about the economic reason why so many big companies have compiler teams, even if they are not compiler companies. As just one example, Stripe has a compiler team that makes a type checker for Ruby. In broad strokes, you can imagine Stripe making a choice between two options with the same outcome:

Ask everyone in the entire company working on backend stuff to just work harder to make their code more correct/secure.
Develop one reusable tool that can enhance the productivity/correctness/security of all backend teams' work.

While it's hard to compare, it seems totally obvious that—at least at some level of correctness objective—option 2 is a way cheaper route to the same outcome. The Ruby language is a "narrow waist" where cross-cutting improvements can be implemented. In this case, those improvements are in the form of a helpful correctness tool, but exactly the same economic trade-off exists for performance (e.g., why Meta has a substantial compiler team making the Hack VM go faster).

I think this proposition is similar to the Self paper's ideology because it suggests that we try to make tools that work better for programmers, rather than making programmers work to adapt to their tools. Of course, the Self philosophy as stated is more absolutist: never compromise the developer's experience in exchange for anything. But aspects of this philosophy seem very powerful to me.

rcplane Nov 2, 2023

These sentiments also apply to recent developments of generative AI chat and large language models (LLMs).
Avoiding compromise in language usability and user onboarding is an integral part of the rising popularity of generative AI chat tools that allow more people to quickly and easily generate short programs or even entire web applications that closely follow common patterns encountered during model training. Accepting hand drawings, speech to text, and natural english language phrases for programming inputs clearly increases the flexibility of user input compared to a standard IDE compilation toolchain.
Compiler improvement for performance using LLMs has so far focused more on compiler option selection instead of direct generations due to non-deterministic LLM generation behaviors (hallucinations). Direct generation works like Alpha Code and Alpha Tensor can find faster programs although slowly and in more limited settings than traditional compilation infrastructure. Arguably the largest code performance push using LLMs and machine learning more generally is directly targeting chip design to accelerate LLM inference. While this is not specifically improvement of a compiler, compiler performance simulations can be used in the process and so existing compilers are used to a greater possible extent of optimization.

ryanwmao · 2023-11-02T02:08:35Z

ryanwmao
Nov 2, 2023

The authors seem to tackle the challenge of implementing a dynamically-typed language efficiently, which is no small feat.
I thought that it was especially interesting to see a focus on prototypes as a basis for the language. How does may this have impacted the design and usage of objects in SELF? What advantages and disadvantages does this paradigm offer in terms of expressiveness and ease of development?

5 replies

MelindaFang-code Nov 2, 2023

I was also wondering the same thing. It seems that prototype-based languages is really seldomly used in real life (as far as I know I only know Javascript is inspired by SELF). I feel like prototype based models allows people to implement class-like behavior but there is less restriction. In essence an object is just an array of slots and some slots are marked as parent slots if we want to define some relationship

collinzrj Nov 2, 2023

I think prototype is actually very popular currently in programming languages. In java, a class can inherit from a prototype, in swift, a class can implement a protocol, which I think is basically prototype. In Rust, a struct implements a trait instead of inherit from a class. Nested inheritance also makes code complicated, there are actually some discussions against inheritance.

Enochen Nov 2, 2023

I'm not familiar with Swift, but I don't see the prototype object model being used in either Java or Rust. The distinction the paper makes about the fact that SELF is based on the prototype object model is that it has no classes. SELF clearly supports inheritance via its prototype model, just from a parent object rather than a parent class (which is arguably messier than class-based inheritance, an already messy concept due to the nested inheritance problem mentioned). I really like Rust's way of dealing with this problem (with traits), but I don't understand how it would be similar to SELF's philosophy/paradigm (to me they are polar opposites).

Another top-level comment mentioned the paper's stance of not trying to "compromise their language". In this case dealing with prototypes certainly allows for greater degrees of freedom if the programmer knows what they are doing. You are in a way encouraged to make all these "spin-off" objects with their own random properties, and everything should still work so long as you as a programmer can keep everything in your head and never make mistakes.

As probably the biggest contemporary language of this "class", JavaScript is incredibly flexible because of this kind of design, but comes with the tradeoff of complexity, such as when there are these long prototype chains that compose the object you're dealing with. In my own experience I see most people ignoring that language feature as much as possible when they are working in the JS ecosystem, especially once type systems are involved.

sampsyo Nov 2, 2023
Maintainer

I do agree that comparing Rust traits or Swift protocols to prototype-based inheritance may be a strained analogy. JavaScript is certainly the standard-bearer for Self-style prototypes, and even it has classes these days (presumably because prototypes, while powerful, are actually kind of confusing for people).

As probably the biggest contemporary language of this "class",

I see what you did there. 😂

collinzrj Nov 2, 2023

I think I see the problem, prototype in Java seems to be a different thing from the prototype in SELF

matth2k · 2023-11-02T02:36:49Z

matth2k
Nov 2, 2023

Wow this paper was very disorienting for me to read, because I did not know what Smalltalk is.

Also unlike Smalltalk, SELF accesses state solely by sending messages; there is no special syntax for accessing a variable or changing its value.

At this point, I was already lost at this point. The wikipedia page for SELF definitely helped explain why prototype-based programming exists. Moreover, it helped elaborate how messages are a unified why to interact with both fields and methods:

Note that there is no distinction in Self between fields and methods: everything is a slot. Since accessing slots via messages forms the majority of the syntax in Self, many messages are sent to "self", and the "self" can be left off (hence the name).

Hope these two points of clarifications help. After this little background research, I was much better equipped to read the paper.

This paper is definitely very interesting because the whole programming paradigm and model of execution is very unusual to me.,

2 replies

sampsyo Nov 2, 2023
Maintainer

Maybe an important bit of context that would help with grokking this paper is that Smalltalk/Self are from the era where object-oriented programming was, like, the hottest thing ever. In that sense, Self is an experiment in pure OOP: there is truly nothing you can do in the language that isn't a method call on an object. Heck, even an if statement is actually a method call on a boolean object! And you certainly can't access fields with some special non-method lookup construct; those are of course calls to accessor methods. Pure OOP was supposed to be the savior of all programmers.

jdroob Nov 2, 2023

Learning about SELF was also quite a paradigm shift for me as all of my understanding of OOP up to this point used the idea of a class as a jumping-off point. It was (and still kind of is) hard for me to wrap my head around prototype-based programming, however, SELF's dynamic inheritance is fascinating. It's easy to see how SELF affords the programmer more flexibility.

Lastly, from section 8.2: Open Issues

Finally, our implementation of type prediction hard-wires both the message names and the predicted type; a more dynamic implementation that used dynamic profile information or analysis of the SELF inheritance hierarchy might produce better, more adapting results.

From this very interesting talk by one of the authors, it looks like later implementations of SELF did use dynamic profile information to perform dynamic optimizations which is pretty cool to learn about, especially after the Dynamic Compilers lecture from Tuesday.

collinzrj · 2023-11-02T02:51:03Z

collinzrj
Nov 2, 2023

The message-passing style object oriented programming reminds me of objective-C, which is also heavily influenced by smalltalk. In my understanding, objective-C is developed upon C, and they want to add object oriented features to C. C is a static language, but objective C wants to add some dynamic features. In that case, message passing seems to be an interesting paradigm that allows us to have dynamic features in a static language. What are some other languages using message passing to implement their object oriented features?

1 reply

sampsyo Nov 2, 2023
Maintainer

Very much so. All method calls (message sends) in ObjC are semantically dispatched based on the string name, which is very much a Smalltalk idea.

It's a little hard to identify exactly what is a message send and what is simply a virtual function call, but on one end of the spectrum we have Python (pretty message-send-y, stuff is semantically dispatched based on the object's attribute dictionary) and on the other end we have C++ virtual methods (clearly just an indirect call based on a vtable lookup).

I don't know if this is straining the analogy too much, but how about the actor model as an extreme example of message-based calls, where the messages are possibly sent across a distributed system from machine to machine?

obhalerao · 2023-11-02T03:15:34Z

obhalerao
Nov 2, 2023

I echo the sentiment of not being familiar with Smalltalk, and as such not being familiar with this particular programming paradigm. Regardless, I found the author's description of the prototype-based programming paradigm to be interesting to read through, despite not quite fully grasping it. I'm both curious as to what its common use cases are in the modern day (if they exist), and also in general what would be a good way to better understand this paradigm.

2 replies

zachary-kent Nov 2, 2023

JavaScript is probably the most widely used prototype-oriented language

sampsyo Nov 2, 2023
Maintainer

The lessons from this paper, IMO, actually have very little to do with prototype-based inheritance. It is definitely true that JavaScript compilers must directly do something like the paper's approach to handing this. But the more important parts, and the reason this is on the 6120 reading list, apply to any dynamic language. Some examples include:

the pointer tagging tricks (reusing the low-order bits that would otherwise go unused in pointers)
type specialization ("customized compilation") and type prediction
inlining as an extremely powerful and general way to eliminate "fundamental" dynamism overhead

alifarahbakhsh · 2023-11-02T03:59:35Z

alifarahbakhsh
Nov 2, 2023

The combination of customized compilation, inlining, and splitting gives rise to a powerful set of static techniques for optimizing performance. It begs the question of how one could leverage these techniques - presumably partially - with run-time information. Specifically, I think adding some instrumentation at compile time in order to gather profiling information at run-time seems to me to be a good fit to the methods of this paper, especially since it seems to me that their compilation time can be a problem for big enough SELF programs. One can imagine partially carrying out some inlining, then gathering run-time info to, say, have more accurate predictions of the recipients of messages, and then returning to more inlining. It would be nice if the benchmarks covered compiling efficiency as well, but I guess they were not worried about that.

3 replies

sampsyo Nov 2, 2023
Maintainer

Indeed; this kind of thing is exactly where high-performance compilers for dynamic languages have gone since this paper. Profiling the run-type types of values turns out to be a great match for this kind of type-specializing compilation.

jiahanxie353 Nov 2, 2023

Agree, and I also thought about integrating runtime profiling information, which could take performance optimization to the next level. I feel like using profiling and adaptively optimize, we can for instance, prioritize inlining for certain methods that are profiled to be hotspots during execution.

It would be nice if the benchmarks covered compiling efficiency as well

Totally agree, I think it'd be beneficial to outline the compiling efficiency consider the trade-offs between compilation time and runtime performance, particularly when introducing/after introduced profiling.

As far as I know, programming language is pretty theory and mathy. So I was also wondering when researchers in PL design new languages or new language features, in what extend will they consider the aspect of compiling efficiency (since compiling is more "downstream") and from the aspect of system?

NgaiJustin Nov 2, 2023

Agree with Jiahan here. It is interesting to think about the implications of compilation time. In the context of AOT compilation, sacrifices can be made on the compilation time for better runtime performance. However, it is also important to consider developer efficiency. Especially for large programs, compilation time plays a crucial role in the overall development process and lengthy compilation times can be frustrating/hinder productivity. This is a perhaps problem space that incremental compilers try to tackle.

he-andy · 2023-11-02T06:01:39Z

he-andy
Nov 2, 2023

When reading this paper, I thought about what advantages a prototyping language like SELF had over a more classical object-oriented programming language like C++, as the language design itself seems very different from anything else that is taught in university. Some exploration led me to the "fragile base class problem," where in complicated inheritance structures, seemingly innocuous modifications to base classes, may cause the derived classes to malfunction. This is especially important in legacy codebases where refactoring is in order, but a break to the base classes could be costly and cause a much more intrusive refactor/rewrite. With dynamic linking, under prototyping languages, we can change superclasses without affecting pre-compiled binaries, and overall more flexible inheritance decisions that can be made at run-time. I've heard the quote that "all problems in computer science can be solved by another level of indirection" -- and I wonder how much that applies in such a case comparing SELF to a language like C++? Is this concept still relevant under these dynamically compiled languages?

4 replies

emwangs Nov 2, 2023

Yeah I think there's a lot of interesting discussion about how technical debt can actually manifest. I guess the one clear reason why someone might choose a compiled language over a dynamic language is that it has faster code velocity, but dynamic languages have a fairly important advantage where developer velocity is much faster. I feel like it is a reasonable tradeoff between the two; the strengths of a compiled OOP language is that a strong and well-designed inheritance structure can make things easier in the future, but coming up with that design requires more initial time investment, and if it doesn't work out, then it's much more costly to change (imagine if I had a complicated inheritance structure that all extended off of one Base at the top, then I'd be more vulnerable to the "fragile base class problem" you mentioned). Interesting point to think about!

rcplane Nov 2, 2023

This comparison between dynamic method overloading decisions in SELF and C++ reminds me of how Python is implemented over a C api, in order for programmers to just worry about high level semantics in Python including lots of dynamic behaviors, and the low level implementation handles the dispatch with indirection that actually looks up method names in object dictionaries. The popularity of python for scientific and numerical computing may indicate a solution arrived at by adding indirection.
Another contrasting point on the design space is RPython, a subset of Python subject to static analysis, which specifically aims to remove dynamic ambiguities for the sake of achieving Pypy performance improvement.

SanjitBasker Nov 2, 2023

I echo the same sentiment as @emwangs on the tradeoffs involved. I have had some good experiences using linters/static analyzers with Python, and I think that a lot of the advantages/disadvantages are realized based on programmers' practices with the language (you can write some very robust Python code with the right tools, but you can also write some very unsafe Java/C++ code if you use reflection mechanisms and such).

evanmwilliams Nov 2, 2023

I also think it's interesting how many optimizations the writers needed to implement to try and achieve the same performance gains that languages that are implemented statically have. These were quite interesting optimizations (I particularly enjoyed reading about inline caching), but it does signal that dynamically typed languages come with a trade off.

It was also interesting to think about the context of this paper. OOP languages have been such a fundamental tool for the past 40 years and it seems like languages nowadays are just starting to trend away from it. Like Andy mentioned with the fragile base class problem, there are a lot of things that object oriented programming forces programmers into that can lead to either leaky abstractions or other unforseen bugs. OOP still has a strong influence in a lot of other languages though, showing why it's important to understand the different ways to implement it well.

xalbt · 2023-11-02T06:46:51Z

xalbt
Nov 2, 2023

As many of the other commenters, I found SELF to be really fascinating as a language. It has inspired many of the most widely used languages and language frameworks today, like JavaScript and some implementations of JVM. I think one really fascinating aspect of SELF is its emphasis on an "exploratory programming environment," where programmers are encouraged to think of the language and IDE as one entity. It allowed programmers to directly modify objects while the program was running to allow programmers to explore the program more and to help debugging. (The paper also has sections about directly supporting this with incremental recompilation, while still maintaining low speed and memory usage.) This features seems to be quite rare and unique—but extremely useful—in today's languages. Most debuggers allow for some minimal version of this: where you can stop and step through code, execute arbitrary expressions, and modify program state, but you usually aren't able to modify the code directly without rerunning (and potentially recompiling) the entire program. I think the JavaScript web console and interactive notebooks (like jupyter) also capture some of this feature, but it is also not complete. I was wondering why this feature is no longer part of modern languages? Is it because the existing debuggers and tools are enough? Or the feature is prohibitively expensive or infeasible to implement while still being efficient? Or it's simply not useful enough?

2 replies

bcarlet Nov 2, 2023

As someone who is largely unconvinced by SELF's other supposed benefits, I totally agree that this is one area that seems really useful. When debugging (or just trying to understand) code, you'll often want to (dynamically) modify the code itself in nontrivial ways. While this may be possible in modern dynamic languages, it's clearly not built into the language in the same was it is in SELF. As for why this feature isn't common today, I wonder if part of the problem is that such a feature would be really easy to abuse if used outside of an exploratory context, ultimately leading to software that's hard to understand. From a language design perspective, it might be a bit like including goto in your language. It gives the programmer a ton of control and flexibility, but you know that if you include it in your language then people are going to do horrible things with it.

sampsyo Nov 2, 2023
Maintainer

It's absolutely true that this specific part of the Smalltalk/Self vision (such deep integration of the language and the GUI IDE that they are basically inextricable from each other) has really not survived to the present day. The original vision was very different from the way we think of IDEs today: there is the program, and there is the IDE that helps you edit the program, and eventually you intend run the program (but the IDE is no longer involved at that point). In the Smalltalk paradigm, the program basically runs in the IDE. There are graphical representations of things like objects in the program's heap. It's a wild world.

Check out Squeak for a version of this idea that you can run on modern systems:

20ashah · 2023-11-02T08:28:35Z

20ashah
Nov 2, 2023

As someone who has primarily worked in more rigid and statically typed languages such as Java, I found SELF to be really interesting and a completely different way of thinking than I am used to. Being able to adapt when the structure of objects and class hierarchy are expected to change dynamically throughout runtime is a really cool idea, however I am having some trouble thinking about it in a more concrete setting. What is a specific example where the class hierarchy is not predefined, and the structure of objects can dynamically change during runtime? Initially I was thinking of some sort of UI framework where the structure of graphics elements can dynamically change based on certain actions that the user performs.

1 reply

sampsyo Nov 2, 2023
Maintainer

Perhaps one inheritor of the idea that regular programs should exploit extreme language dynamism is Racket, which embraces a lot of run-time metaprogramming (together with an advanced macro system) to build up whole new mini-languages for specific purposes embedded in a single programming environment.

rcplane · 2023-11-02T12:09:56Z

rcplane
Nov 2, 2023

The authors' efficiency improvements reminded me a lot of recent Python 3.11 changes to reduce frame generation and customize operations for execution speedup. Incidentally, the changelog makes note of a new "Self" type annotation from PEP 673, reinforcing indications that the concerns of the reading are really quite persistent.

0 replies

Reading for 11/2: SELF #411

Replies: 11 comments · 29 replies

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

sampsyo Nov 2, 2023 Maintainer

Replies: 11 comments 29 replies

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer

sampsyo Nov 2, 2023
Maintainer