Skip to content

Commit

Permalink
pod/perlguts pod/perlhacktips - various updates and new content
Browse files Browse the repository at this point in the history
  • Loading branch information
bulk88 committed Jan 1, 2025
1 parent 8f5aa22 commit c57bbf2
Show file tree
Hide file tree
Showing 2 changed files with 181 additions and 23 deletions.
45 changes: 37 additions & 8 deletions pod/perlguts.pod
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ may not be usable in all circumstances.
A numeric constant can be specified with L<perlapi/C<INT16_C>>,
L<perlapi/C<UINTMAX_C>>, and similar.

See also L<perlhacktips/"Portability problems">.

=for apidoc_section $integer
=for apidoc Ayh ||IV
=for apidoc_item ||I8
Expand Down Expand Up @@ -2943,8 +2945,32 @@ The context-free version of Perl_warner is called
Perl_warner_nocontext, and does not take the extra argument. Instead
it does C<dTHX;> to get the context from thread-local storage. We
C<#define warner Perl_warner_nocontext> so that extensions get source
compatibility at the expense of performance. (Passing an arg is
cheaper than grabbing it from thread-local storage.)
compatibility at the expense of performance. Passing an arg is
much cheaper and faster than grabbing it with from the OS's thread-local
storage API with function calls.

But consider this, if there is a choice between C<Perl_croak> and
C<Perl_croak_nocontext> which one do you pick? Which one is
more efficient? Is it even possible to make the C<if(assert_failed)> test true
and enter conditional branch with C<Perl_croak>?

Maybe only from a test file. Maybe not. Your C<Perl_croak> branch is probably
unreachable until you add a new bug. So the performance of
C<Perl_croak_nocontext> compared to C<Perl_croak>, doesn't matter. The C<dTHX;>
call inside the slower C<Perl_croak_nocontext>, will never execute in anyone's
normal control flow. If the error branch never executes, optimize what does
execute. By removing the C<aTHX> arg, you saved 4-12 bytes space and 1-3 CPU
assembly ops on a cold branch, by pushing 1 less variable onto the C stack
inside the call expression invoking C<Perl_croak_nocontext>, instead of
C<Perl_croak>. The CPU has less to jump over now.

The rational of C<Perl_croak_nocontext> is better than C<Perl_croak> is only
in the case of C<Perl_croak>, and nowhere else except for the deprecated
L<perlapi/"Perl_die_nocontext"> L<perlapi/"Perl_die"> pair and 3rd case of
L<perlapi/"Perl_warn">. L<perlapi/"Perl_warn"> is debateable.

It doesn't apply to C<Perl_form> C<Perl_mess> or PP keyword die, implemented as
C<Perl_op_die(OP * op)>, which could be normal control flow.

You can ignore [pad]THXx when browsing the Perl headers/sources.
Those are strictly for use within the core. Extensions and embedders
Expand All @@ -2971,11 +2997,12 @@ argument somehow. The kicker is that you will need to write it in
such a way that the extension still compiles when Perl hasn't been
built with MULTIPLICITY enabled.

There are three ways to do this. First, the easy but inefficient way,
which is also the default, in order to maintain source compatibility
with extensions: whenever F<XSUB.h> is #included, it redefines the aTHX
and aTHX_ macros to call a function that will return the context.
Thus, something like:
There are three ways to do this. First, the easist way, is using Perl's legacy
code compatibility layer, which is also the default. Production grade code
and code intended for CPAN should never use this mode. In order to maintain
source compatibility with very old extensions: whenever F<XSUB.h> is #included,
it redefines the aTHX and aTHX_ macros to call a function that will return the
context. Thus, something like:

sv_setiv(sv, num);

Expand All @@ -2990,7 +3017,9 @@ or to this otherwise:

You don't have to do anything new in your extension to get this; since
the Perl library provides Perl_get_context(), it will all just
work.
work, but each XSUB will be much slower. Benchmarks have shown using the
compatibility layer and Perl_get_context(), takes 3x more wall time in the best
case, and 8.5x worst case.

The second, more efficient way is to use the following template for
your Foo.xs:
Expand Down
159 changes: 144 additions & 15 deletions pod/perlhacktips.pod
Original file line number Diff line number Diff line change
Expand Up @@ -53,25 +53,101 @@ supported"> for further discussion about context.

Not compiling with -DDEBUGGING

The DEBUGGING define exposes more code to the compiler, therefore more
ways for things to go wrong. You should try it.
The DEBUGGING define exposes more code to the compiler and turns on Perl's
asserts, therefore more ways for things to go wrong. A Perl built with
the C<DEBUGGING> define will be visibly slower in the shell and every other
subsystem. C<DEBUGGING> is only for development of XS modules or core code,
never production running, but its maximum error checking is crucial for
good new code. You should try it.

=item *

Introducing (non-read-only) globals

Do not introduce any modifiable globals, truly global or file static.
They are bad form and complicate multithreading and other forms of
concurrency. The right way is to introduce them as new interpreter
variables, see F<intrpvar.h> (at the very end for binary
compatibility).
Introducing (non-read-only) globals and statics

Do not introduce any modifiable C globals, truly visible global variables
declared with extern visible or per C file globals declared with C<static>
visibility. They are bad form, and not memory safe with complicate multithreading
and other forms of concurrency. XS modules have a dedicated simple API to create
their own, Perl threading safe global variables, see
L<perlxs/Safely Storing Static Data in XS>. But the interpreter core can't use
that API.

The interpreter currently does not use any atomic intrinsic functions offered
by a C compiler. Instead Perl's thread safe serialization, is done with an
internal API with names like C<MUTEX_INIT()> and C<MUTEX_LOCK()> .

Historically, atomic operations didn't exist on most CPU archs that Perl uses.
If they existed, atomic APIs were always OS and vender specific, and never
portable.

As of 5.35.5, perl dropped support for a strict C89 compiler and moved to
a minimum requirement of C89+some C99. See L</C99>. C11 standardized some
atomics for the first time in the optionally implemented C<stdatomic.h>.
Patches are welcome to add a portable atomic API, with fallbacks to
C<MUTEX_LOCK()>.

The right way to introduce a new C global variable, usually will be to add
it as a new interpreter variable. See F<intrpvar.h>. Since 5.10.0, adding
or removing or changed the size of any interpreter variable, is not supported
and undefined behavior. Recompiling XS modules is required.

There are some loopholes to this policy if you are writing unstable
experiments. These loopholes can never be used, in stable code, for the
interpreter, or XS modules. The loopholes may temporarily work, just long
enough, to finish the experiment. Remember, failure to get a C<SEGV>, or
failure to get fatal C<panic:> error, doesn't mean you didn't introduce a bug,
or corrupt a random malloc() block.

Between 5.10.0, and upto 5.21.5, there was a provision, that adding 1 new
variable at the end of F<intrpvar.h> as the very last member, was always binary
compatible with older XS modules. This was intended only for stable
maintenance releases. Ex, new maintenance release 5.18.1, loading an XS module
compiled against header files from 5.18.0. Remember a newer 5.18.1 core,
loading an XS binary compiled against 5.17.10 or 5.16.0, isn't allowed.

So if cutting off current struct members in F<intrpvar.h>, didn't introduce a
crash, you saved some time in your experiment and it was good luck.

Starting with 5.21.6, stricter checking was added, to match the definition of
F<intrpvar.h> as understood by each build of the perl interpreter binary or the
C<libperl> binary, against the definition of F<intrpvar.h> as understood,
when the XS module's shared library file was compiled.

The exact sanity check requires struct length of C<PerlInterpreter *> aka
C<my_perl> to be C<sizeof(PerlInterpreter)> or C<sizeof(*my_perl)>
identical between Core and an XS module, regardless if its a non-threaded or
threaded build of perl. If the C compile time byte lengths don't match at
runtime, L<perldiag/"%s: loadable library and perl binaries are mismatched (got %s handshake key %p, needed %p)">
error happens.

For 5.21.6 and up, to avoid recompiling XS, if you want to add a new interpreter
global variable while hacking on the interpreter, is to rename, repurpose, or
make into union, a current variable from F<intrpvar.h> without change its size,
alignment, and offset.

Something easier, if speed doesn't matter, put your new experimental pointer or
integer, into the former backend of C<MY_CXT_INIT>. It is an C<HV*> named
L<perlapi/"PL_modglobal">.

If speed is important, add a new pointer member to F<intrpvar.h> just once in your
branch, recompile all your XS modules once, and always keep the private patch
in your repo. Shrinking or growing the length of a pointer from C<Newx()>,
doesn't trip the 5.21.6 and up interpreter global struct size check.

Take a look the backend of the C<MY_CXT_INIT> API. The backend is
2 variables, C<PL_my_cxt_list> and C<PL_my_cxt_size>. Nothing prevents
the C<perl_construct>, C<perl_clone_using>, C<perl_destruct> group being
changed to always take ownership of index 0 of array of C<void *>s that is
stored at C<PL_my_cxt_list>, before the first call to C<newXS()> or PP code.

Introducing read-only (const) globals is okay, as long as you verify
with e.g. C<nm libperl.a|egrep -v ' [TURtr] '> (if your C<nm> has
BSD-style output) that the data you added really is read-only. (If it
is, it shouldn't show up in the output of that command.)

If you want to have static strings, make them constant:
Const static strings are less efficient than double quoted string literal.
But if you really want to have static strings, at minimum, make sure they are
declared with constant:

static const char etc[] = "...";

Expand All @@ -81,14 +157,60 @@ right combination of C<const>s:
static const char * const yippee[] =
{"hi", "ho", "silver"};

C requires that C<static const char []> arrays have unique addresses in an
equality test. The linker is prohibited from merging and de-duplicating
const static arrays with identical length and data content. This is B<not> true
for double quoted C string literals. C string literals are efficiently de-duped
by linkers. If a string literal is very long, or its contents decrease
readability of other code, and you desire an alternate token or symbol for that
string, use a C<#define Msg "long Msg">. 2 references to C<"..."> will
always get merged to 1 copy stored in the binary image.

static const char etc[] = "...";

This will never be merge in the final binary. In this case, there would be
2 copies of C<"..."> at different 2 addresses, each taking 4 bytes, inside one
C<perl.bin> or C<libperl.so> or XS binary.

Sometimes this inefficiency is a feature. Its goes as such. Declare a
C<static const char []> array, and place the pointer to that static array,
into a larger global-like or malloc-ed structure, and return control. Sometime
later, you regain control, and you check a global-like or malloc-ed structure.
Is the C<const char *> still the same address as your C<static const char []>
array or not? This can be used as tag or flag or status, if you see the same
address or not in the future.

Because of guarenteed different address, any arbitrary core or XS code that
overwrites the C<const char *> member, with an identical contents, C<"">
literal, would be detected.

Perl uses this method inside C<L<perlapi/"PL_sv_yes">>,
C<L<perlapi/"PL_sv_no">>, and C<L<perlapi/"PL_sv_zero">>. These 3 set
C<SvPVX> to exported, const char arrays, C<PL_Yes>, C<PL_No>, and C<PL_Zero>.
The addresses of 3 const char arrays, have special meaning, and will never test
C<==> true against the address of a string literal with the same contents.

=item *

Not exporting your new function

Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any
function that is part of the public API (the shared Perl library) to be
explicitly marked as exported. See the discussion about F<embed.pl> in
L<perlguts>.
function or any const or read-write, process global data variable that is part
of the public API (the shared Perl library) to be explicitly marked as exported.
C symbols do not cross between different binary disk files on these platforms
unless explicit exported. If a public API macro that uses a non-public API
function or process global variable, the non-public API C symbol has to be
exported so the OS shared library runtime linkers can load XS modules.

Start in 5.37.1, support for C<__attribute__((visibility("hidden")))> was added.
This brought explicit export marking shared library C symbol semantics to
almost all compilers and platforms. This greatly helps if the compiler has LTO
since heuristic automatic inlining of any function is possible, along with
not static, not exported marked, unused functions.

See the discussion about F<embed.pl> in L<perlguts>. Export marking is done
by editing F<embed.fnc> for functions.b For data variables, export marking,
is through F<perl.h>, F<globvar.sym>, and F<perlvars.h>.

=item *

Expand Down Expand Up @@ -609,6 +731,12 @@ to be I<exactly> 32 bits (they are I<at least> 32 bits), nor are they
guaranteed to be C<int> or C<long>. If you explicitly need 64-bit
variables, use C<I64> and C<U64>.

If you are writing CPAN code, you need to support older compilers and Perls
without 64-bit intergers. For CPAN only you must check the HAS_QUAD define and
guard off your C<I64> and C<U64> code if they aren't implemented on that system.

See L<perlguts/"What is an E<quot>IVE<quot>?">

=item *

Assuming one can dereference any type of pointer for any type of data
Expand All @@ -626,7 +754,8 @@ Lvalue casts
(int)*p = ...; /* BAD */

Simply not portable. Get your lvalue to be of the right type, or maybe
use temporary variables, or dirty tricks with unions.
use temporary variables, C<*(int*)&p = ...;>, or dirty tricks with unions.
Remember about alignment, size, and compiling as C++.

=item *

Expand Down Expand Up @@ -1310,7 +1439,7 @@ similar output to CPAN module L<B::Debug>.

# finish this later #

=head2 Using gdb to look at specific parts of a program
=head2 Using gdb to look at specific parts of Perl code

With the example above, you knew to look for C<Perl_pp_add>, but what
if there were multiple calls to it all over the place, or you didn't
Expand Down

0 comments on commit c57bbf2

Please sign in to comment.