Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix confusion between raw samples and CDR in PSMX #1889

Merged

Conversation

eboasson
Copy link
Contributor

@eboasson eboasson commented Nov 29, 2023

There were various issues causing confusion in the contents and size of
loans:

  • dds_write_impl performed a look-ahead and could decide to allocate
    memory for a PSMX loan, only taking into account the type to decide
    whether to allocate based on the raw sample size or the serialized size,
    but failing to account for the possibility of a serialized key value.

  • What used to be the "force serialization" flag was interpreted by
    serdata_default as (also) requiring that the data in a loan would be in
    serialized form, even though that wasn't the case. It will handle
    serialized data just fine even if it is not necessary, but there is no
    reason for serializing it into shared memory if the data can just be
    memcpy'd around.

  • This resulted in serdata_default using slightly different rules for
    deciding whether or not to serialize and so could (attempt to) put
    serialized data in a loan allocated for a raw sample.

    The metadata would be set correctly, and the amount of memory copied
    was that of the destination buffer, so as long as the CDR was no
    longer than the raw sample (unlikely) and the serdata buffer happened
    to be large enough, it was ok. But those conditions are not always
    satisfied.

Taken together, this meant: out-of-bounds reads and inefficiencies.

This commit refactors dds_write_impl, taking the burden of doing things
with the contents of the loans from the serdata implementation (it
leaves the interface unchanged), instead doing it in dds_write_impl
where all the relevant information is available.

It also moves the publishing via PSMX to before the network part.
This ensures that the latency via shared memory is not affected by the
speed with which the network can absorb the data.

In this refactored version, supporting zero-copy within the process (and
without PSMX) was missing. That has also been added.

This PR is dependent on #1890 for compiling warning-free (hence its inclusion of cc47314)

src/core/ddsc/src/dds__write.h Fixed Show fixed Hide fixed
src/core/ddsc/src/dds_write.c Fixed Show fixed Hide fixed
@eboasson eboasson changed the title Fix confusion between raw samples and CDR in PSMX + consistent checking timestamps in write-like functions Fix confusion between raw samples and CDR in PSMX Nov 30, 2023
Copy link
Contributor

@dpotman dpotman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor remarks, and an issue that I think causes PSMX delivery not to work in case a writer loan on a PSMX enabled writer is requested for a non-memcpy-safe type.


struct dds_serdata_default *d;
if (serialize_data)
if (will_require_cdr || !type->is_memcpy_safe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type cannot be non-memcpy safe at this point

if (serdata != NULL)
{
if (ret == DDS_RETCODE_OK)
ret = dds_write_basic_impl (thrst, wr, serdata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename dds_write_basic_impl to dds_write_impl_deliver_via_ddsi ?

//
// II. heap loan => assert (is_memcpy_safe)
// a. psmx only
// - allocate PSMX loan, memcpy, free heap loan, deliver loan via PSMX
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merely a theoretical case, which cannot occur because the loan will always be a PSMX loan in this case. So may be useful to add a remark here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked dds_request_writer_loan, in case of a PSMX writer and a non-memcpy safe type, it will return a heap loan, which probably is not a good idea when there are pointers in the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that that's probably a rather bad idea because it would be totally unclear what the lifetime of the pointed-to data would be. I believe it should always have rejected it and we can treat it as a bug.

struct ddsi_sertype * const sertype = wr->m_topic->m_stype;
assert (wr->m_endpoint.psmx_endpoints.length == 1); // FIXME: support multiple PSMX instances
assert (sertype->is_memcpy_safe || ddsi_serdata_size (sd) >= 4);
const uint32_t loan_size = (sertype->is_memcpy_safe ? sertype->sizeof_type : ddsi_serdata_size (sd) - 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_memcpy_safe is always false at this point

Copy link
Contributor

@dpotman dpotman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

There were various issues causing confusion in the contents and size of
loans:

* dds_write_impl performed a look-ahead and could decide to allocate
  memory for a PSMX loan, only taking into account the type to decide
  whether to allocate based on the raw sample size or the serialized size,
  but failing to account for the possibility of a serialized key value.

* What used to be the "force serialization" flag was interpreted by
  serdata_default as (also) requiring that the data in a loan would be in
  serialized form, even though that wasn't the case.  It will handle
  serialized data just fine even if it is not necessary, but there is no
  reason for serializing it into shared memory if the data can just be
  memcpy'd around.

* This resulted in serdata_default using slightly different rules for
  deciding whether or not to serialize and so could (attempt to) put
  serialized data in a loan allocated for a raw sample.

  The metadata would be set correctly, and the amount of memory copied
  was that of the destination buffer, so as long as the CDR was no
  longer than the raw sample (unlikely) and the serdata buffer happened
  to be large enough, it was ok.  But those conditions are not always
  satisfied.

Taken together, this meant: out-of-bounds reads and inefficiencies.

This commit refactors dds_write_impl, taking the burden of doing things
with the contents of the loans from the serdata implementation (it
leaves the interface unchanged), instead doing it in dds_write_impl
where all the relevant information is available.

There was also some confusion in the handling of loans of complex types,
where the application would be given a heap loan but the serdata
implementation did not handle this correctly throughout.  The code now
supports it properly.  Note that returning a heap loan even for a PSMX
writer is deliberate: this way, the types, memory allocation routines
and lifetimes to be used for any reference in the type is independent of
the use of PSMX plugins.

It also moves the publishing via PSMX to *before* the network part.
This ensures that the latency via shared memory is not affected by the
speed with which the network can absorb the data.

In this refactored version, supporting zero-copy within the process (and
without PSMX) was missing.  That has also been added.

Signed-off-by: Erik Boasson <[email protected]>
also attempt to detect a failure to match because of an event loop that
got stuck without sending out discovery data

Signed-off-by: Erik Boasson <[email protected]>
This runs over:

- memcpy-safe types and non-memcpy-safe types
- use of writer loans
- use of reader loans
- use of PSMX
- writing and disposing (sample vs key)

That should provide basic test coverage for all combinations that the
code distinguishes between.

The type definitions are chosen so that raw samples and CDR have
different memory layouts and one cannot be confused for the other.

Signed-off-by: Erik Boasson <[email protected]>
@eboasson eboasson merged commit 3c6a2c3 into eclipse-cyclonedds:master Dec 8, 2023
8 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants