Making SdoArray iterator behave equal to ODArray and *Record #430

sveinse · 2024-05-14T14:27:44Z

Iterating over an SdoArray and iterating over an ODArray object reveals different results.

# Load an OD that contains "MyArray" at 0x2000 with sub elements
#    0: Number of elements = 2
#    1: Element 1
#    2: Element 2
node: RemoteNode

# From object_dictionary (iterate over ODArray)
a = [i for i in node.object_dictionary["MyArray"]]   # Results in [0, 1, 2]

# From SDO (iterate over SdoArray)
b = [i for i in node.sdo["MyArray"]]  # Results in [1, 2]

The SdoRecord behaves expectantly:

a = [i for i in node.object_dictionary["MyRecord"]]   # Results in [0, 1, 2]
b = [i for i in node.sdo["MyRecord"]]  # Results in [0, 1, 2]

This PR adds that the iterator for SdoArray include index 0. This is consistent with the iterator of ODArray, ODRecord and SdoRecord.

erlend-aasland · 2024-05-14T16:08:39Z

I'm not sure this is correct. PDO indexes are explicitly documented as starting with 1, not 0. I would assume the same being true for SDOs. Anyway, this is a breaking change; doc should be amended according to the new behaviour, iff this is an acceptable change.

sveinse · 2024-05-14T17:12:39Z

The iterator for PDOs; Map() (or PdoMap() after PR #431) produces a sequence of PdoVariable() objects, not indexes. This is different from SDO and OD where the iterators for SdoArray, SdoRecord, ODArray and ODRecord issues index numbers. So they are different.

The only change here is if one iterates over SdoArray. The __getitem__ mechanisms, such as obj[1] remains unchanged. And if it weren't clear: Iterating over ODArray will produce a different result than iterating over SdoArray and this was how I uncovered the difference.

erlend-aasland · 2024-05-14T17:28:06Z

The iterator for PDOs; Map() (or PdoMap() after PR #431) produces a sequence of PdoVariable() objects, not indexes. This is different from SDO and OD where the iterators for SdoArray, SdoRecord, ODArray and ODRecord issues index numbers. So they are different.

Right, that was not immediately obvious to me; thanks for the clarification.

The only change here is if one iterates over SdoArray. The __getitem__ mechanisms, such as obj[1] remains unchanged. And if it weren't clear: Iterating over ODArray will produce a different result than iterating over SdoArray and this was how I uncovered the difference.

Well, it is still a behavioural change for code that iterates over SdoArray :)

acolomb · 2024-05-16T08:58:00Z

The difference in behavior is definitely strange and I couldn't find an explanation why this was explicitly chosen for SDO only. Maybe @christiansandberg can clarify?

Conceptually, I'd rather have the iterator be concerned with only the values in the SdoArray. Getting the count of items as the first entry seems unnatural for an array. We should think about making the behavior shown here also the default for the ODArray.

But I do see a conceptual difference, as the OD describes the object entries and the count is one of those objects. But when accessing the values in the array via SDO, it makes sense to iterate only over the actual contents starting from subindex 1.

Either way, we should make sure the .keys(), .values() and .items() views behave the same as when creating an iterator directly. Is that guaranteed in those methods' inherited implementations?

sveinse · 2024-05-16T16:09:41Z

With the change in this PR the following are observed:

Object	`.keys()`	`.values()`	`.items()`	Iterable	Comment
`ODVariable`	❌ No	❌ No	❌ No	❌ No
`ODArray`	✅ Yes	✅ Yes	✅ Yes	✅ Yes	Includes element 0, which is length
`ODRecord`	✅ Yes	✅ Yes	✅ Yes	✅ Yes	Includes element 0, which is length
`SdoVariable`	❌ No	❌ No	❌ No	❌ No
`SdoArray`	✅ Yes	✅ Yes	✅ Yes	✅ Yes	Includes element 0, which is length ℹ️
`SdoRecord`	✅ Yes	✅ Yes	✅ Yes	✅ Yes	Includes element 0, which is length
`PdoMap`	❌ No	❌ No	❌ No	✅ Yes	Object is not Mapping-like (dict-like), but it has `__getitem__`
`PdoVariable`	❌ No	❌ No	❌ No	❌ No

ℹ️ NOTE: If this PR is not accepted, the SdoArray will not include element 0.

Conceptually, I'd rather have the iterator be concerned with only the values in the SdoArray. Getting the count of items as the first entry seems unnatural for an array. We should think about making the behavior shown here also the default for the ODArray.

The CANopen standard supports this: "Sub-index 0 is of UNSIGNED8 and therefore not part of the ARRAY/RECORD data". (Paraphrased from 7.4.3 in CiA 301). So perhaps the iterator shoud omit index 0, however directly requesting it (__getitem__) should return it? If we are making the change, I'd argue that we want all objects to have the same behavior. Array and Records.

That said, in Python, it is unexpected behavior that a Mapping-like object returns an iterator doesn't iterate over all of its contents.

christiansandberg · 2024-05-16T16:17:29Z

One thing to take into consideration is that indices are dynamically sized. That's why you need to read the actual value of Last Subindex to know how many items there are. So that's why I didn't think it made sense to include Last Subindex as well as that is already included in the returned range.

christiansandberg · 2024-05-16T16:43:39Z

Maybe it would be more logic to exclude it in the records instead? It might make sense to exclude it from the Object Dictionary as well but it could be argued that Last Subindex is part of it and should be presented.

sveinse · 2024-05-16T16:54:04Z

Yes, it depends on which "hat" you're wearing: If you look at it from the object contents perspective, subindex 0 is superflous. If you look at what subobjects an array/record contains, then subindex 0 is definitely a part of the object.

I start to think that the most logical solution would be that the iterators omit subindex 0, but it can be explicitly requested with __getitem__. I'm a sucker for consistency, so if we do this, I'd argue we do it on arrays, records, SDO and OD objects.

While we're at it, perhaps PdoMap should be made Mapping, like so it supports dict-like behavior too?

sveinse · 2024-05-17T09:56:49Z

One thing to take into consideration is that indices are dynamically sized. That's why you need to read the actual value of Last Subindex to know how many items there are. So that's why I didn't think it made sense to include Last Subindex as well as that is already included in the returned range.

In SdoArray the length is read with reading subindex 0 and not considering the number of actual objects: (Which creates its own problems for async, as __len__ is a dunder that can't do async ops.)

    def __len__(self) -> int:
        return self[0].raw

In SdoRecord it use the length of the object and not the contents of subindex 0:

    def __len__(self) -> int:
        return len(self.od)

For completeness: ODArray and ODRecord also determine its length this way.

I'm happy to change the SdoArray.__len__ to equal the other implementations.

Maybe it would be more logic to exclude it in the records instead? It might make sense to exclude it from the Object Dictionary as well but it could be argued that Last Subindex is part of it and should be presented.

Yes. According to CANopen spec, subindex 0 is not a part of neither of them.

acolomb · 2024-05-17T10:08:17Z

Just a quick comment, need some more time to look at this thoroughly. I'm in favor of switching the len dunder to not do I/O and instead rely on the known OD structure. For one it's unexpected and I also noticed that will be a pain point / blocker for async. If one wants to know the content of subindex 0 explicitly, it's no trouble to request it via getitem.

Skipping it while iterating sounds reasonable and if the spec is clear on that point for both records and arrays, let's follow that consistently as well.

I'm still undecided on whether to remove it from OD iterators as well. That is a format description, where the element should definitely be noted. SDO on the other hand concerns the content, which does not include the length as the spec says.

christiansandberg · 2024-05-17T10:39:41Z

Iteration using SDO should probably be explicitly provided by method(s) to make it clear what you will get, rather than implicitly as today. Then it would be easier to provide an async equivalent.

sveinse · 2024-05-17T10:43:16Z

There is a suble detail here regarding arrays that we need to consider: Where is the authoritative information about an arrays length stored? Is it the OD or the actual device?

    # From SdoArray
    def __iter__(self) -> Iterable[int]:
        return iter(range(1, len(self)+1))
    def __len__(self) -> int:
        return len(self.od)

Trying this on real devices, I quickly noticed that this relies on the OD keeping the count of the length of the array. However, the HW might have different length. So the iterator does not reflect the actual remote object. I need to read subindex 0 to get the actual length. So what should the iterator reflect here?

Sidenote: If there are any nodes that doesn't update its subindex 0 to indicate array length, then the only solution is to iterate until a SdoAbortedError.

So for that use having the current code is more accurate, yet it opens up the problem of blocking IO in unexpected operations.

    def __len__(self) -> int:
        return self[0].raw

Do we need a similar mechanism as with PDO where you read the configuration into memory? I.e. for SdoArray it would mean to cache the length read from subindex 0.

sveinse · 2024-05-17T12:00:58Z

If I may add a comment with my async hat on in the hopes we can think a bit ahead:

If we can avoid IO in __len__() and __iter__() in SdoArray, then the regular iterator can be used as-is for async. Otherwise additional async def alen(), def __aiter__() and async def aiter() methods will be needed. The user will need to do [i async for i in sdo_obj] instead of list(sdo_obj).

acolomb · 2024-05-23T09:58:26Z

You're certainly right in that we should treat the array length as dynamic, assuming that the OD does not contain the correct value. In fact, the OD might not even contain definitions for all array members. In that regard, iterating based on the actual subindex 0 count retrieved from the device seems correct.

So maybe doing this based on Python's __iter__ and __len__ is not the right approach after all? I'd say we can keep it in a blocking manner as long as it doesn't stand in the way of async by being used implicitly for other functions. I.e. if len(sdo) is the only blocking operation, that just needs to be documented. If it negatively affects other operations where blocking I/O is completely unexpected, then we should change it (maybe even into something like raise NotImplementedError).

As a replacement, a separate iterator class would be a clean approach. There we can cache the length at an appropriate time, and have regular and async-friendly implementations side by side or within the same class. Or even keep the current interface for blocking and only an SdoAsyncIterator class, separately.

sveinse · 2024-05-23T12:02:29Z

The following code is how its done in the async port. It presents a regular and async iterator. It does use the built-in dunders thou. So one cannot use a regular iterator within an async context. When the iterator relies on IO to get the length, I think the iterators will always need to be colored (regular and async) separate.

    def __iter__(self) -> Iterable[int]:
        return iter(range(1, len(self) + 1))

    async def aiter(self):
        for i in range(1, await self.alen() + 1):
            yield i

    def __aiter__(self):
        return self.aiter()

    def __len__(self) -> int:
        return self[0].raw

    async def alen(self) -> int:
        return await self[0].aget_raw()

acolomb · 2024-08-15T22:48:33Z

This discussion got a little bit derailed by mixing different topics: semantics, implementation characteristics (like unexpected blocking I/O), and async compatibility. I think we should focus on one at a time.

Regarding semantics, I think we had a rough consensus:

There is a difference in meaning / asymmetry between iterating over OD stuff and actual data (via SDO). Iterating over an ODRecord or ODArray in my view means "give me all descriptions of documented sub-objects, one at a time".

Take for example an array where some members are reserved for future usage, skipping a subindex. These may not even show up in an EDS file, thus have no entries in our ODArray. The application hint in CiA 306, section 4.6.3.2 supports this. In theory, even subindex 0 could be missing from that. Thus iterating over an ODArray (or ODRecord) simply returns the OD knowledge, where subindex 0 could be present or not, as with any other subindex. Asking for "the length of the array" in the OD is not really meaningful, as we can get only the DefaultValue for subindex 0, which might differ from the actual length in the device.

Thus ODArray.__len__() correctly returns the number of described sub-objects, and ODArray.__iter__() correctly walks through exactly that many. All good, and the same applies to ODRecord.

For SDO, we're concerned with the actual data on the device. The number of entries (array length) should reflect the actual content of subindex 0. Whether cached or blocking I/O doesn't matter right now, it's an implementation detail. Iterating over an SdoArray should yield exactly as many entries as that subindex specifies (1...n), therefore excluding the count itself. Any entry undefined in the OD is generated on the fly by __getitem__() (but not stored back) based on the subindex 1 entry. If that's not possible, it raises a KeyError exception. If the device reports highest subindex as 0 but still returns array entries, that is buggy and may be worked around by explicitly requesting entries, but letting the application handle the expected range of subindices, from the OD or wherever. Built-in iteration works only for sensibly implemented devices, sorry.¹

Currently SdoArray.__len__() correctly returns the online value of highest subindex (= count) and SdoArray.__iter__() correctly yields that many entries, even if undefined in the OD (copied from subindex 1).

For SdoRecord, it's a bit different (next asymmetry). The data types of sub-objects without an OD entry cannot be guessed, thus the only usable members are the ones we actually know from the OD. That defines the set of (and consequently, the number of) sub-objects usable with standard SDO functions. If the device reports higher subindices being supported (through the live subindex 0 count), or if there are gaps between known subs, those unknown entries could still be handled via sdo.upload() / .download() without trying to interpret the result through an OD schema. But iterating should exclude such "unusable" entries.

Currently SdoRecord.__len__() defers to ODRecord.__len__(), which includes the subindex 0 count value. That should rather mirror the SdoArray and skip the highest subindex (= count) value itself: return len(self.od) - int(0 in self.od). Consequently, SdoRecord.__iter__() should yield that many entries, and filter out the count as well: return filter(None, iter(self.od)).
Contrary to what we're used to with Python containers, SdoRecord.__contains__(0) still correctly indicates True for the count if the OD contains subindex 0, because __getitem__(0) would then succeed. That's distinct from the iteration in this case.
It makes sense that these iterators yield ints, because that's the common compatible key format between OD and SDO. On the other hand, PDOs do not share that enumeration namespace and therefore can do the more useful thing, returning the PdoVariables on iteration. Note that making them dict-like would contradict that pattern, thus unlikely to happen, to keep compatibility.

Summing up, I think this PR goes in the wrong direction, rather SdoRecord should be fixed to exclude the count value (bullet 4.). Everything else is implemented correctly based on valid reasons and following the standard. Turning PdoMap into a Mapping (bullet 6.) would be nice, but yields unexpected iteration behavior, inconsistent with other mappings. There are many easy ways to work around this, like enumerate(pdomap) instead of pdomap.items().

Sorry for the lengthy post. I'd be grateful for short assessments focusing on the six bullet points.

The way I implemented my embedded SDO server code, it knows only about the non-zero subindices (possibly non-consecutive). When subindex zero is requested, it searches for the highest known subindex, writes that to a shared temporary variable, and handles it as if that variable were requested directly. Saves memory and typing for the count entry descriptors and always matches exactly the "highest subindex supported" semantics. ↩

Making SdoArray iterator behave equal to ODArray and *Record

29514b6

acolomb mentioned this pull request Aug 16, 2024

Skip the "highest subindex" entry when iterating over SdoRecord #538

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making SdoArray iterator behave equal to ODArray and *Record #430

Making SdoArray iterator behave equal to ODArray and *Record #430

sveinse commented May 14, 2024

erlend-aasland commented May 14, 2024

sveinse commented May 14, 2024

erlend-aasland commented May 14, 2024

acolomb commented May 16, 2024 •

edited

Loading

sveinse commented May 16, 2024 •

edited

Loading

christiansandberg commented May 16, 2024

christiansandberg commented May 16, 2024

sveinse commented May 16, 2024

sveinse commented May 17, 2024

acolomb commented May 17, 2024 •

edited

Loading

christiansandberg commented May 17, 2024

sveinse commented May 17, 2024 •

edited

Loading

sveinse commented May 17, 2024 •

edited

Loading

acolomb commented May 23, 2024 •

edited

Loading

sveinse commented May 23, 2024

acolomb commented Aug 15, 2024 •

edited

Loading

Making SdoArray iterator behave equal to ODArray and *Record #430

Are you sure you want to change the base?

Making SdoArray iterator behave equal to ODArray and *Record #430

Conversation

sveinse commented May 14, 2024

erlend-aasland commented May 14, 2024

sveinse commented May 14, 2024

erlend-aasland commented May 14, 2024

acolomb commented May 16, 2024 • edited Loading

sveinse commented May 16, 2024 • edited Loading

christiansandberg commented May 16, 2024

christiansandberg commented May 16, 2024

sveinse commented May 16, 2024

sveinse commented May 17, 2024

acolomb commented May 17, 2024 • edited Loading

christiansandberg commented May 17, 2024

sveinse commented May 17, 2024 • edited Loading

sveinse commented May 17, 2024 • edited Loading

acolomb commented May 23, 2024 • edited Loading

sveinse commented May 23, 2024

acolomb commented Aug 15, 2024 • edited Loading

Footnotes

acolomb commented May 16, 2024 •

edited

Loading

sveinse commented May 16, 2024 •

edited

Loading

acolomb commented May 17, 2024 •

edited

Loading

sveinse commented May 17, 2024 •

edited

Loading

sveinse commented May 17, 2024 •

edited

Loading

acolomb commented May 23, 2024 •

edited

Loading

acolomb commented Aug 15, 2024 •

edited

Loading