-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRAFT: Add specifically-named child nodes to DT bindings #79751
Conversation
This would allow removing nodes with a
Trying to spot a few edge cases... Is any of this valid? child-nodes:
- name: some-child-node1
include: "some-binding.yaml"
- name: some-child-node1
include: "some-binding.yaml" child-nodes:
- include: "some-binding.yaml" If not, would it be possible to use the yaml parser to avoid these edge cases if desired: child-nodes:
some-child-node1:
include: "some-binding.yaml"
some-child-node1: # <- YAML parsing error: duplicate entry
include: "some-binding.yaml" If it is possible to have both an child-nodes:
some-child-node1:
include: "common-properties.yaml"
properties:
custom1:
type: int
custom2:
type: int
custom3:
type: int Thank you for diving down this challenge, this addresses everything I met on my side. |
Yes, the goal is to be able to define nodes in the same way as other projects using DT without having to put a zephyr-specific
IMO these are different class of problems in that these things probably don't belong in DT at all, but I'm not familiar with USB in general or in Zephyr to understand why these are being made. If I had to guess these are probably are a use case of the "multi instance static software configuration" effort being led by @fgrandel .
Same with this, it's one of the biggest offenders of the idea of "DT is for hardware only". A lot of people have different opinions on what that means, but I don't think anyone will be willing or able to defend the idea that a linker section is an aspect of hardware. Someday, this needs to go, IMO.
Yes, I think it would be better to have that binding have something basically like: child-nodes:
- name: ports
child-binding:
include: "port.yaml"
Maybe, the power situation is more complicated for multiple reasons
No, the point is the nodes are unique, as if they are properties, just like in linux/dtschema, you wouldn't have two properties with the same name. You're right that this should be checked.
No, this is not the purpose, we still have
So basically you are proposing to make the
It should be possible, I think I tried this when I was messing with some bindings to check the code in the PR. And I think it would still be possible with a mapping just as well as a sequence. Luckily the code that already exists for initializing bindings was pretty well abstracted so that as you can see in the current draft, it was very simple to account for the new location of the "sub-bindings", they are initialized the same way. |
Thank you for the extended answer! This helps me understand what to aim for in the linked PR as well.
It seems to be this one, or linked from this one:
Rather impressive to see how light the changes are for such a feature. |
What you want is type composition. This is related but not the same. It's a very important topic, though, and I'm currently just holding back an already finished PR for it because a preliminary discussion is stuck in arch WG. Composition isn't so straightforward, though. Are you aware of the different variants in typing languages (union, intersection, ...) and schema languages (e.g. the intricacies of allOf, anyOf, oneOf, conflict resolution, etc.)? We shouldn't introduce this ad hoc. |
Please be aware that named children and child bindings are conceptually just special cases of the more general pattern based definition of subschemas. Child bindings are actually a legacy, they should have been defined via wildcard property names (globbing or regexp) to keep them more in line with other typing approaches. We should therefore try to not deviate from keeping property names together. See my comments on Discord. This also has readability advantages as the schema follows the document structure more naturally, if we avoid unnecessary top-level keys. |
Absolutely, compatibility with Linux, u-Boot and others is very relevant and as @decsny has said, it's currently my highest priority to take pressure away from DT by introducing a safe haven for software settings. We should also be aware that whether to introduce an explicit compatible to a child vs. defining a multi-level schema based on a parent compatible is a modeling decision. It semantically expresses different things. Therefore I'm very happy to see the capabilities of our binding language extended, to express both. Side note: JSON schema vocabulary as chosen by Linux is rather awkward for large, highly flexible schemas like ours. It was designed to define small, rigidly structured documents with simple primitive types. What we need is a compositional typing system, not a document description language. Just saying this so that I'm not being misunderstood as trying to propagate JSON schema as our sole inspiration. We should rather learn from highly compositional, dynamic typing systems like TypeScript or Python's typing language as opposed to rather rigid inheritance-heavy systems like those of Java or C++. This is what I have in mind when I think about our binding syntax. |
I actually thought this too and completely agree about it, but not sure what was the best approach because personally I wanted to keep changes to DT tooling minimal and didn't feel like reworking / deprecating Speaking of wildcards, a specific example of that is actually one of the next things I wanted to do, which was regex matching compatible strings (I'm guessing this is something that can also be generalized in some way) because right now there are a lot of bindings being introduced for like a series of parts that are like part12345634 and like 20 bindings files with a slightly different number at the end so that someone can match against many different compatibles. This is very annoying and doesn't scale, obviously |
If it's for the risk of breaking external code, I'd say, no problem, as long as we keep the changes inside edtlib. See the related statements by myself and @dottspina in #78095.
Fully agree. Deprecation would go overboard. Just not introducing more of the same pattern is enough IMO.
If it's also about additional effort, let me know. But if I understand you correctly, that's not so much your concern and you seem to agree that both syntaxes can be supported with almost the same amount of extra code. In any case feel free to delegate to me, if you want. I've been reworking that library so deeply recently that such a change would not weigh in at all on my side.
Are you sure that composition wouldn't be the better approach to handle this? Maybe this is just an artifact generated by us currently preferring inheritance over composition? It's seems to be uncommon to have wildcards in typing systems - I'm actually aware of none that does it, not even the most dynamic ones like TypeScript. The only things that come to mind are generics and advice patterns from AOP. The first seems to be the wrong tool for what you seem to look for and all experiences I made with the second were extremely hard to debug as you cannot grep for types any more. (Although AOP is a great tool for crosscutting concerns, but that's not what your example seems to require. In our case I could think of attaching compatibles to a lot of nodes based on a single statement matching nodes by a regex. That could become relevant for config, but certainly not for DT.) IIUC, using composition in your example would simply mean attaching both, the generic base binding and the simple vendor-specific binding, to the node with the more specific binding taking precedence over the more generic one. That's exactly what multi-compatible was designed for in DT and how I implemented it in my PoC. Having many small vendor-specific binding files and corresponding compatibles is not a problem IMO. What makes this so This encapsulation problem can be fixed by introducing a naming convention like xyz.binding.yaml and allowing bindings everywhere in the tree rather than collecting them in a central folder (which I never liked because it really breaks encapsulation). Ok, sorry, I gave a longer response to your tangent than I probable should have. But maybe this is already enough to satisfy your requirement and close the topic? Otherwise we should certainly open a separate issue/PR for it. |
If you want to try to tackle this issue, it's up to you, I'm not going to suggest not to, but, since this is going to change the DT binding schema, I think it should be at least discussed separately from whichever deep changes you are mentioning.
I always prefer composition over inheritance, I am allergic to inheritance in general, but I am failing to see what it has to do with what we're talking about? Can you clarify where the inheritance is happening? Also, to be clear, I'm NOT talking about putting regex in the DTS. I mean to do like linux once again, where in a DT binding they can regex match the compatible strings. Example: (https://www.kernel.org/doc/Documentation/devicetree/bindings/eeprom/at24.yaml) (maybe of interest to @henrikbrixandersen)
Yes, having multiple compatibles is already supported and is expected according to DT spec
Vendor-specific bindings are obviously fine, what I'm talking about is a vendor making a lot of binding files just to be able to match a lot of different compatibles to the same driver, and while it would make sense just to use one compatible for all of them, I think the motivation was to make it clear which exact part numbers were supported by a driver. Although this probably should just be handled by a comment in the binding or driver kconfig or something rather than having a ton of different strings for compatible possible in DT. Although, still, makes me wonder why does linux think there is an advantage to regex defined compatibles.
I am not sure I agree that this is a problem, and not sure what you are trying to encapsulate. I think having all the bindings in one place just makes them easier to search through. And also, linux does it this way as well (which is not really a justification but just pointing out that it is maybe what someone coming from linux would expect, and also I find is super helpful for me to search for examples of things in their bindings)
I'm not sure what did you mean by close the topic |
Agreed. Let's just leave it open. Who gets around to finding time first, fixes it.
I misunderstood the problem. With your additional explanation the problem seems to be even more trivial.
Sure, I was also talking of regexes in bindings. My (possibly imprecise) understanding is that a "binding" is just a "type". A type can be considered dual to a predicate that divides the overall configuration space ("all possible terms") in configurations ("terms") that match the type (valid configuraton) and those that don't match (validation error). It's more a matter of what you want to achieve ("proof") whether you express your validation rules in one or the other representation. From that perspective a regex doesn't conflict with type theory. It just creates an infinite space of aliases for a single type. It's like having a struct in C but for whatever reason you then generate an inifinite number of typedefs such that the following two declarations would point to the same underlying struct: struct atmel_24c_139 x;
struct atmel_24cs_ser x; This is something that we'd never accept for a good reason, right? It's very unusual in both typing languages and mathematics alike because it's simply not practical from a human perspective (and the human perspective is all that counts in naming).
Just on the DTS side, but on the binding side we don't support composition of compatibles. We only accept a single binding per node right now. But as you correctly say: composition is not the topic here.
That's exactly right. It's as simple as that. It's horrible design that breaks basic principles of informatin hiding and abstraction. A well-known antipattern that we shouldn't accept at all as we wouldn't accept a generic base class in python depending on all of its subclasses. I mean, they can have as many compatibles as they like in DTS to designate their subtypes, but their driver must not depend on it if it uses the generic programming model on purpose. DTS clearly defines a correspondence of compatibles with hardware programming models, not with part numbers.
No idea. Just guessing:
I think it would be a case for regex types if we could somehow support Linux bindings verbatim. But I don't see that we are anywhere close to such a thing and should therefore confidently reject starting to get closer to Linux bindings in a place where we have very good arguments that the design problem lies with Linux. I'd rather propose we concentrate first on the many other legitimate improvements we can make to get closer to linux bindings (e.g. the one proposed in this PR). It is trivial anyway to comment behind a generic type that the corresponding Linux type is a regex if we want to document the relation to Linux.
Yes I see this trade off. This argument was only related to composition, though, and I'm open to discuss the pros and cons. In application programming frameworks there have been both philosophies:
The latter is more prominent in large projects with many stakeholders, while the former is rather common in small projects that actually only represent a single "feature" or well-defined problem solution in itself. As I consider our project to be rather "feature driven" (an application is just a big choice of features - aka drivers and upper subsystems), I was leaning towards the "encapsulation by feature" approach. Searching through bindings would of course also work with a proper naming pattern. But I have no really strong opinion on that. Hope that this became clear with the more differentiated view I actually have.
I think if we agree that using regexes to describe lots of aliases for the same type is an anti-pattern, then it would be enough to summarize the argument (i.e. the essence from our exchange) in a dedicated issue and close that issue right away as "won't fix". |
@fgrandel I will reply to your comment more later but for now, I wanted to just mention that my idea for the regex compatible was to have another top level key for it, then the drivers still use the main compatible which is actually generated in the macros. Personally though I think this is a bit awkward/hacky still, and I guess I am actually in favor of solving all those problems like we said, just use a compatible that works (that's actually the point of a compatible in DT spec, just to match to a driver), and maybe add some comment in the binding that suggests which other part numbers work with the compatible. The issue has come up a few times and people got upset at the idea of enabling a part made by one vendor under a binding prefixed with another vendor name, maybe silly but that was the motivation. examples: |
To be honest I completely agree with the reasoning that @henrikbrixandersen laid out in #73136 (assuming the other cases are similar - didn't have time to look at them all). The only thing that could be proposed is to find a generic name for the programming model, but only if it actually is documented as such somewhere. If Atmel were the first to introduce the programming model and were so successful with it that they set a de-facto standard then other vendors will have to live with it - after all they are riding the wave for a reason. I even looked at the Linux source code and I think their design is horrible. It requires a change to the driver code for every newly introduced part, even if it's only two parameters that change. That is what DT is for. So we should not orient ourselves to Linux blindly in that case. Never saw such bullshit in Linux code before: Linux code example
That's a prime example of how bad design in one place causes bad design in another place. They are just working around a design error if that's the reason why they have regex aliases. It's exactly what I meant by "a parent class knowing about all subclasses that instantiate it". No idea how they got this code past reviewers. |
On a practical level, there might need some interleaving between wildcard "child-bindings" and specific "child-nodes", such as in the video driver class, and I assume several others:
This could require to say "every child of child-nodes:
- name: port
child-bindings:
properties:
endpoint-attr-1:
type: int
endpoint-attr-2:
type: int
endpoint-attr-3:
type: int Maybe other syntaxes evoked permit to solve it too. |
Right. That's one example why I prefer that we don't introduce a top-level property for child nodes and even thought about deprecating the child-binding syntax in favor of a more general wildcard syntax in the future. These could then be mixed flexibly, are easy to read and reason about and also easy to enforce on the implementation side. properties:
port:
type: object # or node
required: true
properties:
"endpoint@[0-9]+":
type: object
min-items: 1
max-items: 5
properties:
endpoint-attr-1:
type: int
endpoint-attr-2:
type: int
endpoint-attr-3:
type: int Of course the same can be achieved with top-level properties like "child-nodes" or "child-bindings", but IMO in a less readable way. And readability is all what a language and abstraction is good for. What we actually want is impose arbitrary composable restrictions on nodes and their subnodes starting from the point where a compatible is set. Being able to composition schemas is one of the advantages of our compositional binding approach over less flexible "document description" languages like JSON-schema or pykwalify that require a full document description. The possibility to mix complex type restrictions ("compatibles") into the source places us midground between pure "schema-driven" languages like protobuf or Thrift and "schemaless" languages like JSON or YAML. And that's exactly what we want IMO. |
closing this for now but might try again another time |
In dtschema (DT binding language used by other projects such as linux), it is possible to declare a child node as a "property" of another node, basically like this:
Example:
Panels including display-timing as a property
Zephyr only has child-binding, which currently applies to all child nodes without a compatible, so in order to process the devicetree properly, we require to put a compatible property string on the display-timing node, which we then don't even use anywhere in the C code in tree. This is therefore then a minor source incompatibility with other projects like linux.
https://docs.zephyrproject.org/latest/build/dts/api/bindings/display/panel/panel-timing.html
To fix this, in zephyr DT python tooling, it seems like it would be easier to introduce this DTS semantic/syntax capability in a different way, by making a new top level key in the zephyr DT bindings that is a list of child nodes with specific names, instead of as a property, like this:
This draft shows a basic attempt at doing that that which is not really meant for serious code review right now, I just whipped this together in a few minutes, but I am looking for any feedback on the idea before I go off and try to implement it more seriously and show some examples and write documentation and test cases.