-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future? Roadmap? #6
Comments
Sorry for the delay, i had 11th exams. |
not any real roadmap, i am still fixing bugs and issues. if you have any ideas, lets do it |
Take your time, no need to hurry.
I'm afraid my ideas are both sharply incompatible with certain goals of pipescript (judging based on readme and examples) but 100% compatible with certain other goals from the same readme and examples 😓. So I'm not sure how you'll perceive the following largely unexplored ideas:
This has many use cases as you know - from easy data exploration through easy writing of true apps up to having no-effort UIs (implicit insertion of Crazy, isn't it? I also have some basics of the corresponding syntax, but that's not important compared to having a clear mutual agreement for goals 😉. |
i don't know much about data manipulation ( splitting & muxing these are going over my head ) but i am happy to learn. it will be very good for pipe-script project to have any real goal. |
Ok, if you didn't set after any firm goals yet, then I'd be more than happy to provide some input for an interesting technology and also a thesis (I have experience in mentoring university students as an external/non-university person). What I'm describing is a bit like creating a streaming data computation graph using a simple syntax. ad 2) I want it to have general computational power. Any signal processing needs feedback, any simulation of digital HW (e.g. circuits from logic gates) needs feedback, etc. I have some very shallow thoughts of how to handle the complexity this might lead to. So I'm aware of certain edge cases (most of which should be "solved" by supporting both sync & async computation). ad 3 & 4) Here I'd like to achieve "partial ordering". Imagine you have an expensive computation you want to run on many cores but you have only one input. Then I'd like to have an option to use the input as a common queue for work-stealing - each core would steal the current sample/item (or batch of those for speed) from the queue. Assume that each core does a chain of operations (not only one) and that these operations might again need to "split" across several other cores (i.e. nested "splitting"). Then I'd like to assemble the output from all these cores again to one queue. Here I'd like to have 2 options - either assemble them as they come (in "random" order) or assemble them in the original order (e.g. based on nested tagging done when splitting). ad 5) I'm imagining this in a discrete domain (i.e. no continuous/smooth function but rather a step function). Thus defining it as a relative measure to other streams (i.e. closer to how network traffic shaping is being done - i.e. by delaying some samples based on given measure to achieve the desired "average" throughput). ad 6) This is one of many potential (conceptual) solutions to the problem of outputting more than one stream from a command - merge of two streams of 64bit ints could e.g. result in a stream of structures containing two 64bit ints (and padding if needed). Reshaping would then either add or remove or swap or rename "fields" (actually representing streams at one point in time) in such a structure. Reshaping shall ideally encompass both physical representation (imagine a C struct with fields and padding and alignment) and logical representation (names of fields, their order if any, value constraints over all fields together - please do not confuse with types as types do specify constraints only over each field separately and actually are usually very limited). For synchronous computation, it's clear how to populate such struct. For async I'd simply take the currently newly computed value for a given field and then took old most recent values of all other fields to populate the struct. But again, this is just a crazy not very thought-through idea (it actually emerged from the need to allow to run all commands in parallel - therefore so much copying). ad 8 & 9) This probably deserves more explanation due to narrow semantics of this topic in most programming languages. What I mean is "true" synchronicity and asynchronicity as seen in physical world (though in real world there seems to be everything async if not talking about things like quantum entanglement etc., but usually we define things to happen synchronously if the measured difference of absolute time is less than Also this seems to be one of the rather major differences compared to many (all I know so far) "pipe languages". I imagine that I could specify a set of subgraphs where each subgraph would either await each other to provide one new sample as a result of their computation (this is synchronous) or to not await each other to provide one new sample and just proceed further (this is asynchronous). I have some very preliminary syntax ideas for this (unrelated to pipe-script and thus incompatible with it). But the main point is the semantics as described. ad 10) Assuming our language will deliberately not specify how the streams will be implemented on inside, there has to be a way to read (as a source) & write (as a sink) data in a specific format. If we had reshaping as suggested in (6), this wouldn't be a problem as reshaping covers the physical representation (endianess, sizes, paddings, signedness, etc.). But as I wrote, it's just an idea and something else (better) might be used. In which case we'll need an explicit generic way how to e.g. read & write serialized data (e.g. in JSON or BSON or FlatBuffers or any other of those gazillions of existing & future formats). ad 11) Here I'd actually be fine even with a more constrained variation - like requiring to know in compile time how the to-be-deleted or to-be-added graph will look like and only care about it's (de)instantiation in run time. ad 13) My idea is to forget completely about "errors being errors" but simply treat any error from a given instance of a command as yet another full-featured output queue (i.e. the command would be seen as multiple-output "subgraph"). ad 14) This is merely a note, that e.g. in case the graph(s) should be deployed as a distributed computing system (cores connected with some type of unreliable network), then there might be the need to specify which of the two commands with pipe in between shall (re)initiate the communication. ad 15) I didn't think of all the edge cases but my initial idea was rather syntax-oriented than semantics. Because I see "streams of streams" as simply a "trunk" or "set" of streams but emphasizing that a command (disregarding whether source or sink or both (aka filter)) has the liberty to "open" and/or "close" any new or existing stream any time. In other words, this point merely holistically reiterates on many of the other points above 😉. ad 16) This "heterogeneous or not" is partially a philosophical question, partially an implementation detail and partially an important decision affecting how the pipe language shall be structured 😉. It's just another of the possible solutions to multiple outputs from one command (in addition to the "struct" concept explained in the context of reshaping in (6) ). We could actually put samples representing different outputs into one stream but in a serial manner instead of "next to each other as described in (6)". This way any command would first need to read the sample and dynamically in run time determine its "type" (incl. length/size, padding, etc.) because generally it wouldn't be easily possible to determine this with some clever analysis up front in compile time. This might have some performance influence - probably less copying than the "struct" approach but also more processing due to a "switch/branch" inside of nearly every single command. Again, this is not thought-through and all rather preliminary... Any comments, ideas, thoughts very much welcome and appreciated! My highly preliminary syntax for such a language should be closer to this than to others:
Some underlying ideas:
I know there are numerous inconsistencies and open questions in this syntax, so feel free to speak up (and ideally offer potential solutions 😉). Key takeaways: everything is a stream (even if many have only one indefinitely repeating sample), everything is inherently parallel, there is a separate compile-time evaluation of subgraphs and separate run-time evaluation of subgraphs, cycles in subgraphs are allowed (necessary). Let me know what you think, I'm all ears! |
future goals for now then
did i get it right? |
What I described is a standalone new language unrelated to your current So the question rather is whether you want to just take some of the ideas and implement them in Which of these 2 option would you choose? |
yeah you are right pipe-script is not designed for this, lets start form scratch. 👍 |
Ok, will you create a new repository so that I can subscribe and closely follow your development? |
here you go |
I now wanted to showcase the discussion we had in the new repo to someone but found out it doesn't exist any more. Did you make the repo non-public or delete it completely? Do you have any archive of the discussion of ours? |
sorry, but i have deleted the repo. |
That's a pity - some of your and mine work is gone. So let's take this as a life lesson - never remove data and instead only archive them or hide them. |
I wonder what the future and the roadmap looks like?
I have also some (very) crazy ideas but I'd first like to learn about the goals and roadmap if any 😉.
The text was updated successfully, but these errors were encountered: