Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

research #83

Closed
lena-hinrichsen opened this issue Apr 12, 2022 · 5 comments
Closed

research #83

lena-hinrichsen opened this issue Apr 12, 2022 · 5 comments
Assignees

Comments

@lena-hinrichsen
Copy link
Member

No description provided.

@joschrew
Copy link

I am using this ticket to write down what I have done so far. I have started to implement parts of the API and I have also thought about the API. This could also be called research.

In my opinion, we should implement the API if we want to use it (in whatever form). I think this is the only way to notice the weaknesses and flaws that the API will most likely have. A mock could possibly be enough, it depends on how detailed it is. I also don't know if you would call what I'm doing an implementation or if it's still a mock, since I'm not aiming for performance.

My approach is to do a minimal implementation first. I do this not only to find out how the API works and what is still missing. I think it's also a good way to get to know OCR-D better and thus it would be another step to somehow "step into OCR-D" better. I first want to provide OCR-D functionality with a Docker container, similar to what is shown in the user guide. Docker could also be used to run workflows. I realize that the performance of this will be very poor. But that's also the reason (i.e. that I want use Docker for the workflow) why I don't see a direct link to the workflow server (pull request from Robert) for now.

  • The test-repo is here: https://github.com/joschrew/ocrd-webapi-test
  • What I have done so far:
    • first I created the stubs of the functions from the WebAPI with a tool to have a starting point: https://github.com/koxudaxi/fastapi-code-generator. Very small changes were necessary so that it starts.
    • then I implemented individual functions:
      • get discovery: implemented it because it is an as a easy staring point
      • post workspace: upload a workspace
      • get workspaces: list uploaded workspaces
      • post processor: start a processor

@tdoan2010
Copy link

Hi, I've checked your work, thank you.

I will take over the code from here. In my opinion, there is a more urgent topic I hope you could investigate. I want to know how to create a Nextflow workflow via a REST endpoint. In short, please focus on this endpoint only.

You don't have to deliver a fully working endpoint at this time. Just come up with ideas how to implement it, which options are available, how the request body should look like, etc.

Some weeks ago, I thought about using DolphinNext to handle these workflow endpoints because the functions we need were already implemented there. So, that might be a starting point for you.

@tdoan2010
Copy link

@MehmedGIT if you have any idea how to create a Nextflow workflow via a REST endpoint, please let us know.

@MehmedGIT
Copy link

In the best case, I guess, we are aiming to have a good and extensive REST API? Not just a single workflow POST method which we then try to parse to a Nextflow script. For example, we should be able to define separate NextFlow processes (or rather OCR-D processors) with POST methods. Then with another POST method to combine the previously specified processors into a single workflow. In other words, building the Nextflow workflow script step by step and manipulating it with REST requests. If we do the parsing of the workflow on a process level, then it should be easier to parse. However, I am also not sure about this approach. It may end up making the workflow creation process a more challenging task for the users if there is no WebUI to ease the building of workflows.

DolphinNext does not support a good API for that purpose. Check here. From my current understanding, what they do is gather data via the WebUI and convert that data into a single POST request (Create a Run) which is then submitted to the server and parsed to a NextFlow script. I still have not found how exactly they do the parsing, I am checking that. Ideally, everything that the DolphinNext's WebUI does, we should also be able to do with a series of REST requests to the OCR-D server.

@tdoan2010
Copy link

This is the protocol of our meeting. In short, we agreed that:

  • All implementation projects agree to use Nextflow.
  • Therefore, Nextflow scripting language (Groovy) can be used as an exchange format.
  • OCR-D/core will provide a tool to validate the workflow.
    • Validating means that the workflows must not contain script task, but only call to ocrd- processors.
  • We need to check Robert's work and decide what to do with it.
  • OCR-D/core needs to provide a mechanism so that all processors can expose their REST endpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants