Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove hyphens from pipeline names in Nextflow workflows #3365

Open
adamrtalbot opened this issue Dec 18, 2024 · 12 comments
Open

Remove hyphens from pipeline names in Nextflow workflows #3365

adamrtalbot opened this issue Dec 18, 2024 · 12 comments

Comments

@adamrtalbot
Copy link
Contributor

Description of feature

There's a common convention outside of nf-core to call your pipeline nf-${thing}, but in doing this we add a hyphen which is an invalid Nextflow workflow name.

It's an easy fix shortly after creation to remove the hyphen, but we can replace all hyphens with strings while creating the subworkflow names.

@adamrtalbot
Copy link
Contributor Author

We could also reduce the number of workflows so this only affects one workflow in total, but this is a bigger issue.

I.e. the hierarchy would be from:

  1. anonymous workflow (main.nf)
  2. NF_CORE_PIPELINE_NAME (main.nf)
  3. PIPELINE_NAME (workflows/main.nf)

To:

  1. anonymous workflow (main.nf)
  2. PIPELINE_NAME (workflows/main.nf)

Cut out the middle man workflow.

@colinbrislawn
Copy link

I just ran into this when using nf-core pipelines create

(I'm not sure if this is the best place to open pipelines create issues, so feel free to move or x-link this post!)

User story:

I'll make a new pipeline and I'll call it ex-ample !

Then, the first time I run it:

ERROR ~ Script compilation error
- file : github/colinbrislawn/ex-ample/main.nf
- cause: Not a valid include module name @ line 16, column 1.
   include { EX-AMPLE  } from './workflows/ex-ample'
   ^

Script compilation error
- file : github/colinbrislawn/ex-ample/main.nf
- cause: Workflow definition syntax error -- A string identifier must be provided after the `workflow` keyword @ line 28, column 9.
   workflow COLINBRISLAWN_EX-AMPLE {

@adamrtalbot
Copy link
Contributor Author

adamrtalbot commented Jan 17, 2025

@colinbrislawn the easy workaround is to replace all EX-AMPLE with EX_AMPLE in every *.nf file, which will fix the issue but preserve all the ex-ample pipeline names in other files like schemas etc. Straightforward with vscode, you can use find and replace.

Do it as a separate commit and Git will sort out template merging in future.

This shouldn't be too hard a problem to solve but someone needs to find time to do it.

@colinbrislawn
Copy link

colinbrislawn commented Jan 17, 2025

Understood!

I left a comment because nf-core pipelines create generated invalid code and this is the closest existing issue I could find.

If there is a better place to put this story please let me know!

@awgymer
Copy link
Contributor

awgymer commented Jan 21, 2025

We issue a warning:

Your workflow name is not lowercase without punctuation. This may cause Nextflow errors.\nConsider changing the name to avoid special characters.

But:

  • only gets issued when you've already gone through the whole TUI and created the pipeline
  • we know which characters will break nextflow (anything not covered by Java Identifiers AFAICT)

I am against silently munging people's chosen pipeline name but I don't see why we should continue allowing people to choose names which violate the known allowed set.

I am not sure how possible it is to add validation on the TextInput for workflow name in the TUI but at the very least I think we could migrate the current warning for nf-core pipelines into the same hard error as we use for nf-core pipelines.

@colinbrislawn
Copy link

I don't see why we should continue allowing people to choose names which violate the known allowed set.

👍

@adamrtalbot
Copy link
Contributor Author

Repo names often have to obey a "standard", dictated by outside groups. If you're other programming team insist on calling everything {lang}-{proj}-{purpose} you don't get much choice in what it's called.

It’s straightforward to update the workflow name to ensure it’s syntactically correct without impacting anything outside the repository. In fact, the pipeline name isn’t required within the repository itself at all. Personally, I would name the workflow in workflows/main.nf as PIPELINE_NAME, as that’s the only instance where the pipeline name is referenced within the repository code.

@awgymer
Copy link
Contributor

awgymer commented Jan 21, 2025

The repo name is far less linked to the concept of the pipeline than the name and code (and probably shouldn't be managed by nf-core anyway).

Would be easier to just leave that up to the people who need a custom repo name (which isn't tied to the pipeline at all anyway) rather than randomly changing the users' supplied pipeline name and maintaining the code logic which goes along with that.

As a kebab-case fan for repo names that's what I have always done anyway 🤷🏼‍♂️

@adamrtalbot
Copy link
Contributor Author

That's exactly what this ticket is for. The nf-core tooling bakes in the pipeline name in many places, often putting it in a URL (e.g. github.com/org/pipeline-name). In the current form, the easiest thing is to create a pipeline with an invalid name (e.g. kebab case) then use the first commit to fix the workflow name.

From a recent template I made the pipeline name is used in:

  • .nf-core.yml
  • CHANGELOG
  • main.nf*
  • nextflow_schema.json
  • nextflow.config
  • README.md
  • CONTRIBUTING.md
  • feature_request.yml
  • multiqc_config.yml
  • base.config
  • test.config
  • test_full.config
  • usage.md
  • subworkflows/local/utils_{pipelinename}_pipeline*

Only the ones marked with an asterisk can be different from the repo name. By happy coincidence, they are also the ones that can't have a hyphen. I'd also argue the last one can be named something much better.

@awgymer
Copy link
Contributor

awgymer commented Jan 22, 2025

Ok you've persuaded me there, I see the issue regarding the github links.

I still think I would want to raise the issue to the user before they proceed to create their pipeline (i.e. with some sort of warning on the tui when you enter the name in the box)

Additionally this reminds me of another gripe I have with the pipeline - the github links are a pain to change if you host elsewhere. Should make a feature request to allow users to customise their repo URL root 🤔

@adamrtalbot
Copy link
Contributor Author

Should make a feature request to allow users to customise their repo URL root 🤔

Hmm yes. I use Github now but I remember this being a pain.

Now I think about it, we could think of it like this. Currently we have a pipeline name, which is used in multiple contexts, but really we have a project name which covers external facing stuff and the workflow name, which is internal.

If we imagine a system where we can import Nextflow pipelines, we can't do this because it's ambiguous:

include { github.com/nf-core/rnaseq }

We have to do this:

include { RNASEQ } from github.com/nf-core/rnaseq

Therefore we should have two fields, pipeline name and workflow name, where workflow name nearly always equals pipeline name but not necessarily.

Stream of conscious over.

@colinbrislawn
Copy link

We could uncouple workflow name, pipeline name, and repo name.

My request is that anything that is coupled should be validated across all of them. Invalid for one means invalid for all to avoid traps later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants