Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The '%w' and '%i' syntax for arrays is just a bad idea #908

Open
nello opened this issue Nov 14, 2022 · 11 comments
Open

The '%w' and '%i' syntax for arrays is just a bad idea #908

nello opened this issue Nov 14, 2022 · 11 comments

Comments

@nello
Copy link

nello commented Nov 14, 2022

Why are these forms favoured? They're harder to read than the original syntax and easily lead to errors with commas. I've personally lost hours chasing non-existent bugs thanks to this "feature" because the eye naturally skips over commas in lists.

I get that people seem to love new shiny things, but this is just a poor decision IMHO.

@pirj
Copy link
Member

pirj commented Nov 14, 2022

You must be referring to the %w guideline:

Prefer %w to the literal array syntax when you need an array of words (non-empty strings without spaces and special characters in them). Apply this rule only to arrays with two or more elements.

with the following examples:

# bad
STATES = ['draft', 'open', 'closed']

# good
STATES = %w[draft open closed]

Do you have some counterexamples in mind?

As the style guide strives to follow the community usage of the Ruby language,

A style guide that reflects real-world usage gets used

would you like to harvest real-world-ruby-apps meta-repo to have some evidence that the literal string array syntax is used more often than %w?

@nello
Copy link
Author

nello commented Nov 14, 2022

Thanks for the reply.

The example that keeps occurring for me is the process of adding another token to a %w-ified list. As array elements in nearly all languages have commas between them, silently automatically adding the comma into the token just causes pain.

The other one is long lines:

STATES = %w[This is an example of what happens when you wrap over a line and there are more words than I,
or anyone else can parse in a gestalt]

Is there a way to harvest real-world usage from before the recommendation was added to the style guide? It has an undue influence for better or worse, as it gets rolled into tooling (especially IntelliJ/RubyMine in my case).

@pirj
Copy link
Member

pirj commented Nov 14, 2022

silently automatically adding the comma into the token just causes pain

The guideline non-ambiguously tells to "strings without ... special characters in them", where a comma is a special character. Such usage would be detected by a corresponding cop and would save you from the pain:

Within %w/%W, quotes and ',' are unnecessary and may be unwanted in the resulting strings

Are there other real daily usage examples that are problematic for you?

Is there a way to harvest real-world usage from before the recommendation was added to the style guide?

Things are as they are now. We don't doubt much about the effectiveness of boolean logic vs tri-state logic, using null, or byte length, even though, as an after-thought, those might not have been the best design decisions. I'm just stating that the guide reflects modern real-life usage, even though early iterations of the guide might have affected real-world usage.

@bbatsov
Copy link
Collaborator

bbatsov commented Nov 14, 2022

Agreed. We generally try to avoid affecting real-world usage, but rather reflect it in the guide. Btw, I'm not a big fan of %-literals myself and I've often publicly criticized them, but we can't deny they have always been quite popular in the wild.

@nello
Copy link
Author

nello commented Nov 14, 2022

Re the "corresponding cop" suggestion I'm pretty dubious that fixing issues exacerbated by tooling with more tooling is a virtuous cycle. I think that the guide has a degree of influence it's wrong to minimise. If a practice is poor, is popularity enough to make it a recommendation?

@pirj
Copy link
Member

pirj commented Nov 14, 2022

Are you confident enough to state that a certain practice is poor, basing on just your personal experience? If you look at this real-world- meta-repos, you would notice that not all of them use RuboCop, and they might even use alternative Ruby style guides. We are not affiliated with those meta-repos, and use them as an open data reference.

If a practice is poor, is popularity enough to make it a recommendation?

I understand what you mean. Yes. As Bozhidar mentions above about his personal preference to avoid %w, I have my personal preferences, too, that are going against what the style guide tells, usage of or/and being a notable example. Still, we try to be objective and unbiased towards personal preferences.

@Drenmi
Copy link
Contributor

Drenmi commented Nov 23, 2022

The argument for this is reduced git churn for multiline lists, as only one line will be touched when re-ordering, preserving git blame, etc. To that end, the argument of problems with commas apply equally to the "normal" form, as it has happened quite frequently at work that someone re-ordered an array literal and forgot to move the comma. 😅

To me the strongest argument against using this style is that it can't be consistently applied. I prefer rules that aren't applied conditionally, like this one is as it requires the qualifier "non-empty strings without spaces and special characters in them".

Personally I'm on the fence about which one is "better". There's clearly a case to be made for either. But as it's a community style guide, I think we need to tolerate that the community might not always share our personal preferences.

@JohnCoates
Copy link

would you like to harvest real-world-ruby-apps meta-repo to have some evidence that the literal string array syntax is used more often than %w?

@pirj I'm interested in this issue as well. Is there a guide for harvesting?

@pirj
Copy link
Member

pirj commented May 9, 2023

@JohnCoates something like:

  1. Clone it with sub modules (might require fixing refs as there are dead ones, and ones with renamed integration branches)
  2. Write a cop that would raise an offense for %i usage
  3. Write a cop that would detect %w
  4. Write a cop that would detect []
  5. Run those three cops separately against the metarepo. You may need to remove all .rubocop*.yml files from there.
  6. Compare the number of offences for each

there was a tool that would save you on steps 2-5 that just goes through a directory and prints all matched usages https://jonatas.github.io/fast/, but I have never used it.
Grep would not be a viable option.

@pirj
Copy link
Member

pirj commented May 9, 2023

easily lead to errors with commas

I believe there is or should be a cop that would catch punctuation inside percent literals, as they certainly are unintentional in most cases if not always.

@marknuzz
Copy link

marknuzz commented Oct 3, 2024

The argument for this is reduced git churn for multiline lists, as only one line will be touched when re-ordering, preserving git blame, etc.

@Drenmi IMO that's not a great argument, ideally you'd be using git-blame-ignore-revs which has been available since 2019.

That being said, it sounds like the OP doesn't believe in linters (which have been in use since 1978) and I feel that's the root cause of the OP's problem, not the style guide (which as others pointed out, also addresses the scenario). Linters are the only realistic way to ensure a consistent application of a style guide. By definition, if you aren't using them, you're not applying the style guide. The OP's admission that not all the rules are being followed is proof of that. So I feel this issue is moot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants