Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

owcc -O2 (-O3) with -mtune=i686 does not create cmovl or cmovg (wcc386 gets -onatxhl+ -6r) #1308

Open
winspool opened this issue Jul 24, 2024 · 2 comments
Labels
CG Code generator enhancement

Comments

@winspool
Copy link
Contributor

I tried code example from a youtube video and found out,
that the OW does not detect the branchless code
and always uses a branch in the other examples.

Original source was a branchless version and a version with if.
I added an often used code-style with a conditional:

/* original example: branchless code */
int smaller_branchless(int a, int b)
{
    return a* (a<b)  +  b*(b<=a);
}

/* my extension: A very common usage with a conditional */
int smaller_cond(int a, int b)
{
    return (a < b) ? a : b;
}

/* original example: with if */
int smaller_if(int a, int b)
{
    if (a < b)
        return a;
    else
        return b;
}

clang (-O2/-O3) detects the branchless version and produces the same code for all 3 variants:

   0:	8b 44 24 08          	mov    0x8(%esp),%eax
   4:	8b 4c 24 04          	mov    0x4(%esp),%ecx
   8:	39 c1                	cmp    %eax,%ecx
   a:	0f 4c c1             	cmovl  %ecx,%eax
   d:	c3                   	ret

The branchless version of gcc (-O2/-O3) is a bit longer,
but gcc creates the same code for the conditional and the "if" version.

  20:	8b 44 24 08          	mov    0x8(%esp),%eax
  24:	8b 54 24 04          	mov    0x4(%esp),%edx
  28:	39 d0                	cmp    %edx,%eax
  2a:	0f 4f c2             	cmovg  %edx,%eax
  2d:	c3                   	ret

OpenWatcom has a small advantage here, because of the selected register calling convention (-6r)
but does not detect, what the branchless code is doing
and for the conditional source example and the "if" source example,
the code produced by OW (owcc -mtune=i686 with -O2 or with -O3) has always a branch:

0020  39 D0				cmp		eax,edx
0022  7D 01				jge		L$1
0024  C3				ret
0025				L$1:
0025  89 D0				mov		eax,edx
0027  C3				ret

I expect, that the impact of running the branching code produced by OW
is smaller on recent CPUs (speculative execution, register renaming, branch prediction, ...)
compared to the Pentium Pro processor generation (when the "cmov" family of commands appeared)
but when a branch can be avoided easily, it should be done (see clang and gcc).

@winspool
Copy link
Contributor Author

winspool commented Jul 24, 2024

I used Compiler explorer and verified, that icc (2021.10) and msvc (17.10)
create similar branchless code for the conditional example and the "if" example.

https://msvc.godbolt.org/z/YYdv41x97

@jmalak jmalak added enhancement CG Code generator labels Jul 25, 2024
@jmalak
Copy link
Member

jmalak commented Jul 25, 2024

˝Thanks for your info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CG Code generator enhancement
Projects
None yet
Development

No branches or pull requests

2 participants