Skip to content

Conversation

@vtjnash
Copy link
Contributor

@vtjnash vtjnash commented Jan 19, 2026

Guard .align directive to avoid internal compiler error on AArch64 Windows with clang.

See: llvm/llvm-project#149547
Update of #5566 to fix remaining issues
See: #5076

Guard .align directive to avoid internal compiler error on
AArch64 Windows with clang.

See: llvm/llvm-project#149547
See: OpenMathLib#5076

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
" beq 3f //dot_kernel_F1 \n"
#ifndef _MSC_VER
/* https://github.com/llvm/llvm-project/issues/149547 */
#if !(defined(__clang__) && defined(OS_WINDOWS))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the _MSC_VER to be equivalent here (it will never match plain MSVC as that can't handle the inline assembly anyway, but it matches any LLVM derivative that uses the clang-cl shim) ?

Copy link
Contributor Author

@vtjnash vtjnash Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be similar? Hard to check exactly if all libc specify that clang+_WIN32 is exactly the same as MSC when this needs to do is to specify clang+_WIN32 and exclude MSVC. It is just very confusing to intentionally use a macro to work around a bug in one specific compiler by specifying that this should be disabled for a completely different compiler

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh well, I'm not opposed to switching to your version if it feels less confusing. My idea was that it would catch any LLVM-derived compiler that goes through the MSVC backend as well, not immediately sure if those would also define __clang__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is that eventually it gets fixed in llvm, and then these checks can be changed to an exact __clang_major__ check

There is clang/c2, which is clang with msvc backend (https://devblogs.microsoft.com/cppblog/clang-c2-we-need-your-advice/), but that wouldn't have this bug. I don't think there is an implementation for using msvc frontend with llvm backend

@martin-frbg
Copy link
Collaborator

Thanks - did you happen to check if these changes are sufficient for using these files with WoA ? I think it is likely that they might need changes to the storing of the complex results (similar to the earlier zdot changes) but I won't get around to checking this myself today

@vtjnash
Copy link
Contributor Author

vtjnash commented Jan 19, 2026

Thanks - did you happen to check if these changes are sufficient for using these files with WoA ?

I'm not precisely sure what you want checked. It compiles and seems to pass tests. clang has always supported c99 just fine (unlike MSVC, which still is buggy):

OpenBLAS/common.h

Lines 579 to 583 in 5ffbf38

/* C99 supports complex floating numbers natively, which GCC also offers as an
extension since version 3.0. If neither are available, use a compatible
structure as fallback (see Clause 6.2.5.13 of the C99 standard). */
#if ((defined(__STDC_IEC_559_COMPLEX__) || __STDC_VERSION__ >= 199901L || \
(__GNUC__ >= 3 && !defined(__cplusplus))) && !(defined(FORCE_OPENBLAS_COMPLEX_STRUCT))) && !defined(_MSC_VER)

@martin-frbg
Copy link
Collaborator

CI failures are spurious BTW

Thanks - did you happen to check if these changes are sufficient for using these files with WoA ?

I'm not precisely sure what you want checked. It compiles and seems to pass tests. clang has always supported c99 just fine (unlike MSVC, which still is buggy):

OpenBLAS/common.h

Lines 579 to 583 in 5ffbf38

/* C99 supports complex floating numbers natively, which GCC also offers as an
extension since version 3.0. If neither are available, use a compatible
structure as fallback (see Clause 6.2.5.13 of the C99 standard). */
#if ((defined(__STDC_IEC_559_COMPLEX__) || __STDC_VERSION__ >= 199901L || \
(__GNUC__ >= 3 && !defined(__cplusplus))) && !(defined(FORCE_OPENBLAS_COMPLEX_STRUCT))) && !defined(_MSC_VER)

Back in december I had to work around the OPENBLAS_COMPLEX_FLOAT macro in zdot_thunderx2t99.c not compiling (with LLVM21 on a WoA laptop I had on loan from Qualcomm), but could be the conditional in common.h is borked.
BTW the two NRM2 kernels had been "mostly" retired in 2024 due to low accuracy and inability to handle some corner cases - this will probably need a revisit from someone who really knows ARM assembly before they can be put to good use again.

@martin-frbg martin-frbg added this to the 0.3.32 milestone Jan 20, 2026
@martin-frbg martin-frbg merged commit 413e609 into OpenMathLib:develop Jan 20, 2026
93 of 102 checks passed
@vtjnash vtjnash deleted the jn/zdot_thunderx2t99-ICE branch January 20, 2026 13:41
@vtjnash
Copy link
Contributor Author

vtjnash commented Jan 20, 2026

It might be because I'm using mingw-w64 headers, which seem to have full c99 support, while it sounded like you might have been using MSVC headers?

@martin-frbg
Copy link
Collaborator

It might be because I'm using mingw-w64 headers, which seem to have full c99 support, while it sounded like you might have been using MSVC headers?

I guess that would explain it - I'm doing my best to keep environments separate so that both MSYS2 and "native" LLVM package builds will work on WoA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants