Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 1 addition & 7 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2016,14 +2016,8 @@ SDValue SelectionDAGBuilder::getValueImpl(const Value *V) {
// If this is an instruction which fast-isel has deferred, select it now.
if (const Instruction *Inst = dyn_cast<Instruction>(V)) {
Register InReg = FuncInfo.InitializeRegForValue(Inst);

std::optional<CallingConv::ID> CallConv;
auto *CB = dyn_cast<CallBase>(Inst);
if (CB && !CB->isInlineAsm())
CallConv = CB->getCallingConv();

RegsForValue RFV(*DAG.getContext(), TLI, DAG.getDataLayout(), InReg,
Inst->getType(), CallConv);
Inst->getType(), std::nullopt);
SDValue Chain = DAG.getEntryNode();
return RFV.getCopyFromRegs(DAG, FuncInfo, getCurSDLoc(), Chain, nullptr, V);
}
Expand Down
6 changes: 6 additions & 0 deletions llvm/lib/Target/X86/X86FastISel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3265,6 +3265,12 @@ bool X86FastISel::fastLowerCall(CallLoweringInfo &CLI) {
bool Is64Bit = Subtarget->is64Bit();
bool IsWin64 = Subtarget->isCallingConvWin64(CC);

// If the return type is ilegal, don't bother to promote it, just fall back to
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic kind of makes sense, but this is too narrow, and I would expect would already logically happen. Why wouldn't the same apply for the input arguments? I thought fast isel already gave up on illegally typed operations, so why does this need a new special case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't use mips as a reference on how to do anything

Comment thread
LuoYuanke marked this conversation as resolved.
Outdated
// DAG ISel.
MVT RetVT;
if (!isTypeLegal(CLI.RetTy, RetVT) && !CLI.RetTy->isVoidTy())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be processing the processed return type, instead of this. AArch64 example:

if (!isTypeLegal(Val->getType(), VT) &&

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CLI.OutVals is not the returned values. X86 has code to check it at https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/X86/X86FastISel.cpp#L3306

return false;

// Call / invoke instructions with NoCfCheck attribute require special
// handling.
if (CB && CB->doesNoCfCheck())
Expand Down
55 changes: 55 additions & 0 deletions llvm/test/CodeGen/X86/bf16-fast-isel.ll
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ define i8 @test_direct_call(ptr %f) nounwind {
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: pushq %rax
; CHECK-NEXT: callq foo@PLT
; CHECK-NEXT: pextrw $0, %xmm0, %eax
; CHECK-NEXT: shll $16, %eax
; CHECK-NEXT: movd %eax, %xmm0
; CHECK-NEXT: callq __truncsfbf2@PLT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are redundant but functional correct. I think it's not a big problem for fast isel.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's true? Adding a "redundant" fpext/fptrunc will canonicalize, which is not allowed for just function return / argument passing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think this may be a more general issue with SDAG getCopyFromParts/getCopyToParts. It does it via FP_EXTEND/FP_ROUND, which is not really correct, but also tricky to avoid (except maybe in this specific bfloat case, where the trunc could just be a shift, as long as we know that the low bits must be zero).

; CHECK-NEXT: callq bar@PLT
; CHECK-NEXT: popq %rcx
; CHECK-NEXT: retq
Expand All @@ -20,6 +24,10 @@ define i8 @test_fast_direct_call(ptr %f) nounwind {
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: pushq %rax
; CHECK-NEXT: callq foo_fast@PLT
; CHECK-NEXT: pextrw $0, %xmm0, %eax
; CHECK-NEXT: shll $16, %eax
; CHECK-NEXT: movd %eax, %xmm0
; CHECK-NEXT: callq __truncsfbf2@PLT
; CHECK-NEXT: callq bar@PLT
; CHECK-NEXT: popq %rcx
; CHECK-NEXT: retq
Expand All @@ -36,6 +44,10 @@ define i8 @test_indirect_all(ptr %fptr, ptr %f) nounwind {
; CHECK-NEXT: movq %rdi, %rbx
; CHECK-NEXT: movq %rsi, %rdi
; CHECK-NEXT: callq foo@PLT
; CHECK-NEXT: pextrw $0, %xmm0, %eax
; CHECK-NEXT: shll $16, %eax
; CHECK-NEXT: movd %eax, %xmm0
; CHECK-NEXT: callq __truncsfbf2@PLT
; CHECK-NEXT: callq *%rbx
; CHECK-NEXT: popq %rbx
; CHECK-NEXT: retq
Expand All @@ -45,13 +57,56 @@ entry:
ret i8 %call2
}

define i8 @test_indirect_all2(ptr %fptr, ptr %f, i1 %cond) nounwind {
; CHECK-LABEL: test_indirect_all2:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: pushq %rbp
; CHECK-NEXT: pushq %rbx
; CHECK-NEXT: pushq %rax
; CHECK-NEXT: movl %edx, %ebp
; CHECK-NEXT: movq %rdi, %rbx
; CHECK-NEXT: movq %rsi, %rdi
; CHECK-NEXT: callq foo@PLT
; CHECK-NEXT: testb $1, %bpl
; CHECK-NEXT: je .LBB3_2
; CHECK-NEXT: # %bb.1: # %exit
; CHECK-NEXT: pextrw $0, %xmm0, %eax
; CHECK-NEXT: shll $16, %eax
; CHECK-NEXT: movd %eax, %xmm0
; CHECK-NEXT: callq __truncsfbf2@PLT
; CHECK-NEXT: callq *%rbx
; CHECK-NEXT: jmp .LBB3_3
; CHECK-NEXT: .LBB3_2: # %exit2
; CHECK-NEXT: movb $3, %al
; CHECK-NEXT: .LBB3_3: # %exit2
; CHECK-NEXT: addq $8, %rsp
; CHECK-NEXT: popq %rbx
; CHECK-NEXT: popq %rbp
; CHECK-NEXT: retq
entry:
%call = call bfloat @foo(ptr %f)
br i1 %cond, label %exit, label %exit2

exit:
%call2 = call zeroext i8 %fptr(bfloat %call)
ret i8 %call2

exit2:
ret i8 3
}


define i8 @test_fast_indirect_all(ptr %fptr, ptr %f) nounwind {
; CHECK-LABEL: test_fast_indirect_all:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: pushq %rbx
; CHECK-NEXT: movq %rdi, %rbx
; CHECK-NEXT: movq %rsi, %rdi
; CHECK-NEXT: callq foo@PLT
; CHECK-NEXT: pextrw $0, %xmm0, %eax
; CHECK-NEXT: shll $16, %eax
; CHECK-NEXT: movd %eax, %xmm0
; CHECK-NEXT: callq __truncsfbf2@PLT
; CHECK-NEXT: callq *%rbx
; CHECK-NEXT: popq %rbx
; CHECK-NEXT: retq
Expand Down
26 changes: 26 additions & 0 deletions llvm/test/CodeGen/X86/fast-isel-v4i1.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
; RUN: llc --fast-isel -mcpu=znver5 < %s -mtriple=x86_64-unknown-unknown | FileCheck %s

define fastcc i16 @test() nounwind {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this test file was called pr179100.ll for better reference

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised.

; CHECK-LABEL: test:
; CHECK: # %bb.0: # %entry
; CHECK-NEXT: pushq %rax
; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0
; CHECK-NEXT: xorl %edi, %edi
; CHECK-NEXT: callq foo@PLT
; CHECK-NEXT: vpslld $31, %xmm0, %xmm0
; CHECK-NEXT: xorl %edi, %edi
; CHECK-NEXT: vpmovd2m %xmm0, %k0
; CHECK-NEXT: vpxor %xmm0, %xmm0, %xmm0
; CHECK-NEXT: vpmovm2d %k0, %xmm1
; CHECK-NEXT: callq bar@PLT
; CHECK-NEXT: popq %rcx
; CHECK-NEXT: retq
entry:
%0 = call fastcc <4 x i1> @foo(ptr null, <4 x i1> zeroinitializer)
%1 = call fastcc i16 @bar(ptr null, <4 x i1> zeroinitializer, <4 x i1> %0)
ret i16 %1
}

declare fastcc <4 x i1> @foo(ptr, <4 x i1>)
declare fastcc i16 @bar(ptr, <4 x i1>, <4 x i1>)