Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions clang/lib/CodeGen/CGExprAgg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1978,6 +1978,10 @@ void AggExprEmitter::EmitCheckedBoundPointerArithmetic(
LValue BaseLV = CGF.EmitAggExprToLValue(Base);
EmitBoundPointerArithmetic(DestLV, BaseLV, Idx, IsSigned, IsSub);

if (IsSub) {
Idx = Builder.CreateNeg(Idx, "idx.neg");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm a little confused by this. IsSub is already being passed to the IsSubtraction parameter of EmitCheckedInBoundsGEP so the method knows we are doing a subtraction so it seems surprising that we would need to negate Idx.

Are we passing the wrong value to the SignedIndices parameter of EmitCheckedInBoundsGEP?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was pretty confused by this as well and looked at a number of callsites. Looking at the complete code you pasted above, it negates the index in the first case, does not in the second case (which might be a bug?). The last case is just a plain increment at least according to the comment.

I see the negation in other places as well.

if (const VariableArrayType *vla
          = CGF.getContext().getAsVariableArrayType(type)) {
      llvm::Value *numElts = CGF.getVLASize(vla).NumElts;
      if (!isInc) numElts = Builder.CreateNSWNeg(numElts, "vla.negsize");
      llvm::Type *elemTy = CGF.ConvertTypeForMem(vla->getElementType());
      if (CGF.getLangOpts().PointerOverflowDefined)
        value = Builder.CreateGEP(elemTy, value, numElts, "vla.inc");
      else
        value = CGF.EmitCheckedInBoundsGEP(
            elemTy, value, numElts, /*SignedIndices=*/false, isSubtraction,
            E->getExprLoc(), "vla.inc");

    // Arithmetic on function pointers (!) is just +-1.
    } else if (type->isFunctionType()) {
      llvm::Value *amt = Builder.getInt32(amount);

      if (CGF.getLangOpts().PointerOverflowDefined)
        value = Builder.CreateGEP(CGF.Int8Ty, value, amt, "incdec.funcptr");
      else
        value =
            CGF.EmitCheckedInBoundsGEP(CGF.Int8Ty, value, amt,
                                       /*SignedIndices=*/false, isSubtraction,
                                       E->getExprLoc(), "incdec.funcptr");

    // For everything else, we can just do a simple increment.
    } else {
      llvm::Value *amt = Builder.getInt32(amount);
      llvm::Type *elemTy = CGF.ConvertTypeForMem(type);
      if (CGF.getLangOpts().PointerOverflowDefined)
        value = Builder.CreateGEP(elemTy, value, amt, "incdec.ptr");
      else
        value = CGF.EmitCheckedInBoundsGEP(
            elemTy, value, amt, /*SignedIndices=*/false, isSubtraction,
            E->getExprLoc(), "incdec.ptr");
    }

SignedIndices is independent of this. For example in the test I wrote, SignedIndices = false and IsSub = true, while if i replace ptr - offset; with ptr - 3; SignedIndices = true and IsSub = false (which makes sense).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check this again and add some more tests

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last case is just a plain increment at least according to the comment.

I think that comment is misleading because
isSubtraction is bool isSubtraction = !isInc; on that path I think so this can still be a decrement.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SignedIndices = false and IsSub = true, while if i replace ptr - offset; with ptr - 3; SignedIndices = true and IsSub = false (which makes sense).

That interesting. Sounds like when we have a constant we effectively are treating this as ptr + (-3).

I'm still very suspicious because given that we are telling EmitCheckedInBoundsGEP:

  • We are doing subtraction
  • The index is an unsigned value

would suggest there isn't a need to negate the index at all. However a quick glance at the implementation shows:

Value *
CodeGenFunction::EmitCheckedInBoundsGEP(llvm::Type *ElemTy, Value *Ptr,
                                        ArrayRef<Value *> IdxList,
                                        bool SignedIndices, bool IsSubtraction,
                                        SourceLocation Loc, const Twine &Name) {
  llvm::Type *PtrTy = Ptr->getType();

  llvm::GEPNoWrapFlags NWFlags = llvm::GEPNoWrapFlags::inBounds();
  if (!SignedIndices && !IsSubtraction)
    NWFlags |= llvm::GEPNoWrapFlags::noUnsignedWrap();

  Value *GEPVal = Builder.CreateGEP(ElemTy, Ptr, IdxList, Name, NWFlags);

  // stuff that does use `IsSubtraction` for computing overflow checking.


  return GEPVal;

It seems at least for the computed value we return, the IsSubtraction isn't really used other than setting the wrapping flags. This makes it seem like IsSubtraction is a bit of a footgun.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this seems to be bad API design, and fixing it would mean updating all the callsites. If you want I can create a radar and fix that upstream but we would also need a separate patch in this repo for bounds-safety usages. What would you recommend?

Copy link
Copy Markdown

@delcypher delcypher Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The origin of this seems to be from https://reviews.llvm.org/D34121.

So perhaps this is expected behavior because the doxygen comments say

  /// Same as IRBuilder::CreateInBoundsGEP, but additionally emits a check to
  /// detect undefined behavior when the pointer overflow sanitizer is enabled.
  /// \p SignedIndices indicates whether any of the GEP indices are signed.
  /// \p IsSubtraction indicates whether the expression used to form the GEP
  /// is a subtraction.
  llvm::Value *EmitCheckedInBoundsGEP(llvm::Type *ElemTy, llvm::Value *Ptr,
                                      ArrayRef<llvm::Value *> IdxList,
                                      bool SignedIndices, bool IsSubtraction,
                                      SourceLocation Loc,
                                      const Twine &Name = "");

and IRBuilder::CreateInBoundsGEP doesn't have any notion of subtraction. Everything is an addition and I think you have to manually negate any value in the IdxList first if you want to do negative indexing. Also because IdxList is a list and not a single llvm::Value* which of those should be subtracted?

That being said this is pretty confusing. I think we should at the least file an issue upstream and try to clean this up in a separate PR.

I don't want to block fixing the -fbounds-safety/ubsan bug on that though.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh something to think about here too is how this interacts with SanitizerKind::ArrayBounds and if that expects Idx to be negated or not.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that also expects the index to be negated. I took a look at the code/few callsites.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update with more tests and resolve this conversation afterwards

}

if (CGF.SanOpts.has(SanitizerKind::ArrayBounds))
CGF.EmitBoundsCheck(E, Base, Idx, IdxType,
/*Accessed*/ false);
Expand Down
18 changes: 18 additions & 0 deletions clang/test/BoundsSafety/CodeGen/ubsan/ptr-overflow-subtraction.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// RUN: %clang_cc1 -O0 -fbounds-safety -fsanitize=pointer-overflow -fsanitize-trap=pointer-overflow -emit-llvm %s -o - | FileCheck %s

#include <ptrcheck.h>

// Regression test for pointer subtraction generating a false-positive
// UBSan pointer overflow check when -fbounds-safety is enabled.
// The check must use the negated index so that (base - offset) ule base
// is checked, not (base + offset) ule base which is unconditionally false.
void ptr_sub(unsigned char * __bidi_indexable ptr, unsigned int offset) {
unsigned char * __bidi_indexable ptr2 = ptr - offset;
(void)ptr2;
}

// CHECK: %[[OFFSET:[a-z0-9]+]] = load i32
// CHECK: %[[NEG:[a-z0-9.]+]] = sub i32 0, %[[OFFSET]]
// CHECK: getelementptr{{.*}} %[[NEG]]
// CHECK: %[[EXT:[a-z0-9.]+]] = sext i32 %[[NEG]] to i64
// CHECK: call { i64, i1 } @llvm.smul.with.overflow.i64(i64 1, i64 %[[EXT]])
Comment on lines +14 to +18
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to use update_cc_test_checks?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a good idea so that this test is easier to update when upstream inevitably changes the IR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is an -O0 test, so the fragility is lower than the -O2 tests

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep this as is. I have not used that script in the past, but it is creating around 30 CHECK lines which I think is much more brittle than this.