fix ubsan issue involving mis-aligned copies#59
Conversation
* Rewrite the model inputs finding logistic * Put stateful shape handle in get input shape
.hpp files converted to .h
| T out; | ||
| std::memcpy(&out, get_input_op_params(index), sizeof(T)); | ||
| return out; | ||
| } |
There was a problem hiding this comment.
Why do we need to use memcpy if we are returning a scalar value anyways? Also, how about making this call more scalable so we can access any index?
template<typename T>
T get_input_op_param_aligned(size_t index, size_t i) const {
return ((const T *)(get_input_op_params(index)))[i];
}
This could be more aligned with what ggml api does as well:
https://github.com/ggml-org/llama.cpp/blob/57819b8d4b39d893408e51520dff3d47d1ebb757/ggml/src/ggml-impl.h#L151
There was a problem hiding this comment.
there's an assumption that a direct dereference is aligned to the size of the variable, so an int32_t* cast to a int64_t is only well-defined if the int32_t* naturally has an 8-byte alignment. I can move away from the memcpy since we know the underlying data is 4-byte aligned
I can go ahead and make it more scalable
There was a problem hiding this comment.
I see, but I was wondering how memcpy would fix this alignment since it should basically do a byte to byte copy regardless of the data types of src and dst pointers.
With the assumption that we want to read the first 8 bytes starting from the given memory address and return it's value as a size_t scalar value, I thought it should return the same value either case. Am I missing something?
Or if we intend to read 4 bytes (int32_t) from memory address, cast it to 8 byte scalar (size_t), then we should have been reading the value from an int32_t aligned pointer at first place.
I think @wine99 can explain better what was the reason of reading 8-byte value instead of 4.
There was a problem hiding this comment.
We are reading a size_t from the int32_t* op_params because, in the case of the VIEW op, the stored offset is indeed a size_t. See: https://github.com/ggml-org/llama.cpp/blob/cf23ee244717b7b41f092410991d0344b25620ea/ggml/src/ggml.c#L3649
|
Also, can you rebase the branch please? That should fix the failing tests. |
76e4057 to
e73b4d4
Compare
996b739 to
b6c83aa
Compare
84e2bdb to
1f1d900
Compare
36daf24 to
c6e06ee
Compare
fba54ea to
4f247b1
Compare
ae93ac3 to
efbc565
Compare
640d290 to
9f60666
Compare
UBSan was throwing some errors involving a misaligned read