Conversation
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
|
I am not sure which one will go first. If these changes are independent, it would make more sense to submit them as independent PRs. |
ffa8a51 to
695f29e
Compare
jkotas
left a comment
There was a problem hiding this comment.
One of the Microsoft maintainers will have to run internal VS debugger tests on this before it gets merged.
...coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.CoreCLR.cs
Outdated
Show resolved
Hide resolved
...coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/RuntimeHelpers.CoreCLR.cs
Outdated
Show resolved
Hide resolved
|
Failures are known, the new one is unrelated deadlettered leg. Results still match before/after #126809 (comment) |
|
Let me run the private diagnostic tests with this change. |
|
There is about 15 failed tests with this change (compared to the commit right before your change). Moreover, I've found that we also have some additional failures in these tests that one of your previous changes - #126222 has introduced. Since I have run those tests at one point during your PR, it seems that some changes made after that has introduced the problem. |
|
So, regarding the issues introduced by the old PR, the DacDbiInterfaceImpl::UnwindStackWalkFrame needs to be updated to skip frames of "System.Environment.CallEntryPoint". I've tried a quick hack (just comparing method name string) and found that fixes two of the three EH related failures. The remaining issue from the old PR is strange. The debugger can see a stack where the System.Environment.CallEntryPoint is at the top of the stack, with Main at the bottom. That doesn't make sense, I am looking at the test to see if I can isolate a standalone repro. |
I've found it actually does make sense, the frame on top of the stack is the filter funclet from the System.Environment.CallEntryPoint. What the test does is that when it breaks on an unhandled exception, it then performs a "step" command, which makes the debugger step into the filter funclet. It sounds like something that the diagnostics folks will need to think about, as I don't think we want the debugger to step into this internal filter when the user "steps" after the debugger breaks on an an unhandled exception. |
Can we create separate PR with fixes to these regressions? I would like to get to a clean state first before taking more risky changes. |
I was thinking to push the commits here for now then split PR once private diagnostics test pass. Cherry-picking these is simple as they are touching different files. It's to reduce the hassle for @janvorli to switch between branches. |
|
@am11 now regarding the failures introduced by the funceval changes. I can see a pattern of which the following is the simplest example. Running funceval in the testing debugger on a function that returns With your change: Stepping through the debugger itself, the difference is that with your change, the result type is CorElementType.ELEMENT_TYPE_CLASS. I have problems attaching VS to the testing debugger with the state before your change for some reason, but I think before it was CorElementType.ELEMENT_TYPE_BOOLEAN. |
|
@janvorli, it's interesting that vscode was showing me unboxed bools. I have just tested with VS 2022 and it is also showing me unboxed value: |
|
@am11 the bool issue is gone with your changes, but similar issues persists. With generics when the testing debugger prints a local variable: Actual: And I can actually see the same issue for plain structs. But I believe these were being boxed even before your change, so the issue must be something else than boxing. It is hard to debug it on the testing debugger side as many things referring to the debuggee are COM objects and VS shows nothing interesting about those. I'll try to enable logging for LF_CORDB to see if it uncovers something interesting and possibly add extra logging. Or debug it on Linux where I can attach lldb to the debuggee even if it is being debugged by the testing debugger. That isn't possible on Windows (except for nonivasive debugging where you cannot step through code) |
|
Seems like generic types are printing as expected: @janvorli, I wonder if it's possible for us to add some tests for ICorDebug interface under src/tests? I'd imagine it shouldn't be impossible as it doesn't even need an actual debugger but rather mocked layer initializing and sending commands with code for funceval. |
|
@am11 I have found the issue is that your change treats special types like strings the same way as value types. I was misled into thinking the problem was boxing of the structs etc, but it turned out that the problematic cases where when the debugger used funceval to call ToString on a struct. Added logging shows that the funceval produced string and in the state before your change, it was setting boxing to Debugger::OnlyPrimitivesUnboxed. runtime/src/coreclr/debug/ee/funceval.cpp Lines 2932 to 2938 in 28deeb4 Then in this "if" below, it has created the strong handle like it is done for value types. The IsElementTypeSpecial(retClassET) part of the condition make it go into the "if" body runtime/src/coreclr/debug/ee/funceval.cpp Lines 2949 to 2957 in 28deeb4 I've tried to make your new code work that way, but apparently there is something else missing, as the debugger has printed "Error: Handle has been disposed". I have no idea what handle was being referred to yet. |
Moved those changes to #126927 and reverted here. |
src/coreclr/debug/ee/funceval.cpp
Outdated
| // | ||
| // Do Step 1e - Gather info from runtime about args (may trigger a GC). | ||
| // | ||
| // GC-protect all arg addresses as interior pointers before any GC-triggering |
There was a problem hiding this comment.
We are calling AllocateObject above that can trigger GC. How are the pointers inside GetArgData protected during that AllocateObject?
There was a problem hiding this comment.
The pointers inside GetArgData are now GC-protected as interior pointers before AllocateObject (and all other GC-triggering calls). I've moved the pArgAddrs setup + GCPROTECT_BEGININTERIOR_ARRAY to the very top of DoNormalFuncEval, before ResolveFuncEvalGenericArgInfo and AllocateObject.
|
@am11 even after your last change it doesn't work. I can see that the DoNormalFunceval took the expected path, the pDE->m_resultType was System.String, the ucoGc.resultObj was of type System.String. So I've spent more time debugging it and it turns out the problem is actually at the funceval arguments processing. The thing is that the debugger calls ToString on the struct via funceval. But we end up boxing the "this" argument. The original code didn't do that - dumping the type of *pObjectRefArray after the call to BoxFuncEvalThisParameter is the struct in the old state case, but the type ucoGc.thisArg in your case is System.Object. |
|
It seems the original code would only box "this" argument if |
|
@janvorli, based on your description, I have pushed a change. When |




Built on top of #126542.Last part of #123864 before the final cleanups.