🍒 [lldb] SymbolLocatorSymStore 2/2 (http lookup)#12911
🍒 [lldb] SymbolLocatorSymStore 2/2 (http lookup)#12911weliveindetail wants to merge 30 commits intoswiftlang:swift/release/6.4.xfrom
Conversation
…#156306) This uses split DWARF and from looking at other tests, it should not be running on Darwin or Windows. It does pass using the DIA PDB plugin but I think this is misleading because it's not actually testing the intended feature. When the native PDB plugin is used it fails because it cannot set a breakpoint. I don't see a point to running this test on Windows at all. Native PDB plugin test failures are being tracked in llvm#114906.
link.exe discards DWARF information. Other linkers on non-Windows do not. Uses the same solution as TestFrameFunctionInlined.test. This test was failing with the native PDB plugin but shouldn't have been using PDB anyway (see llvm#114906). Passes with DWARF and lld.
…lvm#153160) Relands llvm#152295. Checking for the overloads did not account for them being out of order. For example, [the failed output](llvm#152295 (comment)) contained the overloads, but out of order. The last commit here fixes that by using `-DAG`. --------- Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
…153382) Some functions don't have the `FunctionType` set. We can't look these up and won't be able to call them, so ignore them when caching the function names. This does fix the failures caused by llvm#153160 mentioned in llvm#153160 (comment). However, in `lldb-shell::msstl_smoke.cpp` there's another failure not introduced by llvm#153160 (fixed with llvm#153386).
This PR implements `SymbolFileNativePDB::AddSymbols` which adds public symbols to the symbol table. These symbols are found in the publics stream. It contains mangled names coupled with addresses. Addresses are a pair of (segment, offset). If I understood correctly, then the segment is the section ID from the COFF header. Sections are already [constructed](https://github.com/llvm/llvm-project/blob/c48ec7fb60b5e0b4100731d75f82ea63c0ec7b45/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp#L1048) using this 1-based index ([MS docs](https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#section-table-section-headers)). This allows us to use `section_list->FindSectionByID`.
In llvm#165604, a test was skipped on Windows, because the native PDB plugin didn't set sizes on symbols. While the test isn't compiled with debug info, it's linked with `-gdwarf`, causing a PDB to be created on Windows. This PDB will only contain the public symbols (written by the linker) and section information. The symbols themselves don't have a size, however the DIA SDK sets a size for them. It seems like, for these data symbols, the size given from DIA is the distance to the next symbol (or the section end). This PR implements the naive approach for the native plugin. The main difference is in function/code symbols. There, DIA searches for a corresponding `S_GPROC32` which have a "code size" that is sometimes slightly smaller than the difference to the next symbol.
When a PDB is loaded through `target symbols add <pdb-path>`, its `m_objectfile_sp` is an `ObjectFilePDB` instead of `ObjectFilePECOFF` (the debugged module). In both the native and DIA plugin, some paths assumed that `m_objectfile_sp` is the debugged module. With this PR, they go through `m_objfile_sp->GetModule()->GetObjectFile()`. For the DIA plugin, this lead to an assertion failure (llvm#169628 (comment)) and for both plugins, it meant that the symbol table wasn't loaded.
…vm#185658) Minimal infrastructure for a the SymbolLocator plugin that fetches debug info from Microsoft SymStore repositories. This can work cross-platform and for various debug info formats in principle, but the current plan is focussed on PE/COFF on Windows with debug info in PDB files. Once we have a stable first version, we'd like to add features like download, environment variables, caching and progress feedback for users. SymbolVendorPECOFF was tailored towards DWARF debug info so far. I added code to load the PDB path from the executable (it only checked gnu_debuglink so far) and not bail out if DWARF sections are missing, so that in the PDB case we still call AddSymbolFileRepresentation() in the very end of CreateInstance(). The API test in this patch mocks the directory layout from SymStore, so it doesn't depend on SymStore.exe from the Windows SDK. It runs on all platforms that link debug info in a PDB file, which is still just Windows, but it could be cross-platform in principle. ----- Relands with minor fixes: API tests create mocked SymStore in the test's build directory. One log instruction was moved. One more object access goes through module in SymbolFile.
…ib (NFC)" (llvm#186074) Relocate HTTPClient and HTTPServer from the Debuginfod library to llvm/Support/HTTP so they can be reused by other components. --------- Relanding with fixes in CMakeLists.txt to account for dependency to new LLVMSupportHTTP in tools. Relanding with one more fix in libSupportHTTP that adds it as a component in libLLVM.so --------- Co-authored-by: Alexandre Ganea <aganea@havenstudios.com> Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Referencing libSupportHTTP under LINK_LIBS of add_lldb_library() pulls in the archive even in a build configuration with LLVM_LINK_LLVM_DYLIB=On, where libSupportHTTP is part of libLLVM. This patch moves it to LINK_COMPONENTS to fix the issue.
Deleting the executable at the end of this API test-case fails with a permission error, likely because lldb still holds a reference to the EXE. Exit explicitly to avoid that.
…lvm#187072) Deleting the inferior binary after an API test-case causes issues on one of the Windows bots. The previous the fix attempt in ca15db1 didn't succeed. We have to use isolated subfolders for each test-case. This is achieved easily by disabling SHARED_BUILD_TESTCASE.
The initial version of SymbolLocatorSymStore supported servers only on local paths. This patch extends it to HTTP/HTTPS end-points. For that to work on Windows, we add a WinHTTP-based HTTP client backend in LLVM next to the existing CURL-based implementation. We don't add a HTTP server implementation, because there is no use right now. Test coverage for the LLVM part is built on llvm-debuginfod-find and works server-less, since it checks textual output of request headers. The existing CURL-based implementation uses the same approach. The LLDB API test for the specific SymbolLocatorSymStore feature spawns a HTTP server from Python. To keep the size of this patch within reasonable limits, the initial implementation of the SymbolLocatorSymStore feature is dump: There is no caching, no verification of downloaded files and no protection against file corruptions. We use a local implementation of LLVM's HTTPResponseHandler, but should think about extracting and reusing Debuginfod's StreamedHTTPResponseHandler in the future. The goal of this patch is a basic working implementation that is testable in isolation. We will iterate on it to improve it further. This should be fine since downloading from SymStores is not default-enabled yet. Users have to set their server URLs explicitly. --------- Co-authored-by: Alexandre Ganea <aganea@havenstudios.com>
…case (llvm#187705) Listening on all interfaces is probably not permitted on the bots and causes failures of llvm-debuginfod-find/headers-winhttp.test after 39d6bb2. Restricting them to localhost should fix that.
…lvm#187727) a3db68a seemed t be the obvious fix for the winhttp issue from 39d6bb2 in llvm-debuginfod-find, but there are still bots failing. This patch disables the test on all bots that cannot spawn an HTTP server in Python and record request headers. Ideally it turns all affected bots back to green and gives us an error message to investigate.
…ots temporarily (llvm#187753) Windows bots are still failing after a3db68a and d7dbba5. This test is new, let's take it off while we investigate.
Missed this when reviewing llvm#186986. This fixes the warnings to follow the [LLVM Coding Standards](https://llvm.org/docs/CodingStandards.html#error-and-warning-messages).
The cache path we get from the `llvm::sys::path::cache_directory()` library function is not writeable on some Windows bots as they point us to the Default user's AppData: C:\Users\Default\AppData\Local\lldb. This patch avoids the problem and downloads into the temporary directory that we query from `llvm::sys::path::system_temp_directory()`. Once we introduce real caching, this would have changed anyway.
…g on plain HTTP (llvm#188970) This patch only adds the condition, so the flags are applied only for HTTPS URLs. No change in implementation.
Since libSupportHTTP is part of the LLVM dylib, we must link it as a component now. Fixes llvm#189978
…rSymStore (llvm#187687) SymbolLocatorSymStore used a simple local implementation of HTTPResponseHandler so far. That was fine for basic usage, but it would cause issues down the line. This patch hoists the StreamedHTTPResponseHandler class from libDebuginfod to SupportHTTP and integrates it in SymbolLocatorSymStore. PDB file downloads will now be buffered on disk, which is necessary since they can be huge. We use the opportunity an stop logging 404 responses (file not found on server) and print warnings for all other erroneous HTTP responses. It was more complicated before, because the old response handler created the underlying file in any case. The new one does that only once the first content package comes in.
The mechanism to disable `SHARED_BUILD_TESTCASE` for tests that set `TEST_WITH_PDB_DEBUG_INFO` doesn't work. The property was set on the wrong object. This patch fixes it and moves the assignment after the for-loop, since the respective dict only exists there.
We explicitly link against the winhttp library by declaring it in CMake,
in llvm/lib/Support/HTTP/CMakeLists.txt. We therefore don't need to
specify linking against it through a pragma (which embeds a directive in
the object file).
This fixes warnings when building in mingw mode, where this kind of
pragma isn't supported:
llvm-project/llvm/lib/Support/HTTP/HTTPClient.cpp:149:9: warning:
unknown pragma ignored [-Wunknown-pragmas]
149 | #pragma comment(lib, "winhttp.lib")
| ^
The HTTP implementation depends on CURL and would preferably not be part of the LLVM dylib. This was not possible as a nested library under libSupport, because libSupport itself is part of the LLVM dylib. This patch moves the HTTP code into a separate top-level library that is independent from libSupport and excluded from the LLVM dylib.
…re (llvm#191782) The _NT_SYMBOL_PATH environment variable is the idiomatic way to set a system-wide lookup order of symbol servers and a local cache for SymStore. It holds a semicolon-separated list of entries in the following notations: * srv*[<cache>*]<source> sets a source and an optional explicit cache * cache*<cache> sets an implicit cache for all subsequent entries * all other entries are bare local directories Since symbol paths are closely intertwined with the caching of symbol files, this patch proposes support in LLDB for both features at once. ParseEnvSymbolPaths() implements the parsing logic, which processes entries of the symbol path string from left to right to create a series of LookupEntry objects that each store a source and a cache location. The source of a LookupEntry can be a local directory or an HTTP server address. The cache is a local directory or empty. This representation unifies the implicit vs. explicit caching options from the SymStore protocol. The lookup remains in LocateSymStoreEntry() which we now invoke for each LookupEntry. Here we distinguish sources between HTTP servers and local directories. The latter doesn't change. We just moved the logging to clearly express the new cases. For HTTP downloads we now check the cache first and add files to it after download. The download itself keeps targeting a temporary file to avoid corrupt cache entries in case of interruptions. After all, the SymStore protocol has no check-sums! We define a default cache path that is used as a fallback, if the symbol path doesn't specify any. It can be overridden through the plugin.symbol-locator.symstore.cache property. This is analog to Debuginfod and very handy since we want to move files out from temp after download. The default cache path is what we use if there are no other options. The symbol path notation in the SymStore protocol carries quite some legacy. The SymStoreTest unittest checks that we can parse all the obscure combinations of possible server and cache entries. --------- Co-authored-by: Nerixyz <nero.9@hotmail.de>
Using self-signed certificates is the only way forward for testing security features on the HTTPS path. As we don't want to allow any arbitrary certificate, we add a new property that pins a fingerprint and any self-signed certificate is only accepted if it matches this fingerprint.
|
@swift-ci please test |
|
Test failure on macOS seems unrelated to this PR: |
|
The Windows bot reports 3 test failures: (1) Tests in (2) The (3) Finally, the test for the SymStore feature itself failed: The build against stable/21.x doesn't reproduce (1) while it has the test enabled as well. |
@justice-adams-apple are the bots preventing running http requests to localhost? |
|
I split off the non-HTTP parts into #12917 |
|
@swift-ci please test macOS platform |
Upstream LLDB gained support for fetching debug info from SymStore with a new symbol locator implementation. This PR aims to bring the feature to Swift ahead of the next rebranch.
There are 3 kinds of cherry-picks here:
Commits from category 3 are independent from the others and were picked first. We need them, because the SymbolLocatorSymStore implementation required a change in the SymbolFilePDB/NativePDB plugins: They are loaded through the SymbolVendorPECOFF now and not fall back to the generic SymbolVendor anymore. This was a father simple change upstream, because it had switched the default to SymbolFileNativePDB already and test inconsistencies had been fixed. This change wasn't cherry-picked, so SymbolFilePDB is the default still and required these compatibility patches.
Commits from categories 1 and 2 are interleaved, because they are partially overlapping. Order matches upstream. LLVM test for WinHTTP use the existing Debuginfod infrastructure that was tailored for CURL. Applying the patches went pretty smooth. I saw conflicts due to permanent differences downstream that I could resolve cleanly, but nothing critical. I skipped two commits that are loosely related as they'd have blown up the patch:
I back-ported the 1-line tablegen diff from cdbe288 in my code. And eventually I didn't need fb36a54, because the symbol locator switched to LLDB_LOG anyway.