perf(sdb): optimize sdbWriteFileImp to eliminate lock contention and reduce I/O syscalls.#35286
perf(sdb): optimize sdbWriteFileImp to eliminate lock contention and reduce I/O syscalls.#35286xiao-77 wants to merge 1 commit into
Conversation
…reduce I/O syscalls.
There was a problem hiding this comment.
Code Review
This pull request optimizes the SDB checkpointing process by introducing a write buffer to reduce syscalls, moving encryption key checks outside the main loop, and implementing a two-phase row snapshotting mechanism to minimize lock contention. Feedback highlights a potential data consistency issue due to the 'fuzzy checkpoint' behavior where locks are released before encoding. Additionally, improvements were suggested for error handling during key loading and optimizing memory management for the encryption buffer.
| // Perf-A: Phase 2 — encode and write with no lock held. | ||
| // Pinned rows remain valid in memory even if concurrently deleted from the | ||
| // hash. Re-check status before encoding: a row that transitioned to DROPPED | ||
| // after the snapshot must not be written to the checkpoint file. |
There was a problem hiding this comment.
Releasing the table lock before encoding and writing rows introduces a "fuzzy checkpoint" behavior. While this eliminates lock contention, it means that concurrent updates to the same row could result in inconsistent data being written to the checkpoint file (e.g., a row containing a mix of old and new field values if the encoding process is not atomic relative to updates). If the system requires strict consistency for the checkpoint file, the lock should be held during the encoding phase, or an MVCC-like mechanism should be used.
| if (taosWaitCfgKeyLoaded() != 0) { | ||
| code = terrno; |
There was a problem hiding this comment.
| if (needed > encBufLen) { | ||
| taosMemoryFree(encBuf); | ||
| encBuf = taosMemoryMalloc(needed); | ||
| if (encBuf == NULL) { | ||
| code = terrno; | ||
| encBufLen = 0; | ||
| } else { | ||
| encBufLen = needed; | ||
| } | ||
| } |
There was a problem hiding this comment.
Instead of freeing and re-allocating the encryption buffer when a larger size is needed, consider using taosMemoryRealloc. This is generally more efficient as it may avoid a full copy and re-allocation if the current memory block can be extended in place.
if (needed > encBufLen) {
char *tmp = taosMemoryRealloc(encBuf, needed);
if (tmp == NULL) {
code = terrno;
} else {
encBuf = tmp;
encBufLen = needed;
}
}There was a problem hiding this comment.
Pull request overview
This PR optimizes the mnode SDB checkpoint writer (sdbWriteFileImp) to reduce lock contention and cut down on write() syscalls during sdb.data generation.
Changes:
- Add a 256 KiB write buffer (
sdbBufWrite/sdbFlushBuf) to batch writes and reduce syscall frequency. - Move
taosWaitCfgKeyLoaded()to a one-time check before row loops. - Snapshot row pointers under a read lock and reuse an encryption buffer to avoid per-row malloc/free.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Perf-A: Phase 2 — encode and write with no lock held. | ||
| // Pinned rows remain valid in memory even if concurrently deleted from the | ||
| // hash. Re-check status before encoding: a row that transitioned to DROPPED | ||
| // after the snapshot must not be written to the checkpoint file. | ||
| int32_t rowCount = (int32_t)taosArrayGetSize(pRowList); | ||
| for (int32_t j = 0; j < rowCount; j++) { | ||
| SSdbRow *pRow = *(SSdbRow **)taosArrayGet(pRowList, j); | ||
|
|
||
| if (tsMetaKey[0] != '\0') { | ||
| newDataLen = ENCRYPTED_LEN(pRaw->dataLen); | ||
| newData = taosMemoryMalloc(newDataLen); | ||
| if (newData == NULL) { | ||
| code = terrno; | ||
| taosHashCancelIterate(hash, ppRow); | ||
| sdbFreeRaw(pRaw); | ||
| break; | ||
| } | ||
| if (code == 0) { | ||
| if (pRow->status == SDB_STATUS_DROPPED) { | ||
| sdbPrintOper(pSdb, pRow, "not-write"); | ||
| } else { | ||
| sdbPrintOper(pSdb, pRow, "write"); | ||
|
|
||
| SSdbRaw *pRaw = (*encodeFp)(pRow->pObj); | ||
| if (pRaw == NULL) { |
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.