Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions db_stress_tool/db_stress_test_base.cc
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,10 @@ void StressTest::CleanUpColumnFamilies() {
delete cf;
}
column_families_.clear();
for (auto* cf : old_column_families_) {
delete cf;
}
old_column_families_.clear();
for (auto* cf : secondary_cfhs_) {
delete cf;
}
Expand Down Expand Up @@ -796,9 +800,17 @@ Status StressTest::SetOptions(ThreadState* thread) {
}

Options StressTest::GetOptions(int cf_id) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you are here, we should probably fix the type, uint32_t is the right type for cf_id

auto cfh = column_families_[cf_id];
assert(cfh);
return db_->GetOptions(cfh);
Comment on lines -799 to -801
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code treat CF id as the index of column_families_. This is right, if CF is never deleted and recreated. Given we do have code that recreate CF (although this is never called, that's why the bug was never triggered), we should not hold this assumption anymore. I think the fix is correct. Meantime, once we re-enable the drop/recreate test, accessing it would be better guarded by a mutex for thread safety. Although since it is just a pointer of array that is never resized, most hardware would guarantee thread safety on raw pointer read/write, but tsan might warn about this.
Meantime, on re-enable drop/recreate, I don't know whether the stress test handles this gracefully. Essentially whether it would create concurrent write/read failure on a dropped CF properly. Also the re-constructed expected state might need to be updated as well. In short, to re-enable this functionality test, it will likely require a lot more work than this fix. Meantime, dropping a CF concurrently while another thread is read/write it is likely rare in production. Application typically do synchronization by themself on concurrent drop. So I don't know how much value would this provide. If we do have this kind of use case, then I agree we should spend effort on fixing this.

// cf_id is the RocksDB internal column family ID, not the index into
// column_families_. After a CF drop+recreate, the new CF gets a higher ID
// that may exceed column_families_.size(). Search for the matching handle.
for (auto* cfh : column_families_) {
if (cfh && cfh->GetID() == static_cast<uint32_t>(cf_id)) {
return db_->GetOptions(cfh);
}
}
Comment on lines -799 to +810
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On cf id not found, we used to cause segfault, accessing nullptr. Now we just return default options_. Although with delete/recreate, this is a valid case we need to handle properly for remote compaction. I think it is better to surface CF not found error back to caller in remote compaction and have remote compaction just abort the compaction gracefully.

// CF was dropped and recreated with a new ID. Return base options since the
// remote compaction for the old CF will fail anyway.
return options_;
}

void StressTest::ProcessRecoveredPreparedTxns(SharedState* shared) {
Expand Down
3 changes: 3 additions & 0 deletions db_stress_tool/db_stress_test_base.h
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,9 @@ class StressTest {
Options options_;
SystemClock* clock_;
std::vector<ColumnFamilyHandle*> column_families_;
// Handles from dropped column families whose deletion is deferred because
// other threads may still hold pointers to them.
std::vector<ColumnFamilyHandle*> old_column_families_;
std::vector<std::string> column_family_names_;
std::atomic<int> new_column_family_name_;
int num_times_reopened_;
Expand Down
12 changes: 10 additions & 2 deletions db_stress_tool/no_batched_ops_stress.cc
Original file line number Diff line number Diff line change
Expand Up @@ -607,14 +607,22 @@ class NonBatchedOpsStressTest : public StressTest {
}
thread->shared->LockColumnFamily(cf);
Status s = db_->DropColumnFamily(column_families_[cf]);
delete column_families_[cf];
if (!s.ok()) {
fprintf(stderr, "dropping column family error: %s\n",
s.ToString().c_str());
thread->shared->SafeTerminate();
}
// Use a local variable for CreateColumnFamily to avoid setting
// column_families_[cf] to nullptr (CreateColumnFamilyImpl zeroes the
// output handle first). Other threads may concurrently read
// column_families_[cf] without holding the CF lock.
ColumnFamilyHandle* new_cfh = nullptr;
s = db_->CreateColumnFamily(ColumnFamilyOptions(options_), new_name,
&column_families_[cf]);
&new_cfh);
// Defer deletion of the old handle because other threads may still
// hold pointers to it. It will be cleaned up in CleanUpColumnFamilies.
old_column_families_.push_back(column_families_[cf]);
column_families_[cf] = new_cfh;
column_family_names_[cf] = new_name;
thread->shared->ClearColumnFamily(cf);
if (!s.ok()) {
Expand Down
Loading